I have a requirement, where I need to parse a matroska file. Initial few bytes of file is given as below.
0x1a 0x45 0xdf 0xa3 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x23 0x42 0x86 0x81 0x01
0x42 0xf7 0x81 0x01 0x42 0xf2 0x81 0x04 0x42 0xf3 0x81 0x08 0x42 0x82 0x88 0x6d
0x61 0x74 0x72 0x6f 0x73 0x6b 0x61 0x42 0x87 0x81 0x04 0x42 0x85 0x81 0x02 0x18
0x53 0x80 0x67 0x01 0x00 0x00 0x00 0x00 0x33 0xdb 0x10 0x11 0x4d 0x9b 0x74 0x40
0x42 0xbf 0x84 0x11 0xac 0x83 0x8a 0x4d 0xbb 0x8b 0x53 0xab 0x84 0x15 0x49 0xa9
0x66 0x53 0xac 0x81 0xe5 0x4d 0xbb 0x8c 0x53 0xab 0x84 0x16 0x54 0xae 0x6b 0x53
0xac 0x82 0x01 0x56 0x4d 0xbb 0x8c 0x53 0xab 0x84 0x12 0x54 0xc3 0x67 0x53 0xac
0x82 0x11 0x5c 0x4d 0xbb 0x8d 0x53 0xab 0x84 0x1c 0x53 0xbb 0x6b 0x53 0xac 0x83
0x33 0xd9 0x1c 0xec 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x94 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
I am trying to parse this file. I have parsed first 59 bytes successfully. Now I am at 60th byte. From 6th byte bytes are 0x11 0x4d 0x9b 0x74 (shown in bold), so it means seekhead is starting.
I used mkvinfo to view parsed data. As per mkvinfo as shown below, seekhead starts at 59 which is fine.
Looks the first seek entry starts at 71. Now from 59th to 71st position what is there. This part I am not able to understand.
Can somebody please help me to understand this part.
You should parse the bytes like this† (refer to the Matroska specification for details):
A simplified ASCII graphic of the structure of these specific bytes is as follows:
I've drawn the master elements (elements which contain other elements) as boxes.
To answer your specific question:
The SeekHead begins at byte 59. It's size begins 4 bytes later at byte 63. After that, a CRC-32 element begins at byte 65. After that, at byte 71, the first Seek element is found.
†I've just parsed this mentally by hand; hopefully I haven't made any errors or typos.