MKV seekhead parsing

831 views Asked by At

I have a requirement, where I need to parse a matroska file. Initial few bytes of file is given as below.

0x1a 0x45 0xdf 0xa3 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x23 0x42 0x86 0x81 0x01
0x42 0xf7 0x81 0x01 0x42 0xf2 0x81 0x04 0x42 0xf3 0x81 0x08 0x42 0x82 0x88 0x6d
0x61 0x74 0x72 0x6f 0x73 0x6b 0x61 0x42 0x87 0x81 0x04 0x42 0x85 0x81 0x02 0x18
0x53 0x80 0x67 0x01 0x00 0x00 0x00 0x00 0x33 0xdb 0x10 0x11 0x4d 0x9b 0x74 0x40
0x42 0xbf 0x84 0x11 0xac 0x83 0x8a 0x4d 0xbb 0x8b 0x53 0xab 0x84 0x15 0x49 0xa9
0x66 0x53 0xac 0x81 0xe5 0x4d 0xbb 0x8c 0x53 0xab 0x84 0x16 0x54 0xae 0x6b 0x53
0xac 0x82 0x01 0x56 0x4d 0xbb 0x8c 0x53 0xab 0x84 0x12 0x54 0xc3 0x67 0x53 0xac
0x82 0x11 0x5c 0x4d 0xbb 0x8d 0x53 0xab 0x84 0x1c 0x53 0xbb 0x6b 0x53 0xac 0x83
0x33 0xd9 0x1c 0xec 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x94 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

I am trying to parse this file. I have parsed first 59 bytes successfully. Now I am at 60th byte. From 6th byte bytes are 0x11 0x4d 0x9b 0x74 (shown in bold), so it means seekhead is starting.

I used mkvinfo to view parsed data. As per mkvinfo as shown below, seekhead starts at 59 which is fine.

seekhead parse

Looks the first seek entry starts at 71. Now from 59th to 71st position what is there. This part I am not able to understand.

Can somebody please help me to understand this part.

1

There are 1 answers

0
Cornstalks On BEST ANSWER

You should parse the bytes like this (refer to the Matroska specification for details):

0x11 0x4d 0x9b 0x74 (element ID: SeekHead)
0x40 0x42 (element size: 66)
0xbf (element ID: CRC-32)
0x84 (element size: 4)
0x11 0xac 0x83 0x8a (4-byte CRC-32 value)
0x4d 0xbb (element ID: Seek)
0x8b (element size: 11)
0x53 0xab (element ID: SeekID)
0x84 (element size: 4)
0x15 0x49 0xa9 0x66 (SeekID value; refers to Info element ID)
0x53 0xac (element ID: SeekPosition)
0x81 (element size: 1)
0xe5 (SeekPosition value: 229)
0x4d 0xbb (element ID: Seek)
0x8c (element size: 12)
0x53 0xab (element ID: SeekID)
0x84 (element size: 4)
0x16 0x54 0xae 0x6b (SeekID value; refers to Tracks element ID)
0x53 0xac (element ID: SeekPosition)
0x82 (element size: 2)
0x01 0x56 (SeekPosition value: 342)
0x4d 0xbb (element ID: Seek)
0x8c (element size: 12)
0x53 0xab (element ID: SeekID)
0x84 (element size: 4)
0x12 0x54 0xc3 0x67 (SeekID value; refers to Tags element ID)
0x53 0xac (element ID: SeekPosition)
0x82 (element size: 2)
0x11 0x5c (SeekPosition value: 4444)
0x4d 0xbb (element ID: Seek)
0x8d (element size: 13)
0x53 0xab (element ID: SeekID)
0x84 (element size: 4)
0x1c 0x53 0xbb 0x6b (SeekID value; refers to Cues element ID)
0x53 0xac (element ID: SeekPosition)
0x83 (element size: 3)
0x33 0xd9 0x1c (SeekPosition value: 3397916)
0xec (element ID: Void)
[I stopped parsing here]

A simplified ASCII graphic of the structure of these specific bytes is as follows:

+- SeekHead -------+
| CRC-32           |
| +- Seek--------+ |
| | SeekID       | |
| | SeekPosition | |
| +--------------+ |
| +- Seek--------+ |
| | SeekID       | |
| | SeekPosition | |
| +--------------+ |
| +- Seek--------+ |
| | SeekID       | |
| | SeekPosition | |
| +--------------+ |
| +- Seek--------+ |
| | SeekID       | |
| | SeekPosition | |
| +--------------+ |
+------------------+
Void

I've drawn the master elements (elements which contain other elements) as boxes.

To answer your specific question:

Looks the first seek entry starts at 71. Now from 59th to 71st position what is there. This part I am not able to understand.

The SeekHead begins at byte 59. It's size begins 4 bytes later at byte 63. After that, a CRC-32 element begins at byte 65. After that, at byte 71, the first Seek element is found.

I've just parsed this mentally by hand; hopefully I haven't made any errors or typos.