So this might be a little too specific, and too much for anyone to read, for anyone to be able to help. But maybe there is someone out there who has done this before.
I am currently using the trusty, but not very accurate, libdvdread library to read ISO files/devices. But the specific implementation is not so important in this case. It is more on HOW to read UDF file systems. I have read both Ecma-167, and udf260 PDF files, lots.
So, first, let's look at an ISO image from IMGBURN, which appears to work, just the same as an image copied from a retail bluray.
The blocks are laid out as:
Block 32 TagID: 1 TAGID_PRI_VOL
Block 33 TagID: 4 TAGID_IMP_VOL
Block 34 TagID: 5 TAGID_PARTITION
Block 35 TagID: 6 TAGID_LOGVOL
Block 36 TagID: 7 TAGID_UNALLOC_SPACE
Block 37 TagID: 8 TAGID_TERM
Block 64 TagID: 9 TAGID_LOGVOL_INTEGRITY
Block 65 TagID: 8 TAGID_TERM
Block 256 TagID: 2 TAGID_ANCHOR
Block 288 TagID: 266 TAGID_EXTFENTRY
Block 320 TagID: 256 TAGID_FSD
Block 321 TagID: 8 TAGID_TERM
Block 322 TagID: 266 TAGID_EXTFENTRY
Block 323 TagID: 257 TAGID_FID
Block 324 TagID: 266 TAGID_EXTFENTRY
Block 325 TagID: 266 TAGID_EXTFENTRY
Block 326 TagID: 257 TAGID_FID
Block 327 TagID: 266 TAGID_EXTFENTRY
So we start to read the ISO image;
* Read Block 256 which is 2 TAGID_ANCHOR
* Read Block 32 which is 1 TAGID_PRI_VOL
* Read Block 33 which is 4 TAGID_IMP_VOL
* Read Block 34 which is 5 TAGID_PARTITION
Partiton: number 0, start 288, length 2291200, AccessType 1
* Read Block 35 which is 6 TAGID_LOGVOL
LogVolume 2048:70:2
Volume 0 type 01 (len 6) Seq 0001 Part 0000
Volume 1 type 02:
Partition identifier: '*UDF Metadata Partition'
Metadata Partition MainLoc 00000000, MirrorLoc 0022F5A9, BitmapLoc FFFFFFFF, AllocSize 00000020, AlignSize 0020, Flags 1.
returning Start 288
Found partition at 288 length 2291200
Starting scan from 288 (metadata adjusted)
So far so good. You can see I take the "metadata main file location", in this example 0, and add it to the partition start, to go look for the FSD. This appears to work even with examples where "metadata main file location" is not 0. More on that later.
Looking for the FSD now, the first item we find is:
* Read Block 288 which is 266 TAGID_EXTFENTRY
TagID 266 with filetype 250
Metadata Main at location 32 (+partition.start 320)
The Ecma-167 defines filetype=250 has "metadata main file", and the AD.Location points to the metadata. There is also possible to find a filetype=251 (metadata mirror file)
I use the "metadata main file" location (here, 32) as an indirect pointer to where to "really" look for the FSD. For some reason, this is partition.Start + Location (288 + 32) = 320.
At block 320 we find the FSD. So maybe I am on the right track.
Now Scanning from 320
* Read Block 320 which is 256 TAGID_FSD
RootICB at 2 length 2048
MapICB starting at 320,2 -> 322
Great, we read the FSD, and it have RootICB at "+2". Now I would have expected this to be "partition.Start + 2" (288+2), but this does not work. What does work is "FSD_Location + 2" (320+2). Can this really be the case?
In DVD ISOs, FSD_Location=0 (first block on partition, as there is no EXTFileInfo+250 in the way), so using this logic still works.
Lets assume it is correct;
* Read Block 322 which is 266 TAGID_EXTFENTRY
libdvdread: reading AD chain 0
UDFMapICB TagID 266 ExtFile with filetype 4
Part.Start 288 FSD loc 320 RootICB 2 (len 2048) File has loc 3
So, the block at 322 is indeed an ExtFileInfo, of filetype==4 (directory) and lives at location = +3. Again, this appears to be "fsd_location + 3" = 323.
Found '/' at 323 (size 152).
* Read Block 323 which is 257 TAGID_FID
DVDReadDir(.)
DVDReadDir(BDMV)
etc
Success. It goes on the list all the contents.
Here is where I get confused. I use the OSX "newfs_udf" to create myself a UDF test image;
# mkfile 1G roger.iso
# newfs_udf -eu -r 2.60 -v HIGHLANDER roger.iso
# hdiutil attach -imagekey diskimage-class=CRawDiskImage -nomount roger.iso
# hdiutil mount -nobrowse roger.iso
# mkdir /Volumes/HIGHLANDER/A.DIRECTORY.ENTRY
The blocks are:
Block 20 TagID: 1 TAGID_PRI_VOL
Block 21 TagID: 4 TAGID_IMP_VOL
Block 22 TagID: 5 TAGID_PARTITION
Block 23 TagID: 6 TAGID_LOGVOL
Block 24 TagID: 7 TAGID_UNALLOC_SPACE
Block 25 TagID: 8 TAGID_TERM
Block 36 TagID: 9 TAGID_LOGVOL_INTEGRITY
Block 37 TagID: 8 TAGID_TERM
Block 256 TagID: 2 TAGID_ANCHOR
Block 257 TagID: 264 TAGID_SPACE_BITMAP
Block 289 TagID: 266 TAGID_EXTFENTRY
Block 290 TagID: 266 TAGID_EXTFENTRY
Block 291 TagID: 256 TAGID_FSD
Block 292 TagID: 266 TAGID_EXTFENTRY
Block 293 TagID: 266 TAGID_EXTFENTRY
Block 294 TagID: 266 TAGID_EXTFENTRY
Block 295 TagID: 266 TAGID_EXTFENTRY
Block 296 TagID: 266 TAGID_EXTFENTRY
Block 323 TagID: 264 TAGID_SPACE_BITMAP
Block 449 TagID: 259 TAGID_INDIRECTENTRY
Reading this ISO also works well, at least initially;
* Read Block 256 which is 2 TAGID_ANCHOR
* Read Block 20 which is 1 TAGID_PRI_VOL
* Read Block 21 which is 4 TAGID_IMP_VOL
* Read Block 22 which is 5 TAGID_PARTITION
Partiton: number 0, start 257, length 523774, AccessType 4
* Read Block 23 which is 6 TAGID_LOGVOL
LogVolume 2048:70:2
Volume 0 type 01 (len 6) Seq 0001 Part 0000
Volume 1 type 02:
Partition identifier: '*UDF Metadata Partition'
Metadata Partition MainLoc 00000020, MirrorLoc 0007FDFD, BitmapLoc 00000021, AllocSize 00000020, AlignSize 0001, Flags 0.
returning Start 257
Found partition at 257 length 523774
Starting scan from 289 (metadata adjusted)
* Read Block 289 which is 266 TAGID_EXTFENTRY
TagID 266 with filetype 250
Metadata Main at location 34 (+partition.start 291)
* Read Block 291 which is 256 TAGID_FSD
RootICB at 1 length 2048
* Read Block 292 which is 266 TAGID_EXTFENTRY
UDFMapICB TagID 266 ExtFile with filetype 4
Part.Start 257 FSD loc 291 RootICB 1 (len 2048) File has loc 34
Notice the metadata partition is +32 here, by adding that we correctly find the ExtFileInfo+Filetype=250. This then is +34, which correctly gets us the FSD!
The FSD has RootICB at +34, again from FSD, which is 291+34 = 325.
And it is lost. I would guess it SHOULD be 292, but;
...In the block list, you can see there are no FIDs at all. Looking at the hexdump of this ISO image, the "A.DIRECTORY.ENTRY" can be found in block 292. Which is a ExtFileEntry. The very one that sent us off looking for metadata-main-file-location.
I thought ExtFileInfo only contained one (1) file descriptor with ICB pointing to its data. And yet, inside this block at offset +380 or so, we have root (null), "A.DIRECTORY.ENTRY" and ".Trashes".
I guess my question is, does the OSX compress FIDs into ExtFileEntry somehow? and go without FID blocks. Is this "valid"? How do I detect this situation? Is there something in this ExtFileInfo that indicates I should "not follow the Location to look for FIDs" and "keep parsing this block for more entries".
When computing the ICB, I have to use "fsd_location + icb.location" for directories, but with files (to read the actual file data) I have to use "partition.Start + icb.location". This works as expected (directories list, and files have no differences) but it does not seem correct.
If you read all that, you are awesome :) Now, if you could just give me some clues...
OK I feel I got the hang of it all, both from ECMA 167 and BSD UDF sources. The missing magic was that ICBTAG.Flags=3 type, where instead of having a list of "AD" at the end of the block, it puts the actual "file contents" in the AD space. Some sort of "space saver" style, if the file data is less than 2048 bytes.
It also cleared the use of FSD as offset if the metadata is present. All directory content is to be contained in the metadata file (main and mirror).