I am using BrokenThorn's tutorial for OS develpoment. My confusion is in this piece of code, which is responsible for reading the next cluster number of the file:
mov ax, WORD [cluster] ; identify current cluster from FAT
; is the cluster odd or even? Just divide it by 2 and test!
mov cx, ax ; copy current cluster
mov dx, ax ; copy current cluster
shr dx, 0x0001 ; divide by two
add cx, dx ; sum for (3/2)
mov bx, 0x0200 ; location of FAT in memory
add bx, cx ; index into FAT
mov dx, WORD [bx] ; read two bytes from FAT
test ax, 0x0001
jnz .ODD_CLUSTER
From my reading of online sources and threads, this is what I have found:
- The first cluster number in the root directory entry for the file is of 2 bytes. For FAT12, only the lower 12 bits of these 2 bytes are used.
- The FAT for FAT12 stores in the following format:
vwX uYZ
whereXYZ
is one cluster number anduvw
is another. I have a question regarding this - which represents the lower numbered FAT entry and which represents the higher ?
However, seeing the code, I cannot understand how the above 2 facts(if assumed to be correct) are being used. Initially, ax
has the 2 bytes from the root directory and its lower 12 bits can be used directly. But that is not being done. Also, how is the vwX uYZ
format being parsed here ?
If someone could explain this in some detail and point out any mistakes I have made, it would be very helpful
The starting cluster number is used as an index into the FAT. Since it is FAT12, every 2 clusters correspond to 3 bytes.
The whole 16 bits of ax are used. Since the higher 4 bits of ax will be 0 from the starting cluster number, that is equivalent to using only the lower 12 bits (unless there is a corrupted directory entry, which could make you index into nowhere).
That is better put as
vw Xu YZ
. Recall that x86 is little-endian. When you read 2 bytes in x86, and they are stored asvw Xu
, the actual number read isXuvw
. Mask to only keep the lower 12 bits and you getuvw
. Similarly, when you readXu YZ
, the actual number read isYZXu
. Shift right and you getYZX
. Which, incidentally means that the actual format is likely to bevw Zu XY
.