I was going through the B-Tree topic in Introduction to Algorithms by Cormen et. al. And I was having a difficulty in implementing the disk-operations of the pseudocode in a real program. This might be the situation because few descriptions of the objects is not clear to me here. So could anyone guide me as, how to proceed.
The following are the excerpts from the text, presenting the situation of the development:
We model disk operations in our pseudocode as follows. Let
x
be a pointer to an object. If the object is currently in the computer's main memory, then we can refer to the fields of the object as usual:key[x]
, for example. If the object referred to byx
resides on disk, however, then we must perform the operationDisk-Read (x)
to read objectx
into main memory before we can refer to its fields. (We assume that ifx
is already in main memory, thenDisk-Read (x)
requires no disk accesses; it is a "no-op.") Similarly, the operationDisk-Write(x)
is used to save any changes that have been made to the fields of objectx
. The typical pattern for working with an object is as follows:x <- a pointer to some object Disk-Read (x) operations that access and/or modify the fields of x Disk-Write(x) // Omitted if no fields of x were changed. other operations that access but do not modify fields of x
So it is quite obvious that if x
is supposed to store an object which on the disk, we cannot refer to it unless we bring it into main-memory, but how can we have x
pointing to an object in the disk in the first place, this is something, I am having problem to implement.
I could not understand how to implement these disk operations in an actual program.
Creating an empty B-tree
To build a B-tree
T
, we first useB-Tree-Create
to create an empty root node. The procedure use an auxiliary procedureAllocate-Node
, which allocates one disk page to be used as a new node inO(1)
time. (I know to allocate on heap, but here are they talking about allocating directly on disk? Moreover ifx
holds reference to an object allocated on disk then without bringing it into main-memory we cannot possibly work its attributes as per the previous excerpt). We can assume that a node created byAllocate-Node
requires noDisk-Read
, since there is as yet no useful information stored on the disk for that node.(If we do not read it, how can we set the attributes ofx
?)В-Tree-Create (T): 1 x <- Allocate-Node() 2 leaf[x] <- true 3 n[x] <- 0 4 Disk-Write (x) 5 root[T] <- x
I know how to allocate on heap and say use fwrite()
in C programming language
to write it onto the disk, but how to incorporate the linking in the disk? Should be use ftell()
to get the start of an object in file and accordingly make the linking?
I do not quite get how to elegantly write the objects to the disk, maintaining the linked structure. (The text Data Structures Using C and C++ By Aaron M. Tenenbaum et. al. only deals with the topic pictorially without hardcore implementation. And I am yet to take up a formal DBMS course)
Please guide me,if possible. Thank you..
[Note I went through this question as well as this, but the answers suggested contains huge codes without any documentation or vivid comments, googling the stuff yields B-tree maintained in main-memory(which is not what they are designed for). Could anyone just pin point to me the thing in a simpler programming language like C
, so that I can understand the sketch of the process without going into the details of hundreds of lines of codes, with far more complicacies like acquiring lock, then releasing lock,defragment,etc.] [Moreover there is a video lecture on the topic here, but alas without the details of the implementation]