Elegantly write the objects of B-tree onto the disk, maintaining the linked structure, in a simple programming language

159 views Asked by At

I was going through the B-Tree topic in Introduction to Algorithms by Cormen et. al. And I was having a difficulty in implementing the disk-operations of the pseudocode in a real program. This might be the situation because few descriptions of the objects is not clear to me here. So could anyone guide me as, how to proceed.

The following are the excerpts from the text, presenting the situation of the development:

We model disk operations in our pseudocode as follows. Let x be a pointer to an object. If the object is currently in the computer's main memory, then we can refer to the fields of the object as usual: key[x], for example. If the object referred to by x resides on disk, however, then we must perform the operation Disk-Read (x) to read object x into main memory before we can refer to its fields. (We assume that if x is already in main memory, then Disk-Read (x) requires no disk accesses; it is a "no-op.") Similarly, the operation Disk-Write(x) is used to save any changes that have been made to the fields of object x. The typical pattern for working with an object is as follows:

    x <- a pointer to some object 
    Disk-Read (x) 
    operations that access and/or modify the fields of x 
    Disk-Write(x) // Omitted if no fields of x were changed. 
    other operations that access but do not modify fields of x

So it is quite obvious that if x is supposed to store an object which on the disk, we cannot refer to it unless we bring it into main-memory, but how can we have x pointing to an object in the disk in the first place, this is something, I am having problem to implement.

I could not understand how to implement these disk operations in an actual program.

Creating an empty B-tree

To build a B-tree T, we first use B-Tree-Create to create an empty root node. The procedure use an auxiliary procedure Allocate-Node, which allocates one disk page to be used as a new node in O(1) time. (I know to allocate on heap, but here are they talking about allocating directly on disk? Moreover if x holds reference to an object allocated on disk then without bringing it into main-memory we cannot possibly work its attributes as per the previous excerpt). We can assume that a node created by Allocate-Node requires no Disk-Read, since there is as yet no useful information stored on the disk for that node.(If we do not read it, how can we set the attributes of x?)

  В-Tree-Create (T): 
  1 x <- Allocate-Node() 
  2 leaf[x] <- true 
  3 n[x] <- 0 
  4 Disk-Write (x) 
  5 root[T] <- x

I know how to allocate on heap and say use fwrite() in C programming language to write it onto the disk, but how to incorporate the linking in the disk? Should be use ftell() to get the start of an object in file and accordingly make the linking?

I do not quite get how to elegantly write the objects to the disk, maintaining the linked structure. (The text Data Structures Using C and C++ By Aaron M. Tenenbaum et. al. only deals with the topic pictorially without hardcore implementation. And I am yet to take up a formal DBMS course)

Please guide me,if possible. Thank you..

[Note I went through this question as well as this, but the answers suggested contains huge codes without any documentation or vivid comments, googling the stuff yields B-tree maintained in main-memory(which is not what they are designed for). Could anyone just pin point to me the thing in a simpler programming language like C, so that I can understand the sketch of the process without going into the details of hundreds of lines of codes, with far more complicacies like acquiring lock, then releasing lock,defragment,etc.] [Moreover there is a video lecture on the topic here, but alas without the details of the implementation]

0

There are 0 answers