How do I optimize Heap sort being used in sorting Class objects for Burrows Wheeler Transform?

110 views Asked by At

I am trying to implement the burrows wheeler`s transform. I have a Dictionary class which contains index (int) and data (string). The Dictionary class is being used to build the suffix array. I am sorting the suffix array of Dictionary objects through heap sort. When I have short text < 10KiB, the BWT works fine with heap sort, but when I feed larger text files like > 100KiB the programs breaks. I have a feeling that I am doing something wrong in the heap sort implementation. Here is my code of heap sort implemented in Dictionary class which contains two data members (int index and string data) :

void Dictionary::maxHeapify(Dictionary *obj, int size, int i){
    int largest = i;
    int left = 2 * i + 1;
    int right = 2 * i + 2;
    Dictionary tmp;

    while (left < size && strCmp(obj[left].data, obj[largest].data) > 0){
        largest = left;
    }

    while (right < size && strCmp(obj[right].data, obj[largest].data) > 0){
        largest = right;
    }

    if (largest != i){
        // Swap
        tmp = obj[i];
        obj[i] = obj[largest];
        obj[largest] = tmp;
        maxHeapify(obj, size, largest);
    }
}

void Dictionary::heapSort(Dictionary *obj, int size){
    Dictionary tmp;
    for (int i = size/2-1; i >= 0; i--){
        maxHeapify(obj, size, i);
    }

    for (int i = size-1; i > 0; i--){
        // Swap
        tmp = obj[0];
        obj[0] = obj[i];
        obj[i] = tmp;
        maxHeapify(obj, i, 0);
    }
}

Node: I will provide the BWT class code if required.

Edit: Here is the BWT class code:

class BWT {
private:

    string input;
    int size;

public:

    BWT(string input, int size){
        this->input = input;
        this->size = size;
    }

    string bwt(){
        Dictionary *dict = new Dictionary[size];
        Dictionary *a;

        for (int i = 0; i < size; i++){
            dict[i].index = i;
            for (int j = i; j <= size; j++){
                dict[i].data += input[j];
            }
        }

        a.heapSort(dict, size);

        string bwt;
        for (int i = 0; i < size; i++){
            int x = dict[i].index - 1;
            if (x < 0){
                x += size;
            }
            bwt += input[x];
        }
        bwt += '\0';
        return bwt;
    }
};
0

There are 0 answers