I try my best to look up solutions to problems before asking on here, but I've hit a bit of a brick wall. While there is one for Java, it didn't help me enough to understand where I went wrong with my implementation. So without further ado, here's some background, and then the problem.
Background/Reason for attempting this
To keep things short, the reason I'm making this isn't just for learning purposes, and no this not a homework assignment as my Data Structures and Algorithms class is a year from now, it's for practical general use, without worrying too much on optimization. So, hence if it seems a bit inefficient, please feel free to point it out and let me know how I can improve it, but 100% optimization isn't my true goal. I've also created a simple logging tool to help me keep track of all the elements in the linked list to help debug the algorithm, and even though it does what I intended, I still can't figure out how to fix it.
Problem
It seems that some runs work perfectly, I can see that it correctly divides and conquers the list, yet other times, at random places I see repeating places. For instance, here is a good run from the Linked List. Then if I run it again, I either get similar good results or bad ones like this
As can be seen from the bad, I seem to have messed up somewhere, and one of the nodes is pointing to a node that, directly or indirectly, points to itself.
What I've Tried.
I've attempted to create a smaller data structure to keep track of the head and tail nodes for each split list, aptly sub_list_t.
typedef struct {
Node *head;
Node *tail;
size_t size;
} sub_list_t;
The reasoning behind this is because, I figured it would make it much easier to interact with, and I even have it's own individual separate functions; also the original Linked List is rather large to be needlessly creating 3 per sort (List_One, List_Two, and Final_List). The original list contains rwlocks to attempt to make them concurrent, so the overhead of creating 3n short-lived locks wouldn't make much sense. Hence, a mini/sub list is easier and better, IMO.
The function that splits and merges each sub_list into a final list, sort_lists, is implemented here.
static sub_list_t *sort_list(sub_list_t *list, Linked_List_Compare compare){
if(list->size == 1) {
return list;
}
size_t mid = list->size / 2;
sub_list_t *list_one = sub_list_of(list, 0, mid-1);
list_one = sort_list(list_one, compare);
sub_list_t *list_two = sub_list_of(list, mid, list->size - 1);
list_two = sort_list(list_two, compare);
sub_list_t *final_list = merge_lists(list_one, list_two, compare);
free(list_one);
free(list_two);
return final_list;
}
The function that splits each sub_list down into more sub_lists, is implemented here...
static sub_list_t *sub_list_of(sub_list_t *list, unsigned int begin, unsigned int end){
sub_list_t *sub_list = malloc(sizeof(sub_list_t));
int i = 0;
size_t size = 0;
Node *node = list->head;
while(i < begin) {
node = node->next;
i++;
}
sub_list->head = node;
while(i++ <= end){
node = node->next;
size++;
}
sub_list->tail = node;
sub_list->size = size;
return sub_list;
}
I've a feeling the problem is above. I had to change it multiple times, since I'm unsure. It should accurately get the head, but the tail, I'm not certain, although it looks good enough.
static void append_to_list(sub_list_t *list, Node *node){
if(!list->size){ // If list->size == 0
list->tail = list->head = node;
list->size++;
return;
}
// The problem must be here then? This is the only part of code that modifies what the nodes point to.
list->tail->next = node;
node->prev = list->tail;
list->tail = node;
list->size++;
}
Simple constructor for sub_list is here, only used at beginning when Linked_List creates the sub_list or to create an empty final_list in merge_list, implemented below.
static sub_list_t *sub_list_create(Node *head, Node *tail, size_t size){
sub_list_t *list = malloc(sizeof(sub_list_t));
list->head = head;
list->tail = tail;
list->size = size;
return list;
}
Function to append the rest of a list to the final list, which also could be culprit, although I can't find the fault in it.
static void append_list_to_list(sub_list_t *list_src, sub_list_t *list_dst){
Node *node = NULL;
int i = 0;
size_t old_size = list_dst->size;
for(node = list_src->head; i++ < list_src->size; node = node->next) append_to_list(list_dst, node);
}
Finally, the merge_list, which handles merging and comparisons, which seems to be where the problems originates, as it makes all calls to append_to_list and append_list_to_list...
static sub_list_t *merge_lists(sub_list_t *list_one, sub_list_t *list_two, Linked_List_Compare compare){
sub_list_t *final_list = sub_list_create(NULL, NULL, 0);
while(list_one->size > 0 && list_two->size > 0){
if(compare(list_one->head->item, list_two->head->item) <= 0){
append_to_list(final_list, list_one->head);
list_one->head = list_one->head->next;
list_one->size--;
} else {
append_to_list(final_list, list_two->head);
list_two->head = list_two->head->next;
list_two->size--;
}
}
if(list_one->size > 0) {
append_list_to_list(list_one, final_list);
}
if(list_two->size > 0) {
append_list_to_list(list_two, final_list);
}
return final_list;
}
If I'm doing something wrong, let me know. Also, if this is considered a bad way to ask a question, also please let me know. I'll work on a runnable example in ideone right away.
Edit: Here. Mostly replicated results, infinite looping, nodes pointing indirectly to itself, etc.
Edit2: I managed to implement an insertion sort rather easily, by just swapping out the data held by the nodes rather than where the nodes point to... within 2 minutes... yet I've been trying to do the merge sort for about 2 days to no avail. Yet I still don't know how it precisely even worked.
A bottom up merge sort is more appropriate for sorting a linked list. An efficient method for doing this is to use an array of pointers to the first nodes of lists, where the number of nodes pointed to by array[i] is 2^i (2 to the power i), or array[i] is an empty list.
The double linked list can be treated as a single linked list during the merge sort and then the previous links fixed once the list is sorted.
Nodes are removed from the original list one at a time and merged into the array, then the lists in the array are merged to form a single sorted list. Since it seems you need this code soon, here is example C code: