using qsort to sort two arrays simultaneously?

3.6k views Asked by At

I can sort a array of pointers to words so that they are ordered alphabetically, the problem is that I need to ALSO sort an integer array (the number of times that specific word is used) so that the integers are in the same place as their respective words:

my code:

for (i = 0; i < numWords; i++) {
    // prints out the words and their frequency respectively
    printf("%s - %d\n", dictionary[i], frequency[i]); 
}

//sorts the dictionary so that the words are 'alphabetical'
qsort(dictionary, numWords, sizeof(char *), rstrcmp);  
printf("\nafter qsort\n");  //checkmark

for (i = 0; i < numWords; i++) {
    // prints the word list alphabetically, but the frequencies are no longer matched
    printf("%s - %d\n", dictionary[i], frequency[i]); 
}

...comparison function V

int rstrcmp(const void *p1, const void *p2) {
    return strcmp(*(char * const *)p1, *(char * const *)p2);
}
3

There are 3 answers

0
Turix On BEST ANSWER

A simple thing to do would be to use a struct to store word/frequency pairs and then sort an array of these structs.

For example:

struct WordFrequency
{
    const char * word;
    int frequency;
} wordFreqs[numWords];        // Assumes numWords is static/global and constant...

Then:

for (i = 0; i < numWords; i++) {
    printf("%s - %d\n", dictionary[i], frequency[i]);
    wordFreqs[i].word = dictionary[i];
    wordFreqs[i].frequency = frequency[i];
}

//sorts the dictionary so that the words are 'alphabetical'
qsort(wordFreqs, numWords, sizeof(struct WordFrequency), wfcmp);  

for (i = 0; i < numWords; i++) {
    printf("%s - %d\n", wordFreqs[i].word, wordFreqs[i].frequency); 
}

And:

int wfcmp(const void *p1, const void *p2) {
     return strcmp(((const struct WordFrequency *)p1)->word, ((const struct WordFrequency *)p2)->word);
}
5
Jonathan Leffler On

The standard qsort() function cannot do as you wish directly. All else apart, how does it know (or how do you tell it) which two arrays to sort in parallel?

You either have to change the data structure (use an array of a structure type), or you have to write your own sort function. Of the two, changing the data structure is probably the easier.

There is another option — but a somewhat contorted one. You could create an array of int with entries such that:

for (int i = 0; i < N; i++)
    index[i] = i;

You then pass this array to the sort function, along with a comparator that knows the base addresses of the two arrays. The qsort() function permutes the data in the array; the comparator looks at the data in the other arrays. The other two arrays have to be global (at least file scope) variables, or you need global variables that are pointers that can be initialized with the the base addresses of the two arrays.

After the sort, you can use array1[index[i]] and array2[index[i]] to access the ith element of the sorted arrays.

One other option if you're on BSD: you could use the qsort_r() function:

 void qsort_r(void *base, size_t nel, size_t width, void *thunk,
              int (*compar)(void *, const void *, const void *));

The 'thunk' is a pointer that's passed to the comparator as the first argument. You could use this with the index-array scheme to pass the pointers to the two arrays into the comparator, so you wouldn't need file scope variables at all. You still can't do two independent swaps, though, so you'd have to use the index-array scheme.

2
hobbs On

One approach that you might find useful for sorting parallel arrays: create an array of integers (size_ts to be strictly correct) and initialize it with the values 0 through numWords-1. Then qsort that array using a comparison function that does strcmp(dictionary[*(int *)p1], dictionary[*(int *)p2], then use the sorted array of indices to permute both dictionary and frequency at the same time (this is very easily done by copying, or a little less easily done in-place with swaps: here is an example of the latter).

Turix probably has the better solution though — using an array of structs instead of two arrays avoids the whole problem.