Freeing array of dynamic strings / lines in C

108 views Asked by At

I am writing a program that is sorting the lines from the input text file. It does its job, however I get memory leaks using valgrind.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char* getline(FILE * infile)
{
    int size = 1024;
    char * line = (char*)malloc(size);
    int temp;

    int i=0;
    do
    {
        (temp = fgetc(infile));
        if (temp !=EOF)
            line[i++]=(char)temp;
        if (i>=size)    
        {
            size*=2;
            line = (char*)realloc(line, size);
        }

    }while (temp != '\n' && temp !=EOF);
    if (temp==EOF)
        return NULL;
    return line;
}

void print_line (char * line)
{
    printf("%s", line);
}

int myCompare (const void * a, const void * b )
{
    const char *pa = *(const char**)a;
    const char *pb = *(const char**)b;

    return strcmp(pa,pb);
}

int main (int argc, char* argv[])
{
    FILE * infile;
    FILE * outfile;

    infile = fopen(argv[1], "r");
    if (infile==NULL)
    {
        printf("Error");
        exit(3);
    }
    outfile = fopen(argv[2], "w");
    if (outfile==NULL)
    {
        printf("Error");
        exit(3);
    }
    char * line;
    char **all_lines;
    int nlines=0;

    while((line=getline(infile))!=NULL)
    {
        print_line(line);
        nlines++;
    }

    all_lines=malloc(nlines*sizeof(char*));
    rewind(infile);
    int j=0;
    printf("%d\n\n", nlines);

    while((line=getline(infile))!=NULL)
    {

        all_lines[j]=line;
        j++;
    }

    qsort(all_lines, nlines, sizeof(char*), myCompare);

    for (int i=0; i<nlines;i++)
    {
        print_line(all_lines[i]);
        fprintf(outfile, "%s", all_lines[i]);
    }

    for(int i =0; i<nlines;i++)
    {
        free(all_lines[i]);
    }

    free(all_lines);
    fclose(infile);
    fclose(outfile);
    return 0;   
}

Any ideas where they might come from? I loop over all_lines[] and free the content, then free all_lines itself.

UPDATE:

Ok, so I've done the updates you have suggested. However, now valgrind throws error for functions fprintf in my program. Here is what it says:

11 errors in context 2 of 2:

==3646== Conditional jump or move depends on uninitialised value(s)
==3646==    at 0x40BA4B1: vfprintf (vfprintf.c:1601)
==3646==    by 0x40C0F7F: printf (printf.c:35)
==3646==    by 0x80487B8: print_line (sort_lines.c:44)
==3646==    by 0x804887D: main (sort_lines.c:77)
==3646==  Uninitialised value was created by a heap allocation
==3646==    at 0x4024D12: realloc (vg_replace_malloc.c:476)
==3646==    by 0x8048766: getline (sort_lines.c:30)
==3646==    by 0x804889A: main (sort_lines.c:75)

I would like to know why it reports error on simply fprintf those lines to a text file. I've looked up that it is something concerning gcc optimalization turning fprint into fputs but I dont get this idea

1

There are 1 answers

2
chqrlie On BEST ANSWER

There are multiple problems in your code:

function getline:

  • the string in the line buffer is not properly '\0' terminated at the end of the do / while loop.
  • It does not free the line buffer upon end of file, hence memory leak.
  • It does not return a partial line at end of file if the file does not end with a '\n'.
  • neither malloc not realloc return values are checked for memory allocation failure.
  • You should realloc the line to the actual size used to reduce the amount of memory consumed. Currently, you allocate at least 1024 bytes per line. Sorting the dictionary will require 100 times more memory than needed!

function MyCompare:

The lines read include the '\n' at the end. Comparison may yield unexpected results: "Hello\tworld\n" will come before "Hello\n". You should strip the '\n' in getline and modify the printf formats appropriately.

function main:

  • You do not check if command line arguments are actually provided before trying to open them with fopen, invoking undefined behaviour
  • The first loop that counts the number of lines does not free the lines returned by getline... big memory leak!
  • The input stream cannot always be rewinded. If your program is given its input via a pipe, rewind will fail except for very small input sizes.
  • The number of lines read in the second loop may be different from the one counted in the first loop if the file was modified asynchronously by another process. It the file grew, you invoke undefined behaviour when you load it, if it shrank, you invoke undefined behaviour when you sort the array. You should reallocate the all_lines array as you read the lines in a single loop.

Printing the lines before sort is not very useful and complicates testing.