The language I am working in is C.
I am trying to use a mix of built in c string functions in order to take a list of tokens (space separated) and "convert" it into a list of tokens that is split by quotations.
A string like
echo "Hello 1 2 3 4" test test2
gets converted to
[echo] ["Hello] [1] [2] [3] [4"] [test] [test2]
I then use my code (at bottom) to attempt to convert it into something like
[echo] [Hello 1 2 3 4] [test] [test2]
For some reason the second 'token' in the quoted statement gets overridden. Here's a snippet of the code that runs over the token list and converts it to the new one.
88 for (int i = 0; i < counter; i++) {
89 if ( (strstr(tokenized[i],"\"") != NULL) && (inQuotes == 0)) {
90 inQuotes = 1;
91 tokenizedQuoted[quoteCounter] = tokenized[i];
92 strcat(tokenizedQuoted[quoteCounter]," ");
93 } else if ( (strstr(tokenized[i],"\"") != NULL) && (inQuotes == 1)) {
94 inQuotes = 0;
95 strcat(tokenizedQuoted[quoteCounter],tokenized[i]);
96 quoteCounter++;
97 } else {
98 if (inQuotes == 0) {
99 tokenizedQuoted[quoteCounter] = tokenized[i];
100 quoteCounter++;
101 } else if (inQuotes == 1) {
102 strcat(tokenizedQuoted[quoteCounter], tokenized[i]);
103 strcat(tokenizedQuoted[quoteCounter], " ");
104 }
105 }
106
107 }
In short, adding an space to a
char *
means that the memory pointed by it needs more bytes. Since you do not provide it, you are overwritting the first byte of the following "word" with\0
, so thechar *
to it is interpreted as the empty string. Note that writting to a location that has not been reserved is an undefined behavior, so really ANYTHING could happen (from segmentation fault to "correct" results with no errors).Use
malloc
to create a new buffer for the expanded result with enough bytes for it (do not forget tofree
the old buffers if they weremalloc
'd).