how to use strtok to tokenize a expression using c++

542 views Asked by At

i need to tokenize an mathematical expression using strok..i have done something but i cannot get delimiters to my vector when i run the code i get 2x 4y 6 3 this output how can i get delimiters to my vector and how can i get my output like this 2x + 4y ^ 6 - 3 my code

int main()
    {
    vector<string> finalVector;
        char input[1024]="2x+4y^6-3";
        char *token = strtok(input, "^+-/()/t");
        while (token != NULL) {
            finalVector.push_back(token);

                    token = strtok(NULL, "^+-/()/t");
                    }
        for (int i = 0; i < finalVector.size(); i++)
            cout << finalVector.at(i) << " ";
        return 0;
        }
2

There are 2 answers

0
David Hammen On BEST ANSWER

strtok replaces the found delimiter with the null character. The delimiter is irretrievable gone.

If you make a copy of your string before the first call to strtok, you can recover the delimiter:

char* to_strtok = strdup(input);
const char* delims = "^+-/()/t";
char* token;
for (token = strtok(to_strtok, delims);
     token != 0;
     token = strtok(0, delims))
{
  char delim = input[token - to_strtok + strlen(token)];
  if (delim != '\0')
  {
     printf ("token=\"%s\" delim='%c'\n", token, delim);
  }
  else
  {
     printf ("last token=\"%s\"n", token);
  }

}

0
Nir Friedman On

I know your question reads on how to do this with strtok, but my sense is that this will eventually cause you pain. I think you should at least consider using the boost tokenizer, which supports this. In fact, boost supports a combination of discarded and kept delimiters; kept delimiters are stored as their own tokens:

// char_sep_example_2.cpp
#include <iostream>
#include <boost/tokenizer.hpp>
#include <string>

int main()
{
  std::string str = ";;Hello|world||-foo--bar;yow;baz|";
  typedef boost::tokenizer<boost::char_separator<char> > 
      tokenizer;
  boost::char_separator<char> sep("-;", "|", boost::keep_empty_tokens);
  tokenizer tokens(str, sep);
  for (tokenizer::iterator tok_iter = tokens.begin();
      tok_iter != tokens.end(); ++tok_iter)
      std::cout << "<" << *tok_iter << "> ";

  std::cout << "\n";
  return 0;
}

The output is:
<> <> <Hello> <|> <world> <|> <> <|> <> <foo> <> <bar> <yow> <baz> <|> <>

This does what you want pretty easily. My guess is that this will save you a lot of time. Reference: http://www.boost.org/doc/libs/1_58_0/libs/tokenizer/char_separator.htm