I'm using boost::tokenizer to tokenize a string in C++, then I want to pass it to execv.
Consider the following code snippet (compilable):
#include <iostream>
#include <cstdlib>
#include <vector>
#include <boost/tokenizer.hpp>
// I will put every token into this vector
std::vector<const char*> argc;
// this is the command I want to parse
std::string command = "/bin/ls -la -R";
void test_tokenizer() {
// tokenizer is needed because arguments can be in quotes
boost::tokenizer<boost::escaped_list_separator<char> > scriptArguments(
command,
boost::escaped_list_separator<char>("\\", " ", "\""));
boost::tokenizer<boost::escaped_list_separator<char> >::iterator argument;
for(argument = scriptArguments.begin();
argument!=scriptArguments.end();
++argument) {
argc.push_back(argument->c_str());
std::cout << argument->c_str() << std::endl;
}
argc.push_back(NULL);
}
void test_raw() {
argc.push_back("/bin/ls");
argc.push_back("-l");
argc.push_back("-R");
argc.push_back(NULL);
}
int main() {
// this works OK
/*test_raw();
execv(argc[0], (char* const*)&argc[0]);
std::cerr << "execv failed";
_exit(1);
*/
// this is not working
test_tokenizer();
execv(argc[0], (char* const*)&argc[0]);
std::cerr << "execv failed";
_exit(2);
}
When I run this script it calls test_tokenizer(), it will print 'execv failed'. (Although it prints the arguments nicely).
However if I change test_tokenizer to test_raw it runs fine.
It must be some easy solution but I didn't find it.
PS.: I also drop this into an online compiler with boost support here.
boost::tokenizersaves the token by value (and by default asstd::string) in the token iterator.Therefore the character array that
argument->c_str()points to may be modified or invalidated when the iterator is modified and its lifetime will end with that ofargumentat the latest.Consequently your program has undefined behavior when you try to use
argc.If you want to keep using
boost::tokenizer, I would suggest to keep the tokens in astd::vector<std::string>and transform them to a pointer array afterwards.