To solve my problem here, I want to know if/how I can define the second variable of the command line arguments in a format other than char** argv
or char* argv[]
. The reason is that pybind11 doesn't allow either of those in the inputs of a function. Here are the methods I have tried:
Method 1:
#include <stdio.h>
int main(int argc, int* argv_){
for (int i = 0; i < argc; ++i){
printf("%s\n", (char *)(argv_[i]));
}
}
The rationale behind this method is that a pointer is intrinsically an integer and by casting the address to a char
pointer, one should be able to get the strings. Thanks for your kind support in advance.
Method 2:
#include <stdio.h>
#include <string>
int main(int argc, std::string* argv_){
for (int i = 0; i < argc; ++i){
printf("%s\n", argv_[i].c_str());
}
}
Method 3:
#include <stdio.h>
#include <string>
#include <vector>
int main(int argc, std::vector<std::string> argv_){
for (int i = 0; i < argc; ++i){
const char* argv__ = argv_[i].c_str();
printf("%s\n", argv_[i].c_str());
}
}
issue:
Unfortunately, all of the above methods lead to the infamous segmentation fault
.
I would appreciate it if you could help me know what is the problem (i.e., where is the memory leak) and how to solve them.
workaround/hack:
In the comments I'm being told that if any other form rather than main()
, main(int argc, char** argv)
, or main(int argc, char* argv[])
is used, it will unavoidably lead to segmentation fault
. However, the code below works:
#include <stdio.h>
int main(int argc, long* argv_){
for (int i = 0; i < argc; ++i){
printf("%s\n", (char *)(argv_[i]));
}
}
This works on an Ubuntu minimal and g++ 7.4.0
, and Windows 10 Visual Studio 2019 compilers. However, it does not compile with clang
. As others have pointed out this is not a solution and a very bad practice. It can cause undefined behavior depending on the compiler, operating system and the current state of the memory. This should not be used in any actual code ever. The main function in any C/C++ code must be of the forms main()
, main(int argc, char** argv)
, or main(int argc, char* argv[])
.
Let's try to tackle the plethora of issues that have cropped up during the lengthy discussion, one by one.
Question 1: Why do I get a segfault when using some non-standard parameters (like string vector or int pointer) to
main
?The parameter types of
int, char **
are defined that way by both the C and the C++ standard. Non-standard extensions aside, you cannot use other types.From ISO/IEC 9899 (The C Language), 5.1.2.2.1 Program startup:
That last sentence allows for those extensions I mentioned. One such extension I know of is GCC's
environ
:https://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html#Program-Arguments
Question 2: How do I hack around this?
You don't.
Using different types than those defined by the standard, or by compiler extensions, is Undefined Behavior, which can -- but does not need to -- lead to segfaults. Do not invoke undefined behavior. Do not "hack around" the standard. It is not a "workaround", let alone a "solution", it is broken code that can blow up in your face any time.
Question 3: How do I
pybind
a third-party function that takes achar **
as parameter?You don't, as this is not a datatype supported by
pybind
.Question 4: How do I interface such a function through
pybind
, then?You write a wrapper function that, on the front end, takes parameters supported by
pybind
(e.g.std::vector< std::string >
), appropriately marshals those, and then calls the third-party backend function for you with the marshalled arguments. (Then, of course, doing the same in reverse for the return type, if required.)For an idiomatic example on how to do that, see the answer by @TedLyngmo.
Question 5: Can I
pybind
to a third-partymain
?This is ill-advised, as
main
is a special function, and the called code may make assumptions (likeatexit
callbacks) that your calling code does not, and can not, comply with. It is certainly not a function the third party ever expected to be called as a library function.