C++ symbol has different size in shared object

12.3k views Asked by At

I have been working on a cross platform windowing library aimed to be used for OpenGL specifically, currently focusing on linux. I am making use of glload to manage OpenGL extensions, and this is being compiled, along with other libraries that I will use later, into an .so. This `.so is being dynamically loaded as you would expect, but at run time the program gives the following output (manually wrapped so it is easier to read):

_dist/x64-linux-debug/bin/test: Symbol `glXCreateContextAttribsARB' has \
different size in shared object, consider re-linking

Now, obviously I have tried re-linking, going as far as rebuilding the entire project many times (testing things out, not just blindly hoping it will magically make it all better). The program does seem to be willing to run as it will produce some logging output as I would expect it to. I have used nm to confirm that the 'symbol' is in the .so

nm _dist/x64-linux-debug/lib64/libvendor.so | grep glXCreateContextAttribsARB
00000000009e0e78 B glXCreateContextAttribsARB

If I use readelf to look at the symbols being defined I get the following (again, I have manually wrapped the first three lines for formatting sake):

readelf -Ws _dist/x64-linux-debug/bin/test \
_dist/x64-linux-debug/lib64/libvendor.so | \
grep glXCreateContextAttribsARB
   348: 000000000062b318  8 OBJECT  GLOBAL DEFAULT  26 glXCreateContextAttribsARB
   421: 000000000062b318  8 OBJECT  GLOBAL DEFAULT  26 glXCreateContextAttribsARB
  1370: 00000000009e0e78  8 OBJECT  GLOBAL DEFAULT  25 glXCreateContextAttribsARB
 17464: 00000000009e0e78  8 OBJECT  GLOBAL DEFAULT  25 glXCreateContextAttribsARB

I am afraid that this is about all I can offer to help, as I really do not know what to try or look into. Like I said, I am sure more will info will be need, so please just say an I will provide what I can. I am running these commands from my project root, encase you are wondering.

4

There are 4 answers

0
Roland Sarrazin On

I have faced a tedious issue related to objects of different sizes so I want to share my experience - even though it is clear to me that it is only one reason that might explain different object sizes - and not mandatorily the OP's.

The symptoms were objects of different sizes in debug mode, none in release mode. The linker produced the according warnings. The symbol names were hard to decipher but related to some unnamed static variables in instances of class templates.

The reason was the debug logging feature à la LOG("Do something.");. The LOG macro used the C ANSI macro __FILE__ which expanded to another path depending on whether the header was included by the application or by the shared library. And this string was exactly the aforementioned unnamed static variable.

Even more tedious was the fact that due to our make environment the __FILE__ macro sometimes expanded to, let's say, C:\temp\file.h and sometimes to C:\other\..\temp\file.h so that building the application and the library from the same place didn't solve the problem either.

I hope this piece of experience might spare some time to some of you.

1
AudioBubble On

The runtime is noticing that glXCreateContextAttribsARB as compiled in the shared object, and glXCreateContextAttribsARB as compiled in the main program (or maybe even some other shared object previously linked) have different sizes. This means that, in the separate builds for the shared object and whatever else references that object, they must be looking at different code (probably in a shared object) where this is defined. Sometimes this occurs because they are looking at different files, sometimes this occurs because of different #defines causing different interpretations of the same file. Whatever the reason, you absolutely need to make sure that the same symbol (e.g. a structure) is defined the same way (i.e. with the same member variables and size) across everything that is linked together at runtime.

It's actually a very good thing that it is refusing to run, as this is a catastrophe when two parts of the code interpret the same bit of memory in different ways at runtime. (Not too much of an exaggeration to say anything could happen if this was allowed to proceed.)

You might want to try just loading up the executable in gdb (without running it) and typing

info types

to see where it is defined, and then load the shared object in gdb (without running it) and doing another info types there to see what each of them thinks it's looking at. If it's the same thing, check the preprocessor directives.

1
Employed Russian On

wilsonmichaelpatrick's answer is mostly correct, but using gdb is likely not the fastest way to find the problem, and will likely not work at all if you have a non-debug build.

First, you should confirm that there in fact is a problem:

readelf -Ws _dist/x64-linux-debug/bin/test _dist/x64-linux-debug/lib64/libvendor.so |
  grep glXCreateContextAttribsARB

This should show the symbol being defined in test and libvendor.so, with different size.

Second, re-link test and libvendor.so with -Wl,-y,glXCreateContextAttribsARB flag. That will tell you which object files (or libraries) provide the (different) definitions.

Finally, preprocess the sources that produce above object files with -E and -dD flags, and see what's different between them.

Update:

I need help digesting what it is saying

Don't be helpless. Read man readelf, or just run it by hand. You'll see something like this:

readelf -Ws /bin/date | head -5

Symbol table '.dynsym' contains 75 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __ctype_toupper_loc@GLIBC_2.3 (2)

This tells you the meaning of the data you've got. In particular, this tells you that the size of the symbol in test and in libvendor.so is the same (8). Therefore, the problem is not in these two ELF files, but somewhere else. Run readelf on your other libraries, and look for definition of glXCreateContextAttribsARB that has a different size. Then follow the rest of the procedure.

0
Carlo Wood On

In most cases you're probably just linking against the wrong library (a different version). For example, you have libfoo installed twice and link your executable with -L /path/to/version1 -lfoo but during runtime you link with /path/to/version2 (you can see this one with ldd yourprogram).

One reason could be that the executable was linked with -rpath,/path/to/version1 but (as recent versions do) this set the RUNPATH entry in the dynamic section; while you have LD_LIBRARY_PATH=/path/to/version2. When RUNPATH is set, LD_LIBRARY_PATH gets precedence. In this case delete the library from /path/to/version2 (or remove that path from LD_LIBRARY_PATH).

EXAMPLE

$ minimal
/home/carlo/minimal: Symbol `_ZN6libcwd8libcw_doE' has different size in shared object, consider re-linking
COREDUMP    : /home/carlo/projects/libcwd/libcwd/elfxx.cc:2381: void libcwd::elfxx::objfile_ct::load_dwarf(): Assertion `size == sizeof(address)' failed.

(libcwd is smart enough to see it too; aka the problem here is with libcwd):

$ ldd minimal | grep libcwd_r
        libcwd_r.so.5 => /usr/local/install/6.0.0-1ubuntu2/lib/libcwd_r.so.5 (0x00007f0b69840000)

$ echo $LD_LIBRARY_PATH
/usr/local/install/6.0.0-1ubuntu2/lib

$ objdump -a -x minimal | grep PATH
  RUNPATH /opt/gitache/libcwd_r/888f62c44fd64f1486176bf9e35b36f79612790017c31f95e117fc59743a54ca/lib

Unsetting LD_LIBRARY_PATH or removing libcwd from that path results in

$ unset LD_LIBRARY_PATH
$ ldd minimal | grep libcwd_r
        libcwd_r.so.5 => /opt/gitache/libcwd_r/888f62c44fd64f1486176bf9e35b36f79612790017c31f95e117fc59743a54ca/lib/libcwd_r.so.5 (0x00007f11d7298000)

and things work again. Or alternatively I could add to my CMakeLists.txt of the project:

$ set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--disable-new-dtags")

After which we get,

$ objdump -a -x minimal | grep PATH
  RPATH                /opt/gitache/libcwd_r/888f62c44fd64f1486176bf9e35b36f79612790017c31f95e117fc59743a54ca/lib

which now has precedence over LD_LIBRARY_PATH and therefore also solves the issue. This is not the recommended way however: if you set LD_LIBRARY_PATH you should know what you are doing. If that doesn't work, you should fix LD_LIBRARY_PATH or remove the offending library.