How to make an old C codebase with many globals reentrant

411 views Asked by At

I'm working with a large, old C codebase (an interpreter) that uses global variables a great deal, with the result that I cannot have two instances of it at once. Is there a straightforward (ideally automated) approach to convert this code to something reentrant? i.e. some refactor tool that would make all globals part of a struct and prepend the pointer to all variables?

Could I convert to C++ and wrap the entire thing in a class definition?

6

There are 6 answers

0
Kirill Kobelev On BEST ANSWER

I would recommend to convert your project into C++11 project and change all your static vars into threadlocal.

This can be up to several days of work depending on the size of your project. In certain cases this will work.

6
Mats Petersson On

I'm not aware of any "ready made" solution for this type of problem.

As a general rule, global variables are going to make it hard to make the code reentrant.

If you can remove all the global variables [simply delete the globals and see where you get compiler errors]. Replace the globals with a structure, and then use a structure per instance that is passed along, you'd be pretty much done (as long as the state of the interpreter instances is independent, and the instances don't need to know about each other). [Of course, you may need to have more than a single structure to solve the problem, but your global variables should be possible to "stick in a structure"].

Of course, making the structure and the code go together as a C++ class (which may have smaller classes as part of the solution) would be the "next step", but it's not entirely straight forward to do this, if you are not familiar with C++ and class designs.

3
yosim On

Are you trying to make it reentrant in order to be able to make it multi-thread, and divide the work between threads? If so, I would consider making it multi process, instead of multy-thread,

0
Ira Baxter On

A program transformation tool that can handle the C language would be able to automate such changes. It needs to be able to resolve each symbol to its declaration, and preprocessor conditionals are likely to be trouble. I'd name one, but SO zealots object when I do that.

Your solution of a struct containing the globals is the right idea; you need rewrite that replaces each global declaration with a slot member, and each access to a global with access to the corresponding struct member.

The remaining question is, where does the struct pointer come from? One answer is a global variable that is multiplexed when threads are switched; a better answer if available under your OS is the get the struct pointer from thread local variables.

0
Martin James On

What I usually do with interpreters is go straight to a class with instance vars rather than globals. Not sure what you are interpreting, but it could be possible to pass in a file path or string container that the class interprets with an internal thread, so encapsulating the entire interpretation run.

0
user4815162342 On

It is possible to wrap the whole thing in a class definition, but it will not work for code that takes addresses of functions and passes them to C code. Also, converting a large legacy code base to be compilable by a C++ compiler is tedious enough that it probably outweighs the effort of removing the global variables by hand.

Barring that one, you have two options:

  1. Since you need reentrancy to implement threading, it might be easiest to declare all global variables thread-local. If you have a C compiler that supports thread-locals, this is typically as easy as slapping a __thread (or other compiler-specific keyword) before every declaration. Then you create a new interpreter simply by creating a new thread and initializing the interpreter in the normal way.

  2. If you cannot use __thread or equivalent, then you have to do a bit more footwork. You need to place all global variables in a structure, and replace every access to global variable foo with get_ctx()->foo. This is tedious, but straightforward. Once you are done, get_ctx() can allocate and return a thread-local interpreter state, using an API of your choosing.