Consider Chromium
codebase. It's huge, around 4gb of pure code, if I'm not mistaken. But however humongous it may be, it's still modular in its nature. And it implements a lot of interesting features in its internals.
What I mean is for example I'd like to extract websocket
implementation out of the sources, but it's not easy to do by hand. Ok, if we go to https://github.com/chromium/chromium/tree/main/net/websockets we'll see lots of header files. To compile the code as a "library" we're gonna need them + their implementation in .cpp
files . But the trick is that these header files include
other header files in other directories of the chromium
project. And those in their turn include
others...
BUT if there are no circular dependencies we should be able to get to the root of this tree, where header files won't include
anything (or will include
already compiled libraries), which should mean that all the needed files for this dependency subtree are in place, so we can compile a chunk of the original codebase separate from the rest of it.
That's the idea. At least in theory.
Does anyone know how it could be done? I've found this repo and this repo, but they only show the dependency graph and do not have the functionality to extract a tree from it.
There should be a tool already, I suppose. It's just hard to word it out to google. Or perhaps I'm mistaken and this approach wouldn't really work?
Your compiler is almost surely capable of extracting this dependency information so that it can be used to help the build system figure out incremental builds. In
gcc
, for instance, we have the-MMD
flag.Suppose we have four compilation units,
ball.cpp
,football.cpp
,basketball.cpp
, andhockey.cpp
. Each source file includes a header file of the same name. Also,football.hpp
andbasketball.hpp
each includeball.hpp
.If we run
then this will produce, in addition to the object files, some files with names like
basketball.d
that contain dependency information likeIt's simple enough to read these into, say, a python script, and then just take the union of all the dependencies of the files you want to include.
EDIT: In fact, python may even be overkill. In the situation above, if you wanted to get all dependencies for anything containing the word "ball," you could do something like
which will output
If you're not used to reading UNIX pipelines, this:
:
characters;You can see that this produced a list of everything the ball-related files depend on, but skipped
hockey.cpp
andhockey.hpp
which aren't dependencies of any file with "ball" in its name. (Of course in your case you might use "websockets" instead of "ball," and if there is some directory structure instead of everything being in the root directory you may have to do a bit to compensate for that.)