I am asking this basic question to make the records straight. Have referred this question and its currently accepted answer, which is not convincing. However the second most voted answer gives better insight, but not perfect either.
While reading below try to distinguish between the inline
keyword and “inlining” concept.
Here is my take:
The "inlining" concept
This is done to save the call overhead of a function. It's more similar to macro-style code replacement. Nothing to be disputed.
The inline
keyword
Perception A
The
inline
keyword is a request to the compiler usually used for smaller functions, so that compiler can optimize it and make faster calls. The Compiler is free to ignore it.
I partially dispute this for below reasons:
- Larger and/or recursive functions are not inlined anyways and the compiler ignores the
inline
keyword completely - Smaller functions are automatically inlined by the optimizer irrespective of the
inline
keyword being mentioned or not.
It's quite clear that the user doesn't have any control over function inlining with the use of keyword inline
.
Perception B
inline
has nothing to do with the concept of inlining. Puttinginline
ahead of big / recursive functions won't help, while smaller function won't need it for being inlined.The only deterministic use of
inline
is to maintain the One Definition Rule.
i.e. if a function is declared with inline
then only below things are mandated:
- Even if its body is found in multiple translation units (e.g. include that header in multiple
.cpp
files), the compiler will generate only 1 definition and avoid multiple symbol linker error. (Note: If the bodies of that function are different then it is undefined behavior.) - The body of the
inline
function has to be visible / accessible in all the translation units who use it. In other words, declaring aninline
function in.h
and defining in any one.cpp
file will result in an “undefined symbol linker error” for other.cpp
files
Verdict
IMO, the perception “A” is entirely wrong and the perception “B” is entirely right.
There are some quotes in standard on this, however I am expecting an answer which logically explains if this verdict correct or not.
Email reply from Bjarne Stroustrup:
"For decades, people have promised that the compiler/optimizer is or will soon be better than humans for inlining. This may be true in theory, but it still isn't in practice for good programmers, especially in an environment where whole-program optimization is not feasible. There are major gains to be had from judicious use of explicit inlining."
I wasn't sure about your claim:
I've heard that compilers are free to ignore your
inline
request, but I didn't think they disregarded it completely.I looked through the Github repository for Clang and LLVM to find out. (Thanks, open source software!) I found out that The
inline
keyword does make Clang/LLVM more likely to inline a function.The Search
Searching for the word
inline
in the Clang repository leads to the token specifierkw_inline
. It looks like Clang uses a clever macro-based system to build the lexer and other keyword-related functions, so there's noting direct likeif (tokenString == "inline") return kw_inline
to be found. But Here in ParseDecl.cpp, we see thatkw_inline
results in a call toDeclSpec::setFunctionSpecInline()
.Inside that function, we set a bit and emit a warning if it's a duplicate
inline
:Searching for
FS_inline_specified
elsewhere, we see it's a single bit in a bitfield, and it's used in a getter function,isInlineSpecified()
:Searching for call sites of
isInlineSpecified()
, we find the codegen, where we convert the C++ parse tree into LLVM intermediate representation:Clang to LLVM
We are done with the C++ parsing stage. Now our
inline
specifier is converted to an attribute of the language-neutral LLVMFunction
object. We switch from Clang to the LLVM repository.Searching for
llvm::Attribute::InlineHint
yields the methodInliner::getInlineThreshold(CallSite CS)
(with a scary-looking bracelessif
block):So we already have a baseline inlining threshold from the optimization level and other factors, but if it's lower than the global
HintThreshold
, we bump it up. (HintThreshold is settable from the command line.)getInlineThreshold()
appears to have only one call site, a member ofSimpleInliner
:It calls a virtual method, also named
getInlineCost
, on its member pointer to an instance ofInlineCostAnalysis
.Searching for
::getInlineCost()
to find the versions that are class members, we find one that's a member ofAlwaysInline
- which is a non-standard but widely supported compiler feature - and another that's a member ofInlineCostAnalysis
. It uses itsThreshold
parameter here:CallAnalyzer::analyzeCall()
is over 200 lines and does the real nitty gritty work of deciding if the function is inlineable. It weighs many factors, but as we read through the method we see that all its computations either manipulate theThreshold
or theCost
. And at the end:But the return value named
ShouldInline
is really a misnomer. In fact the main purpose ofanalyzeCall()
is to set theCost
andThreshold
member variables on theCallAnalyzer
object. The return value only indicates the case when some other factor has overridden the cost-vs-threshold analysis, as we see here:Otherwise, we return an object that stores the
Cost
andThreshold
.So we're not returning a yes-or-no decision in most cases. The search continues! Where is this return value of
getInlineCost()
used?The Real Decision
It's found in
bool Inliner::shouldInline(CallSite CS)
. Another big function. It callsgetInlineCost()
right at the beginning.It turns out that
getInlineCost
analyzes the intrinsic cost of inlining the function - its argument signature, code length, recursion, branching, linkage, etc. - and some aggregate information about every place the function is used. On the other hand,shouldInline()
combines this information with more data about a specific place where the function is used.Throughout the method there are calls to
InlineCost::costDelta()
- which will use theInlineCost
sThreshold
value as computed byanalyzeCall()
. Finally, we return abool
. The decision is made. InInliner::runOnSCC()
:InlineCallIfPossible()
does the inlining based onshouldInline()
's decision.So the
Threshold
was affected by theinline
keyword, and is used in the end to decide whether to inline.Therefore, your Perception B is partly wrong because at least one major compiler changes its optimization behavior based on the
inline
keyword.However, we can also see that
inline
is only a hint, and other factors may outweigh it.