Capturing a `thread_local` in a lambda

903 views Asked by At

Capturing a thread_local in a lambda:

#include <iostream>
#include <thread>
#include <string>

struct Person
{
    std::string name;
};

int main()
{
    thread_local Person user{"mike"};
    Person& referenceToUser = user;

    // Works fine - Prints "Hello mike"
    std::thread([&]() {std::cout << "Hello " << referenceToUser.name << std::endl;}).join();

    // Doesn't work - Prints "Hello"
    std::thread([&]() {std::cout << "Hello " << user.name << std::endl;}).join();

    // Works fine - Prints "Hello mike"
    std::thread([&user=user]() {std::cout << "Hello " << user.name << std::endl;}).join();
}

https://godbolt.org/z/zeocG5ohb

It seems like if I use the original name of a thread_local then its value on the thread which executes the lambda is the thread_local version of the thread which is running the lambda. But as soon as I take a reference or pointer to the thread local it turns into (a pointer to) the originating threads instance.

What are the rules here. Can I rely on this analysis?

2

There are 2 answers

8
Jan Schultke On BEST ANSWER

Similar to local static objects, local thread_local (implicitly static thread_local) objects are initialized when control passes through their declaration for the first time.

The thread you are creating never executes main, only the main thread does, so you're accessing user before its lifetime has begun on the extra thread.

Your three cases explained

std::thread([&]() {std::cout << "Hello " << referenceToUser.name << std::endl;}).join();

We are capturing referenceToUser which refers to the user on the main thread. This is okay.

std::thread([&]() {std::cout << "Hello " << user.name << std::endl;}).join();

We are accessing user on the extra thread before its lifetime has begun. This is undefined behavior.

std::thread([&user=user]() {std::cout << "Hello " << user.name << std::endl;}).join();

Once again, we are referencing the user from the main thread here, which is same as the first case.

Possible fix

If you declare user outside of main, then user will be initialized when your thread starts, not when main runs:

thread_local Person user{"mike"};

int main() {
    // ...

Alternatively, declare user inside of your lambda expression.


Note: It's not necessary to capture thread_local objects. The second example could be [] { ... }.

0
Yakk - Adam Nevraumont On

[&] is a very dangerous thing to do when your lambda executes outside of the immediate context. It is very useful there -- otherwise, it is a bad plan.

In this case, you are being bitten by the fact that [&] does not capture global, local static or thread local variables. So using [] in this case would behave the same.

thread_local Person user{"mike"};
Person& referenceToUser = user;

std::thread([&]() {std::cout << "Hello " << referenceToUser.name << std::endl;}).join();

this captures referenceToUser by reference. referenceToUser in turn refers to the thread_local variable in the main thread.

std::thread([&]() {std::cout << "Hello " << user.name << std::endl;}).join();

this is identical to

std::thread([]() {std::cout << "Hello " << user.name << std::endl;}).join();

the use of [&] here makes you believe it is capturing user by reference. So the thread_local variable main::user is being used. As the thread never passed the initialization line of that variable, you have just done UB.

std::thread([&user=user]() {std::cout << "Hello " << user.name << std::endl;}).join();

here you explicitly create a new reference variable user at lambda creation.

The basic rule is **never use [&] when creating a lambda to pass to a std::thread.

This is appropriate use of [&]:

foreach_chicken( [&](chicken& c){ /* process chicken */ } );

the lambda is expected to exist within the current scope, and is going to be executed locally. [&] is safe.

auto pop = [&]()->std::optional<int>{
  if (queue.empty()) return std::nullopt;
   auto x = std::move(queue.front());
   queue.pop_front();
   return x;
 };
 while (auto x = pop()) {
 }

this is another example of a valid use of [&], as this pop operation is being refactored into a helper and maybe run more than once in the local function.

But if the lambda is not being run locally or could live beyond the current scope, [&] is a toxic option that leads to surprises and bugs in pretty much every case I've seen it used.