Accessing disjoint entries in global HashMap for lifetime of thread in Rust

405 views Asked by gatoWololo At 27 January 2021 at 00:00

my current project requires recording some information for various events that happen during the execution of a thread. These events are saved in a global struct index by the thread id:

RECORDER1: HashMap<ThreadId, Vec<Entry>> = HashMap::new();

Every thread appends new Entry to its vector. Therefore, threads access "disjoint" vectors. Rust requires synchronization primitives to make the above work of course. So the real implementation looks like:

struct Entry {
    // ... not important.
}

#[derive(Clone, Eq, PartialEq, Hash)]
struct ThreadId;

// lazy_static necessary to initialize this data structure.
lazy_static! {
    /// Global data structure. Threads access disjoint entries based on their unique thread id.
    /// "Outer" mutex necessary as lazy_static requires sync (so cannot use RefCell).
    static ref RECORDER2: Mutex<HashMap<ThreadId, Vec<Entry>>> = Mutex::new(HashMap::new());
}

This works, but all threads contend on the same global lock. It would be nice if a thread could "borrow" its respective vector for the lifetime of the thread so it could write all the entries it needs without needing to lock every time (I understand the outer lock is necessary for ensuring threads don't insert into the HashMap at the same time).

We can do this by adding an Arc and some more interior mutability via a Mutex for the values in the HashMap:

lazy_static! {
    static ref RECORDER: Mutex<HashMap<ThreadId, Arc<Mutex<Vec<Entry>>>>> = Mutex::new(HashMap::new());
}

Now we can "check out" our entry when a thread is spawned:

fn local_borrow() {
    std::thread::spawn(|| {
        let mut recorder = RECORDER.lock().expect("Unable to acquire outer mutex lock.");
        let my_thread_id: ThreadId = ThreadId {}; // Get thread id...

        // Insert entry in hashmap for our thread.
        // Omit logic to check if key-value pair already existed (it shouldn't).
        recorder.insert(my_thread_id.clone(), Arc::new(Mutex::new(Vec::new())));

        // Get "reference" to vector
        let local_entries: Arc<Mutex<Vec<Entry>>> = recorder
            .get(&my_thread_id)
            .unwrap() // We just inserted this entry, so unwrap.
            .clone();  // Clone on the Arc to acquire a "copy".

        // Lock once, use multiple times.
        let mut local_entries: MutexGuard<_> = local_entries.lock().unwrap();
        local_entries.push(Entry {});
        local_entries.push(Entry {});

    });
}

This works and is what I want. However, due to API constraints I have to access the MutexGuard from widely different places across the code without the ability to pass the MutexGuard as an argument to functions. So instead I use a thread local variable:

thread_local! {
    /// This variable is initialized lazily. Due to API constraints, we use this thread_local! to
    /// "pass" LOCAL_ENTRIES around.
    static LOCAL_ENTRIES: Arc<Mutex<Vec<Entry>>> = {
        let mut recorder = RECORDER.lock().expect("Unable to acquire outer mutex lock.");
        let my_thread_id: ThreadId = ThreadId {}; // Get thread id...

         // Omit logic to check if key-value pair already existed (it shouldn't).
        recorder.insert(my_thread_id.clone(), Arc::new(Mutex::new(Vec::new())));
        // Get "reference" to vector

        recorder
        .get(&my_thread_id)
        .unwrap() // We just inserted this entry, so unwrap.
        .clone()  // Clone on the Arc to acquire a "copy".
    }
}

I cannot make LOCAL_ENTRIES: MutexGuard<_> since thread_local! requires a 'static lifetime. So currently I have to .lock() every time I want to access the thread-local variable:

fn main() {
    std::thread::spawn(|| {
        // Record important message.
        LOCAL_ENTRIES.with(|entries| {
            // We have to lock every time we want to write to LOCAL_ENTRIES. It would be nice
            // to lock once and hold on to the MutexGuard for the lifetime of the thread, but
            // this is not possible to due the lifetime on the MutextGuard.
            let mut entries = entries.lock().expect("Unable to acquire lock");
            entries.push(Entry {});
        });
    });
}

Sorry for all the code and explanation but I'm really stuck and wanted to show why it doesn't work and what I'm trying to get working. How can one get around this in Rust?

Or am I getting hung up on cost of the mutex locking? For any Arc<Mutex<Vec<Entry>>>, the lock will always be unlocked so the cost of doing the atomic locking will be tiny?

Thanks for any thoughts. Here is the complete example in Rust Playground.

Original Q&A

TechQA.

Accessing disjoint entries in global HashMap for lifetime of thread in Rust

There are 0 answers

Related Questions in MULTITHREADING

Related Questions in RUST

Related Questions in THREAD-LOCAL

Related Questions in LAZY-STATIC

Popular Questions

Trending Questions