C++ std::vector access using software transactional memory

780 views Asked by At

I'm currently trying to wrap my head around the problem of thread-safety using C++ STL containers. I recently tried to implement a thread safe std::vector by using a std::mutex as a member variable, just to then realize that although I could make member functions thread-safe by locking the lock, I couldn't make lib functions like std::sort thread-safe, since they only get the begin()/end() iterators, which is a result of the fundamental split between containers and algorithms in the STL in general.

So then I thought, if I can't use locks, how about software transactional memory (STM)?

So now I'm stuck with this:

#include <atomic>
#include <cstdlib>
#include <iostream>
#include <thread>
#include <vector>

#define LIMIT 10

std::atomic<bool> start{false};
std::vector<char> vec;

void thread(char c)
{
    while (!start)
        std::this_thread::yield();

    for (int i = 0; i < LIMIT; ++i) {
        __transaction_atomic {
        vec.push_back(c);
        }
    }
}

int main()
{
    std::thread t1(thread, '*');
    std::thread t2(thread, '@');

    start.store(true);

    t1.join();
    t2.join();

    for (auto i = vec.begin(); i != vec.end(); ++i)
        std::cout << *i;

    std::cout << std::endl;

    return EXIT_SUCCESS;
}

Which I compile with:

g++ -std=c++11 -fgnu-tm -Wall

using g++ 4.8.2 and which gives me the following error:

error: unsafe function call to push_back within atomic transaction

Which I kinda get, since push_back or sort or whatever isn't declared transaction_safe but which leaves me with the following questions:

a) How can I fix that error?

b) If I can't fix that error, then what are these transactional blocks usually used for?

c) How would one implement a lock-free thread-safe vector?!

Thanks in advance!

Edit: Thanks for the answers so far but they don't really scratch my itch. Let me give you an example: Imagine I have a global vector and access to this vector shall be shared amongst multiple threads. All threads try to do sorted inserts, so they generate a random number and try to insert this number into the vector in a sorted manner, so the vector stays sorted all the time (including duplicates ofc). To do the sorted insert they use std::lower_bound to find the "index" where to insert and then do the insert using vector.insert().

If I write a wrapper for the std::vector that contains a std::mutex as a member, than I can write wrapper functions, e.g. insert which locks the mutex using std::lock_guard and then does the actual std::vector.insert() call. But std::lower_bound doesn't give a damn about the member mutex. Which is a feature, not a bug afaik.

This leaves my threads in quite a pickle because other threads can change the vector while someone's doing his lower_bound thing.

The only fix I can think of: forgett the wrapper and have a global mutex for the vector instead. Whenever anybody wants to do anything on/with/to this vector, he needs that lock.

THATS the problem. What alternatives are there for using this global mutex. and THATS where software transactional memory came to mind.

So now: how to use STMs on STL containers? (and a), b), c) from above).

3

There are 3 answers

0
Svalorzen On

I'm not sure I understand why you cannot use mutexes. If you lock the mutex each time you are accessing the vector then no matter what operation you are doing you are certain that only a single thread at a time is using it. There is certainly space for improvement depending on your needs for the safe vector, but mutexes should be perfectly viable.

lock mutex -> call std::sort or whatever you need -> unlock mutex

If on the other side what you want is to use std::sort on your class, then again it is a matter of providing thread-safe access and reading methods through the iterators of your container, as those are the ones that std::sort needs to use anyway in order to sort a vector, since it is not a friend of containers or anything of the sort.

0
Michael Kohne On

I believe that the only way you can make an STL container 100% thread safe is to wrap it in your own object (keeping the actual container private) and use appropriate locking (mutexes, whatever) in your object in order to prevent multi-thread access to the STL container.

This is the moral equivalent of just locking a mutex in the caller around every container operation.

In order to make the container truly thread safe, you'd have to muck about with the container code, which there's no provision for.

Edit: One more note - be careful about the interface you give to your wrapper object. You can't very well go handing out references to stored objects, as that would allow the caller to get around the locking of the wrapper. So you can't just duplicate vector's interface with mutexes and expect things to work.

0
Chromozon On

You can use simple mutexes to make your class thread safe. As stated in another answer, you need to use a mutex to lock the vector before use and then unlock after use.

CAUTION! All of the STL functions can throw exceptions. If you use simple mutexes, you will have a problem if any function throws because the mutex will not be released. To avoid this problem, wrap the mutex in a class that releases it in the destructor. This is a good programming practice to learn about: http://c2.com/cgi/wiki?ResourceAcquisitionIsInitialization