Lock-free Shared Memory in C++ for Variable length Records

2k views Asked by At

I am newbee to IPC. The Writer process writes the data into shared memory, Many reader processes reads the data. The data to be written have a unique identifier, has to be be indexed by unique key for the faster access( like STL::map or hashmap for lookup). Also data is a varible length record ( XML ) ( average lenght is 200-250 bytes ). OS is solaris 10 (i86pc) on Intel Xeon Quad Core Server.

Total data size is more than 200G. But we will be keeping only latest data in the shared memory. Historical data resides in file. shared memory size would be around 4G~6G.

No external library avaiable like Boost::interprocess

I have couple of questions, may be many

  1. Which is more efficient : shared_memory or mmap (Memory Mapped Files)
  2. How to build indexes for variable length record. [i have no idea, May be some hashing?].
  3. Would this be neat if XML is converted into fixed size structure ( Trade off - size of the structure will be huge, nearly 300+ possible fields )
  4. Can we place any STL in the shared_memory by providing custom allocator .?
  5. Is it possible to implement without semaphores (lockless implementation using CAS ).

Thanks

How about this.

|--------------------------|
| start_id   |   end_id    |   ->  range of msg id present in the segment
|--------------------------|
| id1 | start_mem | length |   ->
|--------------------------|   ->
| id2 | start_mem | length |   -> table of index for the actual data
|--------------------------|   ->
| id3 | start_mem | length |   ->
|--------------------------|   ->
| id4 | start_mem | length |   ->
|--------------------------|   ->
|                          |
|                          |
|                          |
|       data segment       |
|       varibale length    |
|       xml are stored     |
|                          |
|                          |
|--------------------------|

When the new data arrives and segment is full. oldest data is erased in a circular fashion. There can be a possibility of more than 1 record need to erased.

2

There are 2 answers

1
peeyush On

Is it possible to implement without semaphores (lockless implementation using CAS ).

Since lockless ( lock-free ) implementations are hard to design and we might end up in chaos, before going for lock-free solution you should consider following aspects and alternatives:

  • If there are many threads in system, so that scheduler is likely to preempt the thread which is holding the lock, consequently all other thread will be waiting for lock ( if not than lock-free is not going to give significant improvement ).
  • If this can be solvable using readers-writers lock. ( writers are significantly less than readers ).
  • If lock contention is less likely then you might consider spin-locks, which will give you as equal as lock-free performance.
3
bdonlan On

The easiest solution, if you require complex indexing and other such things, you should really consider a service-oriented architecture instead of shared memory. Simply designate one process to be your master cache process, and have it accept local connections (over unix domain sockets, or TCP sockets, or whatever) from other processes that need the data. This makes things much, much simpler.

If you don't choose this route, be warned that shared memory is hard. What you ask is definitely doable in shared memory - you can create a heap allocator in this shmem chunk, etc etc. STL allocators might work, but don't expect any third-party libraries to be happy with STL allocators using custom pointer types. You will need locks (if you're clever you might be able to avoid them in some cases, but not all, and in this case definitely kiss STL goodbye), and you will have to rebuild everything that you usually take for granted.

Again, I highly recommend you start with a simple cache daemon. This scales just fine most of the time, it just adds a bit more latency.