Is there a concurrency problem here? How to test it during development?

548 views Asked by At

Scenario: There exists 'n' teams who each work on their virtual 'wall' (like facebook's wall). Each team sees only their own wall and the posts on it. The posts can be edited by the author of the post or another team member (if so configured. Assuming this is indeed the case since it's a must have).

Design/technology decisions: RESTful web-app using Restlet+ Glassfish/Java + Mysql (EDIT: Using Apache DBUtils for DB access. No ORM - seemed an overkill)

Question: Multiple teams log on T1, T2 and T3 (say) each with some number of members. There is concurrency at the team-level data access, but not across teams - i.e., different teams access disjoint data sets. To optimize frequent read/writes from/to the DB we are considering a TeamGateway that controls access to DB for handling concurrency. The web-server would cache the data retrieved by the teams to speed up reads (and also to help updating the list of wall posts)

  • Q1: Is this (TableGateway per team + cache) even required? If not how do you suggest it be handled?
  • Q2: If so, does the TableGateway (for each team) need to be coded as thread safe (synchronized method)?? Let's say we have a class/registry TableGatewayFinder with a static method that returns the TableGateway to use for that particular team (using a hashmap).

If 6 people from each of T1 - T3 log on then would ONLY 3 TableGateways be created and would it help catch concurrent writes (simple timestamp comparison before committing or a "conflict-flagged" append) and effectively manage the caching (We plan on having identity maps for the entities - there are 4-5 different entities that need to be tracked. 4 entities for a composition hierarchy and another one is associated to each of the 4)?

How would one unit test the gateway (TDD based or after the fact)?

Thanks in advance!

3

There are 3 answers

1
Enno Shioji On BEST ANSWER

If you just write to the DB or to a cache solution on top the DB (e.g. Spring+Hibernate+EhCache etc.), you don't need to worry about corrupting your tables etc. I.e. no concurrency issue from a low-level point of view.

If you want to write a cache yourself and deal with concurrency issues yourself, then that would involve some effort. If you shard your cache and have a "global lock" (i.e. synchronized on a common mutex) per partition, and acquire this lock for any access then that would work, while it's not the most performant way to do it. But doing something else than a global lock would involve quite a lot of work.

While this is trivial, not sure why you'd want to use a identity hash map... I can't think of any particular reason you want to do that (if you are thinking about performance, then performance of a normal hash map would be the last thing you need to be worried about in this situation!).

If your entities are articles, then you probably have another form of concurrency issue. Like the one that is solved by version controlling software like SVN, Mercurial etc. I.e. if you don't put merging capability to your app., it becomes an annoyance if somebody edits somebody's article only to find that somebody else has "committed" another edit before you etc. Whether you need to add such capability would depend on the use case.

As for testing your app. for concurrency, unit testing is not bad. By writing concurrent unit-tests, it is much more easy to catch concurrency bugs. Writing concurrent tests is very tough, so I recommend that you go through good books like "Java Concurrency in Practice" before writing them. Better catch your concurrency bugs before integration testing when it becomes hard to guess what the hell is going on!

UPDATE:
@Nupul: That's a difficult question to answer. However,if you just have 18 humans typing stuff, my bet is writing every time to the DB would be just fine.

If you don't store any state elsewhere (i.e. only in the DB), you should get rid of any unnecessary mutex (and you should not store any state anywhere else than the DB unless you have very good reason to do so in your situation IMO).

It's easy to make a mistake and acquire a mutex while doing something like a network operation and hence cause extreme usability issues (e.g. app does not respond for many seconds etc.). And it's also easy to have nasty concurrency bugs like thread dead-locks etc.

So my recommendation would be to keep your app. stateless and just write to the DB every time. Should you find any performance issues due to DB access, then turning to cache solutions like EhCache would be the best bet.

Unless you want to learn from the project or have to deliver an app. with extreme performance requirement, I don't think writing your own cache layer will be justified.

1
Gnat On

Unit testing might not be the best approach for concurrency issues. Instead, you could try a web-based performance tool such as JMeter or Rational Performance Tester to test how it performs and that you've got valid wall contents as you ramp up the number of users. You can give each user different posting behaviour with these tools.

0
Stephen C On

Focussing on the 2 questions in the title.

Is there a concurrency problem here?

Yes. Obviously. The only way to avoid the possibility of concurrency problems would be to make the product single-threaded, and that would have major performance and usability issues.

How to test it during development?

That's a hard question.

Obviously you need unit tests. But classic unit tests don't really cut it for concurrency issues, because tricky concurrency bugs tend to show up rarely.

A better approach is load testing. as described by @Gnat.

But for best results, you need acknowledge that testing is not the (whole) solution, and add the following:

  • Staff who are experienced in designing, coding and testing concurrent apps.
  • Paying a lot of attention to the approach / recommendations of "Java - Concurrency in Practice" by Goetz et al.
  • Lots of design and code reviews.
  • Thorough use of static code analysers to pick up potential concurrency problems.