Transactional Database and Preventing Deadlocks

460 views Asked by At

I am in the process of developing cloud services in which synchronises local databases (an order taking POS system) to a central remote database. Our remote database is running the latest version of mysql 5.6 with innoDB table engine for each of the tables. Basically it acts as a transactional database that experiences a lot of transactions mainly writing to (ie. INSERT and UPDATE) and some times experiences reading when viewing reports etc. Inevitably we experienced our first database deadlock (please note this occurred prior to upgrading mysql 5.6), and my understanding of the cause of deadlock may of been where there are 2 connections trying to either READ or WRITE to a row at the same time. I also understand deadlocks are common and require correct code to try/catch deadlocks in which I believe I have managed to accomplish. In order to mitigate deadlocks I was thinking of setting up 2 databases which are a mirror of each other A) Database that is written to b) Database that is read

Basically the syncing from local to remote will be writing to database A, and using rsync in the background would mirror database A to B, which database B would be used to read and query the reports, since rsync doesn't use database connections to read the data for syncing.

My question is, would this structure work in terms of mitigating deadlocks and have any significant performance issues on the server?

I hope my question makes sense in what I am trying to achieve.

Thanks in advance.

3

There are 3 answers

3
rockcode On

Database replication can help in improving performance and availability. But, there can be various aspects you might want to review before starting to replicate databases.

I am assuming that you have turned on knobs to log slow queries. I would recommend to analyze and see if there are opportunities to optimize your problem causing queries. Another thing is, although it is not always easy to re-model your data, but it can help at times. Verify that your reporting queries are using the indexes as desired. You can look into data partitioning as well.

At some point, depending on your data volume and reporting requirements, you can very well think about using NoSQL databases for analytical data.

2
Zenilogix On

Deadlocks aren't a result of two connections trying to write to the same row at the same time. It's just a little bit more complex than that.

The sort of scenario which will produce a deadlock goes something like this:

There are two resources (could be two rows in a table, two tables, two files etc., which we will identify as "A" and "B").

  1. Connection #1 locks resource "A"
  2. Connection #2 locks resource "B"
  3. Connection #1 proceeds to attempt to lock "B" (while still holding its lock on "A"). Since connection #2 has "B" locked, connection #1 waits for the lock to be released
  4. Connection #2 (still holding its lock on "B") attempts to lock "A". Since connection #1 has "A" locked, connection #2 waits for that lock to be released.

Both connections are now in a state where each is waiting on the other to release its lock; they are deadlocked. The deadlock can only be broken by one connection aborting its action to release its lock, allowing the other to finish.

This sort of thing can happen at any level of granularity in a database from rows to pages to tables, and can happen with data or indexes. Leaving it to the database engine to do its own optimistic locking generally gives you best performance, but can result in deadlocks. You may need to do some of your inserts/updates in serializeable transations (although this will come at a performance penalty). Consider transactions which touch more than one table as potential participants in a deadlock scenario.

0
Rick James On

Give up on totally eliminating deadlocks. Focus on "mitigating" them.

Whenever practical, touch the tables and rows in some canonical order. A simple example is to sort the elements in an IN clause. I think that the following could cause a deadlock:

UPDATE ... WHERE id IN (3,7) -- in one connection
UPDATE ... WHERE id IN (7,3) -- in another connection

But if you sorted the ids in all transactions, the worst you would get is that one connection would wait for the other to release a row lock. (This is where innodb_lock_wait_timeout comes into play.)

Another example involves

BEGIN;
SELECT ... FROM table1  FOR UPDATE;
SELECT ... FROM table2  FOR UPDATE;
...
COMMIT;

Of course, there could be cases where you cannot predict which ids or tables are needed in a transaction, so you cannot do some simple sort. That's why you must be ready to catch deadlocks.

Simply replay the transaction when you do get a deadlock. OK, it may not be simple if you have other code interspersed in the the transaction.