Ok so here's the schema, which is pretty self-explanatory:
STORE(storeID, name, city)
PRODUCT(productID, name, brand)
PRODUCT_FOR_SALE(productID, storeID, price)
I have 2 transactions: T1 and T2.
T1
raises by 5% the price for any product sold in any store in 'London'.
T2
lowers by 10% the price for any product whose cost is >= $1050
What I am asked is to tell what kind of concurrency anomaly they may result in, and what isolation level I should apply to which transaction to make it safe.
The code for the transactions is not given, but I suppose it would be something on the lines of:
# T1:
BEGIN;
UPDATE product_for_sale
SET price = price + ((price/100) *5)
WHERE storeID IN (SELECT storeID FROM store WHERE city='London')
COMMIT;
# T2:
BEGIN;
UPDATE product_for_sale
SET price = price - (price/10)
WHERE price >= 1050
COMMIT;
My "guess" to what might happen with READ COMMITTED
(default) is:
Considering a product P, sold in 'London' for $1049
- both transactions begin
- they both consider their row sets: T1 will consider all products sold in London (which includes P), T2 will consider products whose price is $1050 or more (which excludes P)
- T1 commits and sets the price of P to $1101 but, since P wasn't in T2's row set to begin with, the change goes unnoticed, and T2 commits without considering it
Which, if I'm not messing up definitions, should be a case of phantom read, which would be fixed if I set T2 to ISOLATION LEVEL REPEATABLE READ
First, it is not quite clear what you mean with a concurrency issue. It could be:
something that could conceivably be a problem, but is handled by PostgreSQL automatically so that no problems arise.
something that can generate an unexpected error or an undesirable result.
For 1., this is handled by locks that serialize transactions that are trying to modify the same rows concurrently.
I assume you are more interested in 2.
What you describe can happen, but it is not a concurrency problem. It just means that T1 logically takes place before T2. This will work just fine on all isolation levels.
I might be missing something, but the only potential problem I see here is a deadlock between the two statements:
They both can update several rows, so it could happen that one of them updates row X first and then tries to update row Y, which has already been updated by the other statement. The first statement then is blocked. Now the second statement wants to update row Y and is blocked too.
Such a deadlock is broken by the deadlock resolver after one second by killing off one of the statements with an error.
Note that deadlocks are not real problems either, all your code has to do is to repeat the failed transaction.