I'd like to clarify what to except from a single UPDATE SQL statement executed asynchronously with regard to other running transactions. This sample scenario assumes there is already a LogTable on the server:
+-- --------+-------------+
| some_text | in_progress |
+-----------+-------------+
| Q-001 | 2 |
| Q-002 | 2 |
| Q-003 | 2 |
| Q-004 | 2 |
| Q-005 | 2 |
+-----------+-------------+
LogTable is accessed by four independent participants. Three of them are log writers and one is log analyzer (in general there are many log writers and one analyzer):
--
-- Logger1 (Transaction 1)
--
-- Do some other unrelated work here
...
INSERT INTO LogTable VALUES ("A-000", "0");
INSERT INTO LogTable VALUES ("A-001", "0");
...
INSERT INTO LogTable VALUES ("A-999", "0");
-- Do some other unrelated work here
...
COMMIT;
--
-- Logger2 (Transaction 2)
--
-- Do some other unrelated work here
...
INSERT INTO LogTable VALUES ("B-000", "0");
INSERT INTO LogTable VALUES ("B-001", "0");
...
INSERT INTO LogTable VALUES ("B-999", "0");
-- Do some other unrelated work here
...
COMMIT;
--
-- Logger3 (Transaction 3)
--
-- Do some other unrelated work here
...
INSERT INTO LogTable VALUES ("C-000", "0");
INSERT INTO LogTable VALUES ("C-001", "0");
...
INSERT INTO LogTable VALUES ("C-999", "0");
-- Do some other unrelated work here
...
COMMIT;
--
-- Analyzer (Transaction 4)
--
-- Do some other unrelated work here
...
UPDATE LogTable SET in_progress=1;
-- Do some other unrelated work here
...
SELECT count(*) FROM LogTable WHERE in_progress=1;
COMMIT;
Log analyzer, because of its nature and analysis complexity needs to see the complete log sets as they are produced by log writers and their transactions. Analyzer shall not capture partial sets of logs (like for example only A-000 to A-157). It is all or nothing what is expected. Either all 1000 logs from "A-" series or nothing from "A-" series. Same should hold true for any other series too.
Would the simple UPDATE statement do the job here and mark consistent logs subset for further processing? In other words - is there any other result than 0, 1000, 2000 or 3000 possible for a analyzer in my sample? There are no further assumptions about execution order or duration of loggers and analyzer. All possible scenarios are on the table.
All loggers and analyzer code is running in transactions with Read Commited isolation level. My primary target here is Informix RDBMS, but I'd like to have this concept portable to any other decent database engine.