how to synchronize mysql database requests?

1.5k views Asked by At

I have a lot of entries in a table that are fetched for performing jobs. this is scaled to several servers.

when a server fetches a bunch of rows to add to its own job queue they should be "locked" so that no other server fetches them. when the update is performed a timestamp is increased and they are "unlocked".

i currently do this by updating a field that is called "jobserver" in the table that defaults to null with the id of the jobserver.

a job server only selects rows where the field is null.

when all rows are processed their timestamp is updated and finally the job field set to null again.

so i need to synchronize this:

$jobs = mysql_query("
SELECT itemId 
FROM items 
WHERE 
jobserver IS NULL 
AND 
DATE_ADD(updated_at, INTERVAL 1 DAY) < NOW()
LIMIT 100
");

mysql_query("UPDATE items SET jobserver = 'current_job_server' WHERE itemId IN (".join(',',mysql_fetch_assoc($jobs)).")");

// do the update process in foreach loop
// update updated_at for each item and set jobserver to null

every server executes the above in an infinite loop. if no fields are returned, everything is up 2 date (last update is not longer ago than 24 hours) and is sent to 10 minutes.

I currently have MyIsam and i would like to stay with it because it had far better performance than innodb in my case, but i heard that innodb has ACID transactions.

So i could execute the select and update as one. but how would that look and work?

the problem is that i cannot afford to lock the table or something because other processes neeed to read/write and cannot be locked.

I am also open to a higher level solution like a shared semaphore etc. the problem is the synchronization needs to be across several servers.

  • is the approach generally sane? would you do it differently?

  • how can i synchronize the job selectino to ensure that two servers dont update the same rows?

2

There are 2 answers

2
JohnC On BEST ANSWER

You can run the UPDATE first but with the WHERE and LIMIT that you had on the SELECT. You then SELECT the rows that have the jobserver field set to your server.

0
Ian Clelland On

If you can't afford to lock the tables, then I would make the update conditional on the row not being modified. Something like:

$timestamp = mysql_query("SELECT DATE_SUB(NOW(), INTERVAL 1 DAY)");

$jobs = mysql_query("
SELECT itemId 
FROM items 
WHERE 
jobserver IS NULL 
AND 
updated_at < ".$timestamp."
LIMIT 100
");

// Update only those which haven't been updated in the meantime
mysql_query("UPDATE items SET jobserver = 'current_job_server' WHERE itemId IN (".join(',',mysql_fetch_assoc($jobs)).") AND updated_at < ".$timestamp);

// Now get a list of jobs which were updated
$actual_jobs_to_do = mysql_query("
SELECT itemId
FROM items
WHERE jobserver = 'current_job_server'
");

// Continue processing, with the actual list of jobs

You could even combine the select and update queries, like this:

mysql_query("
UPDATE items
SET jobserver = 'current_job_server'
WHERE jobserver IS NULL
AND updated_at < ".$timestamp."
LIMIT 100
");