I have 5+ simultaneously processes selecting rows from the same mysql table. Each process SELECTS 100 rows, PROCESS IT and DELETES the selected rows.
But I'm getting the same row selected and processed 2 times or more.
How can I avoid it from happening on MYSQL side or Ruby on Rails side?
The app is built on Ruby On Rails...
Eric's answer is good, but I think I should elaborate a little...
You have some additional columns in your table say:
Default them all to NULL.
Then you have the worker processes "claim" the rows by doing:
Then you process the claimed rows with SELECT ... WHERE lockhost='myhostname' and lockpid=12345
After you finish processing a row, you make whatever updates are necessary, and set lockhost, lockpid and locktime back to NULL (or delete it).
This stops the same row being processed by more than one process at once. You need the hostname, because you might have several hosts doing processing.
If a process crashes while it is processing a batch, you can check if the "locktime" column is very old (much older than processing can possibly take, say several hours). Then you can just reclaim some rows which have an old "locktime" even though their lockhost is not null.
This is a pretty common "queue pattern" in databases; it is not extremely efficient. If you have a very high rate of items entering / leaving the queue, consider using a proper queue server instead.