How (in code) can I prevent two people starting the same crowdsourcing task at once?

215 views Asked by At

I'm trying to build a Django app for a translation crowdsourcing task.

For each task in the database, I have an is_completed boolean flag that is set when the user completes the task. I also have a 'give me a random task' button, which chooses from the list of uncompleted tasks.

My question is this. How do I prevent two users being given the same task, if one user clicks the button shortly after another?

I was thinking of setting a has_started flag on the row when a task is loaded, and removing started tasks from the list of random available tasks: but what if the user starts a task and then closes the page without finishing it, so it never gets unset? I'll end up with a lot of unfinished tasks.

Could I flag this in a cleverer way with session variables that expire, perhaps? But I know it's hard to capture the 'user closes page' event reliably in JavaScript.

Thanks!

4

There are 4 answers

0
slifty On BEST ANSWER

Instead of making has_started a flag, you could make it a timestamp and decide on a reasonable amount of time for task completion (which will allow you to assume that a task has been dropped after X minutes).

There is a risk that this will result in multiple translations of the same thing (i.e. if someone is really really slow and the job is recirculated early), but I think it will cover most cases.

0
teuneboon On

I would use locking, you add a field "lock_time" to your database. You update this to the current time as soon as a user starts a task. Then, with an event that's called every, let's say: 10 seconds in javascript, you update the lock_time. Now you can check if the lock_time is more than 30 seconds ago, if so: you "break" the lock.

0
John La Rooy On

You'll have to use a timeout. There are no javascript events for "user spills coffee on computer" or "user does a hard reset" etc.

0
GolezTrol On

I think you'd best set the userid and the startdate on start.

When you update a database like this --

UPDATE task t 
SET t.userid = :USERID, t.lastprogress = sysdate() 
WHERE t.userid is null and t.taskid = :TASKID

-- you will notice 0 modified records when a task is already assigned to a user. This addresses your first problem.

Then, when you save a last modified date, you can run a cron job to clean up abandoned tasks, being tasks that haven't been modified in a certain period of time. But this is a different problem altogether. It's hard to find the right balance of deciding too early or too late whether a task is abandoned or not.

If every modification also updates this date, a user can even work on a task for a longer time, without it being stolen by someone else, as long as they do regular saves. Also, when saving the modification data (you can write a routine to do that), you can check if the userid still matches. If the userid of the task is NULL (cron decided 'abandoned') or another userid (abandoned task picked up by someone else), you can raise an error to tell the user that the task no longer belongs to them.