I want to design a job scheduler cluster, which contains several hosts to do cron job scheduling. For example, a job which needs run every 5 minutes
is submitted to the cluster, the cluster should point out which host to fire next run, making sure:
- Disaster tolerance: if not all of the hosts are down, the job should be fired successfully.
- Validity: only one host to fire next job run.
Due to disaster tolerance, job cannot bind to a specific host. One way is all the hosts polling a DB table(certainly with lock), this guaranteed only one host gets the next job run. Since it often locks table, is there any better design?
I did require something like this long ago, when synchronisation was done with floppy disks. You should be clear about three things, which seem to be simple, but in distributed environment the arent :-)
"Synchronisation Sections" If you get a net split, which means your cluster is split in two seperate sections wich can communicate inside the sections, but not between the two sections, the "fire the job exactly once" can only acquired per synchronisation section.
"Disaster" If almost all times all computers are up and running and only very seldom one fails, and the failure of two is almost unthinkable, its a completely different thing, than every host is running only part time, the connections are unstable, or the synchronisation is done by dial-up connections or by floppys. If you want even deal with a net split, it becomes really really complicated. If you want to deal with malicious hosts, you have another Problem.
"Validity" Fire every job exactly once... you have to synchronize faster than the job firing interval.
edit: Tipp for scheduler-tasks design. I have a big text file, wich contains lines. Every line is a job task, starting with job-type, then time to execute, then command and last but not least a optional resubmission-interval for repeating tasks. Syncing means merging. Executed tasks are deleted. If resubmission is on, then a new task is inserted or appended.
In an ideal world, every host ist allways connected to the others, I would implement something like a token ring. If there is no master, one is selected by the hosts, and the master is expected to schedule everything until he is not sending heardbeats for some time. If there are two masters, they negotiate for one of them to become master(maybe lower MAC-Adress... whatever).
If you have to deal with malicious hosts, you can use some byzantine gerenals-problem solution. The selection of the master is allready pretty good proofed against malicious hosts. With a little bit of rsa-krypto the selected master can signature every command, resend attacks can be treated with timestamps or growing indices... voila.
only as a story from an onld programmer, not intended for today everything is allways connected to the internet world: My big problem about 20 years ago was, that the hosts were synchronized from once a hour and once a day to once a week or once a month. So the solution was to have different commands: 1. execute on every host at a given date (wich is far enough in the future for synchronisation) 2. execute on a host, where "whoami" contains a certain substring. 3. execute on a random host with little probability, and send an acknowledgement to all others, that it is allready executed.
The third command-type does something like "fire only once", if the synchronisation is much faster than the probability of execution. It needs no master-slave architecture and it works pretty well, if you know the synchronisation intervalls.