If I schedule a worker to run every day on Heroku, how can I be sure it doesn't run twice or get skipped?

1k views Asked by At

I'm a bit confused as to how Clockwork, Sidekiq, and Redis all fit together on Heroku given that Heroku restarts dynos at least once a day.

Say I have a worker that runs once a day, so it's configured in config/clock.rb as:

module Clockwork
  every(1.day, 'Resend confirmation') { ResendConfirmationWorker.perform_async }
end

In this worker, I get all the users who have created an account but haven't confirmed it within two days, and resend a confirmation email to each of them.

class ResendConfirmationWorker
  include Sidekiq::Worker
  sidekiq_options queue: :resend_confirmation, retry: false

  def perform
    d = Time.zone.now - 2.days
    users = User.where.not(confirmation_sent_at: nil)
                .where(confirmed_at: nil)
                .where(created_at: d.beginning_of_day..d.end_of_day)

    users.find_each do |user|
      user.send_confirmation_instructions
    end
  end
end

Let's say someone signs up on Monday, this job runs on Wednesday, finds them, and sends them a second confirmation email. Then the dyno gets restarted for whatever reason, and the job runs again. They'll get yet another email. Or alternatively, if the restart happens the moment before the job needs to run, then they won't get anything.

How does Clockwork have any concept of jobs longer than 24 hours, given that its “lifespan” in a Heroku dyno? Is there a way to simply manage this limitation without having to constantly save this sort of thing to the database?

2

There are 2 answers

2
spickermann On

IMO you need more information in the database if you want to avoid such issues. A state machine might help or an explicit second_confirmation_send_at column.

That would allow you to write the query in your job like this:

users = User.where('confirmation_sent_at < ?', 2.days.ago)
            .where(second_confirmation_send_at: nil)

Then the query doesn't care anymore if it runs multiple times a day, or by accident a day later.

1
Alexander Luna On

If you know that you will execute it every wednesday, I suggest to use Heroku Scheduler (https://devcenter.heroku.com/articles/scheduler). It lets you run specific commands at set time intervals. Less complexity.