Mailman saves Mails multiple times

1.1k views Asked by At

I've got the mailman gem integrated into my rails project. It fetches emails from gmail successfully. In my app there is a model Message for my emails. The emails are properly saved as Message model.

The problem is that the emails are saved multiple times sometimes and I can't recognize a pattern. Some emails are saved once, some two times and some are saved three times.

But I can't find the failure in my code.

Here is my mailman_server script:

script/mailman_server

#!/usr/bin/env ruby
# encoding: UTF-8
require "rubygems"
require "bundler/setup"
require File.expand_path(File.join(File.dirname(__FILE__), '..', 'config', 'environment'))
require 'mailman'

Mailman.config.ignore_stdin = true
#Mailman.config.logger = Logger.new File.expand_path("../../log/mailman_#{Rails.env}.log", __FILE__)

if Rails.env == 'test'
  Mailman.config.maildir = File.expand_path("../../tmp/test_maildir", __FILE__)
else
  Mailman.config.logger = Logger.new File.expand_path("../../log/mailman_#{Rails.env}.log", __FILE__)
  Mailman.config.poll_interval = 15
  Mailman.config.imap = {
    server: 'imap.gmail.com',
    port: 993,  # usually 995, 993 for gmail
    ssl: true,
    username: '[email protected]',
    password: 'my_password'
  }
end

Mailman::Application.run do
  default do
    begin
      Message.receive_message(message)
    rescue Exception => e
      Mailman.logger.error "Exception occurred while receiving message:\n#{message}"
      Mailman.logger.error [e, *e.backtrace].join("\n")
    end
  end
end

The email is processed inside my Message class:

  def self.receive_message(message)
    if message.from.first == "[email protected]"
      Message.save_bcc_mail(message)
    else
      Message.save_incoming_mail(message)
    end
  end

  def self.save_incoming_mail(message)
    part_to_use = message.html_part || message.text_part || message
    if Kontakt.where(:email => message.from.first).empty?
      encoding = part_to_use.content_type_parameters['charset']
      Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.from.first, inbound: true, time: message.date
    else
      encoding = part_to_use.content_type_parameters['charset']
      Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.from.first, inbound: true, time: message.date, messageable_type: 'Company', messageable_id: Kontakt.where(:email => message.from.first).first.year.id
    end
  end

  def self.save_bcc_mail(message)
    part_to_use = message.html_part || message.text_part || message
    if Kontakt.where(:email => message.to.first).empty?
      encoding = part_to_use.content_type_parameters['charset']
      Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.to.first, inbound: false, time: message.date
    else
      encoding = part_to_use.content_type_parameters['charset']
      Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.to.first, inbound: false, time: message.date, messageable_type: 'Company', messageable_id: Kontakt.where(:email => message.to.first).first.year.id
    end
  end

I have daemonized the mailman_server with this script:

script/mailman_daemon

#!/usr/bin/env ruby

require 'rubygems'  
require "bundler/setup"  
require 'daemons'

Daemons.run('script/mailman_server') 

I deploy with capistrano.

This are the parts which are responsible for stopping, starting and restarting my mailman_server:

script/deploy.rb

set :rails_env, "production" #added for delayed job  
after "deploy:stop",    "delayed_job:stop"
after "deploy:start",   "delayed_job:start"
after "deploy:restart", "delayed_job:restart"
after "deploy:stop",    "mailman:stop"
after "deploy:start",   "mailman:start"
after "deploy:restart", "mailman:restart"

namespace :deploy do
  desc "mailman script ausfuehrbar machen"
  task :mailman_executable, :roles => :app do
   run "chmod +x #{current_path}/script/mailman_server"
  end

  desc "mailman daemon ausfuehrbar machen"
  task :mailman_daemon_executable, :roles => :app do
   run "chmod +x #{current_path}/script/mailman_daemon"
  end
end

namespace :mailman do  
  desc "Mailman::Start"
  task :start, :roles => [:app] do
   run "cd #{current_path};RAILS_ENV=#{fetch(:rails_env)} bundle exec script/mailman_daemon start"
  end

  desc "Mailman::Stop"
  task :stop, :roles => [:app] do
   run "cd #{current_path};RAILS_ENV=#{fetch(:rails_env)} bundle exec script/mailman_daemon stop"
  end

  desc "Mailman::Restart"
  task :restart, :roles => [:app] do
   mailman.stop
   mailman.start
  end
end

Could it be that multiple instances of the mailman server are started during my deploy at nearly the same time and then each instance polls nearly at the same time? The second and third instance pools before the first instance marks the email as read and polls and processes the email as well?

Update 30.01.

I had set the polling intervall to 60 seconds. but that changes nothing.

I checked the folder where the mailman pid file is stored. there is only one mailman pid file. So there is definitely only one mailman server running. I checked the logfile and can see, that the messages are fetched multiple times:

Mailman v0.7.0 started
IMAP receiver enabled ([email protected]).
Polling enabled. Checking every 60 seconds.
Got new message from '[email protected]' with subject 'Test nr 0'.
Got new message from '[email protected]' with subject 'Test nr 1'.
Got new message from '[email protected]' with subject 'test nr 2'.
Got new message from '[email protected]' with subject 'test nr 2'.
Got new message from '[email protected]' with subject 'test nr 3'.
Got new message from '[email protected]' with subject 'test nr 4'.
Got new message from '[email protected]' with subject 'test nr 4'.
Got new message from '[email protected]' with subject 'test nr 4'.

So that seems to me, that the problem is definitely in my mailman server code.

Update 31.1.

Seems to me, that is has something to do with my production machine. when I'm testing this in development with the exact same configuration (changed my local database from sqlite to mysql this morning to test it) as on the production machine I don't get duplicates. Probably is everything ok with my code, but there is a problem with the production machine. Will ask my hoster if they could see a solution for this. To fix this I will go with Ariejan'S suggestion.

The solution: I found the problem. I deploy to a machine where the tmp directory is a shared one between all releases. I forgot to define the path where the pid file of the mailman_daemon should be saved. So it was saved in the script directory instead of the /tmp/pids directory. Because of this the old mailman_daemon could not be stopped after a new deploy. That had led to an army of working mailman_daemons which were polling my mailaccount... After killing all these processes all went well! No more duplicates!

2

There are 2 answers

0
coderuby On BEST ANSWER

I found the problem. I deploy to a machine where the tmp directory is a shared one between all releases. I forgot to define the path where the pid file of the mailman_daemon should be saved. So it was saved in the script directory instead of the /tmp/pids directory. Because of this the old mailman_daemon could not be stopped after a new deploy. That had led to an army of working mailman_daemons which were polling my mailaccount... After killing all these processes all went well! No more duplicates!

4
Ariejan On

This may be some concurrency/timing issue. E.g. new mails are imported before the ones currently processing have been saved.


Edit: Just noticed you have Mailman.config.poll_interval set to 15. This means it will check for new messages every 15 seconds. Try increasing this value to the default 60 seconds. Regardless of this setting, it might be a good idea to add the deduplication code I mentioned below.


My tip would be to also store the message_id from each email, so you can easily spot duplicates.

Instead of:

Message.create(...)

do:

# This makes sure you have the latest pulled version.
message = Message.find_or_create(message_id: message.message_id)
message.update_attributes(...)

# This makes sure you only import it once, then ignore further duplicates.
if !Message.where(message_id: message.message_id).exists?
  Message.create(...)
end

For more info on message_id: http://rdoc.info/github/mikel/mail/Mail/Message#message_id-instance_method

Remember that email and imap are not meant to be consistent data stores like you'd expect Postgres or Mysql to be. Hope this helps you sort out the duplicate mails.