Rails Griddler and conversation / email threading

1.2k views Asked by At

I'm trying to work out how best to connect / thread a chain of emails. This seems like such a common problem that I was surprised that I couldn't easily locate information on how other people have dealt with it. The only thing I found was a post about JWZ threading which looked more concerned with parsing together a thread in one email. I was wondering if anyone could point to me some current solutions.

I'm using the thoughtbot griddler gem to process incoming emails into a model Message(s) and a separate model Contact(s), and I have a third model for storing replies, e.g. Reply.

My current thinking is to thread them by the unique contact and the subject line. But then again the subject line will change slightly. e.g. from "This subject" -> "Re: re: This subject" I could use regex to try parsing out "re:"s or I could use something like amatch to do string comparisons?

But then again, what to do about the same subject appearing for the same user 2 months later? Also add some logic regarding the current date so that threads only use recent emails. Then there might be something else useful stored in the email header itself?

  • User (by unique email address)
  • Unique Subject line (regex re: processing issues?)
  • Current date (emails must be date relative to each other)
  • Some other clues to look for in the email header?

I have i rough idea of how to do it, I'm just curious to see some current implementations, I just can't seem to find any.

Any pointers would be greatly appreciated!

2

There are 2 answers

0
nort On

Email threads are a linked list, the information in the headers contains enough information to reconstruct the list from its component parts.

Introspect the email headers and to look for some specific headers.

The key ones you'll use are Message-ID, In-Reply-To and References. These headers give you information about which message was replied to and what other ids matter to the email thread itself.

The easiest way to find information about the headers of an email is to open the 'Original Message' in gmail (from the more menu).

2
dimid On

There is a new gem named Msgthr, which is an implementation JWZ's algorithm. It's not matching subjects, senders or dates, so it's not exactly what you're looking for, but I think it's a good start.

The neatest thing about Msgthr is that it's container-agnostic, hence you don't have to install requirements such as TMail, as in Frederik Dietz's ruby port. This also means it can be used for other types of communications.

Here's some sample code, given a list of messages, let's group them into threads:

thr = Msgthr.new
threads = {}
[1, 11, 12, 2, 21, 211].each{ |id| threads[id] = [id]}
my_add = lambda do |id, refs, msg|
  thr.add(id, refs, msg) do |parent, child|
    threads[child.mid] = threads[parent.mid]
  end
end
# Create the following structure
# 1
#  \
#  | 1.1
#  \
#    1.2
# 2
#   \
#    2.1
#       \
#         2.1.1
my_add.call(1, nil, '1')
my_add.call(11, [1], '1.1')
my_add.call(12, [1], '1.2')
my_add.call(2, nil, '2')
my_add.call(21, [2], '2.1')
my_add.call(211, [21], '2.1.1')

thr.thread!
thr.rootset.each do |cnt|
  threads[cnt.mid][0] = cnt.msg
end

Disclosure: I'm one of the contributors to the gem.