How does the messenger maintain the sequencing of the messages during chat and when users log in again?

2.5k views Asked by At

I was asked this question in an interview and was unable to answer it.

How does FB messenger order the messages on user side when two messages are concurrent in order to avoid view difference in display order during the chat period and when user visits the messenger again. I thought that we can store a timestamp with each message, which is the time the message is received by the server. However, this will not ensure the correct ordering of messages for clients.

Take a scenario where the server timestamp cannot determine the exact order of messages would look like this:

  1. User-1 sends a message M1 to the server for User-2.
  2. The server receives M1 at T1.
  3. Meanwhile, User-2 sends a message M2 to the server for User-1.
  4. The server receives the message M2 at T2, such that T2 > T1.
  5. The server sends message M1 to User-2 and M2 to User-1.
  6. So User-1 will see M1 first and then M2, whereas User-2 will see M2 first and then M1.

I read that resolve this issue, we can use Vector clocks but was unable to understand how the message sequencing be preserved for different users during the chat and when users log in again.

In the above scenario, user1 will see M1 followed by M2 whereas user2 will see M2 followed by M1. Now if each user also generates a sequence number or timestamp for each of its message to each of the client (separately). Then in scenario above user1 will send message M1 with sequence <1 (user1 seq), 0(user2 seq) > and user2 will send message M2 with sequence <0 (user1 seq), 1(user2 seq) >. So when both the message arrives at user1 and user2 they will have: M1 <1, 0> M2 <0, 1>

Now let’s say user1 sends more messages M3 <2, 1> and M4 <3, 1> then each of client will have following msgs. M1 <1, 0> M2 <0, 1> M3 <2, 1> M4 <3, 1>

So in this case when the user is logged in the display order for user-1 and user-2 during chat will be M1,M2,M3,M4 and M2,M1,M3,M4 respectively. Now, I want to know how will the same order be preserved for user-1 and user-2 end when the login again ?

Thanks.

1

There are 1 answers

1
Shivam On

The problem here is how we will generate a consistent chat conversation for each user from these sequence numbers.

Let's assume a conversation between Alice and Bob.

Message sequence structure:

message<Alice seq number,  Bob sequence number>

One thing to note is numbers in M1, M2, M3,... are just used to differentiate the messages and don't have any relation with the actual message sequence.

View from Alice side:

1) Alice sends M1<1,0>
2) Bob sends M2<1,1>
3) Alice sends M3<2,1>
Now, Bob sends one message(M5) but before Alice gets that, Alice sends one more message.
4) Alice sends M4<3,1>
And now, she received a message from Bob.
5) Bob sends M5<2,2> 
Since Bob didn't get M4 before sending M5 the Alice sequence number in M5 is 2. 
If he would have got that, the M5 would look like M5<3,2>.

Now, View from Bob side:

1) Alice sends M1<1,0>
2) Bob sends M2<1,1>
3) Alice sends M3<2,1>
Now, Bob sends message M5 before getting M4 from Alice
4) Bob sends M5<2,2>
5) Alice sends M4<3,1>

Now when Alice logins next time server will fetch the data and sort it:

1) First sort with Bob sequence number. 
2) if two or more messages have the same Bob's sequence number then sort it in Alice's sequence number within them.

Similarly for Bob

1. First sort the message-ids with respect to Alice sequence number.
2. if two or more messages have the same Alice's sequence number then sort it in Bob's sequence number within them.

So for Alice, it would be in the order of Bob's sequence number:

M1<1,0>  
M2<1,1>  
M3<2,1>  
M4<3,1>  
M5<2,2>  

For Bob, it would be in the order of Alice's sequence number:

M1<1,0>  
M2<1,1>  
M3<2,1>  
M5<2,2>  
M4<3,1>

How we will store the message sequences in the database:

enter image description here

How a client will know which is his/her sequence number?

In our example we decide that the first number will be Alice sequence number and the second will be Bob's. But in real-time how this decision will be made. This can easily be solved if we make a convention that the first sequence number will always be the sender's sequence number and the second one is receivers. So when someone receives a message then he knows that the first sequence number is the sender's sequence number. and when he prepares the next message he increments his sequence number from the last received message and puts it in the first place and takes the sender's sequence number from the received message and put it in second place.

How server will know that which sequence number has to be stored where?

Now since we defined the above convention if the server gets a message from Alice the first field will be Alice sequence number and the second will be Bob's sequence number so it will store in that way. Similarly, it does it for Bob also.

Note: I was also looking for the solution for the above problem but didn't get anything on the net that can help so made my own solution. Please correct me if it breaks any use case so that we can improve it or try something else.