How does STUN perform ICE connectivity check on Candidate Pairs?

1.6k views Asked by At

I have gone through RFC 5389 and RFC 5245 and the newer RFC 8445. I understand how STUN works in returning the Server Reflexive Address or Relayed Address. The request is sent to the STUN server.

My fundamental question is about ICE connectivity check using STUN. RFC 8445 states on Page 10:

"...At the end of
this process, each ICE agent has a complete list of both its
candidates and its peer's candidates.  It pairs them up, resulting in
candidate pairs.  To see which pairs work, each agent schedules a
series of connectivity checks.  Each check is a STUN request/response
transaction that the client will perform on a particular candidate
pair by sending a STUN request from the local candidate to the remote
candidate."

For Checking connectivity checks on candidate pairs, the STUN message must have provision for the target IP address, Port, Proto, at the minimum. Where is this STUN message structure described? Where can I get details of how STUN completes this connectivity check?

2

There are 2 answers

2
abhayAndPoorvisDad On BEST ANSWER

I can understand the difficulty in interpreting the RFC description of the process. I am attempting to simplify:-

Suppose I obtain the candidate pairs at my end as:-

  1. IP1,P1
  2. RIP2,P2
  3. TIP3,P3

Similarly my peer has his own set as

  1. (B)IP1,P4
  2. (B)RIP2,P5
  3. (B)TIP3,P6

Lets fast forward to the future, where we are having good media flow. Obviously, for the direction of media from A->B, we have two transport addresses. Since UDP is being used to send media, the socket has a source address and destination address. Let us call them SrcIP_A, SrcPort_A and SrcIP_B, SrcPort_B.

It must be clear that SrcIP_A, SrcPort_A is a part of the candidate pairs of A and SrcIP_B,SrcPort_B is a part of the candidate pairs of B.

Now, coming to the current time, from the perspective of A, in order to achieve smooth media flow from A->B, we just need to lock down on the pair we will eventually use from the set we already have.

Here is where STUN comes into picture. Remember STUN request needs to be sent to a particular IP,Port. And the response tells which is the NATted external address that the STUN server noticed in the request.

So A, creates 9 pairs,matching each entry in candidate pairs of its own with its peer. It then sends a STUN request from the RFC 8445 Page 14 base of of the pair from each of its own candidate set, to each of the remote candidate pairs. Now, the remote side B MUST have a STUN server logic implemented on it own side when it receives any traffic on its candidate pair. So, basically the socket when receiving any packets needs to be able to distinguish between media and STUN packets. In case of the latter it will send back a STUN response indicating from where it had received the request.

Lets assume while iterating A is at the following combinations.

  1. IP1,P1 Vs RIP2,P5 Here the reuest might reach B, since the reflexive address of RIP2,P5 will reach inside the NAT. The observed address returned will be the reflected address of IP1,P1. At the side of A, when the response is received, it will discard this set since the contained address is not IP1,P1.
  2. RIP2,P2 Vs (B)IP1,P4 This will clearly fail. Since you cannot send to IP1,P4 which is a private address.
  3. RIP2,P2 Vs RIP2,P5 Here the reuest might reach B, since the reflexive address of RIP2,P5 will reach inside the NAT. The observed address returned will also be RIP2,P2. So this can be marked as the "valid pair".

Hope I have been clear.

4
xdumaine On

You'll find the STUN message structure described in RFC-5389, section 6. https://www.rfc-editor.org/rfc/rfc5389#page-10.

Notable pieces of the description:

STUN messages are encoded in binary using network-oriented format (most significant byte or octet first, also commonly known as big- endian). The transmission order is described in detail in Appendix B of RFC 791 [RFC0791]. Unless otherwise noted, numeric constants are in decimal (base 10).

All STUN messages MUST start with a 20-byte header followed by zero or more Attributes. The STUN header contains a STUN message type, magic cookie, transaction ID, and message length.

The most significant 2 bits of every STUN message MUST be zeroes. This can be used to differentiate STUN packets from other protocols when STUN is multiplexed with other protocols on the same port.

The message type defines the message class (request, success response, failure response, or indication) and the message method (the primary function) of the STUN message. Although there are four message classes, there are only two types of transactions in STUN: request/response transactions (which consist of a request message and a response message) and indication transactions (which consist of a single indication message). Response classes are split into error and success responses to aid in quickly processing the STUN message.