Sunday 13 April 2014

IPv6-in-IPv6 + HBHO Extension Headers: The answer to the ultimate question of RFC compliance, the universe, and everything

So there is this person with whom I've been discussing Contiki/6LoWPAN-related topics for ages. However, due to the fact that he's based in the other side of the Atlantic, we'd not had a chance to meet in person before. IETF 89 took place in London this year, he was attending, it was an excellent opportunity to meet in person and I jumped on a train without hesitation.

Himself, a bunch of other (IETF) IoT / Contiki enthusiasts and I met after the IETF and we went for dinner and a few drinks. Among them, a key player in IETF's ROLL WG. Naturally, lots of the conversation was on Contiki-related topics, and at some point we started discussing ROLL's protocols.

Among other topics, we spoke about RPL (some pronounce it 'ripple') and MPL (some pronounce it 'mipple'). I emphatically recommended that, for obvious pronunciation-related reasons, they should refrain from using the acronym NPL for any of their subsequent protocols...

Moving swiftly on...

A little background

Now that Contiki features its shiny support for 6LoWPAN multicast, I've been spending some of my spare time implementing the most recent version of the MPL ID (currently v08).

Simple scenario: A 6LoWPAN connected to the rest of the Internet. For simplicity's (and my own sanity's) sake, let's for a second assume that the world suddenly woke up one day and IPv6 had been deployed globally. A host somewhere in the Internet sends a multicast datagram to a Global-Scope IPv6 address, and some nodes inside the 6LoWPAN are interested in this multicast group. Shimples!

Or maybe it's not that simple. You see, MPL defines an IPv6 HBHO header, called the 'MPL Option', present in all MPL Data Messages. Why is that a problem? Well, it's a problem cause MPL only operates inside the 6LoWPAN, so if a multicast datagram originates from somewhere outside it, it won't have an MPL Option.

Let's try to visualise this from a network layer perspective (omitting layer 2 and UDP headers for clarity). Observe how a multicast datagram originating at an Internet host will not have an MPL Option extension header in it (right). However, inside the 6LoWPAN, the MPL Option must be there (left).

In the MPL draft, there is also a discussion about the Realm-Local multicast scope. This is a new scope, currently emerging as part of draft-ietf-6man-multicast-scopes (currently at v04). The relevance of the Realm-Local scope is not immediately obvious. Please bear with me...

An example of a perfectly smooth conversation

So there we were, eating dinner and generally having a blast, till I decided to mention the Realm-Local scope, which was when then things started going south... The conversation below was predominantly between myself and a person very active within IETF's ROLL WG. Let's call him "Him", to preserve anonymity, privacy etc.

Me: Is Realm-Local moving forward? I'm planning to add support for it in Contiki since MPL uses it.
Him: Yes, it's moving forward.
Me: In practical terms, what's the difference between Realm-Local and... <blah blah>?
Him: <Explanation>

(so far so good)...

An example of a conversation NOT to continue, cause it stops being perfectly smooth...

Him (continues): Thus, when a multicast datagram enters the 6LoWPAN, the LBR is going to send it as Realm-Local.
Me (dazed and confused):  Do you mean a datagram with a Global-Scope destination?
Him: Yes.
Me (dazed and confused): Eeeerm. What? Why would we change the destination address? Why not just forward it inside the 6LoWPAN with the same destination?
Him: No no, you're not changing it. But the datagram doesn't have an MPL option.

(Remember the image earlier. No MPL Option in the original datagram)

Me: Why not just add the MPL option? (I should have known the answer but I didn't)
Him: NO! You may not add extension headers en-route!
Me: OK. Then what?
Him: You encapsulate with IPv6-in-IPv6. The inner header stays as-is, the outer one uses Realm-Local destination.

That was shock number one. Now to give credit where credit's due, the MPL ID is quite explicit about this scenario: "IPv6-in-IPv6 encapsulation MUST be used, hence the "I should have known" comment above.

Let's try to expand the previous diagram to explain what I was thinking (top structure of network headers) and what "Him" was explaining (bottom).

Fair enough, at the end of the day, as discussed in a previous post, Contiki currently doesn't implement either, so if we're gonna do it, we can always do it correctly right from the start... Or can we?

But then I just had this Eureka moment: Haaaang on a minute. Wait wait wait. RPL also defines its own HBHO [RFC 6553] and the situation with incoming unicast datagrams is identical: They are not carrying the RPL HBHO.

Therefore, I simply had to insist

Me: Let's totally forget about multicast for a moment. Plain old unicast datagram sent from the internet. When the datagram is entering the 6LoWPAN, it needs the RPL HBHO and the border router adds it.
Him: Noooope. You must do IPv6-in-IPv6

Houston, we have a problem!

Simply put, "Him" just told me we're not RFC-compliant... Once again, let's have a look at the usual diagram, this time modified for a unicast datagram entering the RPL network. Observe how I've replaced the MPL Option with the RPL Option.

Despite the comedy value of the conversation, this last bit is an even more serious problem than the situation with MPL's HBHO. You see, we (Contiki) don't currently support incoming multicast datagrams, but we sure as hell support unicasts entering a RPL network. In a sense, it's better to not support a spec, than to support it in a non-compliant fashion. I like to think of Contiki as THE operating system for the IoT. Due to the embedded, super optimized way its TCP/IP stack has been implemented, I fear it would take a very considerable effort to implement IPv6-in-IPv6. Not to mention the code size and complexity increase.

Me: Well, sorry, but IPv6-in-IPv6 with Contiki ain't gonna happen any time soon.

I know that the people I was talking to can't be wrong about those things. Originally, I was hoping that they misunderstood the use-case I was trying to describe, or that I misunderstood what they were trying to explain to me.

Nevertheless, the discussion prompted me to re-visit RFC 6553 (RPL Option), as well as draft-ietf-roll-trickle-mcast-08 (MPL's spec, including MPL Option). Having done so, it appears, dear reader, that they understood the question perfectly well and that I understood the answer perfectly well too...

In other words, we just found out that Contiki is not as RFC Compliant as we might have liked it to be. If that's the case, we should probably open a bug in the issue tracker...

Now blogs are about personal opinions are they not? Well, I know that this one is. Well, even though I'm a lot wiser now (thanks, "Him"), my personal opinion is not a lot different to what I said during the dinner in question:

Me: Well, sorry, but IPv6-in-IPv6 with Contiki ain't gonna happen any time soon.

The entire situation is a little bit of a mess. I can see why the RPL Option is useful. What I can't see is the need to over-complicate a spec that targets constrained environments, therefore also constrained nodes. RPL is becoming the de-facto standard for 6LoWPANs, and it's already complex enough. What I can't see is the need to harm it by introducing serious implications to its implementability for something that's no huge benefit.

Happy coding!