Archive

Author Archive

More on MSRP: Chunking

April 27th, 2010by Ben Campbell under SIP

As I mentioned in past blog entries, MSRP is designed to carry arbitrary content. Not only does that mean it can carry any type of data, or at least any kind that can be represented in MIME, it means it can carry arbitrarily large data. MSRP also lets you use the same TCP connection for multiple sessions.
The down side to this is that TCP always delivers data in the order sent. If you start sending a very large piece of content (say, video), but then need to send something else that’s very important, the recipient’s endpoint won’t be able to see the very-important-message until it completely receives the video. We call this head-of-line blocking. MSRP’s chunking feature lets you get around this problem by interleaving content over a single connection.
For the purposes of this discussion, let’s define the term “message” to mean a whole piece of content, in the form of a complete MIME object. That could be a short “text/plain” object, or it could be a huge “video/mp4″ object. A “chunk” is a piece of a message. That piece could be the whole message, which would be likely for the small “text/plain” message, or it could be just a fragment of the message. To break a message into chunks, you first encode the message. You take one “message” and cut it into pieces, instead of creating a bunch of smaller “messages.”
MSRP uses the “byte-range” header field for this purpose. The byte-range field tells you what portion of the message is in a particular chunk. It also tells you the overall size of the message. For example, in the case of a short message that is fully contained in one chunk, you might see the following:
This means the chunk contains bytes 1 through 100 out of a total length of 100. On the other hand, for a chunk from the middle of a bigger message the Byte-Range header field might look like:

Byte-Range: 4097-6144/65535

The sending endpoint puts each chunk into a separate MSRP SEND request. The receiving endpoint reassembles the chunks into a whole message by inspecting the Byte-Range headers. Both parties can use the total length value to provide progress information to the users. All the chunks from a given message share the same value in the Message-ID header field. 
Here’s an example of a SEND request containing an entire message:

MSRP d93kswow SEND
To-Path: msrp://bob.example.com:8888/9di4eae923wzd;tcp
From-Path: msrp://alicepc.example.com:7777/iau39soe2843z;tcp
Message-ID: 12339sdqwer
Byte-Range: 1-16/16
Content-Type: text/plain

Hi, I’m Alice!
——-d93kswow$

And here’s an example of a message broken into two chunks:

MSRP d93kswow SEND
To-Path: msrp://bobpc.example.com:8888/9di4eae923wzd;tcp
From-Path: msrp://alicepc.example.com:7654/iau39soe2843z;tcp
Message-ID: 12339sdqwer
Byte-Range: 1-137/148
Content-Type: message/cpim

To: Bob
From: Alice
DateTime: 2006-05-15T15:02:31-03:00
Content-Type: text/plain

ABCD
——-d93kswow+

MSRP op2nc9a SEND
To-Path: msrp://bobpc.example.com:8888/9di4eae923wzd;tcp
From-Path: msrp://alicepc.example.com:7654/iau39soe2843z;tcp
Message-ID: 12339sdqwer
Byte-Range: 138-148/148
Content-Type: message/cpim

1234567890
——-op2nc9a$

But sometimes, the sender may not know the size of the entire message in advance. For example, imagine that the sender is capturing video from a webcam. It would be inefficient to wait until the entire video is captured to start sending. If you don’t know the full message length, you can put a “*” in the total size field. For example:

Byte-Range: 4097-6144/*

Chunking is very useful, but it adds overhead. You really want to avoid chunking a message unless you have a good reason. Chunking was envisioned for situations where an endpoint needs to send a high priority message when a large message is already in the process of being sent. For example, the peer device might send you a message that requires a timely response in the form of an MSRP REPORT request–but you’ve already got a large transfer in progress. You can’t wait for the large transfer to complete first. The problem is, you can’t know in advance that this will happen, so how do you decide whether to break the large message into chunks? MSRP allows you to interrupt an in-process SEND request. 
The last line of a SEND request is known as the end-line. The end-line is made up of seven hyphens, followed by a transaction identifier and a continuation flag. The transaction identifier is a random sequence initially written into the start line (look closely at the second field in the start lines of the examples above–they repeat in the end-line. The continuation flag tells you whether there are more chunks in the message. If it contains a “+”, then the recipient knows to expect more chunks. If it contains a “$”, then it’s the last chunk in the message. It can also contain a “#” to indicate the sender is abandoning the message.
You can make a SEND request interruptible by putting a “*” in the end-range field in the Byte-Range, like so:

Byte-Range: 4097-*/65535

In this example, the sender could start sending the entire 65535 octet message in a single SEND request. But if it needs to interrupt the message part way through, it simply writes out the end-line, and puts a “+” in the continuation field. It then sends whatever high-priority message that forced the interruption, then resumes the original large message in a new chunk. The receiver cannot tell the actual end-range from the Byte-Range header. Instead, it has to calculate the end-range by counting the actual number of octets before the end-line, and adding that to the start-range value.
A common misconception about MSRP is that the specification says that any message larger than 2048 octets must be broken into chunks. It really says that any message larger than that must be interruptible. It’s far more efficient to send a large message in one interruptible chunk than in a bunch of small chunks. That way you don’t incur the chunking overhead unless you really need it.
There’s another aspect of MSRP chunking that can make the implementer’s life interesting. If there’s an MSRP relay in the path of a session, that relay can break a chunk into smaller chunks. It can also reassemble several smaller chunks into a bigger chunk. This is transparent to the sender, except for one thing.
MSRP allows the recipient, or an intervening relay, to optionally send delivery reports in the form of REPORT requests. A REPORT request also contains a Byte-Range header field. In this case, the Byte-Range header field tells you the range of the original message to which the report applies. Since an intervening relay can change the chunking for a given message, the sender cannot assume that REPORT requests will line up with the chunks it actually sent. For example, you could send 3 chunks for the byte ranges of 1-33, 34-66, and 67-100, but get back REPORT requests for the ranges of 1-50, and 51-100. 
I think that’s enough for now. Next time, I will cover more details on how those REPORT requests work. But until then, here’s a forewarning: The REPORT mechanism is more complicated than you probably expect. 

New MSRP Standards Work: The Alternate Connection Model

March 17th, 2010by Ben Campbell under SIP

Even though the SIMPLE working group published the MSRP and MSRP Relay specifications some time ago, it’s not quite time to say we’re finished and go out for a beer. Like any new protocol, it has some rough edges that need to be polished out. SIMPLE is currently working on a bit of polish known as the MSRP Alternate Connection Model, or MSRP-ACM for short.

MSRP runs over reliable, connection-oriented protocols such as TCP. One important aspect of these protocols is that when two devices want to talk, one of them must act as a client and the other a server. That’s not as big a deal as it sounds. It merely means that the client opens the transport connection towards the server, and the server listens for and hopefully accepts the client’s connection. For truly client-server protocols such as HTTP, this makes perfect sense–your web browser opens a TCP connection to the web server. It rarely makes sense for the server to open the connection towards the browser.

Since so many application protocols work this way, the client-server assumption has become intrinsic to the way people build access networks. Chances are, there’s a firewall between your web browser and the server for this blog. There’s likely even a Network Address Translator (NAT). Both of these devices are commonly configured with policies that lets clients open outbound connections, but severely restrict who can open or receive inbound connections. 

But the client-server assumption falls down for many real-time communication applications. These applications are peer-to-peer at their cores, i.e. they are designed to allow any device to connect to any other device. It’s hard to deploy peer-to-peer applications on networks that were built for client-server applications. That mismatch has provided grist for many of the posts on this blog. I’m sure it has caused headaches for many of our readers.

As I mentioned in my last post, MSRP assumes that the peer that sent the SDP offer always acts as the TCP client. That is, it opens the transport connection towards the peer that sent the answer. For this reason, we often refer to the offer as the “active party.”

The active party also must immediately send an MSRP SEND request to its peer. This is because when the listening device (aka the “passive party”) receives a connection, it doesn’t know for sure who’s really trying to connect. When it receives the SEND request, it can compare an MSRP URI in the request to the one it sent in the SDP answer. You can think of that URI as a party invitation the active party must present to get in the door.

But making the offerer into the active party does not work in all possible scenarios. Let’s go back to the usual suspects, Alice and Bob. Alice sends the SDP offer to Bob, making Alice into the active party and Bob into the passive party. But Bob is behind a NAT that doesn’t allow inbound connections. They need an MSRP relay, or some other kind of media relay, to talk at all.

But what if Alice was not behind such a NAT? They would have been able to talk just fine if Bob sent the offer to Alice. It seems a shame to require the overhead of a relay, if Alice and Bob could have solved the problem by reversing roles.

COMEDIA describes a set of SDP extensions for negotiating which peer becomes the active party for connection oriented media. The MSRP-ACM draft describes how to apply COMEDIA to MSRP sessions, instead of just using the default assumption that the offerer is always the active party.

So if Alice and Bob had both supported the alternate connection model, Alice would have declared in the SDP offer that she could act as either the active or the passive party, since she was not behind a NAT.  Bob would respond in his SDP answer that he could only be the active party. Bob would then take over the role of active party, even though he was not the offerer. He would then open the TCP connection to Alice, and send the initial SEND request.

This situation may not occur very often for sessions between end users. It’s far more likely that both parties have NATs or firewalls getting in the way, and you still need at least one party to use an MSRP relay or other NAT traversal technology. But MSRP-ACM can be extremely useful if one party is an application server, such as a conference bridge, offline message server, etc. It’s much more common for such server-class devices to be able to accept inbound TCP connections. But if they MSRP without MSRP-ACM, then they would need a relay in order to initiate a session to an end-user. With the alternative connection model, that’s no longer necessary.

The SIMPLE working group is almost finished with the MSRP-ACM draft. The draft has completed working group last call (WGLC). The draft authors are working to resolve some WGLC comments, after which the group will submit the draft to the Internet Engineering Steering Group (IESG) for final evaluation before it becomes an RFC.

 

MSRP Target Path

February 3rd, 2010by Ben Campbell under SIP

My last post described how MSRP endpoints use SDP to setup sessions. Today, we’ll discuss how the MSRP protocol uses the results of the SDP offer/answer exchange.

Each endpoint builds a “target path” that it will use for all MSRP communication with its peer. If an endpoint does not use a relay, then the target path is exactly the same as the SDP path attribute value it received from its peer. On the other hand, if the endpoint does use a relay, then it forms the target path by prepending the URI that it got from each relay to the peer’s SDP path attribute value. In both cases, the target path form a roadmap of how to get an MSRP request to the peer, i.e. a list of MSRP URIs showing each hop to visit on the way, ending with the URI of the destination device.

If you recall from last time, Alice sent Bob an SDP offer containing the following:


a=path:msrps://alice.example.com:7654/asfd34;tcp

Bob responded with an SDP answer containing this:


a=path:msrps://relay.example.net:8211/asfioef;tcp msrps://bob.example.net:6581/asfd34;tcp

Since Alice didn’t introduce a relay (Bob’s relay doesn’t count here), she’s got life easy. Her target path is exactly what Bob sent her:


msrps://relay.example.net:8211/asfioef;tcp msrps://bob.example.net:6581/asfd34;tcp

 
Okay, I lied a little bit about Bob’s relay not counting. The relay clearly exists in the path, because Bob put it there. But since Alice did not introduce a relay on her own behalf, she uses Bob’s path value as-is, without worrying about relays in general.
 
On the other hand, Bob did introduce a relay. So to get his target path, he takes the path Alice sent him, and prepends his own relay, and gets this:
 


msrps://relay.example.net:8211/asfioef;tcp  msrps://alice.example.com:7654/asfd34;tcp

 

Alice and Bob are now almost ready to exchange MSRP messages. But there’s one more step that must happen first. Alice must open a TCP connection towards Bob. Note that RFC 4975 says the offerer always opens the connection towards the answerer. There’s work afoot to allow MSRP endpoints to negotiate the connection direction using COMEDIA, but for now lets assume Alice and Bob are using RFC 49752 as-is.

Alice connects to the first device in her target path. In this case, that’s Bob’s relay. She uses the DNS to get an IP address for “relay.example.net” and opens a TCP connection to port 8211, and starts sending messages. That’s assuming she doesn’t already have such a connection–for example, she might already have an MSRP session in progress with someone else that uses the same relay as Bob. In that case, she just uses the connection she already has.

Now, lets pretend for just a moment that things were reversed, and Bob had sent the original offer. The first device in his target path is also “relay.example.net”. But since he already set up a connection to that relay when he authenticated with it, he doesn’t need to setup a new one. He just reuses the one he already has. The relay would establish a connection to the next hop (Alice, in this case) on demand.

When either endpoint wants to send a message to the other, it constructs a SEND request with the message content in its payload. The endpoint puts its target path in a To-Path header field, and its own URI in the From-Path header field. It then sends the request to the first device in the To-Path. If that device is the peer, then that’s pretty much all there is to it. If the first device is a relay, the relay removes its own URI from the To-Path and prepends it to the From-Path, then relays the request downstream.

Here’s a picture for Alice and Bob. I’ve replaced the actual URIs with the symbols “Alice”, “Bob”, and “Relay” to try to keep it readable.

To-Path and From-Path 

At this point, you are probably (and rightly) wondering what the point of a relay moving its URI to the From-Path header field when it relays a request. This allows a downstream device to send a response to a SEND request, in the form of a REPORT request. Don’t confuse this with a message from the human Bob in response to a message from the human Alice. That would simply be another SEND request in the opposite direction. Instead REPORT requests carry delivery information about the original request.

The peer device, and any relays in between can originate a REPORT request back to the endpoint that sent a SEND request. They do this by inserting the From-Path that they observed in the Send request into the To-Path of the REPORT request. Here’s a picture showing a REPORT request sent by Bob, and another by Bob’s relay.

REPORT Paths

That’s enough for now. Next time, I’ll talk about how MSRP messages can be broken into “chunks” in order to multiplex multiple sessions across the same connection. 

 

 

 

MSRP SDP Extensions with Relays

December 22nd, 2009by Ben Campbell under SIP

My last SIP Sessions post discussed the SDP offer/answer extensions used by MSRP in the peer-to-peer scenario. Today, we will look at how this changes when you introduce MSRP Relays into the mix.

RFC 4976 defines the MSRP relay extension. There’s quite a bit to talk about with MSRP Relays. Today we’re going to focus just on the parts that impact the offer/answer process. I’ll cover more about relays in a future post.

An MSRP client that needs to use an MSRP relay must first authenticate to the relay and request an MSRP URI that represents the session at that relay. It does this using an MSRP extension method called “AUTH”. We will dive into the gory details of AUTH after we discuss the general MSRP transaction model–also in future posts (Are you starting to see the pattern here?). But we need to understand it conceptually in order to explore how relays affect the offer/answer model.

The client sends the AUTH request to the relay over a TLS connection. The relay authenticates the client using a form of digest authentication much like that from HTTP and SIP. The client uses the TLS association to authenticate the relay.

Once the authentication completes the relay generates an MSRP URI that resolves to the relay itself. The relay puts this URI in a “Use-Path” header field in the 200 OK response that it sends back to the client in response to the AUTH request. The client then uses the relay’s URI in the session negotiation.

This is conceptually similar to how some other relay-based NAT traversal mechanisms work. For example. SOCKS and TURN each allow a client to request a relay device allocate a port on its behalf.

Once the client gets a “Use-Path” header value from the relay, it can then build the SDP path attribute by appending its local URI to the relay URI. For example, assume the client’s URI is “msrps://client.example.com:2855/asfd34;tcp” and the relay returned a Use-Path value of “msrps://relay.example.com:7212/d3asdf43;tcp” The path attribute would now look like the following: 

a=path:msrps://relay.example.com:7212/d3asdf43;tcp msrps://client.example.com:2855/asfd34;tcp

You’re probably wondering why “Use-Path” is not called “Use-URI”. The reason for this is, just like the SDP path attribute, “Use-Path” can contain more than one URI. There are few cases where a relay might need to return more than one URI. (You guessed it–we’ll talk about those in a future post.) But regardless of why it might happen, the relay-using client would build the SDP path header by taking the entire contents of “Use-Path”, reversing it, then adding its own URI to the end. Thus, the SDP path attribute becomes an assertion to “to get to me, follow this path from left to right. My local URI is on the end.”

Let’s look at a more complete example from RFC 4976. Alice invites Bob to an MSRP session. Alice does not use a relay, but Bob does. Alice’s offer looks something like the following:

v=0

o=alice 2890844526 2890844526 IN IP4 alice.example.com

s= 

c=IN IP4 alice.example.com

t=0 0
m=message 7654 TLS/TCP/MSRP *
a=accept-types:text/plain
a=path:msrps://alice.example.com:7654/asfd34;tcp

When Bob sees the offer, he connects to his relay (relay.example.net), and performs an AUTH transaction. He gets back a 200 OK response containing, among other things, the following:

 Use-Path: msrps://relay.example.net:8211/asfioef;tcp

Bob’s SDP answer then looks something like this:

v=0

o=bob 2890844542 2890844542 IN IP4 bob.example.net

s= 

c=IN IP4 bob.example.net

t=0 0
m=message 6581 TLS/TCP/MSRP *
a=accept-types:text/plain
a=path:msrps://relay.example.net:8211/asfioef;tcp msrps://bob.example.net:6581/asfd34;tcp

Note that in this case, Alice’s client does not have to implement RFC 4976 at all. It won’t know how to use the AUTH method, but even basic RFC 4975 clients can still talk to relay-using peers.

That’s enough for now. Next time, we will talk about how these SDP path attributes get used inside MSRP proper.

SDP Extensions for MSRP

November 11th, 2009by Ben Campbell under SIP

This is my second in a series of posts about MSRP, or the Message Session Relay Protocol. My previous entry gave an overview of MSRP.
Like most other types of media that one can negotiate using SIP, you use the Session Description Protocol (SDP) Offer/Answer model to negotiate MSRP sessions. But MSRP is different in several ways than RTP, and these differences require some different approaches in SDP.
First, MSRP allows multiple sessions to use the same TCP connection. This means you can’t identify a session by IP address and port alone like with RTP. To get around this problem, MSRP defines its own URL scheme. Here’s an example:

msrp://host.example.com:2855/asfd34;tcp

In this example, “host.example.com” identifies the host. In this case, the host was identified by name–this could just as easily been an IP address. The port is “2855″, and the session identifier is “asfd34″. The unvalued “tcp” parameter merely means that this session uses TCP. (Right now, all MSRP sessions use TCP. It could work with other reliable stream oriented transports, such as SCTP, but the IETF has not yet defined bindings for any but TCP.) Just like what the IP address and port do for RTP, the MSRP URI defines where a device wants to receive media. A given session has a separate URI for each endpoint.
You can also specify the use of Transport Layer Security (TLS) by using a URI scheme of “msrps”.
Since the SDP m-line syntax does not support the transfer of URIs, MSRP defines an SDP media-level attribute called “path”. A path attribute for our example would look like the following:

a=path:msrp://host.example.com:2855/asfd34;tcp

It’s called “path” because it can actually carry more than one URI. This is useful when an MSRP session crosses one or more relays. We’ll talk more about that when we cover MSRP relays in a later post.
We do not ignore the m-line and c-line completely. MSRP endpoints copy the host from the URI into the c-line, and the port into the m-line. The peer doesn’t use those fields–it looks at the path attribute instead. The fields are copied just in case something in the middle cares about them, and doesn’t understand the MSRP specific extensions. Finally, the m-line media field is set to “message”, the proto field to “TCP/MSRP”, and the fmt list to “*”. The first two identify the session as MSRP. Here’s an example:
 

c=IN IP4 host.example.com
m=message 7654 TCP/MSRP *

 
The m-line fmt field is ignored because MSRP has another extension attribute to describe allowable content formats: accept-types. The accept-types attribute carries a list of MIME format types that an endpoint understands, in order of preference. It can also include a “*” entry, meaning all types are acceptable. The following example indicates an endpoint is willing to accept any type, but prefers plain text or HTML:

a=accept-types:text/plain text/html *

There’s also an “accept-wrapped-types” attribute, which is useful when you want to require the use of some envelope type such as “message/cpim”, but still negotiate the formats allowed inside that envelope. Use of “accept-wrapped-types” can get a bit complicated for this blog posting. If you’re interested, please see the RFC.
Here’s an example to tie it all together. Notice that the lines I didn’t mention are treated the same as for any other media type.

v=0

o=alice 2890844526 2890844526 IN IP4 host.example.com

s= 

c=IN IP4 host.example.com

t=0 0
m=message 7654 TCP/MSRP *
a=accept-types:text/plain
a=path:msrp://host.example.com:7654/asfd34;tcp

   
That’s enough for today. Next time, we’ll discuss how the negotiation works with relays involved.

Overview of the Message Session Relay Protocol

September 29th, 2009by Ben Campbell under SIP

(This is the first in a series of posts about MSRP, or the Message Session Relay Protocol.)

You’ve probably noticed several mentions of a messaging protocol known as MSRP in this blog. MSRP stands for the Message Session Relay Protocol (not manufacturer’s suggested retail price). MSRP was developed in the IETF by the SIMPLE working group. It’s documented in RFC 4975 and RFC 4976. The former describes the base protocol, and the latter describes the use of relays.

(Full Disclosure: I was the editor of RFC 4975. I am also a co-chair of SIMPLE–but not at the time the RFC was published.) 

SIP also has a built in mechanism for sending instant messages: the MESSAGE method. With the MESSAGE method, the content of an instant message is carried as the payload in a SIP message. That is, message content is carried as part of the signaling path, much like with SMS. Also like SMS, the MESSAGE method provides a “pager” like user experience, where there’s no inherent connection between one message and another. Client devices may simulate conversation threads, but there’s no concept of conversations in the protocol itself. And even more like SMS, the MESSAGE method also has some pretty strict limitations on the size of the content in a single message.

MSRP is a fundamentally richer approach, and differs from the MESSAGE method in several ways. The biggest difference is that MSRP uses a media session that is separate from the SIP signaling, similarly to how RTP is used for audio, video, or other real-time media sessions.  You use a SIP INVITE transaction, carrying an SDP Offer/Answer exchange to establish an MSRP session, and a SIP BYE request to terminate the session.

The explicit start and stop of an MSRP session makes it easier to provide “chat-room” style user experiences. A user enters a chat room with an INVITE request, and leaves it with a BYE. And the fact MSRP is treated much like any other media makes it easier to mix messaging sessions with other media streams. For example, you might have a video conference stream with an associated text conference stream. Floor control features could apply to both streams.

MSRP is inherently multi-media. It can carry any arbitrary type of content, and has no built-in size limitations. Even though it was originally designed with messaging in mind, it’s really a generic content transfer mechanism. For example, I’ve seen demonstrations of an MSRP-based mechanism for sending a photo to the person on the end of a pre-existing VoIP call.

MSRP clients can communicate directly, just like RTP clients. They can also use intermediaries called MSRP Relays. (Remember, that’s the “R” in MSRP.) The main point of MSRP relays is for firewall and NAT traversal. The relays can also be used for policy enforcement and for content-logging. MSRP allows devices to multiplex many MSRP sessions over a single TCP session, which allows a pair of adjacent relays between peering partners to minimize the number of TCP connections.

MSRP is a young protocol, as protocols go. It has not seen a lot of deployment yet–but that is starting to change. We’re starting to see some MSRP deployment by operators, along with some very innovative applications–one of which Adam Roach recently pointed out.

I’ve only scratched the surface of MSRP in this post. Next time, I’ll delve into how MSRP works with the SDP Offer/Answer model, and some SDP extensions that were required to make it work.

But before I sign off for today, I’ve got to get a rant in. (Surely the reader has come to expect a rant from me, yes?) Way back when we were putting the finishing touches on the MSRP RFCs, I was working in a shared office environment. The building had very thin walls (fortunately I’m in a much nicer space now). I overheard a rather loud person on the phone, talking about MSRP with quite a bit of energy. Wow, I thought–people were already using this! Only later did I realize he was talking about MSRP as the list price of something or another.

The moral of this story is, when you choose a name for a new protocol, go do a web search. If you find hundreds of unrelated hits, it’s a bad name choice. In the case of MSRP, I’m not to blame–I didn’t name it. But what can you expect from a work group named “SIMPLE”? :-)

The Identity vs SBC Smack down

August 19th, 2009by Ben Campbell under SIP

This is my third, and probably last for a while, post on RFC 4474 and the barriers to deploying strong identity solutions.

RFC 4474 does more than just provide identity. It also provides cryptographic protection of several fields in a SIP request as well as the request body. Each of these data elements are protected for one reason or another. For now, let’s focus on the message body.

In a typical INVITE request the message body contains an SDP payload that describes, among other things, the IP address and port at which the sender wishes to receive media. If that request was signed by an RFC 4474 identity service, the SDP payload is digitally signed. If an attacker tampers with the content in route, the recipient will be unable to verify the signature, and will know something fishy is going on.

Why would you care if someone tampered with your (or your peer’s) SDP? Imagine for a minute that RFC 4474 did not protect the SDP content. Alice sends an INVITE request to Bob, but Mallory performs a MITM attack. Mallory passes the request on with the identity information intact, but changes the IP address in the SDP body. Bob verifies the signature on the identity information, and has good reason to believe the request really came from Alice–but when Bob sends RTP media, he’s talking to Mallory instead.

This scenario could be worse than if no identity service was used at all, as Bob might divulge confidential information that he would not consider saying over an “unprotected” call. Therefore, RFC 4474 requires protection of the request body.

Enter the Session Border Controller, or SBC. One feature that most SBCs have in common is the ability to pin media sessions so that all media packets are sent through the SBC. The SBC sits on the signaling path (as a SIP B2BUA) and modifies SDP offers and answers so that the endpoints send all media packets to the SBC, which then forwards the packets on to their original destination.

There are a number of reasons to do this, many of which are in support of business requirements. (I will avoid the temptation to discuss whether all those requirements are reasonable–suffice it to say that many network operators require them). For example, many operators use SBC-based media pinning to deal with NAT traversal requirements.

But this common SBC technique is exactly the same as the MITM attack technique that Mallory used against Alice and Bob a couple of paragraphs back. The endpoints can’t tell the difference between Mallory’s malicious attack and the SBCs attempt to be helpful. Or to state it more strongly, from the endpoint perspective, the SBC’s modification of SIP message bodies is indistinguishable from an attack.

The world of business requirements is a messy one. It’s common for requirements to conflict with one another, and service providers must make their own priority decisions. But as a customer of such services, I would be willing to pay more to get a strong identity service. I doubt I’m the only one.

More on RFC 4474

July 7th, 2009by Ben Campbell under SIP

In my previous post, I refuted the PKI criticisms against the SIP identity mechanism described in RFC 4474. This week, I will describe a real issue.

One very real issue is the venerable phone number. Much of the industry has forgotten that SIP is not about phone numbers. The SIP URI takes a form more like an email address. It has one critical component that phone numbers do not have–that is, the authority. My email address (ben.campbell@tekelec.com) tells you two things. The obvious is that my email user name is “ben.campbell”. But more importantly, it tells you that “tekelec.com” is the only legitimate authority on what the “ben.campbell” part means. There might be a “ben.campbell” at some other domain. That person is probably not me.

The same is true of SIP URIs. Imagine the URI of “sip:foo@bar.baz”. Only the domain of “bar.baz” knows what “foo” means. Furthermore, using RFC 4474, only “bar.baz” can authoritatively tell you that a SIP INVITE request really came from “sip:foo@bar.baz”. More concretely, RFC 4474 requires an entity owning a certificate bound to “bar.baz” to digitally sign a hash across several important bits of the INVITE request.

So let’s look at another example. Imagine you receive an INVITE from “tel:+1234567890″. Who is the authority now? RFC 4474 doesn’t handle this scenario. From a SIP purist perspective, this is a corner case. (Remember, SIP is not about phone numbers.) But from a real deployment perspective, phone numbers are the rule, not the exception.

One way around this is to have the calling domain use a SIP URI of the form “sip:+1234567890@bar.baz; user=phone” rather than using a “tel:” URL.  Now we’ve got a bona fide authority that can sign things. But what does that signature mean? In the “sip:foo@bar.baz” case, the authority is saying something to the effect of “I promise on my reputation that foo is calling you. If you trust me, you can be sure of it.”

But the most common scenario where you get a SIP INVITE from a phone number is when the call originated on the PSTN and crossed a PSTN-to-SIP gateway. In this case, an RFC 4474 signature would mean something more like “I received a PSTN call with a calling party ID of +1234567890. It’s probably legit, but you’ve heard about all those Caller ID spoofing attacks, right?”

There’s been a lot of discussion about this issue in the Real-time Application and Infrastructure area of the IETF. I believe this can be solved pretty easily–we just have to agree on how. The issue is pretty esoteric–and considering that one of the major applications of SIP Identity is to provide the VoIP equivalent of caller-id, we can probably live with it in the short run. Hopefully we can do better in the long run.

Next up from me: The Identity vs SBC Smack down. 

 

What’s Wrong with RFC 4474 Anyway?

May 27th, 2009by Ben Campbell under SIP

In my most recent post, I mentioned that RFC 4474 is promising for end-to-end identity, but that there have been some issues deploying it. Let’s explore some of these issues. 

First, some background. As I mentioned, SIP does not come, out of the box, with a clean way to tell the recipient of an offer who sent the offer in the first place. Then came the P-Asserted-Identity (P-AID) extension.

P-AID gave us a new header field for carrying the sender identity–sort of a Caller-ID for SIP. But P-AID offered no integrity protection of the identity. It was useful for carrying identity inside a private network, but was useless if you crossed a trust boundary. There was no way to detect if someone forged or tampered with the identity information before it got to the destination domain.

RFC 4474 gives us the Identity and Identity-Info header fields. An abstract “authentication service” is responsible for authenticating the sender, and checking that the From value contains an address-of-record (AoR) that the sender is allowed to use.  The authenticating service inserts a signed hash of the From value (among other things) into the Identity header field, therefore providing integrity protection of the sender’s identity. If anyone tampers with the From value in route, the recipient can find out by checking the signature.

Sounds great, right? So, what’s the problem?

The argument I hear the most is that RFC 4474 requires a Public Key Infrastructure (PKI). The argument goes that, for one reason or another, we’ve not had success deploying large scale PKIs in general, and it’s not feasible to issue a certificate to every user agent.

This argument neglects the fact that there is at least one very successful large-scale PKI deployment. That is, HTTPS. You probably use HTTPS every day to access your bank’s web page, make online purchases, etc. As it’s usually used, HTTPS doesn’t require your web browser to have a certificate–only the server has one. The various Certificate Authorities are quite happy to sell server certificates to be used with HTTPS.

The simplest way to deploy RFC 4474 is to make the same proxy/registrar that already authenticates user agents (typically using digest authentication) act as the authentication service. The only certificate needed is a server certificate for the proxy/registrar. Client certificates are completely unnecessary. This is exactly analogous to the certificate use in HTTPS.

In case you haven’t noticed, I don’t buy the PKI argument against RFC 4474. Not even a little bit.

In all fairness, there are other arguments against RFC 4474. Probably the biggest ones are the fact that it has problems working across session board controllers (SBCs), and it’s not clear what it means when the sender AoR is a PSTN number. But that’s enough for now–I will discuss the other issues in future posts.

<% Response.Write("" & vbcrlf) %>