Archive

Archive for June, 2010

FAQ: Why do SIP proxies sometimes absorb responses?

June 29th, 2010by Robert Sparks under SIP

SIP proxies play a key role in realizing SIP’s rendezvous service – helping entities that want to communicate find each other by forwarding SIP requests to the places they can best be served.

 

A proxy can try the request at more than one place, either one at a time, a few at a time, or all at once. This behavior is called forking.

A request may be forked by more than one proxy on the way to its destination.

Each of the places receiving the request is supposed to generate a response. The proxy is responsible for choosing the “best” response to forward back to the requester. Except for one special case, the proxy will only return one final response to the request it receives. The final responses from the other branches of the fork are dropped at the proxy. RFC 3261 section 16.7 discusses how the proxy chooses the best response.

SIP proxies were originally designed this way to allow the endpoints to have the same behavior whether there were proxies in the path of a request or not. (Other design decisions forced different behavior at the endpoints anyhow). Dropping the other requests was a tradeoff, and it introduced a problem known in the specifications as HERFP (the Heterogeneous Error Return Forking Problem). When there is a mix of error responses from the various fork branches, only one is returned to the requester, but that requester might have been able to do something useful with the other responses. In the example above, the rightmost phone’s response could have included information about when to try the request again. The original requester would have learned that a future call there had a reasonable probability of being accepted.

Though the condition is known as HERFP, it applies to non-error responses for non-INVITE requests like REGISTER or SUBSCRIBE. For any non-INVITE request, a proxy will only return one final response, whether it’s a success or error response. This is why the SIP Events extension to SIP requires that elements accepting a subscription MUST send an immediate NOTIFY.

 
 

There is one special case where a proxy might return more than one final response to a request. When a proxy sees a 200 OK to an INVITE it is required to forward that to the requestor, even if it has already forwarded a 200 OK from another branch. This exception was added to the protocol to allow the calling user to choose which person to talk to if more than one endpoint answers a call. Other protocol rules try to make this condition unlikely. When the proxy sees the first 200 OK to the INVITE, it will send a CANCEL request to all the other branches. A second 200 OK could only be received from one of those branches if it crossed that CANCEL on the wire (or in the processing at the endpoint). Unfortunately, it’s not hard to encounter that race-condition in practice. It’s up to the endpoint to decide what to do if it finds itself in multiple calls after sending an INVITE. Many deployed endpoints today send an immediate BYE to peers beyond the first accepting their call.

There have been a few proposals in the past for changing or extending the protocol to avoid HERFP, allowing the endpoint to learn about all the final responses that currently get absorbed at proxies. None of them achieved consensus. So far, it’s an open problem.

Are Cell Phones Going to Claim the Identity Assertion Role?

June 17th, 2010by Jiri Kuthan under SIP

Last week, we had a distinguished professor, Andrew Odlyzko, as an invited guest to our “Brilliance in Innovation” lecture series. The critique Andrew articulated about state-of-the art security systems was that increasing security level to perfection is expensive and hard-to-achieve. At the same time, establishing secure communication by multiple, though simple, channels can achieve reasonable robustness without the curse of perfection. An example of this is confirming online banking transactions by sending a text message from your cell phone.

What I find interesting about this example is the role of a cell phone – an extremely useful and at the same time underutilized instrument to establish one’s identity.

Clearly identity is a key notion to our society. We know many forms of it, and life without them would be less convenient. Think of a credit card number for payments, a car license plate to identify traffic offenders, an online auction history to the establish reputation of buyers and sellers. None of these identity types are as waterproof as DNA, but still how comfortable would you feel communicating, socializing, trading and living without them?

Nowadays thinking and sharing loudly over the Internet, as manifested by Facebook, seems to prevail in users’ desires. I believe though that within a few years the desires for privacy refuge will strongly emerge. In other words users, not just businesses but consumers too will want to be in control of what they share with whom.  The technical prerequisite is establishing who is who, and the question to make ourselves busy with is HOW? The answer must be an instrument that is workable on a worldwide basis and must have a price-tag – other identity forms would be just indeed worthless. Could cell phones do this for us?

My quick answer is yes. I’m betting on cell phones to be THE identity vehicle within next five years. Cell phone companies have established immense coverage of the global population, have close relationships to their users, and are therefore in a fairly good position to assert their identity. Mobile Internet’s share is increasing rapidly and cell phone-based identity can be used on a global basis for any applications. The applications may not include just online services, such as online banking, but also confidentiality by encryption or payment.

This all still seems certain to me – stronger identity, confidentiality, payments are all tied to cell phones. What remains to be seen though is who is going to prevail in putting this “new non-anonymous world” together. Apparently, mobile providers would be obvious example. However, most widely deployed applications, like SMS-verified online banking, didn’t want to wait and have carefully avoided dependence on mobile operators and phones. They work simply and efficiently in-band.

There are however scenarios with higher demands for security: think of confidentiality and the inconvenience it would take to run encryption handshake over text messages. Applications on smartphone could ask third parties for identity assertion. Conversely, the applications could ask service providers for the same assertion.

So the final question to myself is: will it be the Verizons or VeriSigns of this world who will add the notion of identity to the sharing culture?

The MSRP Report Model

June 9th, 2010by Ben Campbell under SIP

MSRP offers a highly configurable reporting model. The model offers two mechanisms for an endpoint to learn the status of the messages it sends.
 
The first, and simplest, mechanism is the transaction response to a SEND request. Here’s an example from RFC 4975:

MSRP a786hjs2 SEND
To-Path: msrp://biloxi.example.com:12763/kjhd37s2s20w2a;tcp
From-Path: msrp://atlanta.example.com:7654/jshA7weztas;tcp
Message-ID: 87652491
Byte-Range: 1-25/25
Content-Type: text/plain

Hey Bob, are you there?
——-a786hjs2$

MSRP a786hjs2 200 OK
To-Path: msrp://atlanta.example.com:7654/jshA7weztas;tcp
From-Path: msrp://biloxi.example.com:12763/kjhd37s2s20w2a;tcp
——-a786hjs2$

A SEND transaction response contains a status code similar to the status in a SIP response. Keep in mind, though, the semantics of each status code are not quite identical to those for SIP. In this case, the “200″ status indicates successful delivery. Since the To-Path and From-Path header fields each contain a single URI, we know this transaction was sent peer-to-peer. That is, there are no MSRP relays involved in this transaction. Also, notice that the To-Path and From-Path values are reversed in the response. Unlike SIP, which copies the From and To values from the request into the response without switching them, the To-Path and From-Path header fields in an MSRP response indicate the actual routing of the response.
Responses to SEND requests are sent hop-by-hop, rather than end-to-end. This is due, among other things, to the way that MSRP relays can re-chunk messages, as I discussed in a previous post. Responses for non-SEND methods are sent end-to-end.
If the sender doesn’t get the response within 30 seconds, it assumes the request failed. 
  
The hop-by-hop nature of SEND responses can be insufficient if MSRP relays are in the session path. How can the sender learn about the success or failure of a message once a relay has forwarded it downstream? For example, what if the relay gets a failure response from its next hop? For this scenario, the relay can send an MSRP REPORT request. Here’s an example:

MSRP dkei38sd REPORT
To-Path: msrp://alicepc.example.com:7777/iau39soe2843z;tcp
From-Path: msrp://bob.example.com:8888/9di4eae923wzd;tcp
Message-ID: 12339sdqwer
Byte-Range: 1-106/106
Status: 000 200 OK
——-dkei38sd$

A report request carries very similar information to the transaction response for a SEND request. Notice the “Status” header field, which carries the same sort of status code that you might see in a transaction response. The “000″ prefix indicates the status code is one of the ones defined in the base specification. This is an extension hook, where other specifications could create status code “name spaces” with different prefixes. Remember from my last post that a REPORT request reports status for a range of bytes, which may or may not line up with the range in any particular SEND request that the sending endpoint actually sent.
A REPORT request is a bona-fide MSRP request. However, MSRP devices do not send transaction responses for REPORT requests. They should never send REPORT requests in response to other REPORT requests. REPORT requests that include a failure code, aka “Failure Reports”, are sent from the reporting element all the way back along the session path to the sender of the original request. On the other hand, “Success Reports” are always sent end-to-end, since only the endpoint can no for sure that a message was delivered successfully.
This is all complicated by the fact that MSRP allows a great deal of configuration in the reporting model. When the sender creates a SEND request, it can independently select whether it wants to receive success reports and failure reports, on a per message basis. It does this by including the optional Success-Report and Failure-Report header fields. 
Success-Report can take a value of “yes” or “no.” Failure-Report can take those same values, plus the value of “partial.” The defaults are “no” for Success-Report and “yes” for Failure-Report
If the sender wants delivery confirmation, it sets Success-Report to “yes”. The default is “no”, so if the header is not inserted, success reports won’t be sent.
Along the same line, if the sender wants to suppress failure reports, it can set Failure-Report to “no.” But there’s a catch here–if Failure-Report is “no,” that also suppresses transaction responses to the request. That means there’s really no failure detection at all, other than what might be detected by TCP. That may be counter-intuitive, but it makes sense for some applications. Examples include system messages sent by an administrator or broadcast messages sent by an emergency services agency. It also may make sense for high volume applications where it would be too heavy weight to send all those responses, and the TCP layer provides sufficient reliability. Keep in mind that Failure-Report and Success-Report can be set independently, so you can suppress failure reports and transaction responses, but still request success reports.
Then there’s the really strange sounding mode, where Failure-Report is set to “partial.” This mode suppresses transaction responses just like if it was set to “no.” But it still allows failure reports. This lets MSRP elements opportunistically report any errors they learn about through other means. For example, transport layer errors by downstream devices.
I think this will be the last of my MSRP related posts for a while, unless readers have specific questions. Please feel free to ask questions in the comments section. Otherwise, I will move onto some new and different topic in my next post.

SIP and “Secure” Communication: What does it mean?

June 1st, 2010by Adam Roach under SIP

One of the recurring topics in the discussion of SIP security is how you give users the information they need to make informed decisions. In most of these conversations, a parallel is drawn between web browser security and SIP security – usually, in terms of  “why can’t SIP terminals have a simple lock icon that tells the user the call is secure?” And all major web browsers do have a simple visual indicator, like these two from Internet Explorer and Firefox:

Macintosh HD:Users:adam:Desktop:Screen shot 2010-05-25 at May 25, 14.10.34.png  Macintosh HD:Users:adam:Desktop:Screen shot 2010-05-25 at May 25, 14.11.07.png

Unfortunately, the issue with SIP is significantly more difficult than that. With web browsers, you really need to ensure only two things: that the website you’re connecting to is the web site you think you’re connecting to (authentication), that no one other than you and the website can see the information you’re sending and receiving (confidentiality). For the web, this is easy to do because TLS (used by https) provides both of these properties.

With SIP, you have at least five different major problems to solve – and possibly more, depending on how you account for them: Caller-ID, Called Party Identity, Media Privacy, Media Authentication, and Signaling Confidentiality.

Caller ID and Called Party Identity

First, when a call arrives, the user is going to want to know who is calling, similar to Caller-ID on today’s PSTN. Jiri did a series of posts (1,
2,
3) detailing the need for identity in the SIP network. (While this is a good treatment of the need for identity, I think its conclusion – that we should use the same spam-prevention mechanisms as email – is a bit naïve; as Ben later points out, 94% of all email is spam, and I think we need to do better than that.)

While some techniques can be employed to “spoof” caller ID information on the PSTN, it’s difficult to do, so people generally can and do trust what their phone says when it rings. On the other hand, since SIP signaling flows all the way out to the edge of the network, this kind of identity is much easier to fake in a SIP network. Some deployment architectures have developed specialized “transitive trust” models that get you pretty close to what the PSTN provides today, but they don’t work across the general Internet, or when you transition from one architecture to another.

A more bulletproof means of conveying identity can be performed with RFC 4474, which uses cryptography to let a proxy on the call path make an assertion about the calling party’s identity. Unfortunately, RFC 4474 does suffer from some deployment difficulties, such as perceived deficiencies in key distribution, the difficulty in asserting ownership of phone numbers, and bad interactions with SBCs. And while there are good answers to each of those issues, they still have slowed down acceptance of RFC 4474 as a solution.

A related issue is validation that the person you’re trying to reach is the person you’ve actually reached. For example, if Alice is trying to reach Bob but really reaches Charlie, she needs to know this to make an informed decision. This is even more important when Alice is trying to reach, for example, her bank. There are fairly benign reasons that the called party might not be who the caller was trying to reach – a call-forwarding service, for example – but it also may indicate something more nefarious. To fill this niche, RFC 4916
defines a mechanism for conveying called party identity back to a calling party. It shares RFC 4474’s strengths (cryptographic assertions, leveraging the web’s public key infrastructure), but suffers from the same drawbacks as well.

One interesting twist to the behavior of RFCs 4474 and 4916 is that they only protect the caller and called parties’ addresses, not their names. To protect things like caller names, it becomes necessary to use a mechanism like cryptographic certificates with S/MIME.

Media Privacy and Authentication

Another user expectation of “secure calls” is a guarantee that third parties cannot intercept their call.  This is especially important when users make calls on a shared network, such as a public WiFi network, a hotel network, or certain types of cable networks. Unless the media itself is encrypted, anyone on the same network can use any one of a variety of easy-to-use call interception tools, including some very sophisticated, free ones, and record any call or calls they want to.

The other issue with media is ensuring that the media you receive is coming from the person you think it is. The ability to insert new media into a call can be highly damaging for certain types of calls.

Unfortunately, this area has historically suffered from too many solutions, as opposed to not enough. Luckily, the IETF finally winnowed the solution space down to a single approach for SIP media encryption: RFC 5763. There is also a competing solution in zRTP. This approach has some interesting properties that Jiri discussed in a previous posting – but it also suffers some non-technical drawbacks (see my response at the end of that article) that are likely to limit its deployment outside of the opensource and hobbyist communities. And, while zRTP provides encryption, it requires an onerous manual step to ensure that you’re talking to the person you think you’re talking to (and, without this protection, your call can be listened to by a sophisticated attacker in the middle of the network).

Hopefully, with the recent publication of RFC 5763, we’ll start seeing more vendor support for media privacy and authentication.

Signaling Confidentiality

A final aspect of SIP security that needs to be addressed is confidentiality of the signaling information itself. For voice calls, access to the signaling allows you to figure out who called whom and when. And, while the privacy implications of exposing that kind of information are evident enough, things get much worse once you start mixing in features like instant messaging and presence: eavesdroppers on this information can learn highly sensitive information, such as the contents of instant message conversations.

Support of TLS to protect information as it passes between network entities (say, from a phone to its proxy) is required by the baseline SIP protocol, and has fairly good implementation (on the average, approximately 50% of the implementations at the SIPit interop event
have had TLS support over the past few years). That’s a really good way to ensure that arbitrary third parties can’t eavesdrop on the information being sent.

But TLS doesn’t protect information from being intercepted by servers on the call path.

And while I might be happy to get my SIP service from bobs-discount-voip.com, I may be a bit more reticent to trust them with things I send and receive via instant messages – things like my banking information. And that brings us back to the use of S/MIME certificates, which can be used to hide this kind of information from proxies on the path (while still providing them enough information to route messages correctly).

Summary

So, back to the original question: if you wanted to have a simple, visual indicator to indicate that a call is secure… what would it mean? Is it a promise that the phone number on the caller ID is correct? How about the name? Does it mean that the media is encrypted? And, if it is, can you be sure it’s coming from where you think it’s coming from? Is the signaling protected? And, if so, is it protected from everyone, or can proxies along the call path read it? There are so many degrees of freedom here that there’s no good way to render them all to the user in a sensible fashion. And an all-or-nothing indicator (like a single lock icon) is completely nonsensical – as you’ve seen, SIP security is just about as far from “all-or-nothing” as you can get.

At this point, sadly, it’s mostly a moot point anyway – just about all SIP service providers employ exactly none of these techniques. But as user expectations around identity and privacy start colliding with the reality of service providers’ carelessness, we’re going to run into a few challenges making sure that users can be given the information they need to make informed decisions.


<% Response.Write("" & vbcrlf) %>