Archive

Posts Tagged ‘Standards’

SIMPLE Working Group Update

March 1st, 2011by Ben Campbell under SIP

As I and others on this blog have mentioned on several occasions, the SIMPLE (or the more formal and rather awkward: “SIP for Instant Messaging and Presence Leveraging Extensions”) working group of the IETF has been responsible for defining how to do Presence and Instant Messaging applications using SIP and related protocols. The SIMPLE working group has existed for some time; in fact, it’s one of the oldest ongoing working groups in the Real-time Applications and Infrastructure (RAI) area of the IETF. I am currently a co-chair for SIMPLE.

I write to tell you that SIMPLE’s work is almost done. We are finally seeing the light at the end of this long tunnel. Of the four remaining work items, one is in the AUTH48 state. (This means that the RFC editor has presented a candidate for the final RFC version back to the authors for any last minute edits and approval.) One entered Working Group Last Call (WGLC) last week. There are only two work items that may still see controversy, and one of those is in IESG review.

These drafts are, respectively, draft-ietf-simple-msrp-acm, draft-ietf-simple-simpledraft-ietf-simple-msrp-sessmatch, and draft-ietf-simple-chat.

The first draft extends the MSRP protocol to allow the endpoints to negotiate which one will open a TCP connection to its peer. I blogged about this draft some time ago. We should see publication of the resulting RFC any day now. In fact, it’s already been assigned a number: RFC 6135. [Update: RFC 6135 was officially published on February 28.]

The second, draft-ietf-simple-simple (aka “SIMPLE made Simple”), is an informational draft that acts as a road-map and secret-decoder-ring for the various specifications produced by the SIMPLE working group. (Keep in mind, that there is no one protocol known as SIMPLE. But we still tend to use the term SIMPLE informally to refer to the resulting suite of protocols and architecture.) The fact that this draft is in WGLC means the author believes that this draft is essentially ready to be sent to the IESG for final review and publication. It’s possible that the last call review could uncover some controversial point that would require more work. But given the nature of this draft, I expect that any WGLC feedback is more likely be clarification and editorial comments.

We do know in advance, however, that draft-ietf-simple-simple may require minor editing to reflect the final disposition of the last two drafts below. This means that, regardless of its current completion state, draft-ietf-simple-simple will be the last draft to be published by SIMPLE.

Draft-ietf-simple-msrp-sessmatch describes an extension to MSRP to make it more friendly to Session Border Controllers (SBCs). The way that MSRP devices match TCP connections to message sessions means that, if an MSRP session traverses an SBC, that SBC has to re-write the To-Path and From-Path header fields in a manner similar to an MSRP Relay. Some working group participants expressed concern that this requirement could impact SBC performance. The sessmatch draft would allow supporting endpoints to work across SBCs that do not change MSRP messages en route. However, there are still ongoing discussions concerning the impact on security and interoperability.

Assuming that the sessmatch draft has not become a moot point by then, I plan to go into considerably more detail on it and the surrounding controversy in my next blog entry.

Then, finally, there’s draft-ietf-simple-chat. This draft defines how to create MSRP “chatrooms” with conference servers. There’s still some controversy over how this draft interacts with some similar work from the XCON working group.

Hopefully, we will resolve the issues around these last two drafts soon–at which time I hope to be able to entitle a blog entry as “SIMPLE Finally Done!”

SIP and Network Data: Simplified at Last

August 24th, 2010by Adam Roach under SIP

Many proposed deployments of SIP are seeing an increasing number of components based on HTTP – for storing information such as feature settings, user-provisioned data, instant message archives, and user agent configuration information.

Unfortunately, while HTTP’s ubiquity makes it a good candidate for storing and retrieving this information, it doesn’t serve as a particularly good substrate for finding out as soon as the information changes. In a real-time system, finding out about these changes immediately becomes important.

There have been some efforts to make HTTP more responsive to these kinds of changes – using, for example, Comet-style approaches. In fact, the IETF has even begun work on a standardized mechanism in the HYBI working group. But Comet stretches the HTTP request-response architecture well beyond its original design goals, resulting in a less-efficient and less-scalable model than can be achieved by a purpose-built solution. And HYBI is still a very young working group, unlikely to yield usable results in the short-term.

Now, the SIP community did recognize the need to store information in the network and discover changes to such information pretty early on. It is for this exact purpose that XCAP (RFC 4825) was developed. Technically, XCAP only provides the mechanism for storing and retrieving information in the network – it is used in conjunction with two companion specifications – RFC 5874 and  RFC 5875 – to find out about changes to the information in real-time.

This XCAP approach is really very robust, scalable, and well-designed. Unfortunately, at 122 pages spread across three documents, it also ended up being rather labyrinthine. It also requires the data to be stored not just as XML, but as XML with certain special restrictions that make addressing individual elements in the document easier. As a consequence, the implementation community, industry fora, and other standards bodies – and IETF working groups, for that matter – have been somewhat loath to use XCAP.

Clearly, we need a simpler mechanism.

The forthcoming RFC 5989 defines exactly this simpler mechanism. Rather than trying to define a large, complicated framework, RFC 5989 defines a fairly minimal SIP event package. Interested clients can use this event package to request notification whenever a specified HTTP resource changes. Here’s how it works.

When a client gets an HTTP resource, it also receives what is called a “link relation.” Link relations are simply a URI that is related to the resource in some way. These link relations can be carried in the HTTP response header, in HTML bodies (using the <link> element), and in ATOM bodies (using the <atom:link> element). They also receive a unique identifier (typically, an ETag) that corresponds to the current contents of the HTTP resource. So, if the resource changes, this unique identifier changes also.

RFC 5989 defines a new link relation type that contains a SIP URI. Clients who want to know when the resource changes subscribe to this SIP URI using the SIP SUBSCRIBE method. Whenever the HTTP resource changes, the clients receive a new SIP NOTIFY message containing a new unique identifier for the changed HTTP resource. They can then compare this tag against the tag in their local copy of the resource, and download a new copy.

SIP and Network resized 600 

By using this approach, clients can maintain a completely up-to-date view of the value of an HTTP resource without constantly polling the HTTP server, resorting to the long-poll approaches of Comet, or burdening the data with the restrictions and complications of XCAP.

Enabling Location-Based Services while Protecting Privacy

August 19th, 2010by Robert Sparks under SIP

An increasing number of portable devices (such as cell-phones) are becoming location-aware. Services such as restaurant finders, turn-by-turn navigation tools, and social networking sites are already leveraging any location information these devices provide. So far, these services primarily use custom, proprietary mechanisms to convey location information from the devices to the application.

Standard mechanisms for representing and conveying location information have been defined. These standards recognize that carrying a simple geospatial (latitude-longitude) coordinate or a  civic address isn’t sufficient. It’s also important to indicate how this location may be used. The IETF’s GEOPRIV working group has specified a Location Object format that addresses that concern, as well as a rich policy language that allows a user to control who can see his location, and with what precision their location is exposed. There are many challenges related to privacy-protection in location systems, and providing this control over location precision is one of the tougher ones. It is difficult to design a system that doesn’t expose more information than intended.

Let’s look at a couple of examples where applications are using the location of a given user. These applications will query (or subscribe to) the user’s location service, which could be a network hosted service that communicates with the user’s location-aware devices, could be one of the actual devices.

Suppose Mary, a user in the United States, chooses an “only expose what state I’m in” policy. The simplest implementation of that policy would be to tell any asking application what state Mary’s in whenever it asks. Mary’s expectations of privacy are easily met as long has she primarily moves around within one state. But with that simple implementation, when Mary crosses a border into another state, applications learn much more than what state she’s in – they know which border she’s near. In some situations, that may be enough to deduce her location with under a mile’s worth of uncertainty. For instance, if she were to travel along US 160 from Arizona into Colorado, applications would see her location transition from Arizona to New Mexico, and then to Colorado. The interval between those transitions gives the application a good estimate of her speed, and knowing she’s traveling at highway speeds, the applications can be fairly confident she’s on 160 (there are no other roads that would allow a transition between those states with that timing).

ar co

One way for the location service to respect Mary’s privacy requirements in this case would be to obfuscate the transitions between the states in time, perhaps not exposing the brief transition through New Mexico at all.

Bob might choose a seemingly easier policy to implement – “show where I am, but only to within 100 meters”. A simple implementation of that policy would be to expose Bob’s location as a circle with a radius of 100 meters, covering Bob’s current location, but randomly centered somewhere around Bob’s actual location.

bob 4

The location service would return that circle as long as Bob’s actual location is inside it.

describe the image

When Bob leaves the circle, the location service generates a new covering circle.

describe the image

Unfortunately, if the application knows this is how the location server implements Bob’s privacy requirement, it just learned Bob’s location much more precisely than to within 100 meters. The problem is the application knows that Bob just left the old circle, so he is somewhere close to the edge of it, and is within the new circle, so the application has a very good idea of where Bob actually is.

bob 3a

So, this simple implementation is insufficient to respect Bob’s privacy requirement – a more intricate algorithm will be required. While different location servers do not need to use the same method, a well-known algorithm with good privacy preserving properties would be very valuable. Discussions of a standard algorithm to satisfy this kind of requirement are underway in the GEOPRIV working group.

SIP and “Secure” Communication: What does it mean?

June 1st, 2010by Adam Roach under SIP

One of the recurring topics in the discussion of SIP security is how you give users the information they need to make informed decisions. In most of these conversations, a parallel is drawn between web browser security and SIP security – usually, in terms of  “why can’t SIP terminals have a simple lock icon that tells the user the call is secure?” And all major web browsers do have a simple visual indicator, like these two from Internet Explorer and Firefox:

Macintosh HD:Users:adam:Desktop:Screen shot 2010-05-25 at May 25, 14.10.34.png  Macintosh HD:Users:adam:Desktop:Screen shot 2010-05-25 at May 25, 14.11.07.png

Unfortunately, the issue with SIP is significantly more difficult than that. With web browsers, you really need to ensure only two things: that the website you’re connecting to is the web site you think you’re connecting to (authentication), that no one other than you and the website can see the information you’re sending and receiving (confidentiality). For the web, this is easy to do because TLS (used by https) provides both of these properties.

With SIP, you have at least five different major problems to solve – and possibly more, depending on how you account for them: Caller-ID, Called Party Identity, Media Privacy, Media Authentication, and Signaling Confidentiality.

Caller ID and Called Party Identity

First, when a call arrives, the user is going to want to know who is calling, similar to Caller-ID on today’s PSTN. Jiri did a series of posts (1,
2,
3) detailing the need for identity in the SIP network. (While this is a good treatment of the need for identity, I think its conclusion – that we should use the same spam-prevention mechanisms as email – is a bit naïve; as Ben later points out, 94% of all email is spam, and I think we need to do better than that.)

While some techniques can be employed to “spoof” caller ID information on the PSTN, it’s difficult to do, so people generally can and do trust what their phone says when it rings. On the other hand, since SIP signaling flows all the way out to the edge of the network, this kind of identity is much easier to fake in a SIP network. Some deployment architectures have developed specialized “transitive trust” models that get you pretty close to what the PSTN provides today, but they don’t work across the general Internet, or when you transition from one architecture to another.

A more bulletproof means of conveying identity can be performed with RFC 4474, which uses cryptography to let a proxy on the call path make an assertion about the calling party’s identity. Unfortunately, RFC 4474 does suffer from some deployment difficulties, such as perceived deficiencies in key distribution, the difficulty in asserting ownership of phone numbers, and bad interactions with SBCs. And while there are good answers to each of those issues, they still have slowed down acceptance of RFC 4474 as a solution.

A related issue is validation that the person you’re trying to reach is the person you’ve actually reached. For example, if Alice is trying to reach Bob but really reaches Charlie, she needs to know this to make an informed decision. This is even more important when Alice is trying to reach, for example, her bank. There are fairly benign reasons that the called party might not be who the caller was trying to reach – a call-forwarding service, for example – but it also may indicate something more nefarious. To fill this niche, RFC 4916
defines a mechanism for conveying called party identity back to a calling party. It shares RFC 4474’s strengths (cryptographic assertions, leveraging the web’s public key infrastructure), but suffers from the same drawbacks as well.

One interesting twist to the behavior of RFCs 4474 and 4916 is that they only protect the caller and called parties’ addresses, not their names. To protect things like caller names, it becomes necessary to use a mechanism like cryptographic certificates with S/MIME.

Media Privacy and Authentication

Another user expectation of “secure calls” is a guarantee that third parties cannot intercept their call.  This is especially important when users make calls on a shared network, such as a public WiFi network, a hotel network, or certain types of cable networks. Unless the media itself is encrypted, anyone on the same network can use any one of a variety of easy-to-use call interception tools, including some very sophisticated, free ones, and record any call or calls they want to.

The other issue with media is ensuring that the media you receive is coming from the person you think it is. The ability to insert new media into a call can be highly damaging for certain types of calls.

Unfortunately, this area has historically suffered from too many solutions, as opposed to not enough. Luckily, the IETF finally winnowed the solution space down to a single approach for SIP media encryption: RFC 5763. There is also a competing solution in zRTP. This approach has some interesting properties that Jiri discussed in a previous posting – but it also suffers some non-technical drawbacks (see my response at the end of that article) that are likely to limit its deployment outside of the opensource and hobbyist communities. And, while zRTP provides encryption, it requires an onerous manual step to ensure that you’re talking to the person you think you’re talking to (and, without this protection, your call can be listened to by a sophisticated attacker in the middle of the network).

Hopefully, with the recent publication of RFC 5763, we’ll start seeing more vendor support for media privacy and authentication.

Signaling Confidentiality

A final aspect of SIP security that needs to be addressed is confidentiality of the signaling information itself. For voice calls, access to the signaling allows you to figure out who called whom and when. And, while the privacy implications of exposing that kind of information are evident enough, things get much worse once you start mixing in features like instant messaging and presence: eavesdroppers on this information can learn highly sensitive information, such as the contents of instant message conversations.

Support of TLS to protect information as it passes between network entities (say, from a phone to its proxy) is required by the baseline SIP protocol, and has fairly good implementation (on the average, approximately 50% of the implementations at the SIPit interop event
have had TLS support over the past few years). That’s a really good way to ensure that arbitrary third parties can’t eavesdrop on the information being sent.

But TLS doesn’t protect information from being intercepted by servers on the call path.

And while I might be happy to get my SIP service from bobs-discount-voip.com, I may be a bit more reticent to trust them with things I send and receive via instant messages – things like my banking information. And that brings us back to the use of S/MIME certificates, which can be used to hide this kind of information from proxies on the path (while still providing them enough information to route messages correctly).

Summary

So, back to the original question: if you wanted to have a simple, visual indicator to indicate that a call is secure… what would it mean? Is it a promise that the phone number on the caller ID is correct? How about the name? Does it mean that the media is encrypted? And, if it is, can you be sure it’s coming from where you think it’s coming from? Is the signaling protected? And, if so, is it protected from everyone, or can proxies along the call path read it? There are so many degrees of freedom here that there’s no good way to render them all to the user in a sensible fashion. And an all-or-nothing indicator (like a single lock icon) is completely nonsensical – as you’ve seen, SIP security is just about as far from “all-or-nothing” as you can get.

At this point, sadly, it’s mostly a moot point anyway – just about all SIP service providers employ exactly none of these techniques. But as user expectations around identity and privacy start colliding with the reality of service providers’ carelessness, we’re going to run into a few challenges making sure that users can be given the information they need to make informed decisions.


What’s Wrong with RFC 4474 Anyway?

May 27th, 2009by Ben Campbell under SIP

In my most recent post, I mentioned that RFC 4474 is promising for end-to-end identity, but that there have been some issues deploying it. Let’s explore some of these issues. 

First, some background. As I mentioned, SIP does not come, out of the box, with a clean way to tell the recipient of an offer who sent the offer in the first place. Then came the P-Asserted-Identity (P-AID) extension.

P-AID gave us a new header field for carrying the sender identity–sort of a Caller-ID for SIP. But P-AID offered no integrity protection of the identity. It was useful for carrying identity inside a private network, but was useless if you crossed a trust boundary. There was no way to detect if someone forged or tampered with the identity information before it got to the destination domain.

RFC 4474 gives us the Identity and Identity-Info header fields. An abstract “authentication service” is responsible for authenticating the sender, and checking that the From value contains an address-of-record (AoR) that the sender is allowed to use.  The authenticating service inserts a signed hash of the From value (among other things) into the Identity header field, therefore providing integrity protection of the sender’s identity. If anyone tampers with the From value in route, the recipient can find out by checking the signature.

Sounds great, right? So, what’s the problem?

The argument I hear the most is that RFC 4474 requires a Public Key Infrastructure (PKI). The argument goes that, for one reason or another, we’ve not had success deploying large scale PKIs in general, and it’s not feasible to issue a certificate to every user agent.

This argument neglects the fact that there is at least one very successful large-scale PKI deployment. That is, HTTPS. You probably use HTTPS every day to access your bank’s web page, make online purchases, etc. As it’s usually used, HTTPS doesn’t require your web browser to have a certificate–only the server has one. The various Certificate Authorities are quite happy to sell server certificates to be used with HTTPS.

The simplest way to deploy RFC 4474 is to make the same proxy/registrar that already authenticates user agents (typically using digest authentication) act as the authentication service. The only certificate needed is a server certificate for the proxy/registrar. Client certificates are completely unnecessary. This is exactly analogous to the certificate use in HTTPS.

In case you haven’t noticed, I don’t buy the PKI argument against RFC 4474. Not even a little bit.

In all fairness, there are other arguments against RFC 4474. Probably the biggest ones are the fact that it has problems working across session board controllers (SBCs), and it’s not clear what it means when the sender AoR is a PSTN number. But that’s enough for now–I will discuss the other issues in future posts.

FAQ: What are SIP-I and SIP-T?

May 19th, 2009by Adam Roach under SIP

SIP-I and SIP-T refer to two very similar approaches for interworking ISUP networks with SIP networks. In particular, they provide the means for conveying ISUP-specific parameters through a SIP network so that calls that originate and terminate on the ISUP network can transit a SIP network with no loss of information.

SIP-T was developed by the IETF — the same body that developed the SIP protocol itself — around the same time the most recent version of SIP was being developed (mid-2002). It is defined by RFC 3372, RFC 3398, RFC 3578, and RFC 3204.

SIP-I was developed by the ITU in 2004, and made use of most of the constructs defined in the IETF SIP-T effort. It is defined by ITU-T Q.1912.5.

SIP-I and SIP-T both define the mapping of messages, parameters, and error codes between SIP and ISUP. Both of them are fully interoperable with compliant SIP network components on the SIP network.

The key differences between SIP-I and SIP-T are:

  1. SIP-I defines a mapping from SIP to BICC (in additional to ISUP), while SIP-T addresses only the ISUP case, and
  2. SIP-T is inherently designed for interoperation with native SIP terminals, while SIP-I is restricted for use between PSTN gateways only.

SIP-I and SIP-T also define somewhat different mappings of information between the protocols, mostly in terms of converting from SIP error codes to ISUP cause codes.

The way SIP-I and SIP-T allow transparent transit of ISUP parameters through a SIP network is by attaching a literal copy of the original ISUP message to the SIP message at the ingress PSTN gateway; this ISUP message appears as another body on the SIP message (typically, a peer to an SDP body).

The SIP network ignores the extra ISUP body, processing the SIP message as it normally would. After the SIP service network performs any necessary modifications to the SIP message, it arrives at the PSTN egress gateway. This egress gateway uses the attached ISUP message as the basis for the ISUP message it will send; however, it first makes modifications necessary to match changes made to the SIP message during its traversal of the SIP network.

 

As mentioned before, with SIP-T, the messages may also terminate on the native SIP terminals in the network, which will ignore the extra ISUP body. Additionally, messages may originate from these SIP phones and terminate on the PSTN gateways, which will then generate a new ISUP message for the PSTN.

Putting this together in a call flow, a typical successful call setup from a PSTN terminal to another PSTN terminal through a SIP network can look something like this:

 

In Defense of Standards

April 7th, 2009by Adam Roach under SIP

Most communication technologies follow a fairly predictable trend: first, forward-thinking researchers develop prototype of the technology. Then, pilot deployments demonstrate the viability of the technology. After the utility of the technology is proven, the protocols for multiple interoperable implementations are standardized. If the technology is successful, the early pilot deployments begin to gateway to each other (and a broader community) using the standardized protocols. Eventually, the proprietary islands disintegrate, leaving behind a fully standardized solution.

And that’s when the exciting stuff starts to happen.

One of the clearest examples of this is email – the earliest days of email began with dial-in bulletin board systems and isolated corporate email systems. The IETF standardized the Simple Mail Transfer Protocol (SMTP) in 1982; and, gradually, the proprietary systems gatewayed to the standardized protocols. Eventually, the islands faded away – today, just about all email in the world is exchanged using SMTP. Long gone are the days when you had to worry about having an account on the same email service to be able to communicate with your buddies; when someone says they have “email,” you assume that you can communicate with them using whatever email system(s) you happen to use.

We’ve seen similar migrations in the World Wide Web (distributed hypertext systems trace back to Douglas Engelbart’s work from 1968), with standardization coming about a decade later than email. And we’re in the same curve (albeit much earlier) for Presence and Instant Messaging (think ICQ, AOL, and Yahoo followed by XMPP and SIMPLE), and realtime Voice and Video over IP.

What’s exciting is how quickly we’re getting to the stages of gatewaying the proprietary islands to standardized protocols: last week, Skype announced its plans to gateway its own (very large) voice-over-IP island to the rest of the world.

However, unlike the other technologies I mention above, I have noticed a worrying trend in deployments of SIP to add proprietary extensions to the protocol – most often, in ways that interfere with proper interoperation with actual standard clients. The examples are manifold.

A major operating system vendor has incorporated what they claim to be “SIP” into a number of products, but uses a format for presence that is completely proprietary, and cannot interoperate with standard clients. These products also employ a proprietary federation mechanism that is fundamentally incompatible with the SIP inter-domain routing model.

Another major vendor who sells “SIP” PBXes abuses the SIP events mechanism quite egregiously by sending message waiting indications to every registered phone, regardless of whether those indications are supported by the device (and, conversely, sells phones that never subscribe to receive message waiting indicators).  The same vendor uses the same kind of unsolicited (and unauthenticated) notification to cause the phones to reboot – meaning one person with a 10-line perl script can knock an entire enterprise’s voice network over whenever they want to. And I have personally had to modify clients to deal with the fact that this vendor’s products send certain unwanted (and largely useless) types of media packets, even when the SIP negotiation has indicated they should not be sent.

The story repeats itself many times in the SIP world – current deployments have multiple, non-interoperable ways of sending DTMF (TouchTones®) with SIP; bridged-line appearance; directed call pickup; and several other features.

Why, you may ask, is this any worse than the proprietary islands? Well, back when these vendors had their own proprietary systems, no one expected things to work together. You bought an entire system from a single vendor, and never expected things to work with other vendors’ equipment. But now, when vendors decide to implement non-standard protocols and call them “SIP,” they raise expectations that their equipment will work with systems from other vendors who also claim to implement “SIP.” And when things go off the rails, everyone starts getting the impression that “SIP doesn’t work.”

As someone who has spent well over a decade making sure that SIP does work, I find this very frustrating. Basically, you have vendors implementing proprietary protocols inspired, to varying degrees, by the SIP standard. And these implementations get labeled “SIP,” when they’re not. And we know they don’t work (or, even worse, sometimes work and sometimes don’t) with actual SIP devices.

My point is: standards are pretty much an all-or-nothing proposition. If you’re going to claim to follow a standard, you’ll do everyone a favor by following that standard. If you’re going to do something non-standard that is somehow based on SIP, then what you have is a proprietary protocol inspired by a standard. And that’s okay, as long as you don’t hold it out to be anything other than a proprietary protocol.

Standardized Tools, Infinite Services

January 7th, 2009by Adam Roach under SIP

In last week’s post, Ben touched on some of the aspects of SIP network deployment that stifle innovation. I’d like to follow up by taking you behind the scenes and discussing a bit about what we do at the Internet Engineering Task Force (IETF) to help facilitate innovation.

Probably the most important work the IETF does to foster innovation lies not in what we choose to standardize, but in what we choose to leave unstandardized. With rare exception, the IETF does not standardize how to implement services in SIP. One of the key philosophies that has been central to the development of the SIP protocol is that the IETF defines a rich and powerful set of protocol tools, and allows application developers to implement services using these tools.

To illustrate the difference between tools and services, we’ll consider use cases that involve the interaction between three parties in a single call experience.

SIP includes a number of tools that allow for fairly powerful manipulation of call legs (including the REFER method, and the Join: and Replaces: header fields).  Depending on the desired use cases, user experience, and security properties, endpoints can use these tools in a wide variety of different ways to establish calls involving three parties.

Here is a very basic call-flow that has properties that are effectively identical to the PSTN named service known as “three-way calling”:

However, there are other message flows that would produce the same user experience, albeit with some different properties. For example, using a network-based server to provide media mixing can reduce the amount of voice traffic that is sent and received over Bob’s network connection, and reduce the CPU load on Bob’s phone. These properties may be desirable, for example, if Bob is on a wireless mobile device. Such a call flow may look something like this:

There are a large number of other call flows that can be used to achieve the exact same user experience, but with different technical properties.

What is key, however, is that Bob’s phone (as the initiator of the three-party call) can make the decision about which tools to employ, and how to employ them, unilaterally. This doesn’t require any support from Alice’s device or from Charlie’s device, other than use of the standardized call-leg manipulation tools. From the point of view of Alice and Charlie, the user experience is identical between the two call flows I discuss above. And that is exactly what the IETF hoped to achieve with its philosophy of standardizing protocol tools instead of named services. Only Bob’s device knows whether it is better off performing local mixing or farming mixing off to a network-based media server.

All of that said, the key motivator for defining tools instead of services in the SIP protocol is this: tools can be combined in unique and innovative ways to invent novel services that are limited only by the creativity of the companies producing products in this space. If the IETF were to standardize named services, then the abilities of SIP would be constrained to those named services and nothing else. It would be the end of innovation.

 

Postel’s Maxim – Improving Interoperability

December 11th, 2008by Robert Sparks under SIP

Standards like SIP define a set of rules for how elements behave, focusing on
how messages are constructed and sent, and how other elements react when they
are received. The specifications focus on describing behavior when the received
messages adhere to the protocol. While they can provide general guidance on
dealing with erroneous input, there are a large number of ways (frequently
unbounded) to say or do something wrong at any given protocol step. It would be
counterproductive, if not impossible, to call each of them out explicitly in
the specification.  

So, implementers have to choose what to do when the peer they are talking to
does something out of spec.  One approach would be to strictly reject any
inputs that contain such errors. While sometimes tempting, this approach
ultimately leads to systems that are easy to break and expensive to maintain.

Jon Postel captured a better approach in the RFC793 (the definition of TCP at
the time) : “TCP implementations will follow a general principle of robustness:
be conservative in what you do, be liberal in what you accept from others.”.
Adhering to this simple approach to achieving robustness differentiates systems
that thrive in real deployments from those that are quickly removed for causing
operational pain.

Let’s look at an example. Twice a year, we bring a large number of SIP
implementations together for a week long test event known as the SIPit.  We’ve
done 23 of these so far, and we’re still seeing new implementations at each
event. In the early days, new implementations tended to be overly strict in
enforcing grammar rules. One of the classic problems involved SIP’s Date header
field. One of these in a SIP message might look like this:

    Date: Sat, 13 Nov 2010 23:29:00 GMT

The specification defines the syntax of the header such that the timezone field
on that line can only contain “GMT”. Use of any other timezone string, even
“UTC”, is invalid according to the grammar rules. Some of these early
implementations would reject a SIP request (or even pretend they never saw it)
if the sender used UTC or some other timezone, even if they did NOTHING with
the value in the Date header field later in their processing. Calls failed
because of bits the recipient wasn’t going to use anyway. These implementers
quickly changed their code so that calls worked, but if they had taken Postel’s
maxim into account in the first place, they would have succeeded on their first
try.

I chose that example because its easy to see how Postel’s maxim applies without
diving too deep into the details of SIP. At recent SIPits, few implementations
have fallen into that well known trap. But many similar pitfalls exists (from
simple syntactic things like handling To-tags in registration and dealing with
spurious quotes in Digest authentication challenges, to semantic violations of
the extensibility mechanisms or the offer/answer exchange).  The
implementations that follow the approach of strictly adhering to the
specification in the messages they send, but ignoring other’s mistakes when
they don’t matter, interoperate with many more implementations than the ones
that take a brittle “protocol-police” path.  Being robust, they’re the ones
that work (and get to remain) in real deployments.

In future posts we’ll explore a bit more about how the approach of being
conservative in what you do, but liberal in what you accept from others can be
applied to applications in addition to protocols.

Welcome again to our new blog. We hope you’ve enjoyed this post and we look
forward to visiting with you going forward.

<% Response.Write("" & vbcrlf) %>