Archive

Archive for March, 2010

LTE Handsets: Status Check

March 30th, 2010by admin under SIP

In my last post I discussed the return of IMS as part of the roll-out of voice over LTE (VoLTE).  Of course an essential component of VoLTE will be handsets that support LTE and have a SIP user agent (UA).  Conventional wisdom is that LTE deployments will initially be data only and handsets will follow some time later.  Over the last several weeks I’ve been trying to understand just when LTE handsets will be available. I just returned from the CTIA wireless show here in the US and one of my objectives was to learn a bit more about the roadmap for LTE handsets with voice support. 

For the most part the people I talked to were unwilling to reveal too much detail about their specific plans, but did generally agree that LTE deployments will start with data cards and handsets will follow.  Estimates were that handsets would be available as early as 2011 and Verizon Wireless recently announced they would have handsets in 2011.  On the other hand, a colleague that attended some of the keynote speeches indicated that a couple of the carriers seemed to imply that VoLTE in the 2012 timeframe was a reasonable assumption.

Surprisingly, I also found that Samsung made an announcement that they would have an LTE handset in 2010. The press release can be found HERE.

The phone, a Samsung SCH-r900, will be an LTE-enabled multi-mode handset that also supports CDMA. It will be available in the second half of 2010 for MetroPCS’ LTE deployment in Las Vegas. While few details were made available, it is conceivable that this phone will support voice over LTE – but there is the possibility that it will only use LTE for data only.  The announcement wasn’t definitive on this point.

There are of course many issues and challenges to solve with respect to VoLTE handsets – especially in CDMA networks.  The complexities include radio issues, the implementation of SIP UAs and SIP compression (SIGCOMP).  But just as important are the issues associated with voice call handover – in particular handover between different radio access technologies, e.g. CDMA and LTE.  This creates challenges in both the network and the handset.

Another interesting challenge in the handset will be the design of the SIP UA in the handset.  Our experiences in early IMS trials showed that many of the handsets had a separate SIP stack for every application on the handset. This could mean one SIP stack for voice calls and another for messaging applications or RCS (Rich Communications Suite). It would also mean multiple registrations with the network for each application.  This is probably not the ideal situation from the carrier perspective – but presumably things have evolved in the last 5 or so years since the early IMS trials.

In conclusion, it would appear as though momentum is continuing to grow for LTE and the availability of handsets is exceeding my initial expectations – but it is still early.  LTE deployment will take years (if not a couple of decades) to reach a large percentage of subscribers. We should remember that at the end of 2009, 3G subscribers were estimated to comprise only 17% of the total worldwide mobile subscriber base, according to Portio Research. However, the growth of mobile data is pushing many carriers in developed economies to take a close look at deploying LTE sooner as opposed to later.

Innovation in Speech Encoding Gets Huge Momentum

March 28th, 2010by Jiri Kuthan under SIP

One of the most medieval-looking aspects of VoIP today is speech encoding. As a matter of fact, a vast majority of phone calls are being made using the G.711 codec, which is a standard published in the early seventies. That’s even earlier than the very first scientific publications on the subject – voice over packet networks – began to appear.

Over past decade innovation began to make headway. Internet Low Bit Rate Codec (ILBC) is a great codec that delivers voice in good quality even over fairly poor links. It was standardized as RFC3951 in 2004. Other popular codecs ready for the Internet age is Speex (http://www.speex.org/), with an impressive compression rate. New codecs introduce various improvements: quality improvements over ‘lossy’ networks, high compression, low-latency or superb-quality audio. A frequently mentioned humorous application of the latter is ‘voice for dogs’ that can capture broader audio spectrum than humans.

One may observe though that despite how good the recent codecs are, their adoption is still lagging. I believe this has begun to change for good. In the IETF several standards are being standardized. This week the IETF meets in California, and its busy agenda included several codecs among others: BroadCom’s Broadvoice (16 or 32 kbps, royalty-free and open-source), CELT (http://www.celt-codec.org/, low-latency), and Skype’s SILK (royalty-free, open-source).

I believe the key aspect of the work on new codecs is the tendency to release them under patent-free and/or royalty-free conditions. That’s really creating space for innovation, and massive adoption by both small and big players. I can’t await man listening to his dog’s voice.

New MSRP Standards Work: The Alternate Connection Model

March 17th, 2010by Ben Campbell under SIP

Even though the SIMPLE working group published the MSRP and MSRP Relay specifications some time ago, it’s not quite time to say we’re finished and go out for a beer. Like any new protocol, it has some rough edges that need to be polished out. SIMPLE is currently working on a bit of polish known as the MSRP Alternate Connection Model, or MSRP-ACM for short.

MSRP runs over reliable, connection-oriented protocols such as TCP. One important aspect of these protocols is that when two devices want to talk, one of them must act as a client and the other a server. That’s not as big a deal as it sounds. It merely means that the client opens the transport connection towards the server, and the server listens for and hopefully accepts the client’s connection. For truly client-server protocols such as HTTP, this makes perfect sense–your web browser opens a TCP connection to the web server. It rarely makes sense for the server to open the connection towards the browser.

Since so many application protocols work this way, the client-server assumption has become intrinsic to the way people build access networks. Chances are, there’s a firewall between your web browser and the server for this blog. There’s likely even a Network Address Translator (NAT). Both of these devices are commonly configured with policies that lets clients open outbound connections, but severely restrict who can open or receive inbound connections. 

But the client-server assumption falls down for many real-time communication applications. These applications are peer-to-peer at their cores, i.e. they are designed to allow any device to connect to any other device. It’s hard to deploy peer-to-peer applications on networks that were built for client-server applications. That mismatch has provided grist for many of the posts on this blog. I’m sure it has caused headaches for many of our readers.

As I mentioned in my last post, MSRP assumes that the peer that sent the SDP offer always acts as the TCP client. That is, it opens the transport connection towards the peer that sent the answer. For this reason, we often refer to the offer as the “active party.”

The active party also must immediately send an MSRP SEND request to its peer. This is because when the listening device (aka the “passive party”) receives a connection, it doesn’t know for sure who’s really trying to connect. When it receives the SEND request, it can compare an MSRP URI in the request to the one it sent in the SDP answer. You can think of that URI as a party invitation the active party must present to get in the door.

But making the offerer into the active party does not work in all possible scenarios. Let’s go back to the usual suspects, Alice and Bob. Alice sends the SDP offer to Bob, making Alice into the active party and Bob into the passive party. But Bob is behind a NAT that doesn’t allow inbound connections. They need an MSRP relay, or some other kind of media relay, to talk at all.

But what if Alice was not behind such a NAT? They would have been able to talk just fine if Bob sent the offer to Alice. It seems a shame to require the overhead of a relay, if Alice and Bob could have solved the problem by reversing roles.

COMEDIA describes a set of SDP extensions for negotiating which peer becomes the active party for connection oriented media. The MSRP-ACM draft describes how to apply COMEDIA to MSRP sessions, instead of just using the default assumption that the offerer is always the active party.

So if Alice and Bob had both supported the alternate connection model, Alice would have declared in the SDP offer that she could act as either the active or the passive party, since she was not behind a NAT.  Bob would respond in his SDP answer that he could only be the active party. Bob would then take over the role of active party, even though he was not the offerer. He would then open the TCP connection to Alice, and send the initial SEND request.

This situation may not occur very often for sessions between end users. It’s far more likely that both parties have NATs or firewalls getting in the way, and you still need at least one party to use an MSRP relay or other NAT traversal technology. But MSRP-ACM can be extremely useful if one party is an application server, such as a conference bridge, offline message server, etc. It’s much more common for such server-class devices to be able to accept inbound TCP connections. But if they MSRP without MSRP-ACM, then they would need a relay in order to initiate a session to an end-user. With the alternative connection model, that’s no longer necessary.

The SIMPLE working group is almost finished with the MSRP-ACM draft. The draft has completed working group last call (WGLC). The draft authors are working to resolve some WGLC comments, after which the group will submit the draft to the Internet Engineering Steering Group (IESG) for final evaluation before it becomes an RFC.

 

SIP and Security: New Developments

March 9th, 2010by Adam Roach under SIP






One of the abilities that SIP has had since its very earliest days is the ability to use X.509 certificates to sign and encrypt messages being sent across the network. If you haven’t heard of X.509 before, don’t worry – you’re hardly alone.

X.509 itself is based on a really elegant class of algorithms called “Public Key Cryptography.” At a high level, here’s how public key cryptography works: there are mathematical functions you can apply to a large random number to generate two linked cryptographic keys. These keys have the very interesting property that something that has been encrypted by one key can be decrypted by the other, and vice versa.  In Public Key Encryption, you generally designate one of these keys “public,” and make it available to anyone who wants it. You designate the other key “private,” and keep it secret.

Once you’ve done this, there are a couple of very interesting things you can do with these keys: first, other people can use your public key to encrypt messages they want to send to you. But the only way to decrypt them is using your private key, which guarantees that no one else can read the messages (not even the person who encrypted them!). This is shown in Figure 1.

Figure 1: Public Key Encryption and Decryption

The other thing you can do is sign messages with your private key (proving that you generated the message), and anyone with your public key can verify that the message hasn’t changed since you created it. This is shown in Figure 2.

Figure 2: Public Key Signing and Signature Verification

 

So, that’s all very nice from a security perspective – so why haven’t we seen more of this? Well, it turns out that the hard part of this isn’t the cryptography (in fact, every modern email client has the ability to do this); the hard part is getting your public keys to everyone. There have been a number of designs for public key infrastructures (PKIs) that are supposed to address this problem, but they haven’t been deployed for a number of reasons.

There’s one notable exception, though: web sites. Because e-commerce depends so heavily on the ability for customers to verify that a web site is who it claims to be (through the signing operation), and to send information like credit card numbers in a encrypted form (using the encryption operation), web sites pretty much have this all figured out.

That gives us just enough of a PKI to pull SIP up by the web’s bootstraps, and that’s exactly what RFC 4474 does. At a high level here’s how that works. Let’s say Alice wants to call Bob, and Bob wants to make sure it’s actually Alice calling before he answers the phone. So, Alice sends a SIP INVITE message to her proxy. The proxy makes sure that the caller is actually Alice, usually by asking Alice’s SIP device to prove it has a password of some kind. Alice’s proxy then adds a signature to the SIP INVITE, which says that the proxy has verified that the caller is, in fact, Alice. It also includes the address of its web server in this INVITE message.

So, when Bob gets the INVITE, he can use this web server address to ask Alice’s Proxy’s web server for the public key for Alice’s proxy. He knows he can trust the web server, because we already have a working PKI for web servers. So, Bob can then use the public key he got back from the trusted web server to verify that the proxy is who they claim to be, and that they’re in a position to vouch for Alice’s identity. The overall information flow for this set of operations is demonstrated by Figure 3.

Figure 3: RFC 4744 Identity Verification

 

And that, by itself, gets us a lot of the way to where we need to be. Bob now has cryptographic proof that the person calling him is, in fact Alice.

But this still doesn’t get us all the way to a fully-functioning X.509 service. For example, it doesn’t let Alice sign messages herself, and it doesn’t let her encrypt the actual SIP messages she sends to Bob. Sure, she can send them over TLS, but that only makes things encrypted between proxies – each proxy is still decrypting and re-encrypting the message at every hop. And those proxies can read any part of the message they want.

We’re just finishing up work in the IETF that’s about to change that. The document is called “Certificate Management Service for the Session Initiation Protocol” (or “SIP Certs” for short), and it uses RFC 4474 to do something even more clever.

What’s really neat about RFC 4474 is that it can be used for any kind of SIP request, not just INVITE requests. The SIP Certs work leverages this even further by counting on the ability of RFC 4474 to certify the identity of the entity sending NOTIFY requests. Here’s how we take advantage of that.

Let’s imagine that Alice wants to encrypt something to Bob using his own X.509 certificate. With the SIP Certs framework, Bob will use the SIP PUBLISH method to send his pubic key to what is called a Certificate Server. The certificate server makes sure the person sending the certificate is actually Bob, and then stores the public key so that anyone who comes along asking for it can get a copy. Later on, when Alice wants to encrypt something that only Bob can read, she asks Bob’s certificate server for Bob’s public key, and uses this to encrypt her message. This flow is shown in Figure 4.

 Figure 4: SIP Certficate-based Encryption and Decryption

Now, the step of asking Bob’s certificate server for Bob’s public key is actually a bit tricky, because Alice wants to make sure that Bob’s certificate server is actually
Bob’s certificate server. And that’s where RFC 4474 comes back into the picture: just like the INVITE in the example above, the NOTIFY that contains Bob’s public key contains a signature from Bob’s certificate server, and a pointer to a web site that Alice can go to get a certificate to verify that signature. Which is all a very clever way to get back to leveraging the web Public Key Infrastructure to get a usable public key all the way out to Alice in a way that she knows she can trust that the key is actually Bob’s public key.

And once Alice know that she has Bob’s actual public key, she can encrypt a message for Bob and send it to him.

What’s really cool about this mechanism is that it can be used to sign things as well, as shown by the information flow in Figure 5. Alice simply publishes her own public key to her Certificate server, and then uses her private key to sign a message. Bob can verify Alice’s signature by grabbing her public key from her Certificate server, and using it to validate the message he received.

Figure 5: SIP Certificate-Based Signing and Signature Verification

 


SIP and Transport Protocols: Do’s and Don’ts

March 2nd, 2010by admin under SIP

Session Initiation Protocol messages can be transported over several different protocols: UDP, TCP, SCTP; each of which has advantages and disadvantages. What  follows is an overview of them.

SIP over User Datagram Protocol (UDP)

UDP is the simplest way of transmitting chunks of data from one host to another in an IP network. Provided that the amount of data to be sent at once is not too big, UDP will do its best to accomplish the task. Pretty fast too. If the programmers did their job correctly, multi-process or multi-threaded applications do not require extra synchronization delay since the read/write operations are atomic when it comes to UDP sockets.

What you get is maximum throughput. However, this comes at a cost – it may trigger congestion.  Congestion means basically that the infrastructure cannot support the amount of traffic that is sent/received through it. Congestion can show up in different parts and layers of the infrastructure:

  • it can happen on the way to the remote host – network congestion; in this case the network cannot route and transport at the expected rate.
  • it can happen on the remote host itself – application congestion; the end host (a sip proxy for example) cannot process the packets as fast as they are received.

Since UDP and SIP do not provide any explicit mechanism for overcoming congestion, it has to be taken care of in the signaling application.

Here is an example of application congestion that can take place during a failover in an active/standby configuration of sip proxies. The active proxy fails and the standby one takes over after several seconds; due to the SIP retransmissions the newly active proxy will experience traffic spikes which persist for some seconds after the service functions again. In the case where the traffic rate both before and after the failover happened is close to the engineered calls-per-second (limit supported by the proxy), these spikes may lead to application congestion on the proxy which has just become active.

Another drawback of UDP is that it does not provide either acknowledgment of received datagrams or retransmission mechanisms; SIP takes care of this at application level by using a simple retransmission algorithm.

SIP over Transmission Control Protocol (TCP)

TCP offers a lot more than UDP when it comes to congestion, retransmission and error control. However, TCP is a stream oriented protocol which was designed for transferring reliable chunks of data from host A to host B. Signaling with real-time constraints was not one of the design requirements for TCP.

Conceptually, a TCP based application sees received or sent data as a continuous flow; and this is correct for applications that copy files from remote hosts. For protocols like SIP which is using delimited messages sent over the same TCP connections, things are getting more complicated. The reads and writes on the TCP socket have to be serialized and the reading of the SIP message from the stream is more complicated than in the case of UDP since it may arrive in different TCP segments whose payloads are not delivered all at once to the user space socket.

Standard TCP implementations do not allow configuration of internal timers. Timing is important for SIP based applications though. For example, in the telecom world you need to be able to tell pretty fast whether your peer is still there or not.

TCP subsystems on modern operating systems offer some support for that: keep-alives. These are basically empty TCP segments having the ACK flag set, which are sent periodically in case of idle intervals to monitored peers. If the peer is still running it answers back an ACK, otherwise there is no answer and the local application knows that either the remote peer has crashed or there are problems at lower layers. Again, modern operating systems like Linux offer the possibility to configure the timeouts for the keep-alive mechanism:

  • for how long a connection should be idle before sending keep-alives; one drawback here is that the minimum value for this parameter is 1 second
  • how often the keep-alives should be sent; again the minimum value is 1 second
  • how many keep-alives are sent before the peer is declared dead

SIP over Stream Control Transmission Protocol (SCTP)

SCTP can be considered the Swiss army knife of transport protocols. It basically offers combined features of both UDP and TCP. UDP-like features are: message boundary preservation, unordered message delivery, one-to-many sockets at the application level. Among TCP-like features: positive (selective) acknowledgment, retransmission of lost data, windowed flow control, congestion control, one-to-one sockets at the application level. What makes SCTP unique are some features which do not appear in other transport protocols:

  • multihoming
  • multiple streams per connection
  • built-in heartbeats
  • much more flexible when it comes to configuring certain parameters – especially for controlling timing
  • exposes asynchronously its internal states to the application level through the use of notifications

How much of this is useful for SIP? Message boundary preservation as in case of UDP will make reading/parsing of the SIP message easier; whereas unordered message delivery can help in case of head of line blocking.

What is really helpful for real time oriented applications is that SCTP sockets offer fine tuning of timer values and more details about what has happened on a certain association. The parameters that control the timers for association setup, retransmissions and heartbeats, are configurable per system and per socket. Transport layer failure detection can work fast when appropriate values for SCTP heartbeats are used.

Things get even simpler from the application layer programming perspective. The SCTP notifications provide the means of monitoring what happens with a certain SCTP socket and are standardized by the SCTP socket API. A broad range of asynchronous notifications are sent on the SCTP socket: association start-up, association setup attempt failure, transport-level events, remote operational errors, undeliverable messages.

There are of course pitfalls – SCTP is a relatively newcomer in the transport protocols ecosystem. The SCTP socket API is a moving target still under development. Due to novelty, the level of complexity of some of the SCTP stack implementations is inversely proportional with the time spent on testing them; sometimes their performance in terms of throughput is not on a par with the one offered by TCP.

<% Response.Write("" & vbcrlf) %>