SIP Load Balancing != IP Based-Load Balancing
When it is time to scale up a SIP infrastructure the network planner will most likely ask himself: Because DNS is not a sufficient solution, would a simple IP load balancer be OK?
A simple IP load balancer would act as a front-end for the SIP cluster and all traffic going to the SIP cluster would pass the load balancer. This can be achieved by having a DNS entry for the SIP cluster that maps the URL of the cluster to an IP address that is served by the load balancer. The IP load balancer would then distribute the incoming SIP traffic using some load distribution mechanism such as round-robin or based on the hash of the source IP address.
Such an approach might be sufficient for the case when the SIP nodes in the cluster are transaction stateless SIP proxies. In all other cases, this simple approach would not work:
- Responses and requests for the same transaction should traverse the same nodes. Hence, the load balancer should at least be able to route the responses based on the VIA header, otherwise the response will reach a SIP node that knows nothing about the transaction and will most likely just drop the response or generate an error. This means that the load balancer will need to act as a transaction stateless proxy and parse at least the VIA headers.
- In case all requests that belong to the same dialog are expected to be processed by the same server in the cluster then using round-robin or a hash of the source IP address will not work as well. This would be the case, if the SIP server is collecting and generating CDRs for example or the SIP server is an IVR. Why round-robin is not an option should be clear. Using a hash of the source IP address for determining the SIP node could work in a perfect world. However, as a SIP client might change its IP address during the same dialog or the size of the cluster might change. For example, if a server is added or removed from the cluster then the hashing mechanism will lead to wrong results.
- In some scenarios such as clusters of PSTN gateways, the nodes of the cluster might generate calls themselves. In this case the load balancer will need to be able to route the incoming responses to the right nodes. This will require the load balancer to be able to process the SIP headers and route the responses using the VIA headers.
So, in short, a load balancer for a cluster of SIP nodes must have some SIP logic. The level of SIP logic will depend however on the usage scenario and the type of servers in the cluster as well as the expectations of the operator.
In general one can implement a SIP load balancer in one of two ways:
- Transparent: The existence of the load balancer is transparent to both the clients and servers. Clients send their traffic to the load balancer, which forwards the traffic to the servers without adding any SIP headers. The servers use the load balancer sort of a router to send their responses back to the clients. The VIA and Record-Route headers in the SIP messages leaving the load balancer will include the IP address of the load balancer. This can be achieved by either convincing the nodes in the cluster to use the IP address of the load balancer when adding a VIA or Record-Route header or by having the load balancer manipulating the messages leaving the cluster and replace the IP addresses included in the messages with its own address.
- Non-Transparent: The load balancer acts as an outbound proxy that receives traffic from clients, then adds VIA and possibly RR headers and forwards the traffic to some server.
The transparent mode has the advantage that the addresses of the nodes in the cluster are hidden from the clients and provides this way topology hiding. Also, when the servers in the cluster are supporting NAT traversal, then in the case of symmetrical NATs the clients expect that incoming calls are routed through the same SIP server which is handling the registrations and outgoing calls of the client. With the non-transparent approach the load balancer would have to deal with the NAT traversal aspect itself. With the transparent approach the different servers in the clusters would be each responsible for a subset of the clients which would keep the complexity of the load balancer low and its capacity high.
A major advantage of the non-transparent approach is that the load balancer acts as a SIP proxy and can for example reroute requests that are rejected by an overloaded server to another one, for example.




