Introduction

A load balancer setup can share load between multiple FS hosts according to load. A single ingress IP can be presented to the customer for the entire cluster simplifying the customer configuration (this can be combined with DNS SRV to give failover between data centres). The load balancer will measure load on each server in the cluster and send calls to the least loaded server. This ensures that the call load is spread among servers evenly, so no server

SIP signalling goes via the load balancer server, but media goes direct between the customer endpoint and the FS host in the cluster. This is similar to the bypass media functionality of FS. Since media does not go through the OpenSIPS server, the hardware requirements are far smaller that for the FS hosts in the cluster.

For the purposes of the example of this page there are 2 load balancer servers lb1 (100.100.100.101) & lb2 (100.100.100.102) and 4 FreeSWITCH servers (fs1-fs4, 100.100.100.103 - 100.100.100.106). All are on the same subnet (255.255.255.0). The virtual IP is 100.100.100.100.

Loadbalancer Configuration

For redundancy you should have two servers that act as load balancers. These will be in an active-passive configuration, i.e. all traffic goes via the active server the passive server is promoted to active if the active server fails or is taken down for maintenance. They each have their own IP so that they can be accessed independantly (e.g. via SSH for maintenance) and share a virtual IP which is assigned to the current active server. SIP traffic is accepted on the virtual IP.

It is recommended that lb1 and lb2 have multiple ways to communicate, to prevent a split brain situation where both lb1 and lb2 think the other is offline and both start listening on the virtual IP.

Heartbeat

Heartbeat is responsible for assigning the virtual IP to a server and detecting when the other server fails.

You will need to create 3 files in /etc/heartbeat: authkeys, ha.cf, haresources. These files must be identical on both lb1 and lb2.

authkeys

This file stores one or more authentication methods, and selects which to use with the auth statement. For example:

auth 2
1 crc
2 sha1 thisismysecretkey
3 md5 thisismysecretkey

The key here is used to authenticate communication between nodes in the cluster and prevents malicious servers being able to trick a node into going offline. As a result it should be kept secret and very secure. CRC should never be used in production as it provides no authentication, just error checking.

ha.cf

This is only an example. Timings can be adjusted to suit your needs. This uses a serial cable and network broadcast to check the other node but 1 or more of serial, unicast, multicast and broadcast can be used and none is required (apart from there must be at least one communication method).

haresources

This file controls what resources are configured on the load balancer servers, this is where the virtual IP is configured. The format is:

preferred_hostname IPaddr2::ip_address/netmask(/interface)

Interface is optional, but for our example there are two NICs (eth0 and eth1) bonded to form bond0 for network redundancy.

For example:

lb1 IPaddr2::100.100.100.100/255.255.255.0/bond0

Sysctl

OpenSIPS will bind to the virtual IP. This will result in an error on the passive server which prevents OpenSIPS starting up since it will be attempting to listen on an IP that does not belong to it. This can be avoided by setting the Linux option net.ipv4.ip_nonlocal_bind=1.

On debian this can be achieved by creating the file /etc/sysctl.d/linuxha.conf:

# allow services to bind to the virtual ip even when this server is the passive machine
net.ipv4.ip_nonlocal_bind = 1

OpenSIPS

This example listens for both UDP and TCP SIP calls on the virtual IP.

The dispatcher is used distribute REGISTER requests among the FreeSWITCH hosts. It uses the OPTIONS packet to detect when a FreeSWITCH host is offline and stop sending requests to that host. Dispatcher uses weights and doesn't take server load into account; this is ok for REGISTER requests but unsuitable for actual calls.

The load_balancer module is used to distribute INVITE requests (calls). It also uses OPTIONS to automatically stop sending traffic to a host if it goes offline. Unlike dispatcher, it stores the number of channels available on a host and in memory keeps a count of how many calls are currently in progress to that host, sending the INVITE to the least loaded host.

load_balance() takes an algorithm as the 3rd option - either 0 (absolute) or 1 (relative). The example here uses relative, so the "least loaded" server is the one with the lowest percentage load (current_channels/max_channels) while absolute would send to the one with the most free channels (maximum_channels-current_channels).

MySQL Cluster

For calls to failover without dropping, the dialogs module must share its information in a shared database. This then means that on failover UDP calls will not be dropped since OpenSIPS will still know where to forward any SIP packets passing through the passive (now active) server as they'll be in its database. TCP calls cannot failover as gracefully in this way because the TCP connection will not be known to Linux/OpenSIPS on the passive server.

The database also provides a source for the list of hosts and weights/channels for the dispatcher and load_balancer modules. These must be told to reload the data from the database via a MI module if the database is updated.

There are multiple methods for setting up a shared database. You can have a single database, but this provides a point of failure. You can also have a pair of database servers with master-master, but this can cause problems since the replication is asynchronous. The MySQL Cluster solution provides a nice solution.

For a fully redundant setup, you should have at least 2 management servers, 2 data nodes and 2 sql nodes. All can be on the same hardware or on dedicated hardware. For this example they'll all run on lb1 and lb2, but in real usage it might be advisable to move them onto separate servers to reduce the load on the load balancer servers. This could be a LAN which lb1 & lb2 are also on, to remove them from the public Internet (this is advisable since MySQL Cluster uses no authentication/encryption).

FreeSWITCH Configuration

FreeSWITCH needs very little additional configuration to function with the load balancer setup. Your existing setup will mostly work just fine. There are a couple of areas that will need attention though.

Authentication

User authentication would just fine, as the 401 Not Authorized and authentication headers are forwarded by the SIP proxy.

ACLs might be trickier however, since the requests will all come from the virtual IP.

TODO: is it possible to get FreeSWITCH to authenticate against an ACL from a value in a header if the request comes from an IP matching a proxy ACL? If so OpenSIPS could add an IP header

SIP Registrations

REGISTER requests will go to a host in FreeSWITCH cluster. For the other hosts to know about the registration they will need to store the mod_sofia data in a shared database (single database, master-master replication or MySQL cluster). All hosts must also be configured to handle calls for the same domain, and the registration must be to this domain.

CDRs

The network IP logged in the CDRs will be that of the load balancer not the client's IP, this should be beared in mind.