A load balancer (LB) is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. LBs are used to increase capacity (concurrent users) and reliability of applications. They improve the overall performance of applications by decreasing the burden on servers associated with managing and maintaining application and network sessions, as well as by performing application-specific tasks.
Some of Load Balancers will be mentioned in this article, including:
- general Layer 7 (application) layer
- DNS L7 layer load balancers
- L4 (transport) layer load balancers
- general L3 (network)
- Anycast load balancers
We’ll ignore most of the jargons to minimize preceded knowledge of computer networking, except the basic idea of OSI 7 layer model. The OSI 7 layer model is a model used to separate responsibilities of devices in a network, providing transparent and standard interfaces to both upper and lower layers. They’re named from high to low as:
- Application Layer (Layer 7). High-level APIs, including resource sharing, remote file access
- Presentation Layer. Translation of data between a networking service and an application; including character encoding, data compression, and encryption/decryption
- Session Layer. Managing communication sessions, i.e. continuous exchange of information in the form of multiple back-and-forth transmissions between two nodes
- Transport Layer. Reliable transmission of data segments between points on a network, including segmentation, and multiplexing
- Network Layer. Structuring and managing a multi-node network, including addressing, routing and traffic control
- Data Link Layer. Reliable transmission of data frames between two nodes connected by a physical layer
- Physical Layer (Layer 1). Transmission and reception of raw bit stream over a physical medium
For any layer of them, its upper layer and lower are transparent if available, meaning it isn’t capable of having any insight of other layers. This is a very important concept for OSI model and will be implied in the following article.
Types of LBs
L7, application layer
An L7 load balancer operates at the highest level of OSI level, but this doesn’t mean it has insights for all layers, but simply L7. The content in L7 is the actual information of applications, which includes HTTP, DNS, and FTP etc. This gives the L7 LB ability to terminate traffic and make decisions based on the content of messages, such as URL and cookie. It then forwards traffic to actual servers behind.
DNS is an L7 protocol mentioned previously, so a DNS LB should operate as same as a general L7 LB, right? Applications usually query DNS to find the IP address of a domain, they will then follow the response to reach the destination. A DNS LB is not an LBed DNS group, is a (set of) DNS server(s) returning a set of pre-defined DNS records, which changes itself each query, typically by with the round-robin method. Since the response changes every time w/o considering caching, applications are distributed to a group of servers.
L4, TCP/UDP LB
An L4 load balancer comes into consideration when there is more traffic. It includes TCP and UDP. The L4 load balancer simply forwards packets by using their pairs of IP address and port number or port numbers alone. It allows load balance specific flows across pools. As mentioned above, it has no ability to inspect any content that is higher than L4, meaning it can’t make routing decisions based on the content.
L3, network layer
Layer 3 load balancers are typically based on the IP protocol. It sees only IP information usually and makes a decision solely on IP address (source and/or destination).
Most of the internet use unicast. It means routing devices find best (usually shortest) path to a destination (IP). Anycast is a technique that multiple network devices share the same IP address. The routers will consider them as the same destination with different path costs and will choose the best one by their policies.
Common algorithms and related techniques
Round Robin (RR)
This is the simplest algorithm used in LB. Imagining a set of server returns one answer with instances A, B, and C behind, the answers will follow:
A -> B -> C -> A -> B -> C -> A -> …
Weighted Round Robin (WRR)
RR is robust because of its simplicity. If we have a set of servers with different performance, and we want to distribute traffic fairly not equally, the WRR fits perfectly. This method operates the same way as RR, but a server with a higher weight will receive more traffic in a round. Taking the same example except server B weighing 2, its response will follow or similar:
A -> B -> B -> C -> A -> B -> B -> C -> A -> …
If an LB is capable of tracking states of traffic, there are more possible algorithms used to make decisions. Methods, such as least connection and least response time, need to track states of connections, they may provide some benefits in some cases but will become a bottleneck when traffic goes larger and larger.
Health checking is almost a requirement for using LB, though it isn’t mandatory. The health checking works as its name suggests: regularly check a service is still live to determine whether following traffic to be re-directed to this service.