Load balancing

When Fastly cannot satisfy end-user requests directly, we usually forward the request on to a backend. If you wish you can spread the traffic from Fastly across multiple backends, to balance the load, protect against backend failure and avoid overwhelming any one server.

HINT: This page deals with dividing traffic between backends to balance load. If the reason you want multiple backends on your service is in order to support a service-oriented or microservices architecture, you will want to write your own logic for selecting backends based on URL paths. Refer to selecting backends and a code example for microservices instead of the information below.

In VCL services, support for load balancing is built-in via directors. In Compute services it can be performed explicitly by writing load balancing strategies in your own edge code.

Directors

Directors are entities in VCL services that group together a list of backends, or other directors, with a policy (also known as a 'strategy') for choosing one. These can be created automatically for you by using automatic load balancing, or defined explicitly in VCL code using the director declaration, for example:

director my_director random {
    { .backend = F_backend1; .weight = 2; }
    { .backend = F_backend2; .weight = 1; }
}

Ultimately all load balancing in VCL uses directors defined similarly to the one above, but there are several ways to create these configurations with different trade offs:

Method	How to create	How to use	Works with shielding
Automatic load balancing	Web interface or API	Automatic	Yes
Custom directors	Custom VCL or API	Custom VCL	Not by default
Dynamic server pools	API	Custom VCL	No

Regardless of the method you use, to provide redundancy and failover within any load balancing group, ensure that all backends that participate in load balancing have a health check configured. Directors will redistribute traffic to other backends in the group if one backend is unhealthy. Learn more about failover and redundancy.

Automatic load balancing

The simplest way to perform load balancing is to set the auto_loadbalance property of each backend to true. This can be done via the API, CLI, or in the web interface.

Auto load balancing generates a director with the random policy and equal weights for each backend, and groups backends based on groups of identical conditions. The order of priority for selecting backends for a request is:

Evaluate conditions that are attached to backends, in order of increasing priority. If a condition is matched, do one of the following:
- if the condition is associated with only one backend, select that backend
- if the condition is associated with multiple backends and at least two of them have auto_loadbalance enabled, choose one of the load balanced ones at random
- (rarely) if the condition is associated with multiple backends and no more than one has auto_loadbalance enabled, ignore the load balancing setting and choose the backend that was added to the service first.
Since no conditions match, determine if there are two or more backends with auto load balancing enabled and which do not have any conditions attached.
- If so, choose one of those backends randomly
Otherwise, consider backends without attached conditions and with auto load balancing disabled, and select the one that was created earliest

HINT: For example, if you want to operate 5 backends - one for requests starting with /account, two for all other requests originating from North America, and two for all other traffic, a potential solution using auto load balancing would be:

Set auto load balancing to true on all 5 backends.
Create a condition with the expression req.path ~ "/account(?:/.*)?\z" and attach it to the Accounts microservice backend with a priority of 10.
Create a condition with the expression client.geo.continent_code == "NA" and attach to two of the backends with a priority of 20.
Leave the remaining two backends without a condition.

Auto load-balancing is simple but has some limitations:

If your service uses shielding, the backends participating in auto load balancing should all be configured with the same shield POP. If they are not, the shield that is used will depend on the order in which the backends were added to your service.
All backends will have equal weight, i.e. roughly equal amounts of traffic will go to each backend. To configure unequal shares, use a custom director.
The director will have the random policy. For other policies, use a custom director.
The director will have a quorum of 1: so the director itself will be healthy as long as there is at least one healthy member. To configure a higher quorum, use a custom director.

Custom directors

Explicitly defined directors allow customization of director policies, and complex load balancing logic. They may be defined via API or in custom VCL and require custom VCL to invoke. They are not by default compatible with shielding.

To define a director:

use a director declaration in an 'init' VCL snippet or in the init space of a custom VCL file (i.e. outside of any VCL subroutines); or
use the directors API to have Fastly generate the VCL code for you.

To select a custom director, set the req.backend variable in the vcl_recv subroutine of a custom VCL file:

sub vcl_recv { ... }

Fastly VCL

set req.backend = my_director;

HINT: Your desired value of req.backend needs to take precedence over Fastly's automatic backend selection code, so ensure that you insert this code after the #FASTLY RECV line. This can only be done in custom VCL, not VCL snippets.

Custom directors support configurable policies and quorum. The random, hash, client, or chash policies are good choices for load balancing and will also automatically adapt to the failure of individual backends as long as all backends in the director have a health check configured. The fallback policy creates a director whose only purpose is failover.

A director can be used in any context in which a BACKEND type is expected, including as a member of another director.

Combining with shielding

Directors don't work easily with shielding because they require you to set req.backend explicitly in VCL, which will override Fastly's shielding logic and therefore never forward a request to a shield POP. You can work around this:

Create a placeholder backend just for shielding.
- You can name it anything you like, e.g. shielding_iad.
- Use a hostname that won't route outside of Fastly, e.g. 127.0.0.1.
- Configure it to shield at your desired POP, e.g. Ashburn, Virginia (IAD)
- If you want some requests to shield in a different place, add more shielding placeholder backends, e.g. shielding_fra.
Create the 'real' backends you want to use in a director.
- You can name them anything you like, e.g. prod, prod_backup, eu1, eu2.
- Don't enable shielding or load balancing on these backends.
Add a condition to all backends except one.
- On all 'real' backends, add a condition which is always false. The conditional expression can simply be the word false.
- If you have more than one shielding placeholder backend, add conditions to all but one to differentiate when they should be used, e.g., client.geo.country_code == "US".
- You should end up with one backend (a shielding placeholder) that is not subject to any conditions.
Write VCL at the top of your custom VCL file (or in an "init" VCL snippet), to define a director that contains your real backends, for example:
```
director origin_director fallback {
  { .backend = F_prod; }
  { .backend = F_prod_backup; }
}
```
Write VCL in vcl_miss and vcl_pass to choose the director if the current backend is the shielding placeholder:
```
if (req.backend == F_shielding_placeholder) {
  set req.backend = origin_director;
}
```

The effect of this is that when a request is received, Fastly's automatic backend allocation logic will assign the shielding placeholder backend (because your real backends are disabled by the false condition), and if the POP currently processing the request is not the designated shield, the backend will be set to the shield POP instead. In vcl_miss, if the backend is set to the shielding placeholder, you know that you are on the shield POP and about to make a fetch to origin, so can swap out the placeholder backend for the director that you want to use.

Dynamic server pools

WARNING: Dynamic server pools are generally not the best solution. This feature remains supported in Limited Availability, but we recommend considering a custom director or automatic load balancing instead.

Creating or editing custom directors, whether created via the API or explicitly in VCL, requires you to create a new version of your service - even to add or remove backends in the director. If you prefer, Fastly can regenerate the list of backends within the director dynamically, without generating a new version of your service.

To use this capability, you must define the director and the constituent backends using the pools API. Also note that:

To select the dynamic server pool director, use custom VCL in exactly the same way as for a custom director.
Dynamic server pools, like custom directors, are not compatible with shielding.
Backends in a dynamic server pool are not listed in the web interface and do not appear in the service's backends list via the backends API.
Dynamic server pools are not listed in the web interface

Fastly Compute

The Compute platform does not support directors, but load balancing strategies can be written in edge code directly. Here are some examples for implementing load balancing logic in Compute programs.

A simple random load balancer could be written like so:

For more complex load balancing, you can make use of consistent hashing to map requests to backends based on certain request properties, such as the request's path or a user ID. The implementation for this approach is roughly the same regardless of which property of the request you use as the input to the hash function.

WARNING: Take care when designing load balancing logic in Compute programs that you are aware of how requests will be redistributed if you change the number of backends. Libraries for performing consistent hashing exist in most programming languages, and can be used to implement more stable mappings between requests and backends.

The following Javascript examples use the consistent-hash library to implement hashing.

In this example, the requests are mapped to backends using the paths and queries of the requests:

Here, requests are mapped using a cookie containing a user ID:

The health status of backends is exposed in some Compute SDKs and allows for the creation of custom failover logic in Compute programs.

Network services

Security

Compute

Quick start

Building blocks