Requests targeting a specific portion of the data are consistently routed to the same server, but requests targeting a different portion of the data might be sent to a different server.

Load spreading is useful for deployments:

  • Where the directory information tree (DIT) contains a large number of branches below a common parent
  • Where most operations (including search operations, as indicated by the search base DN) only target entries at least one level below that common parent.

For example, load spreading can be useful for a multi-tenant deployment in which all of the entries for a given tenant are within their own branch, and all of the tenant branches reside below a common parent.

Load spreading is configured with the load-spreading-base-dn property. The value or values of this property are the base DN or DNs below which the tenant entries reside.

In a deployment with a DIT as in the following example, the load-spreading-base-dn value would be set to ou=customers,dc=example,dc=com.

dc=example,dc=com
ou=customers,dc=example,dc=com
ou=Customer 1,ou=customers,dc=example,dc=com
ou=Customer 2,ou=customers,dc=example,dc=com
ou=Customer 3,ou=customers,dc=example,dc=com
...

If the load-spreading-base-dn property is not configured, the failover load-balancing algorithm uses the default behavior. If the property is configured with one or more values, but a client requests an operation that targets an entry that is not below any of the configured base DNs, then that operation is handled using the default behavior.

When the load-spreading-base-dn property is configured with one or more values, the load-balancing algorithm continues to generate the same list of lists, but the order of the servers within each list is determined using the following algorithm:

  1. If the list is empty or contains only a single item, then leave it unchanged and skip the remaining steps.
  2. Identify the Relative Distinguished Name (RDN) component from the target entry DN that is exactly one level below one of the load-spreading-base-dn values. If the targeted entry is not below any of the configured load-spreading-base-dn values, the order of servers in each of those lists is based only on the order in which they appear in the load-balancing algorithm’s backend-server property. The remaining steps are skipped.
  3. Compute a SHA-1 digest from the normalized string representation of the identified RDN component. SHA-1 is notably faster than more secure digest algorithms, and it does a good job at distributing bits across the entire range of the 160 bits that it generates.
  4. Create a non-negative integer from the last 31 bits of the computed SHA-1 digest.
  5. Compute a modulus using the integer value as the dividend and the number of servers in the current list as the divisor. This yields an integer value that is between 0 and (list.size() - 1), inclusive.
    Note:

    (list.size() - 1) is the number of servers minus 1.

  6. If the modulus computed is equal to 0, no further action is necessary. If not, move a number of servers equal to the computed modulus from the beginning of the list to the end of the list. The order of the elements that are moved should be preserved.

For example, consider a load-spreading-base-dn value of "ou=customers,dc=example,dc=com", a list that contains three servers (ds1, ds2, and ds3, in that order), and a modify request that targets the entry with DN "uid=jdoe,ou=People,ou=Acme,ou=customers,dc=example,dc=com". The RDN component immediately below the load-spreading-base-dn is "ou=Acme". The normalized string representation of that RDN component is "ou=acme", and the hexadecimal representation of the SHA-1 digest of that is "f0c69713535daf8816038f1bceab70380c92b83e". The last 31 bits of that SHA-1 digest are 0c92b83e hex, which is 210942014. With 210942014 modulo 3 is 2, which means that the first two servers are moved from the beginning of the list to the end of the list, resulting in an order of ds3, ds1, ds2.

While this algorithm spreads the load across multiple backend servers, it does not mean that there will be an even distribution of the load across all of those servers. The load-balancing algorithm still prioritizes based on location and health check state, so the load is generally spread only across the available servers in the same location as the PingDirectoryProxy server. Assuming that the entries that are immediate children of a load-spreading-base-dn are the tops of the branches that define tenants, some tenants are targeted more heavily than others because they have more entries or because their entries are accessed more frequently. The modulo operation might not result in an even distribution across those servers.