Directory Services 7.3.5

Performance tuning

Performance requirements

Your key performance requirement is to satisfy your users or customers with the resources available to you. Before you can solve potential performance problems, define what those users or customers expect. Determine which resources you will have to satisfy their expectations.

Service level objectives

A service level objective (SLO) is a target for a directory service level that you can measure quantitatively. If possible, base SLOs on what your key users expect from the service in terms of performance.

Define SLOs for at least the following areas:

  • Directory service response times

    Directory service response times range from less than a millisecond on average, across a low latency connection on the same network, to however long it takes your network to deliver the response.

    More important than average or best response times is the response time distribution, because applications set timeouts based on worst case scenarios.

    An example response time performance requirement is, Directory response times must average less than 10 milliseconds for all operations except searches returning more than 10 entries, with 99.9% of response times under 40 milliseconds.

  • Directory service throughput

    Directories can serve many thousands of operations per second. In fact there is no upper limit for read operations such as searches, because only write operations must be replicated. To increase read throughput, simply add additional replicas.

    More important than average throughput is peak throughput. You might have peak write throughput in the middle of the night when batch jobs update entries in bulk, and peak binds for a special event or first thing Monday morning.

    An example throughput performance requirement is, The directory service must sustain a mix of 5,000 operations per second made up of 70% reads, 25% modifies, 3% adds, and 2% deletes.

    Ideally, you mimic the behavior of key operations during performance testing, so that you understand the patterns of operations in the throughput you need to provide.

  • Directory service availability

    DS software is designed to let you build directory services that are basically available, including during maintenance and even upgrade of individual servers.

    To reach very high levels of availability, you must also ensure that your operations execute in a way that preserves availability.

    Availability requirements can be as lax as a best effort, or as stringent as 99.999% or more uptime.

    Replication is the DS feature that allows you to build a highly available directory service.

  • Directory service administrative support

    Be sure to understand how you support your users when they run into trouble.

    While directory services can help you turn password management into a self-service visit to a web site, some users still need to know what they can expect if they need your help.

Creating an SLO, even if your first version consists of guesses, helps you reduce performance tuning from an open-ended project to a clear set of measurable goals for a manageable project with a definite outcome.

Resource constraints

With your SLOs in hand, inventory the server, networks, storage, people, and other resources at your disposal. Now is the time to estimate whether it is possible to meet the requirements at all.

If, for example, you are expected to serve more throughput than the network can transfer, maintain high-availability with only one physical machine, store 100 GB of backups on a 50 GB partition, or provide 24/7 support all alone, no amount of tuning will fix the problem.

When checking that the resources you have at least theoretically suffice to meet your requirements, do not forget that high availability in particular requires at least two of everything to avoid single points of failure. Be sure to list the resources you expect to have, when and how long you expect to have them, and why you need them. Make note of what is missing and why.

Server hardware

DS servers are pure Java applications, making them very portable. DS servers tend to perform best on single-board, x86 systems due to low memory latency.

Storage

High-performance storage is essential for handling high-write throughput. When the database stays fully cached in memory, directory read operations do not result in disk I/O. Only writes result in disk I/O. You can further improve write performance by using solid-state disks for storage or file system cache.

DS directory servers are designed to work with local storage for database backends. Do not use network file systems, such as NFS, where there is no guarantee that a single process has access to files.

Storage area networks (SANs) and attached storage are fine for use with DS directory servers.

Regarding database size on disk, sustained write traffic can cause the database to grow to more than twice its initial size on disk. This is normal behavior. The size on disk does not impact the DB cache size requirements.

Linux file systems

Write barriers and journaling mode for Linux file systems help avoid directory database file corruption. They make sure writes to the file system are ordered even after a crash or power failure. Make sure these features are enabled.

Some Linux distributions permanently enable write barriers. There is no administrative action to take.

Other Linux systems leave the decision to you. If your Linux system lets you configure write barriers and journaling mode for the file system, refer to the options for your file system in the mount command manual page for details on enabling them.

Performance tests

Even if you do not need high availability, you still need two of everything, because your test environment needs to mimic your production environment as closely as possible.

In your test environment, set up DS servers just as you do in production. Conduct experiments to determine how to best meet your SLOs.

The following command-line tools help with basic performance testing:

  • The makeldif command generates sample data with great flexibility.

  • The addrate command measures add and delete throughput and response time.

  • The authrate command measures bind throughput and response time.

  • The modrate command measures modification throughput and response time.

  • The searchrate command measures search throughput and response time.

All *rate commands display response time distributions measurements, and support testing at specified levels of throughput.

For additional precision when evaluating response times, use the global configuration setting etime-resolution. To change elapsed processing time resolution from milliseconds (default) to nanoseconds:

$ dsconfig \
 set-global-configuration-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --set etime-resolution:nanoseconds \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

The etime, recorded in the server access log, indicates the elapsed time to process the request. The etime starts when the decoded operation is available to be processed by a worker thread.

Test performance with your production-ready configuration. If, however, you simply want to demonstrate top performance, take the following points into account:

  • Incorrect JVM tuning slows down server and tool performance.

    Make sure the JVM is tuned for best performance. For details, refer to Java settings.

  • Unfiltered access logs record messages for each client request. Turn off full access logging.

    For example, set enabled:false for the Json File-Based Access Logger log publisher, and any other unfiltered log publishers that are enabled.

  • Secure connections are recommended, and they can be costly.

    Set require-secure-authentication:false in the password policies governing the bind entries, and bind using insecure connections.

Performance settings

Use the following suggestions when your tests show that DS performance is lacking, even though you have the right underlying network, hardware, storage, and system resources in place.

Maximum open files

DS servers must open many file descriptors when handling thousands of client connections.

Linux systems often set a limit of 1024 per user. That setting is too low to accept thousands of client connections.

Make sure the server can use at least 64K (65536) file descriptors. For example, when running the server as user opendj on a Linux system that uses /etc/security/limits.conf to set user level limits, set soft and hard limits by adding these lines to the file:

opendj soft nofile 65536
opendj hard nofile 131072

The example above assumes the system has enough file descriptors available overall. Check the Linux system overall maximum as follows:

$ cat /proc/sys/fs/file-max
204252

Linux page caching

Default Linux virtual memory settings cause significant buildup of dirty data pages before flushing them. When the kernel finally flushes the pages to disk, the operation can exhaust the disk I/O for up to several seconds. Application operations waiting on the file system to synchronize to disk are blocked.

The default virtual memory settings can therefore cause DS server operations to block for seconds at a time. Symptoms included high outlier etimes, even for very low average etimes. For sustained high loads, such as import operations, the server has to maintain thousands of open file descriptors.

To avoid these problems, tune Linux page caching. As a starting point for testing and tuning, set vm.dirty_background_bytes to one quarter of the disk I/O per second, and vm.dirty_expire_centisecs to 1000 (10 seconds) using the sysctl command. This causes the kernel to flush more often, and limits the pauses to a maximum of 250 milliseconds.

For example, if the disk I/O is 80 MB/second for writes, the following example shows an appropriate starting point. It updates the /etc/sysctl.conf file to change the setting permanently, and uses the sysctl -p command to reload the settings:

$ echo vm.dirty_background_bytes=20971520 | sudo tee -a /etc/sysctl.conf
[sudo] password for admin:

$ echo vm.dirty_expire_centisecs=1000 | sudo tee -a /etc/sysctl.conf

$ sudo sysctl -p
vm.dirty_background_bytes = 20971520
vm.dirty_expire_centisecs = 1000

Be sure to test and adjust the settings for your deployment.

For additional details, refer to the Oracle documentation on Linux Page Cache Tuning, and the Linux sysctl command virtual memory kernel reference.

Java settings

Default Java settings let you evaluate DS servers using limited system resources. For production systems, test and run with a tuned JVM.

To apply JVM settings, either:

  • Set the OPENDJ_JAVA_ARGS environment variable and restart the server:

    export OPENDJ_JAVA_ARGS="-Xmx<size> -XX:MaxTenuringThreshold=1 -Djava.security.egd=file:/dev/urandom"
  • Edit config/java.properties to update start-ds.java-args and restart the server:

    start-ds.java-args=-server -Xmx<size> -XX:MaxTenuringThreshold=1 -Djava.security.egd=file:/dev/urandom

    As the name indicates, the start-ds.java-args settings apply only to the start-ds command. To set JVM options for offline LDIF import, edit import-ldif.offline.java-args, for example.

  1. Use the most recent supported Java environment.

    Refer to the release notes section on Java for details.

  2. Set the maximum heap size with -Xmx<size>.

    Use at least a 2 GB heap unless your data set is small.

    For additional details, refer to Cache internal nodes.

    For Java 17, there is no need to set the minimum heap size. If you use Java 11, set the minimum heap size to the same value as the maximum heap size.

  3. Use the default garbage collection (GC) settings, equivalent to -XX:+UseG1GC.

  4. Set the maximum tenuring threshold to reduce unnecessary copying with -XX:MaxTenuringThreshold=1.

    This option sets the maximum number of GC cycles an object stays in survivor spaces before it is promoted into the old generation space. The recommended setting reduces the new generation GC frequency and duration. The JVM quickly promotes long-lived objects to the old generation space, rather than letting them accumulate in new generation survivor spaces, copying them for each GC cycle.

  5. (Optional) Review the following additional details for specific use cases:

    -XX:+DisableExplicitGC

    When using JMX, add this option to the list of start-ds.java-args arguments to avoid periodic full GC events.

    JMX is based on RMI, which uses references to objects. By default, the JMX client and server perform a full GC periodically to clean up stale references. As a result, the default settings cause JMX to cause a full GC every hour.

    Avoid using this argument with import-ldif.offline.java-args or when using the import-ldif command. The import process uses GC to manage memory and references to memory-mapped files.

    -Xlog:gc=level:file

    When diagnosing JVM tuning problems, log GC messages. You can turn the option off when everything is running smoothly.

    Always specify the output file for the GC log. Otherwise, the JVM logs the messages to the opendj/logs/server.out file, mixing them with other messages, such as stack traces from the supportextract command.

    For example, -Xlog:gc=info:file=/path/to/gc.log logs informational GC messages to the file, /path/to/gc.log.

    For details, use the java -Xlog:help command.

    -XX:TieredStopAtLevel=1

    When trying to reduce startup time for short-lived client tools, such as the ldapsearch command, use this setting as shown.

ForgeRock does not recommend using ZGC or huge pages at this time.

Data storage settings

By default, DS servers compress attribute descriptions and object class sets to reduce data size. This is called compact encoding.

By default, DS servers do not compress entries stored in its backend database. If your entries hold values that compress well, such as text, you can gain space. Set the backend property entries-compressed:true, and reimport the data from LDIF. The DS server compresses entries before writing them to the database:

$ dsconfig \
 set-backend-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --set entries-compressed:true \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

$ import-ldif \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --ldifFile backup.ldif \
 --backendID dsEvaluation \
 --includeBranch dc=example,dc=com \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin

DS directory servers do not proactively rewrite all entries after you change the settings. To force the DS server to compress all entries, you must import the data from LDIF.

LDIF import settings

By default, the temporary directory used for scratch files is opendj/import-tmp. Use the import-ldif --tmpDirectory option to set this directory to a tmpfs file system, such as /tmp.

If you are certain your LDIF contains only valid entries with correct syntax, you can skip schema validation. Use the import-ldif --skipSchemaValidation option.

Database cache settings

By default, DS directory servers:

  • Use shared cache for all JE database backends.

    The recommended setting is to leave the global property, je-backend-shared-cache-enabled, set to true.

    If you have more than one JE database backend, before you change this setting to false, you must set either db-cache-percent or db-cache-size appropriately for each JE backend. By default, db-cache-percent is 50% for each backend. If you have multiple backends, including backends created with setup profiles, the default settings can prevent the server from starting if you first disable the shared cache.

  • Cache JE database internal and leaf notes to achieve best performance.

    The recommended setting is to leave this advanced property, db-cache-mode, set to cache-ln.

    In very large directory deployments, monitor the server and minimize critical evictions. For details, refer to Cache internal nodes.

If you require fine-grained control over JE backend cache settings, you can configure the amount of memory requested for database cache per database backend:

  1. Configure db-cache-percent or db-cache-size for each JE backend.

    db-cache-percent

    Percentage of JVM memory to allocate to the database cache for the backend.

    If the directory server has multiple database backends, the total percent of JVM heap used must remain less than 100 (percent), and must leave space for other uses.

    Default: 50 (percent)

    db-cache-size

    JVM memory to allocate to the database cache.

    This is an alternative to db-cache-percent. If you set its value larger than 0, then it takes precedence over db-cache-percent.

    Default: 0 MB

  2. Set the global property je-backend-shared-cache-enabled:false.

  3. Restart the server for the changes to take effect.

Cache internal nodes

A JE backend has a B-tree data structure. A B-tree consists of nodes that can have children. Nodes with children are internal nodes. Nodes without children are leaf nodes.

The directory stores data in key-value pairs. Internal nodes hold the keys and can hold small values. Leaf nodes hold the values. One internal node usually holds keys to values in many leaf nodes. A B-tree has many more leaf nodes than internal nodes.

To read a value by its key, the backend traverses all internal nodes on the branch from the B-tree root to the leaf node holding the value. The closer a node is to the B-tree root, the more likely the backend must access it to get to the value. In other words, the backend accesses internal nodes more often than leaf nodes.

When a backend accesses a node, it loads the node into the DB cache. Loading a node because it wasn’t in cache is a cache miss. When you first start DS, all requests result in cache misses until the server loads active nodes.

As the DB cache fills, the backend makes space to load nodes by evicting nodes from the cache. The backend evicts leaf nodes, then least recently used internal nodes. As a last resort, the backend evicts even recently used internal nodes with changes not yet synced to storage.

The next time the backend accesses an evicted node, it must load the node from storage. Storage may mean the file system cache, or it may mean a disk. Reading from memory can be orders of magnitude faster than reading from disk. For the best DB performance, cache the nodes the DB accesses most often, which are the internal nodes.

Once DS has run for some time and active nodes are in cache, watch the cache misses for internal nodes. DS has "warmed up" and the active nodes are in the cache. The number of evicted internal nodes should remain constant. When the cache size is right, and no sudden changes occur in access patterns, the number of cache misses for internal nodes should stop growing:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --bindDN uid=admin \
 --bindPassword password \
 --baseDN cn=monitor \
 "(objectClass=ds-monitor-backend-db)" \
 ds-mon-db-cache-evict-internal-nodes-count \
 ds-mon-db-cache-misses-internal-nodes
dn: ds-cfg-backend-id=dsEvaluation,cn=backends,cn=monitor
ds-mon-db-cache-evict-internal-nodes-count: <number>
ds-mon-db-cache-misses-internal-nodes: <number>

If ds-mon-db-cache-evict-internal-nodes-count is greater than 0 and growing, or ds-mon-db-cache-misses-internal-nodes continues to grow even after DS has warmed up, DS is evicting internal nodes from the DB cache.

If you can rule out big changes in access cache patterns like large unindexed searches, DS does not have enough space for the DB cache. Increase the DB cache size and add more RAM to your system if necessary. If adding RAM isn’t an option, increase the maximum heap size (-Xmx) to optimize RAM allocation.

Estimate minimum DB cache size

This section explains how to estimate the minimum DB cache size.

The examples below use a directory server with a 10 million entry dsEvaluation backend. The backend holds entries generated using the --set ds-evaluation/generatedUsers:10,000,000 setup option.

Before estimating DB cache size for your deployment:

  • Configure the servers with production replication and indexing settings.

  • Import realistic data.

    If you can’t find test data matching production data, generate realistic data.

    Immediately after import, a JE backend has the minimum number of internal nodes the data requires.

  • Simulate realistic traffic to your service.

    Even better, learn about real loads from analysis of production access logs, and build custom test clients to match the access patterns of your applications.

    The backend appends updates to its database log and cleans the database log in the background. Over time, as more updates occur, the number of internal nodes grows, tracking backend growth.

After simulating realistic traffic for some time, stop the server. Use the output from the backendstat command to estimate the required DB cache size JE DbCacheSize tool together to estimate the required DB cache size:

# Stop the server before using backendstat:
$ stop-ds

$ backendstat list-raw-dbs --backendId dsEvaluation
Raw DB Name                                                                                      Total Keys  Keys Size  Values Size  Total Size
-----------------------------------------------------------------------------------------------------------------------------------------------
/compressed_schema/compressed_attributes                                                         54                 54          837         891
/compressed_schema/compressed_object_classes                                                     18                 18          938         956
/dc=com,dc=example/aci.presence                                                                  1                   1            3           4
/dc=com,dc=example/cn.caseIgnoreMatch                                                            10000165    139242470     47887210   187129680
/dc=com,dc=example/cn.caseIgnoreSubstringsMatch:6                                                858657        5106079    204936276   210042355
/dc=com,dc=example/counter.objectClass.big.objectIdentifierMatch                                 2                  34            2          36
/dc=com,dc=example/counter.userPassword.big.passwordStorageSchemeEqualityMatch                   0                   0            0           0
/dc=com,dc=example/dn2id                                                                         10000181    268892913     80001448   348894361
/dc=com,dc=example/ds-certificate-fingerprint.caseIgnoreMatch                                    0                   0            0           0
/dc=com,dc=example/ds-certificate-subject-dn.distinguishedNameMatch                              1                  18            3          21
/dc=com,dc=example/ds-sync-conflict.distinguishedNameMatch                                       0                   0            0           0
/dc=com,dc=example/ds-sync-hist.changeSequenceNumberOrderingMatch                                0                   0            0           0
/dc=com,dc=example/entryUUID.uuidMatch                                                           9988518      39954072     47871653    87825725
/dc=com,dc=example/givenName.caseIgnoreMatch                                                     8614            51690     20017387    20069077
/dc=com,dc=example/givenName.caseIgnoreSubstringsMatch:6                                         19651           97664     48312525    48410189
/dc=com,dc=example/id2childrencount                                                              8                  26           14          40
/dc=com,dc=example/id2entry                                                                      10000181     80001448   5379599451  5459600899
/dc=com,dc=example/json.caseIgnoreJsonQueryMatch                                                 4                  56            8          64
/dc=com,dc=example/jsonToken.extensibleJsonEqualityMatch:caseIgnoreStrings:ignoreWhiteSpace:/id  2                  34            4          38
/dc=com,dc=example/mail.caseIgnoreMatch                                                          10000152    238891751     47887168   286778919
/dc=com,dc=example/mail.caseIgnoreSubstringsMatch:6                                              1222798       7336758    112365097   119701855
/dc=com,dc=example/member.distinguishedNameMatch                                                 1                  40            2          42
/dc=com,dc=example/oauth2Token.caseIgnoreOAuth2TokenQueryMatch                                   4                  74           10          84
/dc=com,dc=example/objectClass.big.objectIdentifierMatch                                         6                 156            0         156
/dc=com,dc=example/objectClass.objectIdentifierMatch                                             24                396          395         791
/dc=com,dc=example/referral                                                                      0                   0            0           0
/dc=com,dc=example/sn.caseIgnoreMatch                                                            13457           92943     20027045    20119988
/dc=com,dc=example/sn.caseIgnoreSubstringsMatch:6                                                41585          219522     73713958    73933480
/dc=com,dc=example/state                                                                         26               1335           25        1360
/dc=com,dc=example/telephoneNumber.telephoneNumberMatch                                          9989952     109889472     47873522   157762994
/dc=com,dc=example/telephoneNumber.telephoneNumberSubstringsMatch:6                              1111110       6543210    221281590   227824800
/dc=com,dc=example/uid.caseIgnoreMatch                                                           10000152    118889928     47887168   166777096
/dc=com,dc=example/uniqueMember.uniqueMemberMatch                                                10                406           21         427
/dc=com,dc=example/userPassword.big.passwordStorageSchemeEqualityMatch                           0                   0            0           0

Total: 34

# Calculate the sum of total keys, the average key size, and the average value size.
# Sum of total keys: 73255334
# Average key size: sum of key sizes/sum of total keys = 1015212568 / 73255334 ~= 13.86
# Average value size: sum of values sizes/sum of total keys = 6399663760 / 73255334 ~= 87.36

# Use the results rounded to the nearest integer as arguments to the DbCacheSize tool:
$ java -cp editable:dsDockerBase[/path/to/opendj]/lib/opendj.jar com.sleepycat.je.util.DbCacheSize \
 -records 73255334 -key 14 -data 87

=== Environment Cache Overhead ===

3,158,773 minimum bytes

To account for JE daemon operation, record locks, HA network connections, etc,
a larger amount is needed in practice.

=== Database Cache Size ===

   Number of Bytes  Description
   ---------------  -----------
     2,953,929,424  Internal nodes only
    12,778,379,408  Internal nodes and leaf nodes

For further information see the DbCacheSize javadoc.

The resulting recommendation for caching Internal nodes only is 2,953,929,424 bytes (~ 3 GB) in this example. This setting for DB cache includes space for all internal nodes, including those with keys and data. To cache all DB data, Internal nodes and leaf nodes, would require 12,778,379,408 (~13 GB).

Round up when configuring backend settings for db-cache-percent or db-cache-size. If the system in this example has 8 GB available memory, use the default setting of db-cache-percent: 50. (50% of 8 GB is 4 GB, which is larger than the minimum estimate.)

Database log file settings

With default settings, if the database has more than 200 files on disk, then the JE backend must start closing one log file in order to open another. This has serious impact on performance when the file cache starts to thrash.

Having the JE backend open and close log files from time to time is okay. Changing the settings is only necessary if the JE backend has to open and close the files very frequently.

A JE backend stores data on disk in append-only log files. The maximum size of each log file is configurable. A JE backend keeps a configurable maximum number of log files open, caching file handles to the log files. The relevant JE backend settings are the following:

db-log-file-max

Maximum size of a database log file.

Default: 1 GB

db-log-filecache-size

File handle cache size for database log files.

Default: 200

With these defaults, if the size of the database reaches 200 GB on disk (1 GB x 200 files), the JE backend must close one log file to open another. To avoid this situation, increase db-log-filecache-size until the JE backend can cache file handles to all its log files. When changing the settings, make sure the maximum number of open files is sufficient.

Log settings

Debug logs trace the internal workings of DS servers, and should be used sparingly. Be particularly careful when activating debug logging in high-performance deployments.

In general, leave other logs active for production environments to help troubleshoot any issues that arise.

For servers handling 100,000 operations per second or more, the access log can be a performance bottleneck. Each client request results in at least one access log message. Test whether disabling the access log improves performance in such cases.

The following command disables the JSON-based LDAP access logger:

$ dsconfig \
 set-log-publisher-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --publisher-name "Json File-Based Access Logger" \
 --set enabled:false \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

The following command disables the HTTP access logger:

$ dsconfig \
 set-log-publisher-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --publisher-name "File-Based HTTP Access Logger" \
 --set enabled:false \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

Changelog settings

By default, a replication server indexes change numbers for replicated user data. This allows legacy applications to get update notifications by change number, as described in Align draft change numbers. Indexing change numbers requires additional CPU, disk accesses and storage, so it should not be used unless change number-based browsing is required.

Disable change number indexing if it is not needed. For details, refer to Disable change number indexing.