PingDS 7.5.1

Troubleshooting

Define the problem

To solve your problem, save time by clearly defining it first. A problem statement compares the difference between observed behavior and expected behavior:

  • What exactly is the problem?

    What is the behavior you expected?

    What is the behavior you observed?

  • How do you reproduce the problem?

  • When did the problem begin?

    Under similar circumstances, when does the problem not occur?

  • Is the problem permanent?

    Intermittent?

    Is it getting worse? Getting better? Staying the same?

Performance

Before troubleshooting performance, make sure:

When directory operations take too long, meaning request latency is high, fix the problem first in your test or staging environment. Perform these steps in order and stop when you find a fix:

  1. Check for unindexed searches and prevent them when possible.

    Unindexed searches are expensive operations, particularly for large directories. When unindexed searches consume the server’s resources, performance suffers for concurrent operations and for later operations if an unindexed search causes widespread changes to database and file system caches.

  2. Check performance settings for the server including JVM heap size and DB cache size.

    Try adding more RAM if memory seems low.

  3. Read the request queue monitoring statistics over LDAP or over HTTP.

    If many requests are in the queue, the troubleshooting steps are different for read and write operations. Read and review the request statistics available over LDAP or over HTTP.

    If you persistently have many:

    • Pending read requests, such as unindexed searches or big searches, try adding CPUs.

    • Pending write requests, try adding IOPS, such as faster or higher throughput disks.

Installation problems

Use the logs

Installation and upgrade procedures result in a log file tracing the operation. The command output shows a message like the following:

See opendj-setup-profile-*.log for a detailed log of the failed operation.

Antivirus interference

Prevent antivirus and intrusion detection systems from interfering with DS software.

Before using DS software with antivirus or intrusion detection software, consider the following potential problems:

Interference with normal file access

Antivirus and intrusion detection systems that perform virus scanning, sweep scanning, or deep file inspection are not compatible with DS file access, particularly write access.

Antivirus and intrusion detection software have incorrectly marked DS files as suspect to infection, because they misinterpret normal DS processing.

Prevent antivirus and intrusion detection systems from scanning DS files, except these folders:

C:\path\to\opendj\bat\

Windows command-line tools

/path/to/opendj/bin/

Linux command-line tools

/path/to/opendj/extlib/

Optional .jar files used by custom plugins

/path/to/opendj/lib/

Scripts and libraries shipped with DS servers

Port blocking

Antivirus and intrusion detection software can block ports that DS uses to provide directory services.

Make sure that your software does not block the ports that DS software uses. For details, refer to Administrative access.

Negative performance impact

Antivirus software consumes system resources, reducing resources available to other services including DS servers.

Running antivirus software can therefore have a significant negative impact on DS server performance. Make sure that you test and account for the performance impact of running antivirus software before deploying DS software on the same systems.

JE initialization

When starting a directory server on a Linux system, make sure the server user can watch enough files. If the server user cannot watch enough files, you might read an error message in the server log like this:

InitializationException: The database environment could not be opened:
com.sleepycat.je.EnvironmentFailureException: (JE version) /path/to/opendj/db/userData
or its sub-directories to WatchService.
UNEXPECTED_EXCEPTION: Unexpected internal Exception, may have side effects.
Environment is invalid and must be closed.

File notification

A directory server backend database monitors file events. On Linux systems, backend databases use the inotify API for this purpose. The kernel tunable fs.inotify.max_user_watches indicates the maximum number of files a user can watch with the inotify API.

Make sure this tunable is set to at least 512K:

$ sysctl fs.inotify.max_user_watches

fs.inotify.max_user_watches = 524288

If this tunable is set lower than that, update the /etc/sysctl.conf file to change the setting permanently, and use the sysctl -p command to reload the settings:

$ echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf
[sudo] password for admin:

$ sudo sysctl -p
fs.inotify.max_user_watches = 524288

NoSuchAlgorithmException

When running the dskeymgr create-deployment-id or setup command on an operating system with no support for the PBKDF2WithHmacSHA256 SecretKeyFactory algorithm, the command displays this error:

NoSuchAlgorithmException: PBKDF2WithHmacSHA256 SecretKeyFactory not available

This can occur on operating systems where the default settings limit the available algorithms.

To fix the issue, enable support for the algorithm and run the command again.

Forgotten superuser password

By default, DS servers store the entry for the directory superuser in an LDIF backend. Edit the file to reset the password:

  1. Generate the encoded version of the new password:

    $ encode-password --storageScheme PBKDF2-HMAC-SHA256 --clearPassword password
    
    {PBKDF2-HMAC-SHA256}10<hash>
  2. Stop the server while you edit the LDIF file for the backend:

    $ stop-ds
  3. Replace the existing password with the encoded version.

    In the db/rootUser/rootUser.ldif file, carefully replace the userPassword value with the new, encoded password:

    dn: uid=admin
    ...
    uid: admin
    userPassword: 

    Trailing whitespace is significant in LDIF. Take care not to add any trailing whitespace at the end of the line.

  4. Restart the server:

    $ start-ds
  5. Verify that you can use the directory superuser account with the new password:

    $ status \
     --bindDn uid=admin \
     --bindPassword password \
     --hostname localhost \
     --port 4444 \
     --usePkcs12TrustStore /path/to/opendj/config/keystore \
     --trustStorePassword:file /path/to/opendj/config/keystore.pin \
     --script-friendly
    ...
    "isRunning" : true,

Debug-level logging

DS error log message severity levels are:

  • ERROR (highest severity)

  • WARNING

  • NOTICE

  • INFO

  • DEBUG (lowest severity)

By default, DS error log severity levels are set as follows:

  • Log ERROR, WARNING, NOTICE, and INFO replication (SYNC) messages.

  • Log ERROR, WARNING, and NOTICE messages for other message categories.

You can change these settings when necessary to log debug-level messages.

DS debug-level logging can generate a high volume of messages. Use debug-level logging very sparingly on production systems.

  1. Choose the category you want to debug:

    Category Description

    BACKEND

    Server backends

    BACKUP

    Backup procedures

    CONFIG

    Configuration management

    CORE

    Core server operations

    DEFAULT

    Messages with no specific category

    EXTENSIONS

    Reserved for custom extensions

    EXTERNAL

    External libraries

    JVM

    Java virtual machine information

    LOGGING

    Server log publishers

    PLUGIN

    Server plugins

    PROTOCOL

    Server protocols

    PROTOCOL.ASN1

    ASN.1 encoding

    PROTOCOL.HTTP

    HTTP

    PROTOCOL.JMX

    JMX

    PROTOCOL.LDAP

    LDAP

    PROTOCOL.LDAP_CLIENT

    LDAP SDK client features

    PROTOCOL.LDAP_SERVER

    LDAP SDK server features

    PROTOCOL.LDIF

    LDIF

    PROTOCOL.SASL

    SASL

    PROTOCOL.SMTP

    SMTP

    PROTOCOL.SSL

    SSL and TLS

    SCHEMA

    LDAP schema

    SECURITY

    Security features

    SECURITY.AUTHENTICATION

    Authentication

    SECURITY.AUTHORIZATION

    Access control and privileges

    SERVICE_DISCOVERY

    Service discovery

    SYNC

    Replication

    SYNC.CHANGELOG

    Replication changelog

    SYNC.CHANGENUMBER

    Replication change number and change number index

    SYNC.CONNECTIONS

    Replication connections

    SYNC.HEARTBEAT

    Replication heartbeat checks

    SYNC.LIFECYCLE

    Replication lifecycle

    SYNC.PROTOCOL_MSGS

    Replication protocol messages excluding updates and heartbeat checks

    SYNC.PURGE

    Replication changelog and historical data purge events

    SYNC.REPLAY

    Replication replays and conflicts

    SYNC.STATE

    Replication state changes including generation ID

    SYNC.TOPOLOGY

    Replication topology

    SYNC.UPDATE_MSGS

    Replication update messages

    TASK

    Server tasks

    TOOLS

    Command-line tools

  2. Override the error log level specifically for the category or categories of interest.

    The following example enables debug-level logging for the replication lifecycle. As debug-level logging is of lower severity than the defaults, all the default log levels remain in effect:

    $ dsconfig \
     set-log-publisher-prop \
     --add override-severity:SYNC.LIFECYCLE=DEBUG \
     --hostname localhost \
     --port 4444 \
     --bindDN uid=admin \
     --bindPassword password \
     --publisher-name "File-Based Error Logger" \
     --usePkcs12TrustStore /path/to/opendj/config/keystore \
     --trustStorePassword:file /path/to/opendj/config/keystore.pin \
     --no-prompt

    The server immediately begins to write additional messages to the error log.

  3. Read the messages:

    $ tail -f /path/to/opendj/logs/errors
  4. Restore the default settings as soon as debug-level logging is no longer required:

    $ dsconfig \
     set-log-publisher-prop \
     --remove override-severity:SYNC.LIFECYCLE=DEBUG \
     --hostname localhost \
     --port 4444 \
     --bindDN uid=admin \
     --bindPassword password \
     --publisher-name "File-Based Error Logger" \
     --usePkcs12TrustStore /path/to/opendj/config/keystore \
     --trustStorePassword:file /path/to/opendj/config/keystore.pin \
     --no-prompt

Lockdown mode

Misconfiguration can put the DS server in a state where you must prevent users and applications from accessing the directory until you have fixed the problem.

DS servers support lockdown mode. Lockdown mode permits connections only on the loopback address, and permits only operations requested by superusers, such as uid=admin.

To put the DS server into lockdown mode, the server must be running. You cause the server to enter lockdown mode by starting a task. Notice that the modify operation is performed over the loopback address (accessing the DS server on the local host):

$ ldapmodify \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --bindDN uid=admin \
 --bindPassword password << EOF
dn: ds-task-id=Enter Lockdown Mode,cn=Scheduled Tasks,cn=tasks
objectClass: top
objectClass: ds-task
ds-task-id: Enter Lockdown Mode
ds-task-class-name: org.opends.server.tasks.EnterLockdownModeTask
EOF

The DS server logs a notice message in logs/errors when lockdown mode takes effect:

...msg=Lockdown task Enter Lockdown Mode finished execution in the state Completed successfully

Client applications that request operations get a message concerning lockdown mode:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --bindDN uid=kvaughan,ou=People,dc=example,dc=com \
 --bindPassword bribery \
 --baseDN "" \
 --searchScope base \
 "(objectclass=*)" \
 +

The LDAP bind request failed: 49 (Invalid Credentials)

Leave lockdown mode by starting a task:

$ ldapmodify \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --bindDN uid=admin \
 --bindPassword password << EOF
dn: ds-task-id=Leave Lockdown Mode,cn=Scheduled Tasks,cn=tasks
objectClass: top
objectClass: ds-task
ds-task-id: Leave Lockdown Mode
ds-task-class-name: org.opends.server.tasks.LeaveLockdownModeTask
EOF

The DS server logs a notice message when leaving lockdown mode:

...msg=Leave Lockdown task Leave Lockdown Mode finished execution in the state Completed successfully

LDIF import

  • By default, DS directory servers check that entries you import match the LDAP schema.

    You can temporarily bypass this check with the import-ldif --skipSchemaValidation option.

  • By default, DS servers ensure that entries have only one structural object class.

    You can relax this behavior with the advanced global configuration property, single-structural-objectclass-behavior.

    This can be useful when importing data exported from Sun Directory Server.

    For example, warn when entries have more than one structural object class, rather than rejecting them:

    $ dsconfig \
     set-global-configuration-prop \
     --hostname localhost \
     --port 4444 \
     --bindDN uid=admin \
     --bindPassword password \
     --set single-structural-objectclass-behavior:warn \
     --usePkcs12TrustStore /path/to/opendj/config/keystore \
     --trustStorePassword:file /path/to/opendj/config/keystore.pin \
     --no-prompt
  • By default, DS servers check syntax for several attribute types. Relax this behavior using the advanced global configuration property, invalid-attribute-syntax-behavior.

  • Use the import-ldif -R rejectFile --countRejects options to log rejected entries and to return the number of rejected entries as the command’s exit code.

Once you resolve the issues, reinstate the default behavior to avoid importing bad data.

Security problems

Incompatible Java versions

Due to a change in Java APIs, the same DS deployment ID generates different CA key pairs with Java 11 and Java 17 and later. When running the dskeymgr and setup commands, use the same Java environment everywhere in the deployment.

When you run the commands with a Java version that doesn’t match the deployment ID, DS displays a message such as the following:

The specified deployment ID with version '0' will cause interoperability problems with servers
running Java versions less than 17 if the deployment uses deployment ID-based PKI.
Follow the steps in the troubleshooting section of the documentation to resolve compatibility issues
with deployment IDs generated using a Java version prior to 17.

Using different Java versions is a problem if you use deployment ID-based CA certificates. Replication breaks, for example, when you use the setup command for a new server with a more recent version of Java than was used to set up existing servers. The error log includes a message such as the following:

...category=SYNC severity=ERROR msgID=119 msg=Directory server DS(server_id)
encountered an unexpected error while connecting to replication server host:port for domain "base_dn":
ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException:
signature check failed

To work around the issue, follow these steps:

  1. Update all DS servers to use the same Java version.

    Make sure you have a required Java environment installed on the system.

    If your default Java environment is not appropriate, use one of the following solutions:

    • Edit the default.java-home setting in the opendj/config/java.properties file.

    • Set OPENDJ_JAVA_HOME to the path to the correct Java environment.

    • Set OPENDJ_JAVA_BIN to the absolute path of the java command.

  2. Export CA certificates generated with the different Java versions.

    1. Export the CA certificate from an old server:

      $ keytool \
       -exportcert \
       -alias ca-cert \
       -keystore /path/to/old-server/config/keystore \
       -storepass:file /path/to/old-server/config/keystore.pin \
       -file java11-ca-cert.pem
    2. Export the CA certificate from a new server:

      $ keytool \
       -exportcert \
       -alias ca-cert \
       -keystore /path/to/new-server/config/keystore \
       -storepass:file /path/to/new-server/config/keystore.pin \
       -file java17-ca-cert.pem
  3. On all existing DS servers, import the new CA certificate:

    $ keytool \
     -importcert \
     -trustcacerts \
     -alias alt-ca-cert \
     -keystore /path/to/old-server/config/keystore \
     -storepass:file /path/to/old-server/config/keystore.pin \
     -file java17-ca-cert.pem \
     -noprompt
  4. On all new DS servers, import the old CA certificate:

    $ keytool \
     -importcert \
     -trustcacerts \
     -alias alt-ca-cert \
     -keystore /path/to/new-server/config/keystore \
     -storepass:file /path/to/new-server/config/keystore.pin \
     -file java11-ca-cert.pem \
     -noprompt

The servers reload their keystores dynamically and replication works as expected.

Certificate-based authentication

Replication uses TLS to protect directory data on the network. Misconfiguration can cause replicas to fail to connect due to handshake errors. This leads to repeated error log messages such as the following:

...msg=Replication server accepted a connection from address
 to local address address but the SSL handshake failed.
 This is probably benign, but may indicate a transient network outage
 or a misconfigured client application connecting to this replication server.
 The error was: Received fatal alert: certificate_unknown

You can collect debug trace messages to help determine the problem. To display the TLS debug messages, start the server with javax.net.debug set:

$ OPENDJ_JAVA_ARGS="-Djavax.net.debug=all" start-ds

The debug trace settings result in many, many messages. To resolve the problem, review the output of starting the server, looking in particular for handshake errors.

If the chain of trust for your PKI is broken somehow, consider renewing or replacing keys, as described in Key management. Make sure that trusted CA certificates are configured as expected.

FIPS and key wrapping

DS servers use shared asymmetric keys to protect shared symmetric secret keys for data encryption.

By default, DS uses direct encryption to protect the secret keys.

When using a FIPS-compliant security provider that doesn’t allow direct encryption, such as Bouncy Castle, change the Crypto Manager configuration to set the advanced property, key-wrapping-mode: WRAP. With this setting, DS uses wrap mode to protect the secret keys in a compliant way.

Compromised keys

How you handle the problem depends on which key was compromised:

  • For keys generated by the server, or with a deployment ID and password, refer to Retire secret keys.

  • For a private key whose certificate was signed by a CA, contact the CA for help. The CA might choose to publish a certificate revocation list (CRL) that identifies the certificate of the compromised key.

    Replace the key pair that has the compromised private key.

  • For a private key whose certificate was self-signed, replace the key pair that has the compromised private key.

    Make sure the clients remove the compromised certificate from their truststores. They must replace the certificate of the compromised key with the new certificate.

Client problems

Use the logs

By default, DS servers record messages for LDAP client operations in the logs/ldap-access.audit.json log file.

Show example log messages

In the access log, each message is a JSON object. This example formats each message to make it easier to read:

{
  "eventName": "DJ-LDAP",
  "client": {
    "ip": "<clientIp>",
    "port": 12345
  },
  "server": {
    "ip": "<serverIp>",
    "port": 1636
  },
  "request": {
    "protocol": "LDAPS",
    "operation": "BIND",
    "connId": 3,
    "msgId": 1,
    "version": "3",
    "dn": "uid=kvaughan,ou=people,dc=example,dc=com",
    "authType": "SIMPLE"
  },
  "transactionId": "<uuid>",
  "response": {
    "status": "SUCCESSFUL",
    "statusCode": "0",
    "elapsedTime": 1,
    "elapsedQueueingTime": 0,
    "elapsedProcessingTime": 1,
    "elapsedTimeUnits": "MILLISECONDS",
    "additionalItems": {
      "ssf": 128
    }
  },
  "userId": "uid=kvaughan,ou=People,dc=example,dc=com",
  "timestamp": "<timestamp>",
  "_id": "<uuid>"
}
{
  "eventName": "DJ-LDAP",
  "client": {
    "ip": "<clientIp>",
    "port": 12345
  },
  "server": {
    "ip": "<serverIp>",
    "port": 1636
  },
  "request": {
    "protocol": "LDAPS",
    "operation": "SEARCH",
    "connId": 3,
    "msgId": 2,
    "dn": "dc=example,dc=com",
    "scope": "sub",
    "filter": "(uid=bjensen)",
    "attrs": ["cn"]
  },
  "transactionId": "<uuid>",
  "response": {
    "status": "SUCCESSFUL",
    "statusCode": "0",
    "elapsedTime": 3,
    "elapsedQueueingTime": 0,
    "elapsedProcessingTime": 3,
    "elapsedTimeUnits": "MILLISECONDS",
    "nentries": 1,
    "entrySize": 591
  },
  "userId": "uid=kvaughan,ou=People,dc=example,dc=com",
  "timestamp": "<timestamp>",
  "_id": "<uuid>"
}
{
  "eventName": "DJ-LDAP",
  "client": {
    "ip": "<clientIp>",
    "port": 12345
  },
  "server": {
    "ip": "<serverIp>",
    "port": 1636
  },
  "request": {
    "protocol": "LDAPS",
    "operation": "UNBIND",
    "connId": 3,
    "msgId": 3
  },
  "transactionId": "<uuid>",
  "timestamp": "<timestamp>",
  "_id": "<uuid>"
}
{
  "eventName": "DJ-LDAP",
  "client": {
    "ip": "<clientIp>",
    "port": 12345
  },
  "server": {
    "ip": "<serverIp>",
    "port": 1636
  },
  "request": {
    "protocol": "LDAPS",
    "operation": "DISCONNECT",
    "connId": 3
  },
  "transactionId": "0",
  "response": {
    "status": "SUCCESSFUL",
    "statusCode": "0",
    "elapsedTime": 0,
    "elapsedTimeUnits": "MILLISECONDS",
    "reason": "Client Unbind"
  },
  "timestamp": "<timestamp>",
  "_id": "<uuid>"
}

For details about the messages format, refer to Access log format.

By default, the server does not log internal LDAP operations corresponding to HTTP requests. To match HTTP client operations to internal LDAP operations:

  1. Prevent the server from suppressing log messages for internal operations.

    Set suppress-internal-operations:false on the LDAP access log publisher.

  2. Match the request/connId field in the HTTP access log with the same field in the LDAP access log.

Client access

To help diagnose client errors due to access permissions, refer to Effective rights.

Simple paged results

For some versions of Linux, you read a message in the DS access logs such as the following:

The request control with Object Identifier (OID) "1.2.840.113556.1.4.319"
cannot be used due to insufficient access rights

This message means clients are trying to use the simple paged results control without authenticating. By default, a global ACI allows only authenticated users to use the control.

To grant anonymous (unauthenticated) user access to the control, add a global ACI for anonymous use of the simple paged results control:

$ dsconfig \
 set-access-control-handler-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword "password" \
 --add global-aci:"(targetcontrol=\"SimplePagedResults\") \
 (version 3.0; acl \"Anonymous simple paged results access\"; allow(read) \
 userdn=\"ldap:///anyone\";)" \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

Replication problems

Replicas do not connect

If you set up servers with different deployment IDs, they cannot share encrypted data. By default, they also cannot trust each other’s secure connections. You may read messages like the following in the logs/errors log file:

msg=Replication server accepted a connection from /address:port
to local address /address:port but the SSL handshake failed.

Unless the servers use your own CA, make sure their keys are generated with the same deployment ID/password. Either set up the servers again with the same deployment ID, or refer to Replace deployment IDs.

Temporary delays

Replication can generally recover from conflicts and transient issues. Temporary delays are normal and expected while replicas converge, especially when the write load is heavy. This is a feature of eventual convergence, not a bug.

Persistently long replication delays can be a problem for client applications. A client application gets an unexpectedly old view of the data when reading from a very delayed replica. Monitor replication delay and take action when you observe persistently long delays. For example, make sure the network connections between DS servers are functioning normally. Make sure the DS server systems are sized appropriately.

For detailed suggestions about monitoring replication delays, refer to either of the following sections:

Use the logs

By default, replication records messages in the log file, logs/errors. Replication messages have category=SYNC.

The messages have the following form. The following example message is folded for readability:

...msg=Replication server accepted a connection from 10.10.0.10/10.10.0.10:52859
 to local address 0.0.0.0/0.0.0.0:8989 but the SSL handshake failed.
 This is probably benign, but may indicate a transient network outage
 or a misconfigured client application connecting to this replication server.
 The error was: Remote host closed connection during handshake

Stale data

DS replicas maintain historical information to bring replicas up to date and to resolve conflicts. To prevent historical information from growing without limit, DS replicas purge historical information after the replication-purge-delay (default: 3 days).

A replica becomes irrevocably out of sync when, for example:

  • You restore it from backup files older than the purge delay.

  • You stop it for longer than the purge delay.

  • The replica stays out of contact with other DS servers for longer than the purge delay.

When this happens to a single replica, reinitialize the replica.

For detailed suggestions about troubleshooting stale replicas, refer to either of the following sections:

Change number indexing

DS replication servers maintain a changelog database to record updates to directory data. The changelog database serves to:

  • Replicate changes, synchronizing data between replicas.

  • Let client applications get change notifications.

DS replication servers purge historical changelog data after the replication-purge-delay in the same way replicas purge their historical data.

Client applications can get changelog notifications using cookies (recommended) or change numbers.

To support change numbers, the servers maintain a change number index to the replicated changes. A replication server maintains the index when its configuration properties include changelog-enabled:enabled. (Cookie-based notifications do not require a change number index.)

The change number indexer must not be interrupted for long. Interruptions can arise when, for example, a DS server:

  • Stays out of contact, not sending any updates or heartbeats.

  • Gets removed without being shut down cleanly.

  • Gets lost in a system crash.

Interruptions prevent the change number indexer from advancing. When a change number indexer cannot advance for almost as long as the purge delay, it may be unable to recover as the servers purge historical data needed to determine globally consistent change numbers.

For detailed suggestions about monitoring changelog indexing, refer to either of the following sections:

Take action based on the situation:

Situation Actions to take

The time since last indexing is much smaller than the purge delay.

No action required.

The time since last indexing is approaching the purge delay.

Begin by determining why. The fix depends on the exact symptoms.

A DS server was removed without a clean shutdown.

Rebuild an identical server and shut it down cleanly before removing it.

A DS server disappeared in a crash.

Rebuild an identical server.

Incorrect configuration

When replication is configured incorrectly, fixing the problem can involve adjustments on multiple servers. For example, adding or removing a bootstrap replication server means updating the bootstrap-replication-server settings in the synchronization provider configuration of other servers. (The settings can be hard-coded in the configuration, or read from the environment at startup time, as described in Property value substitution. In either case, changing them involves at least restarting the other servers.)

For details, refer to Replication and the related pages.

Support

Sometimes you cannot resolve a problem yourself, and must ask for help or technical support. In such cases, identify the problem and how you reproduce it, and the version where you observe the problem:

$ status --offline --version

ForgeRock Directory Services 7.5.1-20240808151121-d43482b24ad96bf845412d3dbb2eba542c4e22ec
Build <datestamp>

Be prepared to provide the following additional information:

  • The Java home set in config/java.properties.

  • Access and error logs showing what the server was doing when the problem started occurring.

  • A copy of the server configuration file, config/config.ldif, in use when the problem started occurring.

  • Other relevant logs or output, such as those from client applications experiencing the problem.

  • A description of the environment where the server is running, including system characteristics, hostnames, IP addresses, Java versions, storage characteristics, and network characteristics. This helps to understand the logs, and other information.

  • The .zip file generated using the supportextract command.

    For an example showing how to use the command, refer to supportextract.