Troubleshooting replication

This section covers information on troubleshooting your replication deployment.

Discovering obsolete replicas

About this task

To avoid entering lockdown mode when upgrading servers in a replicated topology, before upgrading, check the replicationChanges database for any obsolete replicas. To perform this check, run the check-replication-domains tool, which scans changelogDb for all known replication domains and identifies any obsolete replicas still listed as part of a topology.

Steps

Run check-replication-domains --serverRoot <serverRootDirectory>.

You can use the --serverRoot argument to specify the root directory where the server containing the replication data is installed. If you don’t supply this argument, check-replication-domains uses the default value of the server where you run the tool.

Review the output for any replica IDs listed as Obsolete.

Example:

The following is an example output from the check-replication-domains tool:

Server topo-1
[pinguser@topo-1 ~]$ PingDirectory/bin/check-replication-domains --serverRoot PingDirectory/

SERVER           DOMAIN DN                ID
---------------- ------------------------ ------
topo-1           cn=schema                20693 (local)
topo-1           dc=example,dc=com        23135 (local)
topo-2           cn=schema                8371
topo-2           dc=example,dc=com        19233

Server topo-2
[pinguser@topo-2 ~]$ PingDirectory/bin/check-replication-domains --serverRoot PingDirectory/

SERVER           DOMAIN DN                ID
---------------- ------------------------ ------
<unknown>        dc=example,dc=com        7403  OBSOLETE 
topo-1           cn=schema                20693
topo-1           dc=example,dc=com        23135
topo-2           cn=schema                8371 (local)
topo-2           dc=example,dc=com        19233 (local)

Next steps

If you identified any obsolete replicas, purge the obsolete replicas.

Recovering a replica with missed changes

If a server has been offline for a period of time longer than the replication purge delay, you must run the dsreplication initialize command to bring the replica into sync with the topology.

Any missed changes are detected at the time of server startup. A missed change is a change that the replica detects it needs, but the change is not found within any other replication server’s replicationChanges backend stored in the /changelogDb server root path.

If missed changes are detected, the server enters lockdown mode, where only privileged clients can make requests. Any other server that is not missing changes can be used as a source for dsreplication initialize.

If the server requires a manual backup and restore, perform the following steps, which are equivalent to dsreplication initialize.

Performing a manual initialization

About this task

The PingDirectory server provides the tools necessary for backing up and restoring backends, which can be used to manually initialize a replica.

As detailed in the following procedure, you use <server-root>/bin/backup to create a backup of the backend containing the replicated base DN. If encryption is enabled for the backend containing the replicated base DN, then you must also make a backup of the encryption-settings backend.

When initializing a server that has been offline longer than the replication-purge-delay, you must also make backups of the replicationChanges and schema backends.

You then need to transfer all backup files to the target server(s) and restore them individually using <server-root>/bin/restore.

To preserve existing encryption settings, <server-root>/bin/restore appends to the encryption-settings database as opposed to replacing it.

To manually initialize a server when an online initialization isn’t possible:

Steps

From another server in the replication topology, back up the userRoot, schema, changelog, and replicationChanges backends to the <server-root>/bak directory.

If data encryption is enabled, export the encryption-settings backend because you might need to import one or more encryption settings IDs into the new replica.

Example:

$  <source-server-root>/bin/backup --backendID userRoot --backupDirectory \
   bak/userRoot
$  <source-server-root>/bin/backup --backendID schema --backupDirectory \
   bak/schema
$  <source-server-root>/bin/backup --backendID changelog --backupDirectory \
   bak/changelog
$  <source-server-root>/bin/backup --backendID replicationChanges \
   --backupDirectory bak/replicationChanges
$  <source-server-root>/bin/encryption-settings export --id <id> \
   --output-file bak/exported-key

shell

Copy the bak directory to the new replica.

Example:

$ scp -r  <source-server-root>/bak/* \
    <user>@<destination-server>:<destination-server-root>/bak

shell

Stop the server.

Restore the userRoot, schema, changelog, and replicationChanges backends.

If the encryption-settings backend was exported, import it before restoring any of the backends.

Example:

$  <destination-server-root>/bin/encryption-settings import --input-file \
   bak/exported-key --set-preferred
Enter the PIN used to encrypt the definition:
$  <destination-server-root>/bin/restore --backupDirectory bak/userRoot
$  <destination-server-root>/bin/restore --backupDirectory bak/schema
$  <destination-server-root>/bin/restore --backupDirectory \
bak/changelog
$  <destination-server-root>/bin/restore --backupDirectory \
   bak/replicationChanges

shell

Start the server using bin/start-server.

Fixing replication conflicts

Replication conflicts occur when an incompatible change to an entry is made on two replicas at the same time. The change processes on one replica and then replicates to the other replica, which causes the conflict. While most conflicts resolve automatically, some require manual action.

To fix replication conflicts, initialize the replica containing the conflicts with the data from another replica that does not have conflicts. If the database is large and the number of conflicts is small, and the command includes the Replication Repair Control specified by OID value 1.3.6.1.4.1.30221.1.5.2, run ldapmodify against the server with the conflict. The Replication Repair Control prevents the change from replicating and enables changing operational attribute values, which are not normally writable.

The following tasks use the Replication Repair Control to fix replication conflicts and apply change only to the server with the conflict. There are two examples provided to fix replication conflicts: one for fixing a modify conflict using the ldap-diff tool and the other for fixing a naming conflict.

Fixing a modify conflict

Steps

To isolate conflicting entries between two replicas, use the bin/ldap-diff tool.

Replace the sourceHost value with the server that needs the adjustment.

Example:

The following example uses the tool to search across the entire base distinguish name (DN) for any difference in user attributes and reports the difference in difference.ldif.

$ bin/ldap-diff \
  --sourceHost austin02.exmple.com --sourcePort 1389 \
  --sourceBindDN "cn=Directory Manager" --sourceBindPassword pass \
  --targetHost austin01.example.com --targetPort 1389 \
  --targetBindDN "cn=Directory Manager" --targetBindPassword \
  --baseDN "dc=example,dc=com" --outputLDIF difference.ldif \
  --searchFilter "(objectclass=*)" --numPasses 3 "*" pass \
  "^userPassword"

shell

To apply changes to the server that contains conflicts, use the difference.ldif file in a format compatible with ldapmodify.

Run ldap-diff command with the sourceHost value as the server with conflicts.

Example:

The following is an example of the contents of difference.ldif file.

dn: uid=user.1,ou=people,dc=example,dc=com
                        changetype: modify
                        add: mobile
                        mobile: +1 568 232 6789
                        -
                        delete: mobile
                        mobile: +1 568 591 7372
                        -

To correct the entries on the sole server with conflicts, run bin/ldapmodify.

Example:

$ bin/ldapmodify --bindPassword <password> -J "1.3.6.1.4.1.30221.1.5.2" \
  --filename difference.ldif

shell

Fixing a naming conflict

About this task

In this example, a naming conflict was encountered when the replica attempted to replay an ADD of uid=user.200,ou=people,dc=example,dc=com. Because of this conflict, the server returns a replication conflict message. See the following example message.

[18/Feb/2010:14:53:12 -0600] category=EXTENSIONS severity=SEVERE_ERROR
msgID=1880359005 msg="Administrative alert type=replication-unresolved-conflict
id=bbd2cbaf-90a4-42af-94a8-c1a42df32fc6
class=com.unboundid.directory.server.replication.plugin.ReplicationDomain
msg='An unresolved conflict was detected for DN uid=user.200,ou=People,dc=example,dc=com.
The conflicting entry has been renamed to
entryuuid=69807e3d-ab27-43a3-8759-ec0d8d6b3107+uid=user.200,ou=People,dc=example,dc=com'"

The PingDirectory server prepends the entryUUID to the DN of the conflicting attribute and adds a ds-sync-conflict-entry auxiliary object class to the entry to aid in search.

To resolve the conflict:

Steps

Search for any entry that has the ds-sync-conflict-entry objectclass and returns only the DNs that match the filter.

Example:
```
$ bin/ldapsearch --baseDN dc=example,dc=com --searchScope sub \
  "(objectclass=ds-sync-conflict-entry)" "1.1"
```
shell
Result:

The search results display the conflicting entry for uid=user.200.
```
dn: entryuuid=69807e3d-ab27-43a3-8759-ec0d8d6b3107+uid=user.200,ou=People,dc=example,dc=com

dn: entryuuid=523c430e-a870-4ebe-90f8-9cd811946420+uid=user.200,ou=People,dc=example,dc=com
```
Conflict entries are not returned unless the objectclass=ds-sync-conflict-entry is present in the search filter.
Compare the conflict entry with the target entry.

Apply the difference in two ways:

Choose from:

Use the ldapmodify tool with the Replication Repair Control.

You can also delete the conflict entry using this command.

Run bin/ldapmodify with the Replication Repair Control to make the fix.

When making changes using the Replication Repair Control, the updates are not propagated through replication. Examine each replica individually, and apply the necessary modifications using the request control.

Example:

$ bin/ldapmodify -J "1.3.6.1.4.1.30221.1.5.2" \
  --filename difference.ldif

shell

Fixing mismatched generation IDs

About this task

If you receive a warning message that multiple generation IDs were detected for a specific suffix, you must re-initialize one or more replicas. If the warning is presented from a server after an initialization, it could be that the post-external-initialization command was not run as part of a global change in data.

Try the following fixes as needed.

Steps

To re-initialize replicas as part of a global change in data, run the post-external-initialization command.
To fix mismatchd generation IDs, run the dsreplication command.
To warn when any generation IDs are different across the topology, run the dsreplication tool with the status command.

PingDirectory

Troubleshooting replication

Discovering obsolete replicas

About this task

Steps

Example:

Next steps

Recovering a replica with missed changes

Performing a manual initialization

About this task

Steps

Example:

Example:

Example:

Fixing replication conflicts

Fixing a modify conflict

Steps

Example:

Example:

Example:

Fixing a naming conflict

About this task

Steps

Example:

Result:

Choose from:

Example:

Fixing mismatched generation IDs

About this task

Steps