PingDirectory

Troubleshooting replication

This section covers information on troubleshooting your replication deployment.

When troubleshooting, check the log file associated with the subcommand that is producing the error. Learn more about Replication subcommand logs.

For replication issues related to certificate trust, see Repairing broken listener certificate trust in replication.

Discovering obsolete replicas

About this task

To avoid entering lockdown mode when upgrading servers in a replicated topology, before upgrading, check the replicationChanges database for any obsolete replicas. To perform this check, run the check-replication-domains tool, which scans changelogDb for all known replication domains and identifies any obsolete replicas still listed as part of a topology.

As of PingDirectory version 10.2, the risk of lockdown due to obsolete replicas is minimal. If you have already upgraded all servers in a replicated topology to version 10.2 or later, these steps are optional.

Steps

  1. Run check-replication-domains --serverRoot <serverRootDirectory>.

    You can use the --serverRoot argument to specify the root directory where the server containing the replication data is installed. If you don’t supply this argument, check-replication-domains uses the default value of the server where you run the tool.

  2. Review the output for any replica IDs listed as Obsolete.

    Example:

    The following is an example output from the check-replication-domains tool:

    Server topo-1
    [pinguser@topo-1 ~]$ PingDirectory/bin/check-replication-domains --serverRoot PingDirectory/
    
    SERVER           DOMAIN DN                ID
    ---------------- ------------------------ ------
    topo-1           cn=schema                20693 (local)
    topo-1           dc=example,dc=com        23135 (local)
    topo-2           cn=schema                8371
    topo-2           dc=example,dc=com        19233
    
    Server topo-2
    [pinguser@topo-2 ~]$ PingDirectory/bin/check-replication-domains --serverRoot PingDirectory/
    
    SERVER           DOMAIN DN                ID
    ---------------- ------------------------ ------
    <unknown>        dc=example,dc=com        7403  OBSOLETE 
    <unknown>        dc=example,dc=com        7406  DELETED 
    topo-1           cn=schema                20693
    topo-1           dc=example,dc=com        23135
    topo-2           cn=schema                8371 (local)
    topo-2           dc=example,dc=com        19233 (local)

    Any replica marked DELETED has been deleted from the topology but is not yet obsolete.

Next steps

If you identified any obsolete replicas, purge the obsolete replicas.

Recovering a replica with missed changes

If a server has been offline for a period of time longer than the replication purge delay, you must run the dsreplication initialize command to bring the replica into sync with the topology.

Any missed changes are detected at the time of server startup. A missed change is a change that the replica detects it needs, but the change is not found within any other replication server’s replicationChanges backend stored in the /changelogDb server root path.

If missed changes are detected, the server enters lockdown mode, where only privileged clients can make requests. Any other server that is not missing changes can be used as a source for dsreplication initialize.

If the server requires a manual backup and restore, perform the following steps, which are equivalent to dsreplication initialize.

Performing a manual initialization

About this task

The PingDirectory server provides the tools necessary for backing up and restoring backends, which can be used to manually initialize a replica.

As detailed in the following procedure, you use <server-root>/bin/backup to create a backup of the backend containing the replicated base DN. If encryption is enabled for the backend containing the replicated base DN, then you must also make a backup of the encryption-settings backend.

When initializing a server that has been offline longer than the replication-purge-delay, you must also make backups of the replicationChanges and schema backends.

You then need to transfer all backup files to the target server(s) and restore them individually using <server-root>/bin/restore.

To preserve existing encryption settings, <server-root>/bin/restore appends to the encryption-settings database as opposed to replacing it.

To manually initialize a server when an online initialization isn’t possible:

Steps

  1. From another server in the replication topology, back up the userRoot, schema, changelog, and replicationChanges backends to the <server-root>/bak directory.

    If data encryption is enabled, export the encryption-settings backend, because you might need to import one or more encryption settings IDs into the new replica.

    Example:

    $  <source-server-root>/bin/backup --backendID userRoot --backupDirectory \
       bak/userRoot
    $  <source-server-root>/bin/backup --backendID schema --backupDirectory \
       bak/schema
    $  <source-server-root>/bin/backup --backendID changelog --backupDirectory \
       bak/changelog
    $  <source-server-root>/bin/backup --backendID replicationChanges \
       --backupDirectory bak/replicationChanges
    $  <source-server-root>/bin/encryption-settings export --id <id> \
       --output-file bak/exported-key
  2. Copy the bak directory to the new replica.

    Example:

    $ scp -r  <source-server-root>/bak/* \
        <user>@<destination-server>:<destination-server-root>/bak
  3. Stop the server.

  4. Restore the userRoot, schema, changelog, and replicationChanges backends.

    If the encryption-settings backend was exported, import it before restoring any of the backends.

    Example:

    $  <destination-server-root>/bin/encryption-settings import --input-file \
       bak/exported-key --set-preferred
    Enter the PIN used to encrypt the definition:
    $  <destination-server-root>/bin/restore --backupDirectory bak/userRoot
    $  <destination-server-root>/bin/restore --backupDirectory bak/schema
    $  <destination-server-root>/bin/restore --backupDirectory \
    bak/changelog
    $  <destination-server-root>/bin/restore --backupDirectory \
       bak/replicationChanges
  5. Start the server using bin/start-server.

Fixing replication conflicts

Replication conflicts occur when an incompatible change to an entry is made on two replicas at the same time. The change processes on one replica and then replicates to the other replica, which causes the conflict. While most conflicts resolve automatically, some require manual action.

To fix replication conflicts, initialize the replica containing the conflicts with the data from another replica that does not have conflicts. If the database is large and the number of conflicts is small, and the command includes the Replication Repair Control specified by OID value 1.3.6.1.4.1.30221.1.5.2, run ldapmodify against the server with the conflict. The Replication Repair Control prevents the change from replicating and enables changing operational attribute values, which are not normally writable.

The following tasks use the Replication Repair Control to fix replication conflicts and apply change only to the server with the conflict. There are two examples provided to fix replication conflicts: one for fixing a modify conflict using the ldap-diff tool and the other for fixing a naming conflict.

Fixing a modify conflict

Steps

  1. To isolate conflicting entries between two replicas, use the bin/ldap-diff tool.

    Replace the sourceHost value with the server that needs the adjustment.

    Example:

    The following example uses the tool to search across the entire base distinguish name (DN) for any difference in user attributes and reports the difference in difference.ldif.

    $ bin/ldap-diff --sourceHost austin02.exmple.com --sourcePort 1389 \
                            --sourceBindDN "cn=Directory Manager" --sourceBindPassword pass \
                            --targetHost austin01.example.com --targetPort 1389 \
                            --targetBindDN "cn=Directory Manager" --targetBindPassword
                            --baseDN "dc=example,dc=com" --outputLDIF difference.ldif \
                            --searchFilter "(objectclass=*)" --numPasses 3 "*" pass \
                            "^userPassword"
  2. To apply changes to the server that contains conflicts, use the difference.ldif file in a format compatible with ldapmodify.

    Run ldap-diff command with the sourceHost value as the server with conflicts.

    Example:

    The following is an example of the contents of difference.ldif file.

    dn: uid=user.1,ou=people,dc=example,dc=com
                            changetype: modify
                            add: mobile
                            mobile: +1 568 232 6789
                            -
                            delete: mobile
                            mobile: +1 568 591 7372
                            -
  3. To correct the entries on the sole server with conflicts, run bin/ldapmodify.

    Example:

    $ bin/ldapmodify --bindPassword password -J "1.3.6.1.4.1.30221.1.5.2" \
                            --filename difference.ldif

Fixing a naming conflict

About this task

In this example, a naming conflict was encountered when the replica attempted to replay an ADD of uid=user.200,ou=people,dc=example,dc=com. Because of this conflict, the server returns a replication conflict message. See the following example message.

[18/Feb/2010:14:53:12 -0600] category=EXTENSIONS severity=SEVERE_ERROR
msgID=1880359005 msg="Administrative alert type=replication-unresolved-conflict
id=bbd2cbaf-90a4-42af-94a8-c1a42df32fc6
class=com.unboundid.directory.server.replication.plugin.ReplicationDomain
msg='An unresolved conflict was detected for DN uid=user.200,ou=People,dc=example,dc=com.
The conflicting entry has been renamed to
entryuuid=69807e3d-ab27-43a3-8759-ec0d8d6b3107+uid=user.200,ou=People,dc=example,dc=com'"

The PingDirectory server prepends the entryUUID to the DN of the conflicting attribute and adds a ds-sync-conflict-entry auxiliary object class to the entry to aid in search.

To resolve the conflict:

Steps

  1. Search for any entry that has the ds-sync-conflict-entry objectclass and returns only the DNs that match the filter.

    Example:

    $ bin/ldapsearch --baseDN dc=example,dc=com --searchScope sub \
      "(objectclass=ds-sync-conflict-entry)" "1.1"

    Result:

    The search results display the conflicting entry for uid=user.200.

    dn: entryuuid=69807e3d-ab27-43a3-8759-ec0d8d6b3107+uid=user.200,ou=People,dc=example,dc=com
    
    dn: entryuuid=523c430e-a870-4ebe-90f8-9cd811946420+uid=user.200,ou=People,dc=example,dc=com

    Conflict entries are not returned unless the objectclass=ds-sync-conflict-entry is present in the search filter.

  2. Compare the conflict entry with the target entry.

  3. Apply the difference in two ways:

    Choose from:

    • Use the ldapmodify tool with the Replication Repair Control.

      You can also delete the conflict entry using this command.

    • Run bin/ldapmodify with the Replication Repair Control to make the fix.

      When making changes using the Replication Repair Control, the updates are not propagated through replication. Examine each replica individually, and apply the necessary modifications using the request control.

      Example:

      $ bin/ldapmodify -J "1.3.6.1.4.1.30221.1.5.2" \
        --filename difference.ldif

Fixing mismatched generation IDs

About this task

If you receive a warning message that multiple generation IDs were detected for a specific suffix, you must re-initialize one or more replicas. If the warning is presented from a server after an initialization, it could be that the post-external-initialization command was not run as part of a global change in data.

Try the following fixes as needed.

Steps

  • To re-initialize replicas as part of a global change in data, run the post-external-initialization command.

  • To fix mismatchd generation IDs, run the dsreplication command.

  • To warn when any generation IDs are different across the topology, run the dsreplication tool with the status command.