PingDirectory

Reverting or replaying changes

The PingDirectory server provides support for an audit logger that records information about the changes to data within the server.

The data is formatted as LDIF, and it can be replayed with tools such as ldapmodify or parallel-update. The data includes information encoded as comments that provide additional context about the changes. By default, the log records the changes as requested by clients, but it can log the changes in reversible form so that they can be undone.

This audit logger can be useful for the following scenarios:

  • If one or more undesirable changes have been made, such as by a malicious or defective client, it can obtain the necessary changes to revert those operations.

  • If a catastrophic loss of all servers in the topology occurs that leaves an audit log available with newer data than any backup or LDIF export, such as concurrent database corruption across all instances, it can recover changes that might not otherwise be available.

  • Automating the process of identifying changes made in one topology that can be replayed into another topology, such as to replay production changes into an isolated server or topology for testing purposes or to attempt to reproduce a problem.

  • Analytics and reporting purposes.

To assist with these and other uses, the LDAP SDK for Java provides an API for consuming, parsing, and reverting audit log messages. This API can be used for the analytics and reporting. Also available is the extract-data-recovery-log-changes tool that can extract audit log changes matching a specified set of criteria so that they can be replayed, either as they were originally processed or in a reversible form that makes it possible to revert those changes.

The data-recovery log

The setup tool automatically creates an audit logger for data recovery purposes in logs/data-recovery. The log is always compressed, and it is encrypted if data encryption is enabled within the server. The logger has the following properties:

  • Log files are written into the logs/data-recovery directory so that they are isolated from other log files. The active log file is named data-recovery.gz.encrypted while rotated files are named data-recovery.{timestamp}.gz.encrypted.

  • The log files are gzip-compressed. If data encryption is enabled, they are encrypted with a key obtained from the server’s preferred encryption settings definition.

  • Each log file contains no more than 10 MB of data, and is rotated after 24 hours. Keeping the log files small ensures that the entire contents of a log file easily fits into the extract-data-recovery-log-changes tool’s memory.

  • The server retains rotated data recovery log files for no more than one week. However, as a safeguard against consuming too much disk space in periods of extremely heavy and prolonged write activity, the server also retains no more than 1,000 data recovery log files for a maximum of 500 MB of disk space.

  • Changes are logged in reversible form and include the authentication and authorization identity of the requester, as well as the IP address. If present, the log message includes details from any intermediate client request control included in the request, which can provide information about the downstream client.

The extract-data-recovery-log-changes tool

The extract-data-recovery-log-changes tool creates an LDIF file (compressed and encrypted by default) with a specified subset of changes from the server’s data recovery log. That LDIF file can then be applied to the server using either the ldapmodify or parallel-update. Before applying the changes, the output file can be decrypted and examined to ensure that the changes it contains look correct. This tool can be useful for disaster recovery.

The extract-data-recovery-log-changes tool provides arguments for input and output of the extracted changes, including encryption settings, location, and compression.

The direction of whether changes should be extracted in forward mode or reverse mode is also configured. In forward mode (replay), the audit log messages are traversed from oldest to newest, and extracted changes are presented as they were originally requested. In reverse mode (revert), the audit log messages are traversed from newest to oldest, and extracted changes are converted to a form that reverts the original changes. Regardless of the direction chosen, additional arguments enable identifying the changes to extract by time, requester address or DN, connection ID, origin, content type, or alterations.

The following is a sample command to revert all changes by user uid=malicious,ou=People,dc=example,dc=com between noon and 2 pm on October 15, 2018.

$ bin/extract-data-recovery-log-changes \
  --auditLogFile logs/data-recovery/data-recovery.201810161234.567.gz.encrypted \
  --outputFile revert-malicious-user-changes.ldif \
  --direction revert \
  --startTime 201810151200.000 \
  --endTime 201810151359.999 \
  --includeAuthorizationDN "uid=malicious,ou=People,dc=example,dc=com"