Backup and restore

Backup archives are not guaranteed to be compatible across major and minor server releases. Restore backups only on directory servers of the same major or minor version.

To share data between servers of different versions, either use replication, or use LDAP Data Interchange Format (LDIF).
DS servers use cryptographic keys to sign and verify the integrity of backup files, and to encrypt data. Servers protect these keys by encrypting them with the shared master key for a deployment. For portability, servers store the encrypted keys in the backup files.

Any server can therefore restore a backup taken with the same server version, as long as it holds a copy of the shared master key used to encrypt the keys.

How backup works

DS directory servers store data in backends. The amount of data in a backend varies depending on your deployment. It can range from very small to very large. A JE backend can hold billions of LDAP entries, for example.

Backup process

A JE backend stores data on disk using append-only log files with names like number.jdb. The JE backend writes updates to the highest-numbered log file. The log files grow until they reach a specified size (default: 1 GB). When the current log file reaches the specified size, the JE backend creates a new log file.

To avoid an endless increase in database size on disk, JE backends clean their log files in the background. A cleaner thread copies active records to new log files. Log files that no longer contain active records are deleted.

The DS backup process takes advantage of this log file structure. Together, a set of log files represents a backend at a point in time. The backup process essentially copies the log files to the backup directory. DS also protects the data and adds metadata to keep track of the log files it needs to restore a JE backend to the state it had when the backup task completed.

Cumulative backups

DS backups are cumulative in nature. Backups reuse the JE files that did not change since the last backup operation. They only copy the JE files the backend created or changed. Files that did not change are shared between backups.

A set of backup files is fully standalone.

Purge old backups

Backup tasks keep JE files until you purge them.

The backup purge operation prevents an endless increase in the size of the backup folder on disk. The purge operation does not happen automatically; you choose to run it. When you run a purge operation, it removes the files for old or selected backups. The purge does not impact the integrity of the backups DS keeps. It only removes log files that do not belong to any remaining backups.

Back up

When you set up a directory server, the process creates a /path/to/opendj/bak/ directory. You can use this for backups if you have enough local disk space, and when developing or testing backup processes. In deployment, store backups remotely to avoid losing your data and backups in the same crash.

Back up data (server task)

When you schedule a backup as a server task, the DS server manages task completion. The server must be running when you schedule the task, and when the task runs:

Schedule the task on a running server, binding as a user with the backend-backup administrative privilege.

The following example schedules an immediate backup task for the dsEvaluation backend:
```
$ dsbackup \
 create \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --trustStorePath /path/to/opendj/config/keystore \
 --trustStoreType PKCS12 \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --backupLocation bak \
 --backendName dsEvaluation
```
To back up all backends, omit the --backendName option.

To back up more than one backend, specify the --backendName option multiple times.

For details, refer to dsbackup.

Back up data (scheduled task)

When you schedule a backup as a server task, the DS server manages task completion. The server must be running when you schedule the task, and when the task runs:

Schedule backups using the crontab format with the --recurringTask option.

The following example schedules nightly online backup of all user data at 2 AM, notifying diradmin@example.com when finished, or on error:

$ dsbackup \
 create \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --trustStorePath /path/to/opendj/config/keystore \
 --trustStoreType PKCS12 \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --backupLocation bak \
 --recurringTask "00 02 * * *" \
 --description "Nightly backup at 2 AM" \
 --taskId NightlyBackup \
 --completionNotify diradmin@example.com \
 --errorNotify diradmin@example.com

For details, refer to dsbackup.

Use the manage-tasks command to manage scheduled tasks. For background, read Server tasks. For an example command, refer to Status and tasks.

Back up data (external command)

When you back up data without contacting the server, the dsbackup create command runs as an external command, independent of the server process. It backs up the data whether the server is running or not.

When you back up LDIF-based backends with this method, the command does not lock the files. To avoid corrupting the backup files, do not run the dsbackup create --offline command on an LDIF backend simultaneously with any changes to the backend.

This applies to LDIF backends, schema files, and the task backend, for example.

Use this method to schedule backup with a third-party tool, such as the cron command:

Back up data without contacting the server process, and use the --offline option.

The following example backs up the dsEvaluation backend immediately:
```
$ dsbackup \
 create \
 --offline \
 --backupLocation bak \
 --backendName dsEvaluation
```
To back up all backends, omit the --backendName option.

To back up more than one backend, specify the --backendName option multiple times.

For details, refer to dsbackup.

Back up configuration files

When you back up directory data using the dsbackup command, you do not back up server configuration files. The server stores configuration files under the /path/to/opendj/config/ directory.

The server records snapshots of its configuration under the /path/to/opendj/var/ directory. You can use snapshots to recover from misconfiguration performed with the dsconfig command. Snapshots only reflect the main configuration file, config.ldif.

Stop the server:
```
$ stop-ds
```
Back up the configuration files:
```
$ tar -zcvf backup-config-$(date +%s).tar.gz config
```
By default, this backup includes the server keystore, so store it securely.
Start the server:
```
$ start-ds
```

Back up using snapshots

Use the dsbackup command when possible for backup and restore operations. You can use snapshot technology as an alternative to the dsbackup command, but you must be careful how you use it.

While DS directory servers are running, database backend cleanup operations write data even when there are no pending client or replication operations. An ongoing file system backup operation may record database log files that are not in sync with each other.

Successful recovery after restore is only guaranteed under certain conditions.

The snapshots must:

Be atomic, capturing the state of all files at exactly the same time.

If you are not sure that the snapshot technology is atomic, do not use it. Use the dsbackup command instead.

For example, Kubernetes deployments can use volume snapshots when the underlying storage supports atomic snapshots. Learn more in Backup and restore using volume snapshots.

In contrast, do not use VMWare snapshots to back up a running DS server.
Capture the state of all data (db/) and (changelogDb/) changelog files together.

When using a file system-level snapshot feature, for example, keep at least all data and changelog files on the same file system. This is the case in a default server setup.
Be paired with a specific server configuration.

A snapshot of all files includes configuration files that may be specific to one DS server, and cannot be restored safely on another DS server with a different configuration. If you restore all system files, this principle applies to system configuration as well.

For details on making DS configuration files as generic as possible, refer to Property value substitution.

If snapshots in your deployment do not meet these criteria, you must stop the DS server before taking the snapshot. You must also take care not to restore incompatible configuration files.

Backup and restore options
	`dsbackup` commands	Snapshots
What is backed up	DS backend data only	Potentially everything; at minimum DS backends, changelogs
Incremental backups	Yes	Depends on the snapshot tools
Portability	Yes; restore backend data on any DS of the same major/minor version	Depends; potentially limited to the same environment as with Kubernetes volume snapshots
Disaster recovery	Optimal; restore data and delete old changelog	Potentially restores changelog only to clear it during recovery
Recover single server	Potentially slower while rebuilding the local changelog; impacts the change number index (if enabled)	Optimal; restores everything to the previous state
Choice of what to restore	Good; you choose which backends to restore	Bad; you restore the file system, potentially rolling back multiple backends at once
Ease of use	Medium; you must understand `dsbackup` commands and choose what to restore	Medium; you must understand platform tools and impact of restoring everything at once

Restore

After you restore a replicated backend, replication brings it up to date with changes newer than the backup. Replication uses internal change log records to determine which changes to apply. This process happens even if you only have a single server that you configured for replication at setup time (by setting the replication port with the --replicationPort port option). To prevent replication from replaying changes newer than the backup you restore, refer to Disaster recovery.

Replication purges internal change log records, however, to prevent the change log from growing indefinitely. Replication can only bring the backend up to date if the change log still includes the last change backed up.

For this reason, when you restore a replicated backend from backup, the backup must be newer than the last purge of the replication change log (default: 3 days).

If no backups are newer than the replication purge delay, do not restore from a backup. Initialize the replica instead, without using a backup. For details, refer to Manual initialization.

Restore data (server task)

Verify the backup you intend to restore.

The following example verifies the most recent backup of the dsEvaluation backend:
```
$ dsbackup \
 list \
 --backupLocation bak \
 --backendName dsEvaluation \
 --last \
 --verify
```
Schedule the restore operation as a task, binding as a user with the backend-restore administrative privilege.

The following example schedules an immediate restore task for the dsEvaluation backend:
```
$ dsbackup \
 restore \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --trustStorePath /path/to/opendj/config/keystore \
 --trustStoreType PKCS12 \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --backupLocation bak \
 --backendName dsEvaluation
```
To restore the latest backups of more than one backend, specify the --backendName option multiple times.

To restore a specific backup, specify the --backupId option. To restore multiple specific backups of different backends, specify the --backupId option multiple times.

To list backup information without performing verification, use the dsbackup list command without the --verify option. The output includes backup IDs for use with the --backupId option.

For details, refer to dsbackup.

Restore data (external command)

Stop the server if it is running:
```
$ stop-ds --quiet
```
Verify the backup you intend to restore.

The following example verifies the most recent backup of the dsEvaluation backend:
```
$ dsbackup \
 list \
 --backupLocation bak \
 --backendName dsEvaluation \
 --last \
 --verify
```
Restore using the --offline option.

The following example restores the dsEvaluation backend:
```
$ dsbackup \
 restore \
 --offline \
 --backupLocation bak \
 --backendName dsEvaluation
```
To restore the latest backups of more than one backend, specify the --backendName option multiple times.

To restore a specific backup, specify the --backupId option. To restore multiple specific backups of different backends, specify the --backupId option multiple times.

To list backup information without performing verification, use the dsbackup list command without the --verify option. The output includes backup IDs for use with the --backupId option.

For details, refer to dsbackup.
Start the server:
```
$ start-ds --quiet
```

Restore configuration files

Stop the server:
```
$ stop-ds --quiet
```
Restore the configuration files from the backup, overwriting existing files:
```
$ tar -zxvf backup-config-<date>.tar.gz
```
Start the server:
```
$ start-ds --quiet
```

Restore from a snapshot

Use the dsbackup command when possible for backup and restore operations.

You can use snapshot technology as an alternative to the dsbackup command, but you must be careful how you use it. For details, refer to Back up using snapshots.

Take the following points into account before restoring a snapshot:

When you restore files for a replicated backend, the snapshot must be newer than the last purge of the replication change log (default: 3 days).
Stop the DS server before you restore the files.
The DS configuration files in the snapshot must match the configuration where you restore the snapshot.

If the configuration uses expressions, define their values for the current server before starting DS.
When using snapshot files to initialize replication, only restore the data (db/) files for the target backend.

Depending on the snapshot technology, you might need to restore the files separately, and then move only the target backend files from the restored snapshot.
When using snapshot files to restore replicated data to a known state, stop all affected servers before you restore.

Purge old files

Periodically purge old backup files with the dsbackup purge command. The following example removes all backup files older than the default replication purge delay:

$ dsbackup \
 purge \
 --offline \
 --backupLocation bak \
 --olderThan 3d

This example runs the external command without contacting the server process. You can also purge backups by ID, or by backend name, and you can specify the number of backups to keep. For details, refer to dsbackup.

To purge files as a server task, use the task options, such as --recurringTask. The user must have the backend-backup administrative privilege to schedule a purge task.

Cloud storage

You can push backup files to cloud storage and restore them from cloud storage.

Mount the cloud storage as a local filesystem and use the mount point as a local backup location. This approach works with the same commands and procedures as for local backups.

To mount the cloud storage as a local filesystem, use third-party tools such as the following:

Ping Identity supports the DS backup and restore commands, not the third-party cloud storage filesystem tools.

Efficiently store backup files

DS backups are collections of files in a backup directory. To restore from backup, DS requires a coherent collection of backup files.

You can use the dsbackup command to purge stale backup files from a backup directory. When you purge stale backup files, the command leaves a coherent collection of files you can use to restore data.

You should also store copies of backup files remotely to guard against the loss of data in a disaster.

Remote storage

Perform the following steps to store copies of backup files remotely in an efficient way. These steps address backup of directory data, which is potentially very large, not backup of configuration data, which is almost always small:

Choose a local directory or local network directory to hold backup files.

Alternatively, you can back up to cloud storage.
Schedule a regular backup task to back up files to the directory you chose.

Make sure that the backup task runs more often than the replication purge delay. For example, schedule the backup task to run every three hours for a default purge delay of three days. Each time the task runs, it backs up only new directory backend files.

For details, refer to the steps for backing up directory data.
Store copies of the local backup files at a remote location for safekeeping:
1. Purge old files in the local backup directory.
  
  As described in How backup works, DS backups are cumulative in nature; DS reuses common data that has not changed from previous backup operations when backing up data again. The set of backup files is fully standalone.
  
  The purge removes stale files without impacting the integrity of newer backups, reducing the volume of backup files to store when you copy files remotely.
2. Regularly copy the backup directory and all the files it holds to a remote location.
  
  For example, copy all local backup files every day to a remote directory called bak-date:
  $ ssh user@remote-storage mkdir /path/to/bak-date $ scp -R /path/to/bak/* user@remote-storage:/path/to/bak-date/
Remove old bak-date directories from remote storage in accordance with the backup policy for the deployment.

Restore from remote backup

For each DS directory server to restore:

Install DS using the same cryptographic keys and deployment ID.

Backup files are protected using keys derived from the DS deployment ID and password. You must use the same ones when recovering from a disaster.
Restore configuration files.
Restore directory data from the latest remote backup folder.

After restoring all directory servers, validate that the restore procedure was a success.

PingDS