Importing data

The PingDirectory server provides initialization mechanisms to import database files.

The import-ldif command line tool imports data from an LDIF file. You can import all or a portion of the entries contained in the LDIF file, including a subset of the entries, a subset of the attributes within entries, or both. The command also supports importing data that has been compressed, encrypted or digitally signed, or both.

You can run import-ldif with the server offline or online. If the server is online, administrators can initiate the import from a local or remote client. The LDIF file that contains the import data must exist on the server system. During an online import, the target database repository, or backend, is removed from service and data held in that backend is not available to clients.

The import-ldif tool guards against the accidental overwriting of existing backend data. When importing data into a backend with a branch that already contains entries, to overwrite the entries in the branch, you must include the --overwriteExistingEntries option. If a branch contains just a single base entry, you don’t need to include this option.

The tool also rejects any entries that contain duplicate values within the same attribute. To make import-ldif treat these types of entries as if each attribute value is only provided once, include the --ignoreDuplicateAttributeValues option.

Validating an LDIF file

Before importing data, you can validate an import file using the PingDirectory server’s validate-ldif tool.

About this task

The tool binds to the PingDirectory server locally or remotely and validates the LDIF file to determine whether it violates the server’s schema. Elements that do not conform to the schema are rejected and written to standard output. You can specify the output filepath to which the rejected entries and reasons for their rejection are written. The validate-ldif tool works with regular non-compressed LDIF files or gzip-compressed LDIF files.

Steps

To validate an LDIF file, run the validate-ldif tool.

Make sure the server is online before running this command.

To process large files faster, you can set the number of threads for validation. The tool also provides options to skip specified schema elements if you are only validating certain items, such as attributes only.

Example:
```
$ bin/validate-ldif --ldifFile /path/to/data.ldif \
  --rejectFile rejectedEntries
```
shell
1. (Optional) To view the arguments, use the --help option.
  
  Result:
  1 of 200 entries (0 percent) were found to be invalid. 1 undefined attributes were encountered. Undefined attribute departmentname was encountered 1 times.

About the database cache estimate

After successful completion of an import, the import-ldif command lists detailed information about the database cache usage characteristics of the imported data set.

To guide decisions for changing Java virtual machine (JVM) size and database-cache-percent for the backend, the current server configuration is considered along with the capabilities of the underlying hardware.

The /logs/tools directory contains additional files that describe the database cache characteristics in more detail.

Tracking skipped and rejected entries

During import, skip entries if they do not belong in the specified backend or if they are part of an excluded base distinguished name (DN) or filter.

Steps

To write skipped entries to a specified file, use the --skipFile {path} option on the command line.

You can add a comment indicating why the entries were skipped.
To write information about rejected entries and the reasons for rejection to a specified file, use the --rejectFile {path} option.

You can reject an entry if:
- It violates the server’s schema constraints.
- Its parent entry does not exist.
- Another entry already exists with the same DN.
- It was rejected by a plugin.

PingDirectory

Importing data

Validating an LDIF file

About this task

Steps

Example:

Result:

About the database cache estimate

Tracking skipped and rejected entries

Steps