The Directory Server transform-ldif tool provides backward compatibility with the former scramble-ldif tool, with additional functionality for configuring input and output files. The transform-ldif tool reads data from one or more source LDIF files and writes the transformed data to a single output file.

Using this tool to scramble data, enables obscuring the values of certain attributes so that it is difficult to determine the original values in the source data, while also preserving the characteristics of the associated attribute syntax. This process is repeatable, so that if the same value appears multiple times, it will yield the same scrambled representation each time. Scrambling can be applied to both LDIF entries and LDIF change records.

The process of scrambling data is not the same as encryption. It should only be used to provide simple obfuscation of data. The following are general guidelines for scrambling attributes:
  • If the attribute is userPassword and its value starts with a scheme name surrounded by curly braces, such as "{SSHA256}XrgyNdl3fid7KYdhd/Ju47KJQ5PYZqlUlyzxQ28f/QXUnNd9fupj9g==", the scheme name will be left unchanged and the rest of the value will be treated like a generic string.
  • If the attribute is authPassword and its value contains at least two dollar signs, such as "SHA256$QGbHtDCi1i4=$8/X7XRGaFCovC5mn7ATPDYlkVoocDD06Zy3lbD4AoO4=", the portion up to the first dollar sign (which represents the name of the encoding scheme) is preserved and the remainder of the value is treated like a generic string.
  • If an attribute has a Boolean syntax, the scrambled value will be either TRUE or FALSE. The determination to use a value of TRUE or FALSE is random, so scrambling Boolean values is not repeatable. By randomizing the scrambling for Boolean values, the syntax and obfuscation of the original value is preserved.
  • If an attribute has a distinguished name syntax (or a related syntax, such as a name and optional UID), scrambling is applied to the values of RDN components for any attributes to be scrambled. For example, if the tool is configured to scramble both the member and uid attributes, and an entry has a member attribute with a value of "uid=john.doe,ou=People,dc=example,dc=com", that member value will be scrambled in a way that only obscures the "john.doe" portion but leaves the attribute names and all values of non-scrambled attributes intact.
  • If an attribute has a generalized time syntax, that value is replaced with a randomized timestamp using the same format (the same number of digits and the same time zone indicator). The randomization will be over a time range that is double the difference between the time the transform-ldif tool was launched and the timestamp to be scrambled. For values where that time difference is less than one day, one day will be added to the difference before it is doubled.
  • If an attribute has an integer, numeric string, or telephone number syntax, scrambling is only applied to numeric digits while all other characters are left intact. If there are multiple digits, then the first digit will be nonzero.
  • If an attribute has an octet string syntax, it is scrambled as follows:
    • Each byte that represents a lowercase ASCII letter is replaced with a randomly-selected lowercase ASCII letter.
    • Each byte that represents a uppercase ASCII letter is replaced with a randomly-selected uppercase ASCII letter.
    • Each byte that represents an ASCII digit is replaced with a randomly-selected ASCII digit.
    • Each byte that represents a printable ASCII symbol is replaced with a randomly-selected printable ASCII symbol.
    • Each byte that represents an ASCII control character is replaced with a randomly-selected ASCII letter, digit, or symbol.
    • Each non-ASCII byte will be replaced with a randomly-selected non-ASCII byte.
  • If an attribute has a value that represents a valid JSON object, the resulting value will also be a JSON object. All field names will be left intact, and only the values of those fields may be scrambled. If the --scrambleJSONField argument is provided, only the specified fields will have values scrambled. Otherwise, the values of all fields will be scrambled. Field values are scrambled as follows:
    • Null values are not scrambled.
    • Boolean values are replaced with randomly-selected Boolean values. As with attributes with a Boolean syntax, these values are non-repeatable.
    • Number values will have only their digits replaced with randomly-selected digits and all other characters (minus sign, decimal point, exponentiation indicator) are left unchanged.
    • String values will be replaced with a randomly-selected generic string.
    • Array values have scrambling applied as appropriate for each value in the array. If the array field itself should be scrambled, then all values in the array are scrambled. Otherwise, only JSON objects contained inside the array have scrambling applied to appropriate fields.
    • JSON values have scrambling applied as appropriate for their fields.
  • If an attribute does not match any of the previous criteria, it is scrambled as follows:
    • Each lowercase ASCII letter is replaced with a randomly-selected lowercase ASCII letter.
    • Each uppercase ASCII letter replaced with a randomly-selected uppercase ASCII letter.
    • Each ASCII digit is replaced with a randomly-selected ASCII digit.
    • All other characters are left unchanged.
The following example reads from an LDIF file named original.ldif, scrambles the values of the telephoneNumber, mobile, and homeTelephoneNumber attributes, and writes the results to scrambled.ldif:
$ bin/transform-ldif --sourceLDIF original.ldif \
  --targetLDIF scrambled.ldif \
  --scrambleAttribute telephoneNumber \
  --scrambleAttribute mobile \
  --scrambleAttribute homeTelephoneNumber \
  --randomSeed 0