PingDirectory

Log sanitization options

There are six options you can use for sanitizing access log fields as they are being written.

Option Description

Preserve

Preserves the original value without attempting to obscure it.

Some sanitization might be performed, such as truncating very long values and escaping special characters, but this is more for usability than privacy.

Omit

Excludes the field completely from the log. Neither the field name nor its value will be present in the log.

Redact Entire Value

Includes the field name in log messages, but replaces the entire value with a fixed string that does not reveal anything about the actual value for the field.

To attempt to preserve the original syntax of a field value, the actual string that is used for the redacted value of a field depends on the syntax of that field. The redacted strings include:

String, string list, and Boolean fields

These fields will have a redacted value of {REDACTED}.

Because Boolean values can only be true or false, redacted and tokenized representations of Boolean values will not conform to the Boolean syntax. If there is concern about leaking sensitive information through a Boolean field, you should omit the field.

Integer fields

These fields will have a redacted value of -999999999999999999.

Floating-point fields

These fields will have a redacted value of -999999.999999.

Distinguished name (DN) fields

These fields will have a redacted value of redacted={REDACTED}.

Filter fields

These fields will have a redacted value of (redacted={REDACTED}).

JSON object fields

These fields will have a redacted value of { "redacted":"{REDACTED}" }.

Generalized time fields

These fields will have a redacted value of 99990101000000.000Z.

RFC 3339 timestamp fields

These fields will have a redacted value of 9999-01-01T00:00:00.000Z.

Redact Value Components

Includes the field name in log messages but replaces the value with a string that can redact only certain components of the value without redacting the entire value.

This preserves some usability of the original value without exposing sensitive information.

Syntaxes that support redacted value components include:

String list

Each element in the list will be replaced with {REDACTED}, so a list with three elements will be represented as {REDACTED},{REDACTED},{REDACTED}.

DN

Attribute names will be preserved, but attribute values will be redacted. For example, the DN uid=jdoe,ou=People,dc=example,dc=com will be redacted as uid={REDACTED},ou={REDACTED},dc={REDACTED},dc={REDACTED}.

Filter

As with DNs, attribute names will be preserved, but attribute values will be redacted. For example, the filter (&(givenName=John)(sn=Doe)) would be redacted as (&(uid={REDACTED})(sn={REDACTED})).

You can configure the DN and filter log field syntaxes to include or exclude specified attributes from redaction while preserving the other attribute values:

  • To specify a set of included sensitive attributes so that only those attributes will have their values redacted, and to preserve all other attribute values, use the included-sensitive-attribute property.

  • To specify a set of excluded sensitive attributes so that only the values of the excluded attributes will be preserved, and to redact all other attribute values, use the excluded-sensitive-attribute property.

JSON objects

Field names will be preserved but field values will be replaced with the string {REDACTED}.

You can configure the JSON log field syntax to include or exclude specified fields from redaction while preserving the other field values:

  • To specify a set of included sensitive fields so that only those fields will have their values redacted, and to preserve all other field values, use the included-sensitive-field property.

  • To specify a set of excluded sensitive fields so that only the values of the excluded fields will be preserved, and to redact the values of all other fields, use the excluded-sensitive-field property.

If a syntax does not have a value with components, the entire value is redacted.

Tokenize Entire Value

Includes the field name in log messages but replaces the entire value with a tokenized representation that’s based on the original value but doesn’t reveal the value.

This is similar to redacting the entire value, but because the tokenized value is based on the original value, and the logic used to perform the tokenization is repeatable, you can identify cases in which the same value appears multiple times in the log even if you do not know the original value.

  • String, string list, and Boolean fields will have a tokenized value of {TOKENIZED:<token-value>}, where <token-value> is the actual value for the token generated from the original value (for example, {TOKENIZED:NcMNHJxmCwhETFIe}).

  • Integer fields will have a tokenized value that starts with -999999999 and is followed by nine more digits generated from the original value (for example, -999999999600205901).

  • Floating-point fields will have a tokenized value that starts with -999999. and is followed by six digits generated from the original value (for example, -999999.695201).

  • DN fields will have a tokenized value of tokenized={TOKENIZED:<token-value>}, where <token-value> is the value generated from the normalized string representation of the original DN.

  • Filter fields will have a tokenized value of (tokenized={TOKENIZED:<token-value>}), where <token-value> is the value generated from the normalized string representation of the original filter.

  • JSON object fields will have a tokenized value of { "tokenized":"{TOKENIZED:<token-value>}" }, where <token-value> is the value generated from the normalized string representation of the original JSON object.

  • Generalized time and RFC 3339 timestamp fields will have a tokenized value with a year of 8888, with the remainder of the month, day, hour, minute, second, and sub-second elements generated from the original timestamp.

Tokenize Value Components

Behaves in the same way as redacted value components, but replaces the redacted string with a tokenized string.

Syntaxes that support tokenized value components include:

String list

Each element in the list will be tokenized individually.

DN

Attribute names will be preserved, but attribute values will be tokenized.

Filter

Attribute names will be preserved, but attribute values will be tokenized.

You can configure the DN and filter log field syntaxes to include or exclude specified attributes from tokenization while preserving the other attribute values:

  • To specify a set of included sensitive attributes so that only those attributes will have their values tokenized, and to preserve all other attribute values, use the included-sensitive-attribute property.

  • To specify a set of excluded sensitive attributes so that only the values of the excluded attributes will be preserved, and to tokenize all other attribute values, use the excluded-sensitive-attribute property.

JSON objects

Field names will be preserved, but field values will be tokenized.

You can configure the JSON log field syntax to include or exclude specified fields from tokenization while preserving the other field values:

  • To specify a set of included sensitive fields so that only those fields will have their values tokenized, and to preserve all other field values, use the included-sensitive-field property.

  • To specify a set of excluded sensitive fields so that only the values of the excluded fields will be preserved, and to tokenize the values of all other fields, use the excluded-sensitive-field property.

After you have identified the approach you want to take for sanitizing your log content as it is written, see Customizing log field syntaxes or Customizing log field behaviors.