Data Preparation

Once you have deployed PingOne Autonomous Identity, you can prepare your dataset into a format that meets the schema.

The initial step is to obtain the data as agreed upon between ForgeRock and your company. The files contain a subset of user attributes from the HR database and entitlement metadata required for the analysis. Only the attributes necessary for analysis are used.

There are a number of steps that must be carried out before your production entitlement data is input into PingOne Autonomous Identity. The summary of these steps are outlined below:

Data collection

Typically, the raw client data is not in a form that meets the PingOne Autonomous Identity schema. For example, a unique user identifier can have multiple names, such as user_id, account_id, user_key, or key. Similarly, entitlement columns can have several names, such as access_point, privilege_name, or entitlement.

To get the correct format, here are some general rules:

Submit the raw client data in .csv file format. The data can be in a single file or multiple files. Data includes application attributes, entitlement assignments, entitlements decriptions, and identities data.
Duplicate values should be removed.
Add optional columns for additional training attributes, for example, MANAGERS_MANAGER and MANAGER_FLAG. You can add these additional attributes to the schema using the PingOne Autonomous Identity UI. For more information, refer to Set Entity Definitions.
Make a note of those attributes that differ from the PingOne Autonomous Identity schema, which is presented below. This is crucial for setting up your attribute mappings. For more information, refer to Set Attribute Mappings.

CSV files and schema

The required attributes for the schema are as follows:

CSV Files Schema
Files	Schema
applications.csv	This file depends on the attributes that the client wants to include. Here are some required columns: app_id. Specifies the applications’s unique ID. app_name. Specifies the applications’s name. app_owner_id. Specifies the ID of the application’s owner.
assignments.csv	user_id. Specifies the unique user ID to which the entitlement is assigned. ent_id. Specifies the entitlements’s unique ID.
entitlements.csv	ent_id. Specifies the entitlements’s unique ID. ent_name. Specifies the entitlement name. ent_owner_id. Specifies the entitlement’s owner. app_id. Specifies the applications’s unique ID.
identities.csv	usr_id. Specifies the user’s unique ID. user_name. Specifies a human readable username. For example, `John Smith`. usr_manager_id. Specifies the user’s manager ID.