How to run PingOne Recognize Agent

Infrastructure requirements

PingOne Recognize Agent requires the following minimum versions and infrastructure:

Kubernetes 1.32+
Helm (if your chosen installation method is Helm)

In terms of performance, a single instance of PingOne Recognize Agent typically performs as follows (2000 mCPUs, 8000 Mi RAM):

5 requests in parallel
3 seconds per image processed
Horizontal scaling ready (online mode only)

Pull the Docker image

The Docker image is available on the PingOne Recognize Quay repository.

Run Docker login:

docker login -u="keyless_technologies+<PROVIDED_TENANT_NAME>" -p="<PROVIDED_PASSWORD>" quay.io

Pull the container:
```
docker pull quay.io/keyless_technologies/keyless-agent:v3.0.0
```
The image can also be installed through the official Helm chart, which can be provided on request through your technical services contact.

Running PingOne Recognize Agent

Customers can run this as a Docker image or use the Helm chart.

After the Docker image is set up, configure the service using the following environment variables. If no default is specified, the variable is required.

NUM_OF_CIRC - Number of circuits to create during enrollment (default: 25)

Enable online enrollment

The following environment variables are required. If they aren’t set, online enrollment is disabled:

AUTH_SERVER_URL - URL of the auth server (same as host passed to SDK)
AUTH_SERVER_API_KEY - API key for the auth server (same as API key passed to SDK)

Configure logs

LOG_FORMAT - Log format: json or human (default: human)

Configure the HTTP server

PORT - Port the HTTP server listens on (default: 80)

Configure concurrency

MAX_CONCURRENCY: Maximum number of concurrent biometric sessions (default: number of CPUs).
MAX_WAIT: Maximum number of requests waiting for an available biometric session (default: 1).

Maximum concurrency overview

Each enrollment request runs a biometric session to extract embeddings. There is a maximum number of biometric sessions that can run concurrently. If requests exceed concurrent sessions, requests are queued and processed when a session becomes available. If requests exceed MAX_WAIT, requests are rejected with HTTP 429 Too Many Requests.

Memory management

Memory usage is primarily controlled by MAX_CONCURRENCY, since it determines how many biometric sessions can run at the same time. Each session consumes memory to load biometric models and run extraction. More concurrent sessions consume more memory.

Other less significant factors are:

Number of circuits created during enrollment
Number of waiting requests
Resolution of photos used for enrollment

Approximate memory consumption for offline enrollment with 25 circuits:

MAX_CONCURRENCY=2: Approximately 0.5 GB
MAX_CONCURRENCY=4: Approximately 1 GB
MAX_CONCURRENCY=7: Approximately 1.5 GB
MAX_CONCURRENCY=9: Approximately 2 GB

Performance and throughput

Peak performance is approximately 1 second to generate offline enrollment with 25 circuits. Biometric extraction is approximately 0.5 seconds (depending on CPU power), with the remaining time spent on network transfer and request processing.

Max concurrency	CPU cores	Requests per second
1	1	0.6
1	2	1.7
1	4	2.2
1	8	2.2
2	2	0.7
2	4	2.3
2	8	4.5

MAX_WAIT controls how many requests can wait for an available biometric session and therefore how the service handles traffic spikes. If requests exceed MAX_WAIT, they’re rejected with HTTP 429 Too Many Requests.

Service throughput is mostly determined by the number of concurrent biometric sessions and the number of available CPU cores. The benchmarks above demonstrate the relationship between max concurrency, CPU cores, and performance.

Always benchmark on your own hardware for the most accurate performance estimates.

Understanding circuits for authentication

Circuits are a key concept in the PingOne Recognize platform to ensure the solution remains 100% privacy-preserving. They ensure the cryptographic transformation is unique for every user, on every device, and for every authentication, which helps prevent reverse engineering attempts.

Circuits are generated on the client side (for example, the IDV Bridge) and sent to the server. Each authentication request consumes one circuit, and a successful authentication replenishes the circuit supply. If a user repeatedly fails authentication, circuits can become exhausted and the user must re-enroll.

The maximum number of circuits determines the maximum number of consecutive failed authentication attempts allowed before the account is locked and re-enrollment is required. Increasing this maximum too much can affect performance.

The default is 25, and changing it isn’t recommended without a clear reason.