How to run PingOne Recognize Agent
Infrastructure requirements
PingOne Recognize Agent requires the following minimum versions and infrastructure:
-
Kubernetes 1.32+
-
Helm (if your chosen installation method is Helm)
In terms of performance, a single instance of PingOne Recognize Agent typically performs as follows (2000 mCPUs, 8000 Mi RAM):
-
5 requests in parallel
-
3 seconds per image processed
-
Horizontal scaling ready (online mode only)
Pull the Docker image
The Docker image is available on the PingOne Recognize Quay repository.
-
Run Docker login:
docker login -u="keyless_technologies+<PROVIDED_TENANT_NAME>" -p="<PROVIDED_PASSWORD>" quay.io -
Pull the container:
docker pull quay.io/keyless_technologies/keyless-agent:v3.0.0The image can also be installed through the official Helm chart, which can be provided on request through your technical services contact.
Running PingOne Recognize Agent
Customers can run this as a Docker image or use the Helm chart.
After the Docker image is set up, configure the service using the following environment variables. If no default is specified, the variable is required.
-
NUM_OF_CIRC- Number of circuits to create during enrollment (default:25)
Enable online enrollment
The following environment variables are required. If they aren’t set, online enrollment is disabled:
-
AUTH_SERVER_URL- URL of the auth server (same as host passed to SDK) -
AUTH_SERVER_API_KEY- API key for the auth server (same as API key passed to SDK)
Configure concurrency
-
MAX_CONCURRENCY: Maximum number of concurrent biometric sessions (default: number of CPUs). -
MAX_WAIT: Maximum number of requests waiting for an available biometric session (default:1).
Maximum concurrency overview
Each enrollment request runs a biometric session to extract embeddings. There is a maximum number of biometric sessions that can run concurrently. If requests exceed concurrent sessions, requests are queued and processed when a session becomes available. If requests exceed MAX_WAIT, requests are rejected with HTTP 429 Too Many Requests.
Memory management
Memory usage is primarily controlled by MAX_CONCURRENCY, since it determines how many biometric sessions can run at the same time. Each session consumes memory to load biometric models and run extraction. More concurrent sessions consume more memory.
Other less significant factors are:
-
Number of circuits created during enrollment
-
Number of waiting requests
-
Resolution of photos used for enrollment
Approximate memory consumption for offline enrollment with 25 circuits:
-
MAX_CONCURRENCY=2: Approximately 0.5 GB -
MAX_CONCURRENCY=4: Approximately 1 GB -
MAX_CONCURRENCY=7: Approximately 1.5 GB -
MAX_CONCURRENCY=9: Approximately 2 GB
Performance and throughput
Peak performance is approximately 1 second to generate offline enrollment with 25 circuits. Biometric extraction is approximately 0.5 seconds (depending on CPU power), with the remaining time spent on network transfer and request processing.
| Max concurrency | CPU cores | Requests per second |
|---|---|---|
1 |
1 |
0.6 |
1 |
2 |
1.7 |
1 |
4 |
2.2 |
1 |
8 |
2.2 |
2 |
2 |
0.7 |
2 |
4 |
2.3 |
2 |
8 |
4.5 |
MAX_WAIT controls how many requests can wait for an available biometric session and therefore how the service handles traffic spikes. If requests exceed MAX_WAIT, they’re rejected with HTTP 429 Too Many Requests.
|
Service throughput is mostly determined by the number of concurrent biometric sessions and the number of available CPU cores. The benchmarks above demonstrate the relationship between max concurrency, CPU cores, and performance. Always benchmark on your own hardware for the most accurate performance estimates. |
Understanding circuits for authentication
|
Circuits are a key concept in the PingOne Recognize platform to ensure the solution remains 100% privacy-preserving. They ensure the cryptographic transformation is unique for every user, on every device, and for every authentication, which helps prevent reverse engineering attempts. Circuits are generated on the client side (for example, the IDV Bridge) and sent to the server. Each authentication request consumes one circuit, and a successful authentication replenishes the circuit supply. If a user repeatedly fails authentication, circuits can become exhausted and the user must re-enroll. The maximum number of circuits determines the maximum number of consecutive failed authentication attempts allowed before the account is locked and re-enrollment is required. Increasing this maximum too much can affect performance. The default is |