Distributed tracing
In a distributed system, requests pass through multiple services hosted on multiple servers. Without telemetry data, it can be difficult to identify the root cause of performance issues or errors.
Distributed tracing provides visibility into the full path a request takes in a distributed system. PingDirectory supports the OpenTelemetry framework for collecting distributed tracing data. You can send traces collected by PingDirectory to a backend service, such as Jaeger, for aggregation, storage, and visualization.
|
This feature is provided as a Preview, which means that it isn’t supported and should not be used in production environments. Learn more in Feature status. Additionally, distributed tracing is only available for the PingDirectory server. |
Why use distributed tracing?
Diagnosing escalated production issues can take hours to days, involving multiple subject matter experts trying to correlate fragmented logs and understand what happened, often yielding a lot of noise and little clarity. The more services and instances involved, the more challenging troubleshooting becomes.
Distributed tracing addresses these challenges by supporting end-to-end request visibility and data correlation across multiple services and servers. As a result, you can troubleshoot performance issues and errors more quickly and effectively. You can also use distributed tracing to optimize system performance by identifying bottlenecks and inefficiencies in service interactions.
What is distributed tracing?
Distributed tracing shows you how an incoming request was processed across all servers and services in a distributed system, including:
-
Which servers and services the request went through.
-
How much time each service took to process its part of the request.
-
How the services are connected.
-
What the failure point was in case of a request failure.
A distributed trace provides a visual representation of a request’s journey. Spans show when an operation started, when it ended, and its duration. When one service calls another, these calls are linked within the trace, showing the flow and time spent in each service. The PingDirectory server uses the OpenTelemetry framework to create and manage these spans and traces.
- Traces
-
A trace represents the path of a request through an application. A trace is made up of one or more spans. Learn more about traces in the OpenTelemetry documentation.
- Spans
-
A span is a segment of a request journey. It represents a unit of work or an operation within a service. Each span includes the following elements:
-
traceIdrepresents the trace that the span is a part of. -
spanIdis a unique ID for the span. -
parentSpanIdis the ID of the originating request.
Servers add span attributes following the semantic conventions, with LDAP-specific attributes based on HTTP conventions.
-
- Root span
-
The root span indicates the start and end of an entire operation. The
parentSpanIdof the root span is null because the root span isn’t part of an existing trace. Subsequent spans in the trace have their own uniquespanId. TheirtraceIdis the same as that of the root span, and theirparentSpanIdmatches thespanIdof the root span. - OpenTelemetry
-
OpenTelemetry is an open-source observability framework for instrumenting, generating, collecting, and exporting telemetry data. It provides a standardized way to capture distributed traces across different services and platforms. It doesn’t provide a backend for storing or analyzing telemetry data. Learn more in the OpenTelemetry documentation.
Which requests are traced?
All incoming LDAP requests, including those from PingFederate, are supported. Requests must include the W3 trace context LDAP request control to propagate trace information.
The W3 trace context allows for consistent correlation IDs and metadata across systems that support the W3 standard. If a request doesn’t include the W3 trace context control, a new trace starts for that request.
Enable and configure tracing
Distributed tracing is disabled by default. To enable the feature, you need to enable the OpenTelemetry plugin, as follows:
bin/dsconfig set-plugin-prop \
--plugin-name OpenTelemetry \
--set enabled:true
Supply the following properties to configure how spans are sampled and where telemetry data gets exported:
| Property | Description | Values | ||
|---|---|---|---|---|
|
The key manager provider to use if the OTLP/HTTP collector requires a client certificate. |
For example, |
||
|
The trust manager provider used to validate the certificate presented by the OTLP/HTTP collector. |
For example, |
||
|
The nickname in the associated key store for the certificate to present to the OTLP/HTTP collector. You can leave this undefined if no key manager provider is configured or if the JVM should select a certificate automatically. |
For example, |
||
|
Sets the OTLP/HTTP endpoint where the server exports sampled spans. |
The endpoint must start with either http:// or https:// and include the full HTTP path.
|
||
|
Selects the sampling strategy used when new spans are created. |
|
||
|
Specifies the sampling percentage used by ratio-based samplers. Higher values result in more spans being sampled but could impact performance. When the sampling strategy is either |
The default value is |
The following example configures the plugin to push all traces to http://localhost:4318/v1/traces, sampling all the spans:
bin/dsconfig set-plugin-prop \
--plugin-name OpenTelemetry \
--set enabled:true \
--set tracer-sampler:always-on \
--set tracer-exporter-otlp-endpoint:http://localhost:4318/v1/traces \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--no-prompt
How to view traces
PingDirectory can push traces to an OpenTelemetry Protocol (OTLP) endpoint over HTTP. Any backend that supports OTLP/HTTP can be used to collect and visualize the traces.
|
Try the Jaeger tracing All-in-one Docker image to capture exported spans. By default, Jaeger stores the spans in memory, but you can configure Jaeger to send the spans to various persistent datastores external to the Docker image. |