The PingDataMetrics Server uses a PostgreSQL DBMS to store data, which is included in the installation. This is a traditional, table-based DBMS, best suited for tabular data.
The PingDataMetrics Server interacts with the DBMS in four ways:
- Data import
- Import places steady write load on the DBMS and accounts for 80% of the writes. This single-threaded interaction puts a lock on the target table. A PingDataMetrics Server that monitors 20 servers keeps a single 10K RPM disk 70% busy with this single interaction.
- Data aggregation
- Data aggregation places a less frequent read/write load on the DBMS. This interaction is responsible for the aggregation of the data samples from one time resolution to the next, reading from one set of tables and writing to another set. Sample aggregation uses no table-level locks and the ratio of records between read:write is between 60:1 and 24:1.
- Data sample age-out
- Sample age-out occurs at regular intervals and results in a table being dropped or added. Age-out occurs every 30 minutes though some intervals might drop or add more than one table.
- Data query
- Sample queries occur when clients request metric samples from the public API. The API can aggregate multiple dimensions and multiple servers in a single request. A single request might fetch several million rows from the DBMS though it only returns a few hundred data points to the client. Samples from previous queries are cached by the PingDataMetrics Server, but initial queries for a given metric might take several seconds and result in a large amount of disk read activity
Over time, the storage of samples in the data tables is optimized to match the access patterns of the queries. However, the public API supports queries where the results are the aggregate of thousands of different dimension sets, and each dimension set might have thousands of samples within the time range of the query.
For example, a query about the throughput of all PingDirectory and PingDirectoryProxy Servers for all applications and all LDAP operations over the last 72 hours might result in four to six million DBMS records being read into memory, aggregated, and finally reduced to 100 data values. The results from each query are cached so that a subsequent request for the same data results in less DBMS activity. Both disk seek time and rotational delay impact the performance of a first-time query, so disks with faster RPM speeds provide a measurable improvement for first-time queries.