Metric Categories
A metric has a dot-separated name, which defines it within a namespace hierarchy or tree. Metric names can only include letters, numbers, underscores, and dots. Database Performance Monitor metrics take one of two forms: what we refer to as a gauge or as a derivative.
A gauge represents a scalar value, recorded at a specific time, such as the current value of the motherboard temperature or the number of users logged in. A DPM example would be
mysql.innodb.pending_log_writes
.Some values are accumulated values which always increase, such as the number of network bytes received on an interface. We transform these into derivatives by subtracting the previously seen value from each new sample and dividing by the time between samples. A DPM example would be
os.cpu.idle_us
. These numbers often represent throughput or time, and (as in the below table) many derivative metrics are suffixed withtput
or_us
.
All metrics are per-second. When hovering over a chart to view its value, the number shown will be some value per-second. When viewing a long time frame, individual seconds are averaged together. Because of this averaging, peaks in data may be shown as less than their maximum value if you are viewing the chart at a resolution other than per-second. For example, if you are viewing a long time frame and each point on the chart represents 15 minutes, the value displayed is the per-second average for that 15 minutes.
Metrics are not displayed with units; the metric name typically includes some indication of the units the metric represents, such as bytes
, packets
, count
, etc. When viewing metrics, you may see a letter suffix, such as 10G
or 35m
. They represent values to a power of ten. We use the following standard SI values:
Symbol | Decimal | Description |
---|---|---|
T | 1 000 000 000 000 | trillion |
G | 1 000 000 000 | billion |
M | 1 000 000 | million |
k | 1 000 | thousand |
m | 0.001 | thousandth |
μ | 0.000 001 | millionth |
n | 0.000 000 001 | billionth |
We use some standard naming suffixes for common types of metrics:
Concept | Suffix |
---|---|
Throughput or arrivals per second | tput |
Read or write operations | reads / writes |
Timing | *_us (microseconds) |
Timing | *_s (seconds) |
Utilization | util (integer from 1 to 100) |
Count | count |
These are the default highest level metric namespaces and a brief description of each:
Name | Description |
---|---|
agents |
Internal diagnostic metrics of agent behavior. |
aws |
Metrics retrieved from Amazon CloudWatch. |
host |
Database-agnostic data, such as user/connection/client statistics and query metrics. |
mongo |
Metrics from MongoDB sources, such as serverStatus . |
mssql |
Metrics from SQL Server sources, such as sys.dm_exec_connections . |
mysql |
Metrics from MySQL sources such as PERFORMANCE_SCHEMA . |
os |
Disk, memory, CPU, ps, and networking information. |
pgsql |
Metrics from PostgreSQL sources, such as pg_stat_statements . |
redis |
Metrics from Redis sources such as INFO ALL . |
Below is more information about each of the above metric categories. The complete
list of metrics is extensive and in many cases come directly from the output of
built in database commands (such as SHOW PROCESSLIST
) and we have linked to the
relevant database-specific documentation where applicable. You can use the categories
below to search for metrics on the Metrics page of DPM.
agents
Metric Information
Metrics in the agents
-family of metrics contain diagnostic information about the DPM agent. Each plugin records different metrics about itself, and the metrics are organized by plugin name; for example, agents.vc_mysql_metrics.*
.
aws
Metric Information
Metrics in the aws.*
-family of metrics come from CloudWatch, Amazon’s monitoring
service, for RDS and Aurora. To enable CloudWatch for your Amazon
instances, follow the instructions here. We collect the metrics in the table on this page. The metric names are lowercased, with words separated by underscores, to match our standard. For example, the AWS metric BinLogDiskUsage
becomes aws.bin_log_disk_usage.*
We collect the count
, sum
, min
, and max
values as provided directly from Amazon.
Metric | Description |
---|---|
aws.aws_rds.bin_log_disk_usage.* |
The amount of disk space occupied by binary logs on the master, in bytes. |
aws.aws_rds.cpu_credit_balance.* |
The number of earned CPU credits that an instance has accrued since it was launched or started. For T2 Standard, the CPUCreditBalance also includes the number of launch credits that have been accrued. |
aws.aws_rds.cpu_credit_usage.* |
The number of CPU credits spent by the instance for CPU utilization. |
aws.aws_rds.cpu_utilization.* |
The percentage of CPU utilization. |
aws.aws_rds.database_connections.* |
The number of database connections in use. |
aws.aws_rds.disk_queue_depth.* |
The number of outstanding IOs (read/write requests) waiting to access the disk. |
aws.aws_rds.free_storage_space.* |
The amount of available storage space. |
aws.aws_rds.freeable_memory.* |
The amount of available random access memory. |
aws.aws_rds.network_receive_throughput.* |
The incoming (Receive) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication. |
aws.aws_rds.network_transmit_throughput.* |
The outgoing (Transmit) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication. |
aws.aws_rds.oldest_replication_slot_lag.* |
The lagging size of the replica lagging the most in terms of WAL data received. Applies to PostgreSQL. |
aws.aws_rds.read_iops.* |
The average number of disk read I/O operations per second. |
aws.aws_rds.read_latency.* |
The average amount of time taken per disk I/O operation. |
aws.aws_rds.read_throughput.* |
The average number of bytes read from disk per second. |
aws.aws_rds.swap_usage.* |
The amount of swap space used on the DB instance. |
aws.aws_rds.transaction_logs_disk_usage.* |
The disk space used by transaction logs. Applies to PostgreSQL. |
aws.aws_rds.transaction_logs_generation.* |
The size of transaction logs generated per second. Applies to PostgreSQL. |
aws.aws_rds.write_iops.* |
The average number of disk write I/O operations per second. |
aws.aws_rds.write_latency.* |
The average amount of time taken per disk I/O operation. |
aws.aws_rds.write_throughput.* |
The average number of bytes written to disk per second. |
host
Metric Information
Children under host
, and their descriptions, are as follows:
Metric | Description |
---|---|
host.auth |
DPM database login attempt metrics: failure , blocked , and other |
host.callers |
Metrics related to queries, grouped by connected client. IP address bytes are underscore separated as the dot is a reserved character in metric names. Comes from the protocol decoder. This data is used by the Profiler to allow ranking of ‘Hosts.’ These metrics are only available with the On-Host configuration. |
host.connections |
Number of connections throughput and connection time throughput. These metrics are only available with the On-Host configuration. |
host.dbs |
Metrics by database, such as affected_rows , data_length , index_total , row_count , total_length , etc.This data is used by the Profiler to allow ranking of ‘Databases.’ |
host.queries |
Query data, grouped by q , p , e , or c (for query/prepare/execute/close) and query digest. Each metric is suffixed with .tput (except time_us and tput itself) as they are per-second derivatve values. This information comes from the protocol decoder, in addition to PERFORMANCE_SCHEMA and PG_STAT_STATEMENTS .This data is used by the Profiler to allow ranking of ‘Queries.’ host.queries.tagged.* , specifically, is used by the profiler to allow ranking of ‘Query Tags.’ Query tags are only available with the On-Host configuration. More information about query tags is available in our Query Tags Documentation. |
host.samples |
Contains the count of failed_rules per query digest. You can read more about the rules we apply to query samples here. |
host.status |
Total bytes_sent and bytes_received . |
host.tables |
Metrics by table: data_length , index_length , total_length , data_free , and row_count . Comes from INFORMATION_SCHEMA in MySQL and pg_statio_user_tables view in PostgreSQL.This data is used by the profiler to allow ranking of ‘Tables.’ host.tables metrics are disabled by default, as the capture of these metrics can be expensive. |
host.totals |
Total metrics for all queries combined, for an entire host, regardless of database type. host.totals.queries.* includes totals for all query metrics, including affected_rows , errors , latency , rows_examined , etc. |
host.totals.queries.latency.*.count |
Count of queries whose latency was in this latency band. |
host.totals.queries.p99_latency_us |
Total P99 latency for the selected host. 99% of query executions were faster than this latency. |
host.totals.queries.time_us |
The total execution time of all queries which finished in a given second. |
host.totals.queries.tput |
The number of queries which finished executing in a given second. |
host.users |
Metrics by user accross databases and tables: affected_rows , errors.<code> , errors.no_good_index , no_index , slow , time_us , and tput . Each of these are suffixed with .tput (except time_us and tput itself) as they are per-second derivative values. Comes from the protocol decoder.This data is used by the profiler to allow ranking of ‘Users.’ These metrics are only available with the On-Host configuration. |
host.verbs |
Metrics by query verb (ALTER , SELECT , etc.), such as affected_rows , no_index , rows_examined , slow , etc.This data is used by the Profiler to allow ranking of ‘Query Verbs.’ |
mongo
Metric Information
Metrics in the mongo.*
family of metrics are captured by the vc-mongo-metrics
plugin.
We capture nearly all of the metrics provided by the MongoDB command serverStatus
.
Additionally, we capture a number of metrics from the connPoolStats
command as well.
DPM metric names follow the MongoDB hierarchy, normalized to our metric style of lowercase with underscores.
Metric | Description |
---|---|
mongo.connpool |
Information about the open outgoing connections from the current database instance. |
mongo.status.asserts |
Assertions raised since the MongoDB process started. |
mongo.status.background_flushing |
mongod process’s periodic writes to disk. |
mongo.status.connections |
Status of the connections. |
mongo.status.dur |
mongod instance’s journaling-related operations and performance. |
mongo.status.extra_info |
Additional information regarding the underlying system. |
mongo.status.global_lock |
Reports on the database’s lock state. |
mongo.status.locks |
For each lock <type> , data on lock <modes> . |
mongo.status.mem |
System architecture of the mongod and current memory use. |
mongo.status.metrics |
Various statistics that reflect the current use and state of a running mongod instance. |
mongo.status.network |
MongoDB’s network use. |
mongo.status.opcounters |
Operations by type since the mongod instance last started. |
mongo.status.opcounters_repl |
Database replication operations by type since the mongod instance last started. |
mongo.status.wired_tiger |
Metrics about the Wired Tiger storage engine. |
mssql
Metric Information
DPM retrieves database metrics for SQL Server from dynamic management views (DMV) such as data about running threads (e.g. sys.dm_exec_requests
) and performance (e.g. dm_os_performance_counters
).
Metrics generally have the same name as they appear in the SQL Server documentation, normalized to our convention (e.g. spaces are replaced with underscores).
Metric | Description |
---|---|
mssql.perf_counters.memory_broker_clerks.* |
Provides metrics for statistics related to memory broker clerks. SQL Server Documentation Link |
mssql.perf_counters.memory_node.* |
Provides metrics to monitor server memory usage on NUMA nodes. SQL Server Documentation Link |
mssql.perf_counters.databases.* |
Provides metrics to monitor bulk copy operations, backup and restore throughput, and transaction log activities. SQL Server Documentation Link |
mssql.perf_counters.general_statistics.* |
Povides metrics to monitor general server-wide activity, such as the number of current connections and the number of users connecting and disconnecting per second from computers running an instance of SQL Server. SQL Server Documentation Link |
mssql.perf_counters.transactions.* |
Provides metrics to monitor the number of transactions active in an instance of the Database Engine, and the effects of those transactions on resources such as the snapshot isolation row version store in tempdb . SQL Server Documentation Link |
mssql.perf_counters.wait_statistics.* |
Provides metrics that report information about broad categorizations of waits. SQL Server Documentation Link |
mssql.processlist.* |
Information regarding query-family specific execution information, which comes from dm_exec_requests, dm_exec_connections, and dm_exec_sessions. |
mssql.waiting_tasks.* |
Aggregated metrics about all the waits encountered by threads that executed. SQL Server Documentation Link |
mssql.status.* |
Metrics covering a miscellaneous set of useful information about the computer, and about the resources available to and consumed by SQL Server. SQL Server Documentation Link |
mysql
Metric Information
Metrics recorded in the mysql.*
family of metrics are captured by the vc-mysql-metrics
plugin.
For information on what settings and permissions are required to capture the metrics listed,
please see here.
Children under mysql
, and their descriptions, are as follows:
Metric | Description |
---|---|
mysql.innodb |
InnoDB engine information, from selected portions of the return from SHOW ENGINE INNODB STATUS . For more information on this statement, see here. |
mysql.innodb.pending_reads |
The number of InnoDB buffer pool pages waiting to be read in to the buffer pool. |
mysql.innodb.trx.state.* |
Transaction state information, such as active , notstarted , and prepared , expressed as total time and count. |
mysql.mutex.innodb |
Information regarding InnoDB mutex wait times. This data comes from the MySQL Performance Schema, and is available for versions 5.7 and up. For more information about InnoDB mutex wait instruments, which must be enabled to capture this data in DPM, see here. This data is used by the Profiler to allow ranking of InnoDB Mutexes . |
mysql.processlist |
Process information from performance_schema.threads , INFORMATION_SCHEMA.processlist , or SHOW PROCESSLIST , in that order, depending on which we are able to query. More information about each child under mysql.processlist , such as callers , command , and query , can be found hereThis data is used by the Profiler to allow ranking of ‘MySQL Processlist’ metrics. |
mysql.processlist.callers.* |
Total time and count for all threads, grouped by by client IP address. |
mysql.processlist.users.* |
Total time and count for all threads, grouped by by user. |
mysql.processlist.query.* |
Total time and count for all threads, grouped by query. The metric name contains the DPM query ID, which can be used to search for other metrics for that query. |
mysql.processlist.state.* |
Total time and count for all threads, grouped by state. More information about thread states can be found in the MySQL documentation |
mysql.status |
Metrics built from MySQL’s server status variables, retrieved from SHOW GLOBAL STATUS . More information about each variable is here.We keep the same variable name; for example, MySQL’s aborted_connects system variable becomes mysql.status.aborted_connects . The specific metric mysql.status.replication_delay.us is provided by SHOW SLAVE STATUS . |
mysql.status.binlog_* |
Binlog statistics from SHOW STATUS . |
mysql.status.com_* |
Indicates the number of times each statement type has been executed. There is one status variable for each type of statement. For example, com_delete and com_update count DELETE and UPDATE statements, respectively. |
mysql.status.com_stmt_* |
These metrics are for prepared statement commands. Their names refer to the COM_xxx command set used in the network layer. |
mysql.status.connection_errors_* |
These metrics provide information about errors that occur during the client connection process. They represent error counts aggregated across all connections. For more information on each error, see the MySQL documentation. |
mysql.status.created_tmp_* |
Metrics about MySQL’s use of temporary files and tables. If created_tmp_disk_tables is large, you may want to increase the tmp_table_size or max_heap_table_size value to lessen the likelihood that internal temporary tables in memory will be converted to on-disk tables. You can compare the number of internal on-disk temporary tables created to the total number of internal temporary tables created by comparing the values of the created_tmp_disk_tables and created_tmp_tables variables . |
mysql.status.handler_read_* |
Data about how MySQL is reading data from tables. Large numbers of handler_read_rnd and handler_read_rnd_next indicates your tables are poorly indexed or queries are not written to use the indexes you have. |
mysql.status.innodb_adaptive_hash_* |
Activity against the adaptive hash index. A high number of searches and activity usually indicates the index is effective. |
mysql.status.innodb_row_lock_* |
InnoDB row lock information, including average and total wait time, the number of row locks being waited for, and the number of total operations which waited for a lock. |
mysql.status.performance_schema_*_lost |
These variables provide information about instrumentation that could not be loaded or created due to memory constraints. If you are monitoring using the performance_schema (“off-host”) and “performance_schema_digest_lost” is not 0, some queries are not being tracked. Increase the value of performance_schema_digests_size. |
mysql.status.select_* |
How MySQL performed SELECTs. If select_full_join and select_range_check are not 0, you should carefully check the indexes of your tables. select_range is normally not a critical issue even if the value is quite large. |
mysql.status.sort_* |
Indicates the number of times a query executed using each sort type. For example sort_merge_passes indicates the number of merge passes that the sort algorithm has had to do. |
mysql.status.ssl_* |
Indicates the number of SSL-related events that have occurred. For example, ssl_accepts indicates the number of accepted SSL connections. |
mysql.status.table_open_* |
Hits and misses of table cache lookups. Overflows is the number of times, after a table is opened or closed, a cache instance has an unused entry and the size of the instance is larger than table_open_cache / table_open_cache_instances. |
mysql.tables |
Metrics about the non-temporary tables that are open in the table cache, provided by SHOW OPEN TABLES . For more information about this command, see here. |
os
Metric Information
To obtain the information contained within the os.*
category of metrics, the vc-os-metrics
plugin inspects the contents of the /proc
virutal filesystem. The agent does not execute any commands, such as ps
. For more information about /proc
, see here.
Note that when installed using the Off-Host configuration, these metrics will be for the host where the agent is installed, not the database itself. For system metrics when monitoring RDS or Aurora, use the CloudWatch metrics.
Children under os
, and their descriptions, are as follows:
Metric | Description |
---|---|
os.cpu.* |
CPU statistics retrieved from /proc/stat . Metric names generally correspond to the meanings of the columns contained within /proc/stat suffixed with _us as they are microsecond time values (see below). More information about /proc/stat and the data it contains is available here. If you would like to trigger an alert on CPU utilization, see the table here. |
os.cpu.user_us |
Time spent in user mode |
os.cpu.nice_us |
Time spent in user mode with low priority |
os.cpu.system_us |
Time spent in system mode |
os.cpu.idle_us |
Time spent in the idle task |
os.cpu.io_wait_us |
Time waiting for I/O to complete |
os.cpu.irq_us |
Time servicing interrupts |
os.cpu.softirq_us |
Time servicing softirqs |
os.cpu.steal_us |
Stolen time, which is the time spent in other operating systems when running in a virtualized environment |
os.cpu.guest_us |
Time spent running a virtual CPU for guest operating systems under the control of the Linux kernel |
os.cpu.guest_nice_us |
Time spent running a niced guest (virtual CPU for guest operating systems under the control of the Linux kernel) |
os.cpu.processes |
Number of forks since boot |
os.cpu.intr |
Count of interrupts serviced since boot time |
os.cpu.ctxt |
The number of context switches that the system underwent |
os.cpu.util_pml |
DPM-computed processor utilization (out of 1000%) |
os.cpu.cores.count |
Number of cores |
os.cpu.freq_mhz |
Speed of the processor; retrieved from /proc/cpuinfo |
os.cpu.procs_blocked |
Number of processes blocked waiting for I/O to complete |
os.cpu.procs_running |
Number of processes in runnable state |
os.cpu.loadavg |
Retrieved from /proc/loadavg |
os.disk.* |
Disk statistics retrieved from /proc/diskstats . There are metrics for each device individually (<volume_name> , below) as well as aggregate statistics. |
os.disk.<volume_name>.total_io_us |
The total number of microseconds spent doing IO. |
os.disk.<volume_name>.weighted_io_us |
Weighted number of microseconds spent doing IO. |
os.disk.<volume_name>.ios_in_progress |
Number of operations currently in progress. |
os.disk.<volume_name>.avg_ios_in_progress |
Weighted IO time / total IO time |
os.disk.<volume_name>.tput |
This is the total number of reads and writes, combined. |
os.disk.<volume_name>.read_sectors |
This is the total number of sectors read successfully. |
os.disk.<volume_name>.write_sectors |
This is the total number of sectors written successfully. |
os.disk.<volume_name>.read_us |
This is the total number of microseconds spent by all reads. |
os.disk.<volume_name>.write_us |
This is the total number of microseconds spent by all writes. |
os.disk.<volume_name>.read_tput |
This is the total number of reads completed successfully. |
os.disk.<volume_name>.write_tput |
This is the total number of writes completed successfully. |
os.disk.<volume_name>.read_merges |
Reads and writes which are adjacent to each other may be merged for efficiency. Thus two 4K reads may become one 8K read before it is ultimately handed to the disk, and so it will be counted (and queued) as only one I/O. This field lets you know how often this was done. |
os.disk.<volume_name>.write_merges |
Reads and writes which are adjacent to each other may be merged for efficiency. Thus two 4K reads may become one 8K read before it is ultimately handed to the disk, and so it will be counted (and queued) as only one I/O. This field lets you know how often this was done. |
os.mem.* |
Memory statistics retrieved from /proc/meminfo and /proc/vmstat . All metrics coming from /proc/meminfo begin with bytes_ . Metrics corresponding to bytes paged in/out or swapped in/out begin with pages_ . |
os.mem.bytes_active |
The total amount of buffer or page cache memory, in bytes, that is in active use. This is memory that has been recently used and is usually not reclaimed for other purposes. |
os.mem.bytes_active_anon |
Anonymous memory, in bytes, that has been used more recently and usually not swapped out. |
os.mem.bytes_active_file |
Pagecache memory, in bytes, that has been used more recently and usually not reclaimed until needed. |
os.mem.bytes_anonpages |
Size, in bytes, of non-file backed pages mapped into userspace page tables. |
os.mem.bytes_buffers |
The amount of physical RAM, in bytes, used for file buffers. |
os.mem.bytes_cached |
The amount of physical RAM, in bytes, used as cache memory. |
os.mem.bytes_commited_as |
The total amount of memory, in bytes, estimated to complete the workload. This value represents the worst case scenario value, and also includes swap memory. |
os.mem.bytes_commitlimit |
Based on the overcommit ratio (vm.overcommit_ratio), this is the total amount of memory, in bytes, currently available to be allocated on the system. |
os.mem.bytes_dirty |
The total amount of memory, in bytes, waiting to be written back to the disk. |
os.mem.bytes_free |
Total free memory, in bytes. |
os.mem.bytes_inactive |
The total amount of buffer or page cache memory, in bytes, that are free and available. This is memory that has not been recently used and can be reclaimed for other purposes. |
os.mem.bytes_inactive_anon |
Anonymous memory, in bytes, that has not been used recently and can be swapped out. |
os.mem.bytes_inactive_file |
Pagecache memory, in bytes, that can be reclaimed without huge performance impact. |
os.mem.bytes_kernelstack |
The memory, in bytes, the kernel stack uses. This is not reclaimable. |
os.mem.bytes_mapped |
The total amount of memory, in bytes, which have been used to map devices, files, or libraries using the mmap command. |
os.mem.bytes_mlocked |
Size, in bytes, of pages locked to memory using the mlock() system call. Mlocked pages are also Unevictable. |
os.mem.bytes_pagetables |
Amount of memory, in bytes, dedicated to the lowest level of page tables. This can increase to a high value if a lot of processes are attached to the same shared memory segment. |
os.mem.bytes_shmem |
Memory, in bytes, allocated as small pages shared memory. |
os.mem.bytes_slab |
The total amount of memory, in bytes, used by the kernel to cache data structures for its own use. |
os.mem.bytes_sreclaimable |
Part of Slab, that might be reclaimed, such as caches. |
os.mem.bytes_sunreclaim |
The part of the Slab that can’t be reclaimed under memory pressure. |
os.mem.bytes_total |
Total usable memory, in bytes. |
os.mem.bytes_unevictable |
Size of unevictable pages, in bytes, that can’t be swapped out for a variety of reasons. |
os.mem.bytes_used |
Total memory currently used, in bytes. |
os.mem.bytes_writeback |
The total amount of memory, in bytes, actively being written back to the disk. |
os.mem.compact_stalls |
Incremented every time a process stalls to run memory compaction so that a huge page is free for use. |
os.mem.major_page_faults |
Number of major page faults. |
os.mem.page_faults |
Number of page faults. |
os.mem.pages_free |
Number of pages of free memory. |
os.mem.pages_in |
Number of pages paged into memory, per second. |
os.mem.pages_out |
Number of pages paged out of memory, per second. |
os.mem.percent_available |
Calculated % of free memory. |
os.mem.percent_used |
Calculated % of used memory. |
os.net |
Networking statistics retrieved from /proc/net/dev . Metrics are the expected column names prefixed with rx_ and tx_ for receive and send, respectively. There are statistics for each kernel-identified network interface individually as well as aggregate statistics. This data is used by the Profiler to allow ranking of ‘Network Socket’ metrics. |
os.netstat |
Networking statistics retrieved from /proc/net/snmp/ (for metric names ip , tcp , and udp ) and /proc/net/netstat (for metric names ipext and tcpext ). |
os.ps |
Process-specific metrics retrieved by examining /proc/$PID/stat and /proc/$PID/io for each process.This data is used by the Profiler to allow ranking by ‘Process.’ |
pgsql
Metric Information
Metrics recorded in the pgsql.*
category of metrics are captured by the
vc-pgsql-metrics
plugin.
Metric | Description |
---|---|
pgsql.locks |
Metrics for each PostgreSQL locktype , grouped by held or awaited . This data comes from the pg_locks table. For more information, see here. |
pgsql.processlist.state |
count and time_us for each PostgreSQL state . This data comes from pg_stat_activity . For more information, see here. |
pgsql.processlist.users |
count and time_us by user. This data comes from pg_stat_activity . For more information, see here. |
pgsql.processlist.query |
count and time_us by query ID. This data comes from pg_stat_activity . For more information, see here. |
pgsql.status.* |
Metric names are the column names documented here. |
pgsql.status.blk_read_time_us |
Time spent reading data file blocks by backends, in microseconds |
pgsql.status.blk_write_time_us |
Time spent writing data file blocks by backends, in microseconds |
pgsql.status.blks_hit |
Number of times disk blocks were found already in the buffer cache, so that a read was not necessary (this only includes hits in the PostgreSQL buffer cache, not the operating system’s file system cache) |
pgsql.status.blks_read |
Number of blocks read from disk. |
pgsql.status.buffers_alloc |
Number of buffers allocated. |
pgsql.status.buffers_backend |
Number of buffers written directly by a backend. |
pgsql.status.buffers_backend_fsync |
Number of times a backend had to execute its own fsync call (normally the background writer handles those even when the backend does its own write). |
pgsql.status.buffers_checkpoint |
Number of buffers written during checkpoints. |
pgsql.status.buffers_clean |
Number of buffers written by the background writer. |
pgsql.status.checkpoint_sync_time |
Total amount of time that has been spent in the portion of checkpoint processing where files are synchronized to disk, in milliseconds. |
pgsql.status.checkpoint_write_time |
Total amount of time that has been spent in the portion of checkpoint processing where files are written to disk, in milliseconds. |
pgsql.status.checkpoints_req |
Number of requested checkpoints that have been performed. |
pgsql.status.checkpoints_timed |
Number of scheduled checkpoints that have been performed. |
pgsql.status.conflicts |
Number of queries canceled due to conflicts with recovery in this database. |
pgsql.status.numbackends |
Number of backends currently connected. |
pgsql.status.replication_delay_us |
Replication delay, in microseconds. |
pgsql.status.temp_bytes |
Total amount of data written to temporary files by queries. All temporary files are counted, regardless of why the temporary file was created, and regardless of the log_temp_files setting. |
pgsql.status.temp_files |
Number of temporary files created by queries. All temporary files are counted, regardless of why the temporary file was created (e.g., sorting or hashing), and regardless of the log_temp_files setting. |
pgsql.status.tup_deleted |
Number of rows deleted by queries. |
pgsql.status.tup_fetched |
Number of rows fetched by queries. |
pgsql.status.tup_inserted |
Number of rows inserted by queries. |
pgsql.status.tup_returned |
Number of rows returned by queries. |
pgsql.status.tup_updated |
Number of rows updated by queries. |
pgsql.status.xact_commit |
Number of transactions that have been committed. |
pgsql.status.xact_rollback |
Number of transactions that have been rolled back. |
pgsql.totals |
count and time_us totals for state , users , and query . |
redis
Metric Information
Metrics recorded in the redis.*
family of metrics are captured by vc-redis-metrics
plugin by using INFO ALL
. We record most of the metrics from the clients
, memory
, persistence
, stats
, replication
, cpu
, and cluster
output sections. For more information, see here.
Metric | Description |
---|---|
redis.status.aof_delayed_fsync |
Delayed fsync counter |
redis.status.aof_enabled |
Flag indicating AOF logging is activated |
redis.status.aof_pending_bio_fsync |
Number of fsync pending jobs in background I/O queue |
redis.status.aof_pending_rewrite |
Flag indicating an AOF rewrite operation will be scheduled once the on-going RDB save is complete. |
redis.status.aof_rewrite_in_progress |
Flag indicating a AOF rewrite operation is on-going |
redis.status.aof_rewrite_scheduled |
Flag indicating an AOF rewrite operation will be scheduled once the on-going RDB save is complete. |
redis.status.blocked_clients |
Number of clients pending on a blocking call (BLPOP, BRPOP, BRPOPLPUSH) |
redis.status.client_biggest_input_buf |
biggest input buffer among current client connections |
redis.status.client_longest_output_list |
longest output list among current client connections |
redis.status.cluster_enabled |
Indicate Redis cluster is enabled |
redis.status.connected_clients |
Number of client connections (excluding connections from replicas) |
redis.status.connected_slaves |
Number of connected replicas |
redis.status.evicted_keys |
Number of evicted keys due to maxmemory limit |
redis.status.expired_keys |
Total number of key expiration events |
redis.status.instantaneous_ops_per_sec |
Number of commands processed per second |
redis.status.keyspace_hits |
Number of successful lookup of keys in the main dictionary |
redis.status.keyspace_misses |
Number of failed lookup of keys in the main dictionary |
redis.status.latest_fork_usec |
Duration of the latest fork operation in microseconds |
redis.status.loading_start_time |
Epoch-based timestamp of the start of the load operation |
redis.status.loading |
Flag indicating if the load of a dump file is on-going |
redis.status.loading_loaded_perc |
Same value expressed as a percentage |
redis.status.mem_fragmentation_ratio |
Ratio between used_memory_rss and used_memory |
redis.status.pubsub_channels |
Global number of pub/sub channels with client subscriptions |
redis.status.pubsub_patterns |
Global number of pub/sub pattern with client subscriptions |
redis.status.rdb_bgsave_in_progress |
Flag indicating a RDB save is on-going |
redis.status.rdb_changes_since_last_save |
Number of changes since the last dump |
redis.status.rejected_connections |
Number of connections rejected because ofmaxclients limit |
redis.status.repl_backlog_active |
Flag indicating replication backlog is active |
redis.status.total_commands_processed |
Total number of commands processed by the server |
redis.status.total_connections_received |
Total number of connections accepted by the server |
redis.status.uptime_in_seconds |
Number of seconds since Redis server start |
redis.status.used_cpu_sys_children |
System CPU consumed by the background processes |
redis.status.used_cpu_sys |
System CPU consumed by the Redis server |
redis.status.used_cpu_user_children |
User CPU consumed by the background processes |
redis.status.used_cpu_user |
User CPU consumed by the Redis server |
redis.status.used_memory_lua |
Number of bytes used by the Lua engine |
redis.status.used_memory_peak |
Peak memory consumed by Redis (in bytes) |
redis.status.used_memory_rss |
Number of bytes that Redis allocated as seen by the operating system. |
redis.status.used_memory |
Total number of bytes allocated by Redis using its allocator. |