Metric Categories

A metric is some value that a resource exposes. An observation is a sample of a metric’s value at a point in time from a given resource. For example, a network interface has a bytes_received metric, and if we sample its value at 12:00 PM we have one observation. A metric has a dot-separated name, which defines it within a namespace hierarchy or tree. Metric names can only include letters, numbers, underscores, and dots.

Metrics at VividCortex take one of two forms: what we refer to as a gauge or as a derivative.

  • A gauge represents a scalar value, recorded at a specific time, such as the current value of the motherboard temperature or the number of users logged in. A VividCortex example would be mysql.innodb.pending_log_writes.

  • Some values are accumulated values, such as the number of network bytes received on an interface. We transform these into derivatives by subtracting the previously seen value from each new sample and dividing by the time between samples. A VividCortex example would be os.cpu.idle_us. These numbers often represent throughput or time, and (as in the below table) many derivative metrics are suffixed with tput or _us.

We use some standard naming suffixes for common types of metrics:

Concept Suffix
Throughput or arrivals per second tput
Read or write operations reads / writes
Timing *_us (microseconds)
Timing *_s (seconds)
Utilization util (integer from 1 to 100)
Count count

These are the default highest level metric namespaces and a brief description of each:

Name Description
agents Internal diagnostic metrics of agent behavior.
aws Metrics retrieved from Amazon CloudWatch.
cstar Metrics from Cassandra sources such as org.apache.cassandra.metrics
host Database-agnostic data, such as user/connection/client statistics and query metrics.
mongo Metrics from MongoDB sources, such as serverStatus.
mysql Metrics from MySQL sources such as PERFORMANCE_SCHEMA.
os Disk, memory, CPU, ps, and networking information.
pgsql Metrics from PostgreSQL sources, such as pg_stat_statements.
redis Metrics from Redis sources such as INFO ALL.

Below is more information about each of the above metric categories. The complete list of metrics is extensive and in many cases come directly from the output of built in database commands (such as SHOW PROCESSLIST) and we have linked to the relevant database-specific documentation where applicable. You can use the categories below to search for metrics on the Metrics page of VividCortex.


agents Metric Information

Metrics in the agents-family of metrics contain diagnostic information about the

VividCortex agent. Each plugin records different metrics about itself, and the metrics are organized by plugin name; for example, agents.vc_mysql_metrics.*.

aws Metric Information

Metrics in the aws.*-family of metrics come from CloudWatch, Amazon’s monitoring service, for MySQL RDS and PostgreSQL RDS. To enable CloudWatch for your Amazon instances, follow the instructions here.

We collect the count, sum, min, and max values as provided directly from Amazon.

Category Description
aws.aws_rds We collect the metrics on this page. The metric names are lowercased, with words separated by underscores, to match our standard. For example, the AWS metric BinLogDiskUsage becomes aws.bin_log_disk_usage.*


cstar Metric Information

Most metrics in the cstar.*-family of metrics come from here, except where noted.

Category Description
cstar.status.cache KeyCache, RowCache, and CounterCache metrics.
cstar.status.clientrequest read, write, and rangeslice metrics including latency and timeouts.
cstar.status.commitlog Commitlog completedtasks, pendingtasks, and size metrics.
cstar.status.compaction Compaction-related pendingtasks, tasks, completed, and compacted metrics.
cstar.status.connection Throughput of connection timeouts.
cstar.status.droppedmessage Throughput of dropped messages per droppable verb, such as hint, mutation, or read.
cstar.status.storage hintsinprogress, hints, exceptions, and load
cstar.status.threadpools request and internal thread pool metrics, with metrics for pendingtasks, activetasks and completedtasks.
cstar.status.cql Counts of prepared and non-prepared statements, from here


host Metric Information

Children under host, and their descriptions, are as follows:

Category Description
host.auth VividCortex database login attempt metrics: failure, blocked, and other
host.callers Metrics related to queries, grouped by connected client. IP address bytes are underscore separated as the dot is a reserved character in metric names. Comes from the protocol decoder.

This data is used by the Profiler to allow ranking of ‘Hosts.’

These metrics are only available with the On-Host configuration.
host.connections Number of connections throughput and connection time throughput.

These metrics are only available with the On-Host configuration.
host.dbs Metrics by database, such as affected_rows, data_length, index_total, row_count, total_length, etc.

This data is used by the Profiler to allow ranking of ‘Databases.’

These metrics are only available with the On-Host configuration.
host.queries Query data, grouped by q, p, e, or c (for query/prepare/execute/close) and query digest. Each metric is suffixed with .tput (except time_us and tput itself) as they are per-second derivatve values. This information comes from the protocol decoder, in addition to PERFORMANCE_SCHEMA and PG_STAT_STATEMENTS.

This data is used by the Profiler to allow ranking of ‘Queries.’

host.queries.tagged.*, specifically, is used by the profiler to allow ranking of ‘Query Tags.’ Query tags are only available with the On-Host configuration. More information about query tags is available in our Query Tags Documentation.
host.samples Contains the count of failed_rules per query digest. You can read more about the rules we apply to query samples here.
host.status Total bytes_sent and bytes_received.
host.tables Metrics by table: data_length, index_length, total_length, data_free, and row_count. Comes from INFORMATION_SCHEMA in MySQL and pg_statio_user_tables view in PostgreSQL.

This data is used by the profiler to allow ranking of ‘Tables.’

host.tables metrics are disabled by default, as the capture of these metrics can be expensive.
host.totals Total metrics for all queries combined, for an entire host, regardless of database type. host.totals.queries.* includes totals for all query metrics, including affected_rows, errors, latency, rows_examined, etc.
host.users Metrics by user accross databases and tables: affected_rows, errors.<code>, errors.no_good_index, no_index, slow, time_us, and tput. Each of these are suffixed with .tput (except time_us and tput itself) as they are per-second derivative values. Comes from the protocol decoder.

This data is used by the profiler to allow ranking of ‘Users.’

These metrics are only available with the On-Host configuration.
host.verbs Metrics by query verb (ALTER, SELECT, etc.), such as affected_rows, no_index, rows_examined, slow, etc.

This data is used by the Profiler to allow ranking of ‘Query Verbs.’


mongo Metric Information

Metrics in the mongo.* family of metrics are captured by the vc-mongo-metrics plugin.

We capture nearly all of the metrics provided by the MongoDB command serverStatus. Additionally, we capture a number of metrics from the connPoolStats command as well.

VividCortex metric names follow the MongoDB hierarchy, normalized to our metric style of lowercase with underscores.

Category Description
mongo.connpool Information about the open outgoing connections from the current database instance.
mongo.status.asserts Assertions raised since the MongoDB process started.
mongo.status.background_flushing mongod process’s periodic writes to disk.
mongo.status.connections Status of the connections.
mongo.status.dur mongod instance’s journaling-related operations and performance.
mongo.status.extra_info Additional information regarding the underlying system.
mongo.status.global_lock Reports on the database’s lock state.
mongo.status.locks For each lock <type>, data on lock <modes>.
mongo.status.mem System architecture of the mongod and current memory use.
mongo.status.metrics Various statistics that reflect the current use and state of a running mongod instance.
mongo.status.network MongoDB’s network use.
mongo.status.opcounters Operations by type since the mongod instance last started.
mongo.status.opcounters_repl Database replication operations by type since the mongod instance last started.
mongo.status.wired_tiger Metrics about the Wired Tiger storage engine.


mysql Metric Information

Metrics recorded in the mysql.* family of metrics are captured by the vc-mysql-metrics plugin. For information on what settings and permissions are required to capture the metrics listed, please see here.

Children under mysql, and their descriptions, are as follows:

Category Description
mysql.innodb InnoDB engine information, from selected portions of the return from SHOW ENGINE INNODB STATUS. For more information on this statement, see here.
mysql.mutex.innodb Information regarding InnoDB mutex wait times. This data comes from the MySQL Performance Schema. For more information about InnoDB mutex wait instruments, which must be enabled to capture this data in VividCortex, see here.
This data is used by the Profiler to allow ranking of InnoDB Mutexes.
mysql.processlist Process information from performance_schema.threads, INFORMATION_SCHEMA.processlist, or SHOW PROCESSLIST, in that order, depending on which we are able to query. More information about each child under mysql.processlist, such as callers, command, and query, can be found here
This data is used by the Profiler to allow ranking of ‘MySQL Processlist’ metrics.
mysql.status Metrics built from MySQL’s server status variables, retrieved from SHOW GLOBAL STATUS. More information about each variable is here.
The specific metric mysql.status.replication_delay.us is provided by SHOW SLAVE STATUS.
mysql.tables Metrics about the non-temporary tables that are open in the table cache, provided by SHOW OPEN TABLES. For more information about this command, see here.


os Metric Information

To obtain the information contained within the os.* category of metrics, the vc-os-metrics plugin inspects the contents of the /proc virutal filesystem. The agent does not execute any commands, such as ps. For more information about /proc, see here. These metrics are also available for non-Linux installs; descriptions of where the data is derived from will be provided in the near future.

Note that all of these metrics are only available with the On-Host configuration.

Children under os, and their descriptions, are as follows:

Category Description
os.cpu CPU statistics retrieved from /proc/stat. Metric names generally correspond to the meanings of the columns contained within /proc/stat suffixed with _us as they are time values; ctxt, processes, procs_blocked, and procs_running behave as expected.
os.cpu.freq_mhz is retrieved from /proc/cpuinfo. os.cpu.loadavg is retrieved from /proc/loadavg.
os.disk Disk statistics retrieved from /proc/diskstats. There are metrics for each disk individually as well as aggregate statistics.
os.mem Memory statistics retrieved from /proc/meminfo and /proc/vmstat. All metrics coming from /proc/meminfo begin with bytes_. Metrics corresponding to bytes paged in/out or swapped in/out begin with pages_.
os.net Networking statistics retrieved from /proc/net/dev. Metrics are the expected column names prefixed with rx_ and tx_ for receive and send, respectively. There are statistics for each kernel-identified network interface individually as well as aggregate statistics.
This data is used by the Profiler to allow ranking of ‘Network Socket’ metrics.
os.netstat Networking statistics retrieved from /proc/net/snmp/ (for metric names ip, tcp, and udp) and /proc/net/netstat (for metric names ipext and tcpext).
os.ps Process-specific metrics retrieved by examining /proc/$PID/stat and /proc/$PID/io for each process.
This data is used by the Profiler to allow ranking by ‘Process.’


pgsql Metric Information

Metrics recorded in the pgsql.* category of metrics are captured by the vc-pgsql-metrics plugin.

Category Description
pgsql.locks Metrics for each PostgreSQL locktype, grouped by held or awaited. This data comes from the pg_locks table. For more information, see here.
pgsql.processlist.state count and time_us for each PostgreSQL state. This data comes from pg_stat_activity. For more information, see here.
pgsql.processlist.users count and time_us by user. This data comes from pg_stat_activity. For more information, see here.
pgsql.processlist.query count and time_us by query ID. This data comes from pg_stat_activity. For more information, see here.
pgsql.status Data from the pg_stat_database view. Metric names are the column names documented here; for example, pgsql.status.blks_read.

Also includes data from the pg_stat_bgwriter view, which is one row about the background writer process. Metric names are the column names documented here; for example, pgsql.status.buffers_clean.
pgsql.totals count and time_us totals for state, users, and query.


redis Metric Information

Metrics recorded in the redis.* family of metrics are captured by vc-redis-metrics plugin.

There is one child under redis:

Category Description
redis.status Data captured from issuing INFO ALL. We record most of the metrics from the clients, memory, persistence, stats, replication, cpu, and cluster output sections. For more information, see here.