Built-In Event Types
Database Performance Monitor comes pre-packaged with a number of automatically-detected events. You can also send your own custom events by following the instructions here. This is an alphabetized list of the events which you may see in the DPM application. You can set up an alert to trigger when any of these events has occurred using the Alerts page.
Approaching Max MySQL Connections
- The number of connections to the database has reached 95% of the database’s current limit. This is configurable using the
max-db-conns-threshold
database metrics agent configuration file setting; for example,"max-db-conns-threshold":"90"
will trigger an event when 90% of connections are used.
Balancer Round Complete
- The MongoDB balancer has finished rebalancing data between the shards. More information about the MongoDB balancer is available in the MongoDB documentation.
Database Configuration Change
- The DPM agent detected a change in the configuration of the database being monitored. The agent will detect runtime configuration changes as well as changes applied after a database is restarted. The agent must be running at the time of the change in order to detect it.
Database Server Restart
- The database being monitored by DPM has been restarted.
Disk Device Almost Full
- The disk device has 10% free space. This is configurable using the
disk-full-threshold-warn
OS metrics agent configuration file setting; for example,"disk-full-threshold-warn":"15"
will trigger an event when the disk is down to 15% free space.
Disk Device Full
- The disk device has 5% free space. This is configurable using the
disk-full-threshold-crit
OS metrics agent configuration file setting; for example,"disk-full-threshold-crit":"10"
will trigger an event when the disk is down to 10% free space.
High Swap Activity
- Swap activity on this disk device has exceeded 100 pages per second.
Long-Running Autovacuum
- You can configure DPM to detect that the PostgreSQL autovacuum process took longer than a defined threshold. To enable, edit your host’s
/etc/vividcortex/vc-pgsql-metrics.conf
configuration file to include thepg-vacuum-events
setting. Its value is an event level and a time threshold. Acceptable event levels areinfo
,warn
, andcrit
; the time threshold isNm
number of minutes. For example, if you wanted to generate aninfo
level event for a vacuum taking more than 1 minute and a warning after 5 minutes, you would include"pg-vacuum-events":"info:1m,warn:5m"
in your configuration file.
Long-Running Query
- You can configure DPM to detect that a query is taking more than a defined threshold to execute. This feature is available for MySQL, PostgreSQL, and MongoDB. To enable long-running query detection, select an Environment, then go to Settings. In Settings, click Preferences and then select “Enable Long Running Query Detection” under Events Settings.
When configured, we will generate an Event for each query that is detected. You can then configure an Event-based Alert to notify you.
Max MySQL Connections Reached
- The database has reached its maximum number of connections.
MongoDB Server Stall
- This event, along with corresponding types for other supported databases, represents a Fault that DPM detected. Faults are short stalls that are usually caused by a bottleneck in the database’s workload. Refer to the Faults Page for details on how faults work.
MySQL Replication Started
- The DPM agent has detected that replication has started running for a replica database. The event
will specify whether
START SLAVE
was seen when replication started.
MySQL Replication Stopped
- The DPM agent has detected that replication has stopped running for a replica database. The event
will specify whether
STOP SLAVE
was seen when replication stopped.
MySQL Server Stall
- This event, along with corresponding types for other supported databases, represents a Fault that DPM detected. Faults are short stalls that are usually caused by a bottleneck in the database’s workload. Refer to the Faults Page for details on how faults work.
New Important Query
- A previously unseen query has appeared among the top 10 queries in the environment. Similar to the New Important Process type.
New Important Process
- This event is similar to the New Important Query type - it indicates that a previously unseen process has appeared among the top 10 processes in the environment.
Out of Memory Killer
- The Linux Out of Memory (OOM) Killer has killed a process. The process will be named in the event.
PostgreSQL Replication Started
- The DPM agent has detected that replication has started running for a replica database.
PostgreSQL Replication Stopped
- The DPM agent has detected that replication has stopped running for a replica database.
PostgreSQL Server Stall
- This event, along with corresponding types for other supported databases, represents a Fault that DPM detected. Faults are short stalls that are usually caused by a bottleneck in the database’s workload. Refer to the Faults Page for details on how faults work.
Replica Set State Change
- The MongoDB node in a replica set has changed state; for example, it may now be a primary or a second. A complete list of replica states can be found on the MongoDB documentation.
Replica Sync Completed
- The MongoDB node has finished fetching data from the primary.
Replica Sync Started
- The MongoDB node is out of sync and fetching data from the primary.
SQL Injection
- DPM has detected a possible SQL injection attempt. You should investigate the query sample to verify. Note that this feature requires that capturing query sample text be enabled, and that the samples are being sent to DPM.
DPM Diagnostic Events
- In addition to the events above, which provide information about your operating system(s) or database(s), the following events provide information about the DPM service. For more information about these events or if you would like help with a resolution, please contact Support.
Agent Shutdown
- This event indicates a DPM agent has stopped running. The event reports the specific agent that stopped, and also reports the manner in which it was stopped (e.g. graceful shutdown or a process kill). Agents restart periodically, such as during automatic upgrades, and an agent shutdown or startup should be considered normal behavior.
Agent Startup
- This event indicates a DPM agent has started running. The event reports the specific agent that started, along with its version. Agents restart periodically, such as during automatic upgrades, and an agent shutdown or startup should be considered normal behavior.
Database Connection Error
- The monitoring agent is no longer able to connect to a previously-monitored database. For example, this can be because the database is refusing connections or is down.
High Distinct Query Rate
- DPM has received a very high number of unique query digests. Too many unique query digests prevents aggregation of data in a way that is meaningful. To continue providing useful data, we automatically adjust the digestion algorithm to be more aggressive, which allows the system to better detect queries which are similar to one another, with only small differences which affect digestion but not query behavior.
Host Registration
- The agent attempted to register a new database host and failed. This can be because there are no licenses available for the new database, or because of a failure in the API.
Host stopped sending data
- The API has not received any new metrics from a registered host in the last 15 minutes. This can be caused by connection problems, or a host being shutdown.
Intermittent Connectivity to Datacenter!
- The DPM agent experienced a connection timeout while attempting to report metrics to the DPM API. If this event shows up in your environment, it could indicate a networking problem that will lead to missing data in your environment if not addressed.
MySQL Open Tables Tracking Stopped
- A database has more open tables than the agent is configured to track, so table tracking has been disabled to prevent any potential performance problems. By default the agent will capture metrics on the open tables that MySQL is using, but since doing so is expensive this monitoring is automatically disabled if too many tables are open. If you would like these metrics enabled contact Support for assistance in changing the limit.
No TCP Captured
- The agent has not detected any TCP traffic on any of the ports it is configured to monitor. The agent will restart itself in an attempt to correct for any errors, but if the problem continues you should check that traffic is being sent to the database on the correct IP/port, or contact Support.
Performance Schema Configuration
- When monitoring MySQL using the
performance_schema
we fetch data fromevents_statements_summary_by_digest
. This table has a maximum size, configured at server start with theperformance_schema_digests_size
system variable. This alert indicates that more than 50% of a server’s query execution count is for queries thatperformance_schema
could not record because the table is full. Please see this FAQ for information about how to fix this.
Performance Schema Unavailable
- The agent is not able to collect metrics from the database because the
performance_schema
is not enabled.
PostgreSQL User Lacks Privileges
- The user assigned to DPM for monitoring your database does not have the necessary privileges in order to capture data. Please refer to our privileges documentation for information about what PostgreSQL privileges are required.
Query Samples Unavailable
- A database host’s configuration does not allow the DPM agent to collect samples of queries that ran on the host. This is only applicable for hosts that are using the off-host monitoring configuration.
Restart Loop
- One of the agents is caught in restart loop and cannot finish loading. Your database may or may not be monitored. This can happen if the agent does not have sufficient file system privileges or if it cannot download a necessary update. You can refer to the vc-agent-007 log file for more information, or contact Support.
Size Metrics Aborted
- DPM has automatically disabled trying to fetch database and table sizes on this host because the database has taken too long to respond. This is done in order to prevent adding more load to a database possibly under heavy load.
Stale Action
- One of our monitoring agent’s queries took longer than expected. This is typically caused by high load on the database, a server stall, or some other database performance issue. When this happens, it can correspond to some metrics missing for a few seconds on the affected host.