Query Samples & Configuration

We obfuscate all queries we observe by normalizing them into digests, removing all literal text from the query. This obfuscation takes place locally and not within the API, and we transmit the digested query to the application.

In addition, we periodically capture individual samples of queries. These query samples include the literal text of the query, data about the execution such as latency, and (if possible) an execution plan. When a raw query sample is available for examination, we display a small circle on the query details sparkplot, one for each sample.

Here’s a screenshot:

Top Queries

Query samples are retained for 30 days, and then purged.

Database Performance Monitor also stores the checksum (first 64 bits of MD5) for each query, which is used throughout the application to assist with analysis (such as through metric names which include the query checksum).

This table illustrates how a query is digested and checksummed:

Item Example Usage
raw query SELECT * FROM db1.tbl1 WHERE SSN = '123-45-6789' Sampled on a periodic basis and displayed in the UI to facilitate inspection of queries.
digested query select * from db?.tbl? where ssn = ? Used for aggregate metrics.
checksum 51c37c16602e541c Used as a key for metrics about queries that share the same digest (or query shape), e.g. host.queries.51c37c16602e541c.latency_us

Data Contained in a Sample

A full query sample contains the following data:

  • The raw text of the query, as well as any query comments
  • Metadata about the execution itself; this can include latency, user, origin, database, connection ID, and error codes
  • An execution plan provided by the database (if possible)

Because raw query text may contain Personally Identifiable Information (PII) or other sensitive information, what data we collect with a sample (or whether we capture samples at all) is configurable.

Configuring Capture of Query Samples

Except for disabling execution plans, there are two ways to enable these settings:

  • in the application, in an Environment’s Settings (Settings -> Query Data)
  • on each individual host, using a configuration file

Please note: If you want to disable query text but still collect execution plans, you currently must configure this setting using a vc-{database}-metric.conf, not through the UI or the global.conf.

Enable only one of these settings on a host at a time.

Option Configuration File Flag Description
Disable all sampling "disable-sampling":"true" Do not collect text, metadata, execution plans, or query comments.
Capture comments & metadata "query-comment-pattern":"*" Captures query comments, metadata, and execution plans, but not raw query text.
Capture only metadata "disable-sampling-text":"true" Captures metadata and execution plans, but not raw text or comments.
Whitelist a set of queries to sample "query-whitelist-pattern":"REGEXP" Will collect sample information if the query matches the provided regular expression.
Blacklist a set of queries to not sample "query-blacklist-pattern":"REGEXP" Will collect samples that do not match the provided regular expression.

If you would like to disable the capture of execution plans, that can be done separately (and must be done in a configuration file):

Option Configuration File Flag Description
Do not collect execution plans "enable-explains":"false" Do not collect execution plans. This can be used with one of the above options.

Enabling

All of the above options (except disabling the capture of execution plans) can be set through the Settings UI. Alternatively, you can use a DPM configuration file on each host. Any option set in a configuration file takes precedence over the Settings UI.

Create a configuration file for your database type (if it does not exist already):

Database Filename
MySQL /etc/vividcortex/vc-mysql-metrics.conf
PostgreSQL /etc/vividcortex/vc-pgsql-metrics.conf
MongoDB /etc/vividcortex/vc-mongo-metrics.conf
Redis /etc/vividcortex/vc-redis-metrics.conf

Do not use the global.conf file or any vc-*-query.conf file to enable these options.

To the file, add the appropriate setting. The files are JSON formatted. As an example, if you wished to disable the capture of query text for PostgreSQL, you would edit the file /etc/vividcortex/vc-pgsql-metrics.conf to include a line with "disable-sampling-text":"true":

{
"disable-sampling-text":"true"
}

If you wished to capture MySQL query comments and execution metadata, but not raw text or explain plans:

{
"query-comment-pattern":"*",
"enable-explains":"false"
}

Deleting Query Samples

You can delete all query samples which have been captured in an environment. See our Environment Preferences documentation for more information.