Query Samples

We obfuscate all queries we observe by normalizing them into digests (the “query shape” for MongoDB), stripping out variable portions of the query including literal values that may contain Personally Identifiable Information (PII) and other sensitive data. This obfuscation takes place locally and not within the API, and we transmit the digested query to the application.

For the most part, we simply generate metrics about these digested queries. But to make this data much more useful, we probabilistically capture individual raw samples of queries, including literal values, so you can examine them. When a raw query sample is available for examination, we display a small circle on the query details sparkplot, one for each sample. Here’s a screenshot:

Top Queries

VividCortex also stores the checksum (first 64 bits of MD5) for each query, which is used throughout the application to assist with analysis (such as through metric names which include the query checksum).

This table illustrates how a query is digested and checksummed:

Item Example Usage
raw query SELECT * FROM db1.tbl1 WHERE SSN = '123-45-6789' Sampled on a periodic basis and displayed in the UI to facilitate inspection of queries.
digested query select * from db?.tbl? where ssn = ? Transmitted to the VividCortex API the first time the software see it. Used for aggregate metrics.
checksum 51c37c16602e541c Used as a key for metrics about queries that share the same digest (or query shape), e.g. host.queries.51c37c16602e541c.latency_us

Because we are capturing raw query samples and sending them to our application for your analysis, there is the potential for Personally Identifiable Information or other sensitive data to be transmitted to VividCortex. The options below allow you to limit or completely eliminate the transmission of query samples.

Configuring Capture of Query Samples

In order to prevent the transmission of PII or other sensitive data to VividCortex, you can make use of a number of specific settings to customize your configuration. This can be done either with configuration files using the flags defined below or through the Settings UI. Any options set in a configuration file take precedence over settings configured in-app; for example, you can disable sampling for an entire environment and re-enable sample collection on one host using a config file.

Disable all sampling

Sampling can be completely disabled.

Disabling sampling entirely prevents the agent from taking any query samples: no query text nor any information related to individual samples (such as connection ID, query latency, and EXPLAIN plans) will be captured.

Flag: "disable-sampling":"true"

During installation, this option is “No Samples” when asked “Configure capture of query samples.”

Disable query text

Samples can be captured, but the raw query text itself is discarded.

This option allows us to collect query metadata for individual queries, such as connection ID, query latency, and EXPLAIN plans, but does not transmit raw query samples (such as the “raw query” example in the table above). It is helpful to have samples even if the raw query is deleted because many of the sample details are stil useful. When using this option, we also take steps to remove PII which may be in the EXPLAIN plan. For MySQL, we do not capture the JSON representation of the EXPLAIN plan in order to avoid capturing the query text. For PostgreSQL, we filter out the Filter, Hash Cond, Index Cond, Merge Cond, and Recheck Cond fields.

Flag: "disable-sampling-text":"true"

During installation, this option is “Samples without Query Text” when asked “Configure capture of query samples.”

Disable explain plans

The execution of EXPLAIN plans can be fully disabled

If it is believed that the EXPLAIN plan might still contain PII, EXPLAIN functionality can be disabled entirely. This is most often used in conjunction with Disable query text(above), when it is valuable to capture query metadata but the EXPLAIN plan, like the raw query itself, may contain PII or other sensitive data.

Flag: "enable-explains":"false"

Safelist/Blocklist queries

Queries can be safelisted or blocklisted by query text or regular expression.

This option allow us to:

  • collect all queries except for those matching a blocklist or
  • discard all queries except those matching a safelist.


"query-blacklist-pattern":"REGEXP" This (REGEXP) is a regular expression describing the queries that must not be sampled. These queries will be treated as if disable-sampling was enabled. Do not also enable disable-sampling or disable-sampling-text in conjunction with this option.

"query-whitelist-pattern":"REGEXP" This (REGEXP) is a regular expression describing the queries that may be sampled. If you provide a whitelist, all queries NOT matching the whitelist will not be sampled. Do not also enable disable-sampling or disable-sampling-text in conjunction with this option.

(Note: The syntax available for use in a regular expression can be found here).

Enabling settings

You can enable the above settings on a per-host basis using a VividCortex configuration file containing the option(s) you wish to enable. Alternatively, you can enable and disable sample collection, as well as configure a safelist or blocklist, through the Settings UI. Options configured in a file take precedence over options configured through the Settings UI.

Add the option to the metrics plugin configuration file; they are named based on database technology. As an example, if you wished to disable the capture of query text for PostgreSQL, you would edit the file /etc/vividcortex/vc-pgsql-metrics.conf to include a line with "disable-sampling-text":"true":


You will need to create this file in /etc/vividcortex/ if it does not exist.

Please contact VividCortex Customer Support if you have any questions or concerns about query samples or the available options to configure this functionality.

Deleting Query Samples

You can delete all query samples which have been captured in an environment. See our Environment Preferences documentation for more information.