VividCortex Performance Impact

Our agents are written by world-renowned performance experts and are designed to cause negligible performance impact on the systems they measure. All agents also have adjustable resource limits, and will be killed and restarted by the supervisor agent if they exceed these.

Regardless of how busy your servers are, the agents capture time-series metrics in 1-second granularity, so please feel free to evaluate on high-traffic servers, for example, under load test conditions; it will not place any undue burden on our servers.

What language are the agents written in?

Our agents are written in Google’s Go language. It is compiled to native machine code, and is high-performance, memory-efficient, and CPU-efficient. It is comparable to Java in this regard, and in many cases comparable to C and C++, but requires no external dependencies, so no specific libraries or runtime are required on your servers.

How many resources do agents consume?

Our agents use minimal resources.

  • Disk impact is essentially zero. Our agents don’t use the disk and don’t require any parsing of log files or other intrusive operations. They write log files to /var/log/vividcortex, but otherwise do not store any data locally in memory or on disk, so they should not exhaust system resources.
  • Agents rotate and cap their log files. By default, they each keep 5 logs of 50 MB in size, but this is configurable if needed.
  • The agents use a trivial amount of CPU. The TCP sniffer agent can use more CPU on very busy servers. It typically uses 4% of a single CPU on our own (quite busy) servers, but on some servers can use more than that. Note that if you test our agents on a server where you’re running a benchmark like Sysbench, you are typically creating a worst-case – many very small queries – so you can consider this as an upper bound of CPU consumption.
  • The agents typically use 20-40MB of memory. Please do not be concerned about the virtual memory size of the agents (VSZ). The resident set size (RSS) is the true memory usage. Please see the Go FAQ entry for more details on this topic.

Bottom line, the agents should essentially be “free.” Unless you run your servers at 100% CPU utilization, they will not deprive anything else of needed resources. There are also measures we can take to reduce resource utilization further, for example by capturing only a fraction of query traffic; please contact us for help with this.

Do the agents impact query performance?

No. See above about resource impact, which is essentially zero. In addition to that, the agents do not intercept or delay queries or data in any way. The agents are not configured as a man-in-the-middle for network traffic or system calls. They are passive observers. You can think of them as similar to a person standing on the side of the road counting the cars without interacting with them.

How much load is added to MySQL?

We don’t do expensive things inside your MySQL server. For example, we don’t poll SHOW FULL PROCESSLIST, which locks the server momentarily while it runs. (Tools that do this can severely impact your MySQL server.) We also don’t use intrusive commands such as SHOW TABLE STATUS or SHOW MUTEX STATUS. Some commands have safeguards; for example, we won’t run SHOW OPEN TABLES if your server’s table cache is large. VividCortex was designed and built by experts in MySQL performance, so we won’t cause problems on your critical servers.

How much load is added to PostgreSQL?

Similar to MySQL, we merely retrieve data from commonly-available status views such as pg_stat_activity and similar. If running in off-host configuration, where we cannot sniff TCP traffic, we measure query activity from pg_stat_statements, provided the extension is available.

How much load is added to Redis?

Virtually none. We only call INFO and CONFIG GET * once a second.

How much load is added to MongoDB?

We don’t run expensive things in your MongoDB server. The agent executes a serverStatus() call once a second, with some occasional buildInfo() when a server restart is detected. Even though we do for other databases, we don’t even attempt to fetch explain plans for MongoDB queries, because MongoDB may end up running the query in order to return its plan (see here).

Does fault detection add load?

No. Our fault detection algorithm is super-efficient and uses metrics that are already captured by the agents in the normal course of their operation. Each fault detection operation is just a few CPU instructions and operates in a few bytes of memory, which are reused. We could run fault detection tens of thousands of times per second and you’d never notice an increase in system load.

Can agents fall behind?

It’s possible for the TCP sniffer agent to fall behind in decoding network traffic. If this happens, it harmlessly drops network packets and simply degrades to capture less data. We have spent a great deal of time and effort to ensure that our agents sniff TCP traffic as cheaply as possible and handle partial or corrupted data as well as possible.