Jake Luciani

To learn more about the DataStax open-source project,&nbsp;<a href="https://github.com/datastax/metric-collector-for-apache-cassandra">Metric Collector for Apache Cassandra</a>&nbsp;and to try a&nbsp;demo, visit us on <a href="https://github.com/datastax/metric-collector-for-apache-cassandra">GitHub</a>.
Apache Cassandra is a resilient system for users to build applications on, but many operators see Cassandra as a bit of a black box. It&rsquo;s not that Cassandra doesn&rsquo;t have <a href="https://cassandra.apache.org/doc/latest/operating/metrics.html">hundreds of metrics to consume</a>, it does (over 300 metric series per table!). The fact is visualizing and getting a unified view of the cluster combined with OS-level metrics and application metrics is not an easy thing for Cassandra users to set up.&nbsp;&nbsp;
<h3>What is the Metrics Collector for Apache Cassandra?</h3>
To help solve this problem, DataStax released a new open source project called the <a href="https://github.com/datastax/metric-collector-for-apache-cassandra">Metric Collector for Apache Cassandra</a> (MCAC for short).&nbsp; This project provides a drop-in solution to solve this monitoring gap for Apache Cassandra. Here&rsquo;s how it works.
MCAC is built on the widely used <a href="https://collectd.org/">collectd</a> agent but with a novel twist. Collectd is a metric collection agent that is well adopted and integrates well with all kinds of external metrics systems like, <a href="https://collectd.org/wiki/index.php/Plugin:Write_Prometheus">prometheus</a>, <a href="https://collectd.org/wiki/index.php/Plugin:Write_Graphite">graphite</a>, <a href="https://collectd.org/wiki/index.php/Plugin:Write_Stackdriver">stackdriver</a>, and <a href="https://collectd.org/wiki/index.php/Plugin:Write_HTTP">others</a>. While collectd can scrape JMX metrics out of the box, <a href="https://github.com/prometheus/jmx_exporter/issues/246#issuecomment-367573931">JMX scraping can be quite slow</a> and works best with only a subset of metrics.&nbsp;Not to mention many people don&rsquo;t want to maintain and configure the metric agent on every node.&nbsp;&nbsp;
We use MCAC to power the health tab in <a href="https://www.datastax.com/products/datastax-astra">Astra</a> and is bundled with our <a href="https://github.com/datastax/cass-operator">Kubernetes operator for Apache Cassandra</a>.&nbsp;
<h3>Why MCAC is different</h3>
To solve this problem MCAC comes as a single bundle with our java agent and a linux portable collectd build all in one. Just add the agent to the cassandra-env.sh, it brings up collectd and ships every metric in Cassandra to collectd via a unix-socket. It works on all Apache Cassandra versions from 2.2 -&gt; 4.0.&nbsp;
By shipping the metrics this way efficiently it is able to export hundreds of thousands of series per node with little/no impact on C* performance.
Not only does it send the metrics, but it is specially designed to work well with prometheus out of the box, like <a href="https://www.robustperception.io/how-does-a-prometheus-histogram-work">histograms are tailored for aggregation</a> by prometheus and labels are automatically converted on ingest. This means you can slice and dice metrics across DCs, racks, down to even tables.
The Cassandra metrics are one aspect of the equation but with collectd we can also gather and expose all the OS level metrics, like context switches and disk/network performance.
MCAC also creates a historical log on the nodes of metric and non-metric diagnostic events related to activity on the node. Non-metric events include details on Flushes, Compactions, Exceptions, GC, etc.&nbsp;This DataLog can be used to help analyze performance or other impacting issue on the cluster. If you need help our SRE team is available to help you diagnose problems with this log <a href="https://www.datastax.com/keepcalm">https://www.datastax.com/keepcalm</a> and if you have any questions we're here to help at <a href="https://community.datastax.com/">https://community.datastax.com/</a>.
Finally, what good are all these metrics without a way to visualize them! To tie it all together, MCAC comes with pre-built grafana dashboards which give operators the best Cassandra monitoring solution out there. These dashboards will change over time to focus on specific aspects of the system to make it easier to drill into the cluster.
<img src="https://www.datastax.com/sites/default/files/inline-images/MCAC1_0.png" alt="Grafana" data-entity-type="file" data-entity-uuid="3352a498-c70f-43a8-adb4-cf1086877b2b" />
<img src="https://www.datastax.com/sites/default/files/inline-images/MCAC3.png" alt="mcac" data-entity-type="file" data-entity-uuid="7a740a4f-874c-4520-b95d-a549adb48be6" />
<img src="https://www.datastax.com/sites/default/files/inline-images/MCAC2_0.png" alt="mcac2" data-entity-type="file" data-entity-uuid="f6e23d85-a306-4141-a3b1-35b2c92ba846" />
&nbsp;
&nbsp;
&nbsp;

Monitoring Apache Cassandra™ Made Simple

Jake LucianiEngineering

Share

Share

What is the Metrics Collector for Apache Cassandra?

Why MCAC is different

More Company

DataStax Acquires Langflow to Accelerate Generative AI Development

The Top 5 DataStax Stories from 2023

2023 Recap: Data = AI

DataStax Astra DB Nabs Three Prestigious 2023 TrustRadius “Best of” Awards, Dominates the Vector Databases Category

One-stop Data API for Production GenAI