May 2nd, 2013

By 

 

 

Alexis Lê-Quôc: CTO at DataDog

Brady Gentile: Community Manager at DataStax

 

Brady: What does DataDog do?

 

Alexis: Datadog is an infrastructure monitoring service that aggregates performance metrics from more than 50 different applications in one place to let development and ops teams understand, in real time, how their systems are behaving.

 

Brady: How are you using Apache Cassandra (C*)?

 

Alexis: The vast bulk of the monitoring data we process is represented internally as time series. Monitoring data is by and large consumed in real time, either to generate alerts or to be displayed on interactive graphs; we use Cassandra to store this time series data.  Cassandra gives us a nice mix of low latency for durable writes, scalable storage and simple management.

 

Brady: What made you choose C*?

 

Alexis:  Before Datadog we were doing a lot with large SQL databases and we knew that for storing large amounts of binary data (such as time series), SQL databases were not optimal. When we started our research, we looked at a number of alternatives; C* had come out from Facebook just before we started doing our research and, compared to the alternatives, it was easier to configure and operate.

 

We had some issues with the earlier 0.7x series a few years ago. Since then it’s been doing its job without requiring constant care, which is how we like our data stores.

 

Brady: What tips do you have for someone getting started with C*?

 

Alexis:  Think carefully about how the data will be accessed and spend some time understanding the storage model; large amounts of data are difficult to reshape.  Also, join the community. It’s vibrant and contributes in no small part to the success of C*.

 

Brady: Are you running C* in the cloud or your own data center?

 

Alexis:  We are running C* in the cloud across multiple data centers.

 

Brady: Could you share some metrics with us (# of nodes, read/write speeds, etc.)?

 

Alexis:  We run several clusters in the tens of nodes backed by hard drives or SSDs depending on how much read speed we need. We store trillions of performance measurements in Cassandra.

 

Brady: What are your thoughts on the Apache Cassandra community?

Given the number of options available for large-scale storage, the community is ultimately what makes or break a project. The Cassandra community is open and inclusive, which allows newcomers to get started without apprehension.

 

Brady: Anything else you’d like to add?

 

Alexis:  Keep up the good work!