October 8th, 2013


Gary Dusbabek: Principal Engineer at Rackspace

Matt Pfeil: Founder at DataStax


Hello, everyone.  This is Matt Pfeil.  Today for our Apache Cassandra use case interview I’m joined by Gary Dusbabek, Principal Engineer at Rackspace.  Gary, how are you doing today?

Doing great.  Glad to be here.


Thanks for taking some time.  You’ve been a long time member of the Cassandra Community.  Why don’t you quickly introduce or tell all of our readers and listeners how you got started with the project and what you guys are working on today?

Sure.  I first got started with Cassandra with my involvement with coming to Rackspace in late 2009 and started working on little things. As I got more familiar with the project, and more involved, we latched on to some of the bigger features that were coming in to Cassandra post 0.6. We did that for about a year and a half and then Rackspace involvement wasn’t as great as it was at that point.  I haven’t been focusing on Cassandra too much since then, but I’ve been using it quite a bit.


What’s the use case for how you are using it today?

We use it in two different ways.  In cloud monitoring, since we have a lot of Cassandra experience on our team, we’ve just used it as our basic data store which sometimes isn’t a really great fit but we have the operational experience that we just made it work.


Where we really use Cassandra a lot and where it shines, is in our data cluster.  As we run our checks on different entities’ servers and we generate a ton of data that all goes into Cassandra and then we do different kinds of analysis on it.


Can you share some more insight into how much data you’re collecting?

Our cluster is coordinating about 35 million writes per hour right now. This represents about 180 million samples per hour.  Obviously, we’re batching them together.


It’s a bunch of time series data then, correct?



Is it for basically monitoring every server in Rackspace’s deployment?

Not every server.  It’s every server that chooses to configure it.  It’s an end user option.  They can choose to have Rackspace monitor new servers or they can install an agent that runs on their servers and reports can get back that basically  can do the same or similar kinds of things.


That’s very cool.  Do you know anything about the size of the cluster or anything like that?

Sure,  we run two.  Our largest cluster is a 32 node cluster and it’s getting ready to upgrade, it’s actually a pretty old data cluster.  It’s running Cassandra 1.0.  They’re upgrading it this week to Cassandra 1.1 and hopefully at some point in the future we’ll get on to 1.2.


That’s very cool.  What’s the motivation for moving to the newer version other than general performance and stability?

Mainly that.  We just feel a little bit of a pain whenever we discover a bug that’s been fixed in a version of Cassandra that’s ahead of what we’re using. We’ve been really motivated that way to get on to newer versions of Cassandra so that we’re not living in the Stone Age of Cassandra I guess.


I like the Stone Age.  It’s a what, whopping two or three years ago?

That’s right.


You’ve obviously been involved with it since the day that you and I actually worked at Rackspace together.  What’s the biggest thing that you’re the most proud of in Cassandra over the last few years?

As far as my involvement goes, in the early days one of the biggest pain points was online schema changes.  That was one of the features that I put in, was online schema updates,  so I’m kind of proud of that.  Here lately, I really do appreciate virtual nodes and also CQL.  CQL has made Cassandra a lot easier to use than just using the raw Thrift interface which people have gotten used to.



CQL has had a lot of adoption from our angle as well so I completely agree with both of those things.  I will say I remember the days of .6 to .7 when you had to shut down the whole cluster for the upgrade and I’m very glad we’re past that several years later.  Very cool.  I appreciate your time today.