August 14th, 2013

Mike Heffner: Software Architect at Librato

Matt Pfeil: Co-Founder at DataStax


Matt: Hi Planet Cassandra this is Matt Pfeil here today with Mike Heffner, software engineer at Librato.  Mike, thanks for joining us.


Mike: Of course.


Matt: Just to kick this off why don’t you tell everyone a little bit about what Librato does?


Mike: Librato is a time-series metrics platform that we run as a hosted service. Basically, we provide a really simple rest API that customers push any type of time-series data to us. We store that,  provide you with dashboards and we do threshold alerting on that data as it comes in.   Because it’s developed as a generic API, we support systems monitoring, you can give us application performance metrics, business metrics, all of that and we present that all in one product.


Matt: Very cool.  What’s one real-world-use case for a customer using you guys?


Mike: One real-world-use case, typically, application performance metrics, specifically users running apps on Heroku.  We have a RubyOnRails Gem that you can embed in your Rails project and that will push time-series metrics to us and you can watch your performance metrics through their routing layer, your response times, and tie that into what your customers are seeing from your site.


Another use case that we ourselves use heavily is our JMX Java monitoring, specifically for monitoring Cassandra rings. Using our JMX Taps Rubygem you can monitor all of the JMX metric attributes that Cassandra exports and use them to build a ring monitoring dashboard. See this link for more information:


Matt: Very cool.  What’s the use case for Cassandra inside the application?


Mike: We built the product actually from the ground up around Cassandra.  All of our at-rest time-series data is actually pumped into Cassandra and we read directly from that, so that’s our primary data store for all of our time-series data.  We also do some amount of object caching in Cassandra but primarily the at-rest data for time-series.


Matt: Time-series data and Cassandra go well together because of the data model.  Was that your primary motivation when you were evaluating technologies?


Mike: Yes, we analyzed it from that perspective.  We knew we were going to have to scale the time-series data so we wanted a scalable solution from the get go. It was the best combination of performance, stability, scalability, and community.  Those were our primary reasons for selecting Cassandra as a solution.


Matt: What’s your experience with the community been like?


Mike: It’s been superb.  Several times we have popped onto IRC or the mailing list and put questions up there and have gotten really thoughtful responses. IRC has also come in handy when we had some crisis moments and wanted to gut-check some of our assumptions.


Matt: I’m glad to hear that you’ve had a good experience.  Can you share a little information about what you deployment looks like in terms of the infrastructure, number of machines, where it’s hosted at, things like that?


Mike: Sure.  We run everything on EC2.  We’re split across three of their availability zones all in the U.S. East Region. We have about three to four rings that we shard to based on users and on the resolution of rollups that we do for our historical data.  We do, I think about, over several hundred thousand writes per second across those rings in aggregate.


Matt: That’s very cool.  So you’re doing some serious traffic?


Mike: Definitely.


Matt: Very cool, any advice for someone who’s just getting started with Cassandra?


Mike: For us, optimizing our data model and access patterns to how the data moves around on disc and is stored has really helped us tune to how Cassandra behaves.  So don’t be afraid to get in there and really understand how C* works. I think Aaron Morton talked about this at the conference was really don’t be afraid of digging into how Cassandra stores data, how the read and write paths work.  Because for us, optimizing, over the years, our data model and access patterns to how the data moves around on disc and is stored really has helped optimize what Cassandra does.  So, I think, don’t be afraid to get in there and understand it.


Secondly, I would say track as many of the metrics from Cassandra as possible.  We are a metrics company but, for us, tracking the Cassandra metrics when we’re trying to theorize about a performance issue allows us to analyze the metrics to confirm or reject our hypothesis’. This has been tremendously useful in the past.


With the community, don’t be afraid to ask questions and read the docs.


Matt: Great.  Mike I want to thank you for your time today.


Mike: Thanks Matt