October 3rd, 2013

By 

 

Tim Moreton: Founder & CTO at Acunu

Matt Pfeil: Founder at DataStax

 

Hello everyone. This is Matt Pfeil, and I’m here with Tim Moreton from Acunu. Tim, thanks for joining us, why don’t we start things off by telling the audience what Acunu does and what your role is there?

Hi Matt, good afternoon and great to be here. I am the CTO and founder at Acunu. Acunu provides real-time analytics, so an engine for getting instant insights out of streaming data and a visualization tool so you can make sense of that; it all sits on top of Cassandra. We’re big fans of Cassandra, and we’d been using it for a long time, and it’s a key part in that platform.

 

What’s a real-time use case?

That’s a good question. Our use cases are varied, so we work with startups who are doing web and visitor analytics, so they’re collecting clickstreams. We work with investment banks who are collecting financial market data, and we’re helping them understand the pattern of activity in their trading environments, spotting anomalies and doing risk analysis there; and we work with organizations who are collection telemetry data, such as call detail records for telcos or meter readings for smart grid infrastructures.

 

That’s very cool. What attracted you to Cassandra in first place?

We’ve been using Cassandra for a long time now. I guess the key things that really make Cassandra a great fit for us and for real-time analytics, is the fact that it’s very high performance; it’s optimized in particular for writes. As you can imagine, when you’re doing analytics, writes dramatically outnumber the reads. We’re sometimes working with customers who are collecting tens of billions of events a day, and to be able to handle that load, you really need a system that’s optimized for that sort of thing.

 

I think the biggest difference that Cassandra presents, in terms of features that no other system really offers, is great multi-data center support. It’s an architecture where you have a single cluster, and that cluster is just aware of potentially multiple racks and multiple data centers.  That’s something that I think is really unique to Cassandra.

 

From a data model perspective, are most of the objects that you’re tracking stored on a per-row basis, or can you share some insight into how that’s laid out?

You can think of Acunu Analytics really as a data-modeling tool for Cassandra. From the perspective of a business user in analytics, one of the things that I think you don’t want to get caught up in is the detailed design of building a data model. That’s really what analytics as a tool does for you. You sort of specify a very high-level schema about what sort of dashboards you want. For example: I want a line graph that’s displaying rate of change of activity of some particular element in my system over time, and I want to be able to group that by potential to the other dimensions as well; analytics takes care of that.

 

 We actually build data models programmatically inside Cassandra, and we do that pretty much by leveraging the fact that data stored in a particular row is co-located and data stored in different rows are stored far apart. I guess that’s the sort of key insight that we worked from. But we’re big users of counters in Cassandra, and obviously counters being a great building-block for quantitative analytics.

 

That’s great. What’s your experience with the Cassandra community been like?

The Cassandra community is superb, very open and supportive. Jonathan Ellis and the other committers do a great job of ensuring there’s a single, cohesive Cassandra product. One of the things that can often happen in open-source databases and in other open-source projects is that you end up with a fractured vendor-specific project organization and that hasn’t happened in Cassandra.

 

I think what you’ve really got is a very thriving community, both of users and of developers, with many contributors from different vendors, different industries, and from home users, as well. I think that’s something of a real achievement for the Cassandra community. 

 

Also, we’ll be presenting at the DataStax Cassandra SF Users meetup group on Wednesday October 23rd. Nicholas Favre-Felix and myself will be presenting on understanding and (not) managing vnodes (virtual nodes) in Cassandra, as well as a look under the hood at Acunu Analytics.

 

Tim, thanks for your time today. Is there anything you’d like to add that Acunu is doing in the near future that’s exciting?

We’ll be announcing some interesting new products at Cassandra Summit EU and I’ll also be presenting, so definitely be on the lookout!

 

LinkedIn