September 16th, 2013

By 

“Super fast and highly available relational database.”

-Chris Smith, Engineer at Chill

Chris Smith Engineer at Chill

 

TL;DR: Chill is in the business of mobilizing communities for entertainment properties; they help filmmakers build and activate their audiences, and distribute their work for them.

 

Cassandra was initially of interest simply because of its scaling capabilities. Chill uses Cassandra for storing all of the information about a project, all of the promotional material in the project as well as any of the metadata about the films. 

 

Chill is 100% in the cloud and they run on AWS instances; they’ve a set of nodes with three-replica copies that are set within 8GB stacks.

 

Hi Planet Cassandra viewers. This is Matt Pfeil. Today I’m here with Chris Smith from Chill. Chris, thanks for taking some time to chat with us today. To kick this thing off, why don’t you tell everyone what Chill does?

Sure. Thank you very much, Matt. Chill is in the business of mobilizing communities for entertainment properties.  We help filmmakers build and activate their audiences, and distribute their work for them.  That involves selling, promoting, the whole nine yards.

 

Awesome. So in other words, you guys are sort of a marketplace where video creators can put their work up for sale and distribution.

Yes, and also how they can help promote their works before they sell them.

 

That’s great. How do you guys use Cassandra at Chill?

Cassandra was initially of interest simply because of its scaling capabilities. We’re starting to draw a very large audience and we wanted to make sure that our system could scale cleanly. We’re actually in the process of migrating and I would say we’re most of the way through the process of migrating and using it as our primary data store for most of our data. We use it like a super fast and highly available relational database.

 

That’s great. What kind of information are you storing in there? Metadata about films? Are you storing actual movie content itself? Can you help share that with the audience?

Sure. We use it for storing all of the information about a project, all of the promotional material in the project as well as any of the metadata about the films. We also store all of the operations about who’s bought what media, who’s interested in which project, things like that. 

 

More importantly, we also use it to store all of the information about interactions that we have with our community. What things people clicked on. What they looked at. The sort of stuff that we can use to better evaluate what features are working and what aren’t; also, the things that are most engaging to our audience.

 

That’s great. It’s not only like a marketplace, but it’s also like you’ve got a miniature social network in there as well.

That’s correct.

 

Very cool use, guys. What can you tell us about your setup?

We’re 100% in the cloud and we run on AWS instances; we’ve a set of nodes with three-replica copies that are set within 8GB stacks. We originally were using PostgreSQL as our primary data store for most of our work. We’ve an infrastructure set up so that we can store all of our data either both in PostgreSQL and Cassandra or in one or the other.

 

I would say that the vast majority of our entities are stored in both systems for now. We’re increasingly developing — most of our new entities are going just to Cassandra because that’s the direction that we’re going in at this point.

 

Most of our web stack is Python. The primary way we talk to Cassandra is through Python.  The one exception is that we use Storm for doing real-time data mining on our data. That stuff obviously uses Cassandra in real time and it tends to be written primarily in Java.

 

There’s a whole different structure for doing payment processing, but that doesn’t touch any of this.

 

That makes sense. Are you running out of multiple availability zones, or just one?

It’s funny you mention that. We were just talking about moving to multiple regions today.  We’re running out of multiple availability zones to guarantee that we can survive an individual availability zone failure without really having to take a step back and restart or put the site in a 404, but we don’t currently run multi-region.   We’re starting to talk about moving to a multi-region state.

 

That’s awesome. Chris, I want to thank you for your time today. For anyone who’s in the LA area, Chris will be speaking at a Meetup in the near future. Look for that on Planet Cassandra. Chris, thanks again.

Vote on Hacker News