August 22nd, 2013



Brady Gentile: Community Manager at DataStax

Glen Ford: Chief Architect at Zeebox


Hello, Planet Cassandra users. Today we have Glen Ford of Zeebox with us today, and he’ll be sharing the Zeebox Cassandra use case. Thanks for joining us today Glen. To start things off, what does Zeebox do?

What we’re trying to do is to make TV even better than it is. It’s in the second screen space and we’re trying to enhance people’s viewing by providing an immersive second screen application, so they can share with their friends or purely themselves. 


It allows you to browse what’s on television, to book what you want to watch, to share what you’re watching with your friends, to interact with your friends while you’re watching, as well as receive information about things that are on screen. It might be background information on the program that’s currently on or about an actor or all the things you might be interested in while you’re watching television.


That’s really cool. It sounds like a really engaging and social way of watching television. And how is Zeebox currently using Cassandra?

We’re using it in a few places. We’re using it to store our user data, which includes things like activities and feeds of data, things that people are doing, things that they’re interested in. We also use it to store broadcast data, so we can stream lots of broadcast feeds about what’s on television.  


We break this data down into consumable chunks for our clients and we use Cassandra as a very good way of storing that data. We’re also gathering analytics about what’s happening in the system and what’s happening with our users.


Interesting, and how much data do you have stored in Cassandra at Zeebox? 

It’s hard to say at the moment; it’s growing all the time. I think it’s about 30 GB of our user data and similar activities. Broadcast data is storing several GB a day.


I’ve heard that you switched from another database offering to Cassandra. Could you tell us a little bit about that?

We’re a startup and we got things moving very quickly and we leveraged Amazon SimpleDB to get things up and running, but it wasn’t that long until we started to hit issues with SimpleDB. Specifically we hit performance issues; it wasn’t behaving the way that we needed it to behave. This was just over 12 months ago, so before Amazon came out with Dynamo and we were looking at different technologies that we could use to store our data. The main feature we were looking for included being able to scale in a linear fashion and Cassandra really fit that need. We were able to transition from SimpleDB to Cassandra seamlessly. Our users never even noticed the transition.


Do you have any tips or tricks for someone who’s looking to transition from SimpleDB to Cassandra?

When you’re doing the transition, you really have to do a bit of extra work, and it’s worth doing that extra work to make sure that your users don’t experience any sort of outtakes. Being able to copy data across in the background, we actually kept both data stores in sync for several days until we could just gradually turn off our usage of SimpleDB. I think it’s really about putting a bit of extra thought into how you do that transition.


That makes sense. For a new user that’s starting out with Cassandra, would you have any tips or tricks for them getting started?

Sure, we’ve obviously made mistakes; we did things wrong and learned from those mistakes, so really understanding how Cassandra works is important. I think a lot of people who come into the NoSQL space expect to have all of the things that their traditional relational database gave them and they aren’t used to thinking at the level that Cassandra and other NoSQL data stores require you to think.


It’s worth learning some of the background, read some of the books and, of course, there are a lot of great blog posts. Netflix has some great blog posts about how they’ve done stuff with Cassandra, trying out different client drivers, etc. Be prepared to spike and test this stuff out before you dive into building it.


Last question for you. Are you storing your data on your own servers or on you in the cloud?

We’re fully based in Amazon. We don’t use EBS (Elastic Block Store) at this stage for storing our Cassandra data; we store it all on disc.


Excellent. Well thank you Glen, that’s all the questions I have for you today. Thanks so much for joining us.

Thanks very much, Brady.