August 5th, 2013

By 

 

Hisham Mardam Bey: CTO at Mate1

Brady Gentile: Community Manager at DataStax

 

Brady: Welcome Planet Cassandra users; joining us today we have Hisham Mardam Bey, CTO at Mate1.  To start things off Hisham, could you tell us a little bit about what Mate1 does? 

 

Hisham: At Mate1 we develop an on-line dating website with some social networking features. We’ve been around for almost 9 years now.

 

Brady: Excllent and how are you using Cassandra?

 

Hisham: A few years ago we wanted to give our users a news / activity feed that updated frequently and could be kept around. After evaluating other possibilities (HBase, MySQL, Redis, are the ones that come to mind) we ended up using Cassandra. We store events that belong to a user’s feed in 3 wide rows in Cassandra and have fourth row that represents the rolled up state of the feed. We also make use of several counters for each user representing feed specific numbers (category counts, read / unread counts, etc.).

 

We started with version 0.7 of Cassandra and afterwards switched on compression which was quite rewarding.

 

Brady: What was the motivation for using Cassandra and what other technologies was it evaluated against?

 

Hisham: When we started thinking about implementing the feeds we evaluated MySQL and Redis for a while, then thought about looking into HBase and Cassandra. There was no elegant way to model the data in MySQL without leading to performance bottlenecks as the data grew and we did not want to shard that data ourselves. HBase was interesting as was Cassandra. At the time we were still learning Hadoop and were not ready to set up and maintain HBase. Cassandra’s set up was very straight forward and simpler than HBase and it offered interesting possibilities both in terms of modelling the data, as well as scalability. We built a very quick prototype and ran some load tests and were very happy with the results and with how Cassandra’s data model really suited our needs.

 

Brady: Can you share some insight on what your deployment looks like?

 

Hisham: We are hosted in our own DC here in Montreal primarily. We have experiment around with AWS but we don’t rely on it at the moment. Our servers are mainly provided by Dell or SuperMicro and the Cassandra cluster is running on SuperMicro. We run on spinning disks using RAID0. We have 2 Cassandra clusters consisting of 4 nodes each (production and testing). We have 64GB of RAM per machine and a around 4TB of data in production.

 

Brady: What’s your favorite part about Apache Cassandra?

 

Hisham: We absolutely love the data model and scalability that Cassandra offers us. Coupled with ease of use and the ability to get it up and running quickly in development means that we can experiment quickly and have code flow from development into production in a short amount of time. Tools like OpsCenter have also given us visibility that’s been very helpful across releases.

 

Brady: What would you like to see out of Apache Cassandra in future versions?

 

Hisham: Cassandra works great for us; we’re looking forward and eager to experiment with triggers… not to mention all new enhancements and features!

 

Brady: Thanks for joining us today Hisham and best of luck to both you and Mate1.

 

Hisham: Thank you.