December 28th, 2013

By 

“We found out that in all of our test scenarios Cassandra was performing consistently better.”

-Hannu Kröger, Software Architect at OnestoServices

Hannu Kröger Software Architect at Onesto

 

 

What does Onesto do and what is your role there?

Onesto is a design and IT consulting company. We make information technology do pleasurable things for people. Our services come in handy at every stage in a product lifecycle. We can be there right from the beginning, the ideation stage, to the end of a product lifetime. Our main domain expertise is in the telco sector and retail.

I work as a software architect and work on everything from architecture design and data model design to software development and system administration.

 

How are you using Apache Cassandra?

We are a Finnish partner of DataStax and we started on our first Cassandra project last spring. We use Cassandra currently in production with one customer and we are looking to expand its use with our existing and new customers. We have invested heavily in Cassandra and are pretty excited about it.

Currently we are using DataStax Enterprise 3.1 in production and Cassandra 2.0 in some tests.

What was the motivation for using Cassandra and what other technologies was it evaluated against?

We had a customer case where we wanted to evaluate different new technologies for an enterprise middleware data storage solution. We evaluated Cassandra, MongoDB, HBase, MySQL Galera and Hypertable in Amazon Cloud. We found out that in all of our test scenarios Cassandra was performing consistently better or equally well as the others. It wasn’t the absolute best performer in every scenario but we found that in none of the cases it had any performance or stability issues as some of the others did. Also the simplicity of cluster topology attracted us. It is pretty cool that you have one type of node and if you need to scale out, you just add another node and that’s about it.

Can you share some insight on what your deployment looks like?

The current production deployment in our customer case is a 4-node cluster in one DC and holds about 1TB of data with RF=3. So about 2.5-3TB in total for now.

What advice do you have for those just getting started with Cassandra?

Get familiar with Cassandra anti-patterns and data modelling topics. Design the data model from your usage point of view. You should know how you are going to use the data before you start putting it in there and try to test your data model with a lot of data before you deploy if possible. When you get those things in place, it will be fast.

What’s your experience with the Apache Cassandra community?

Community is active and I follow regularly discussions in the Cassandra user mailing list and I am happy to see that there are a lot of experts who help out with the problems users are encountering. I also hang out in IRC channel #cassandra at the freenode network. I met a lot of nice people at the Cassandra Summit and DataStax trainings and it’s nice to see that I’m not the only one excited about this technology.

Anything else that you’d like to add?

I hope more and more people and companies get on board with Cassandra. If you need help with Cassandra or some other parts of your system, give us a call.

 

Vote on Hacker News