March 3rd, 2014

By 

 

 

Whisk
 

“MongoDB, I’d say, only performed good while fitting the data in RAM, but satisfying this requirement was quite expensive for us. Now we feel not limited to the amount of storage in Cassandra.”

- Viktor Taranenko, Senior Engineer at Whisk

 

Viktor Taranenko

 Viktor Taranenko Senior Engineer at Whisk

 

 

Whisk
Whisk is about creating grocery lists from recipes, allowing people to buy ingredients in a very straightforward way. By applying various techniques, we are trying to “understand” the recipes and ingredients required to cook them. We work with companies like FoodNetwork and a number of recipe publishers and product brands. Here I’m a senior engineer, looking after Whisk’s back-end services and infrastructure.

 

MongoDB to Cassandra

Recently, with having a growing number of recipes we were struggling about making our internal processing fast.  So we migrated and refined some parts that relied on MongoDB before. With great concurrency and fast writes Cassandra helped us a lot. MongoDB, I’d say, only performed good while fitting the data in RAM, but satisfying this requirement was quite expensive for us. Now we feel not limited to the amount of storage in Cassandra.

Moving things from MongoDB to Cassandra is still in progress.  We’ve already fixed the most critical stuff and improved the finance by moving to Cassandra

 

Horizontal scaling

The horizontal scalability of Cassandra is just great.  Cassandra’s columnar nature allows us design the schema for very efficient queries and updates.  It’s so easy to the Cassandra cluster deployed, configured and running. Thanks to DataStax  for providing good Cassandra documentation, which helped a lot.

 

Whisk’s deployment

Currently it’s a rather small cluster with six machines.  Our first Cassandra use case was about storing precomputed products data data without any limits. And doesn’t matter how much do we store – response times are predictable now. One of our recent use cases is for our ingredient graph. We decided get it with Titan running on top of Cassandra and ElasticSearch. The main reason was the easy replication and horizontal scalability derived from Cassandra.

 

Whisk’s ingredient graph

Currently we expect our graph database to provide us an ability to maintain core data easily and reuse it on other nodes. Wine recommendation is a very good example, where we can have wine data attached to base ingredients and share it between subtypes.  For example, when you have “Orange” with a lot of metadata attached to it like wine and nutritional information, we can easily reuse it on related ingredients like “Orange Juice”.  We found the graph database very good to represent our ingredients relations and make reaching that information very quickly.

The community

Thank you guys in DataStax for a great product.  DataStax has made quite a big effort to make the documentation be great, to organize conferences and webinars.   Recently I’ve been to Cassandra Summit in Europe which took place in London  and that was my almost initial point of working with Cassandra.  It was great time to share experience, to see a lot of use cases from presentations and eventually to learn some useful details from the second-day workshop.

Vote on Hacker News