October 23rd, 2013

By 

 

 

“Our product catalogs are stored in Cassandra, and its strength in indexing lends itself well to facilitating the online browsing, categorizing, searching, and sorting of products.”

-Ron Siemens, Lead Engineer, Catalog Systems at CharityUSA.com

Ron Siemens Lead Engineer, Catalog Systems at CharityUSA.com

 

For today’s Apache Cassandra Use Case we have Ron Siemens, Lead Engineer at CharityUSA.com. Ron thanks for joining us today, what does CharityUSA.com do?

CharityUSA.com is the owner and operator of the GreaterGood Network: a family of online activism sites using simple online ways to protect the health and well-being of people, animals and the planet.  Since 1999, more than $29 million has been given to non-profit charities around the world.

How are you using Apache Cassandra?

The Apache Cassandra non-relational database is one of the primary architectural pieces of the GreaterGood Network retail websites (TheHungerSite.com, TheRainForestSite.com, TheAnimalRescueSite.com, and others ).  Our product catalogs are stored in Cassandra, and its strength in indexing lends itself well to facilitating the online browsing, categorizing, searching, and sorting of products.

We found we were also able to leverage Cassandra for our recommendations engine.  It is a collaborative filtering design with 10s of millions of entities in the Titan graph framework, which runs on top of Cassandra.

What was the motivation for using Cassandra and what other technologies was it evaluated against?

We were looking at replacing legacy proprietary solutions that run on top of our traditional relational database systems.  Maintenance, scalability, and contention were the primary areas we were aiming to improve.  Cassandra was only at version 0.7 at time, but we decided to give it a try. Its track record with other big internet players, the clear momentum building in its use.  Prototyping some of our use cases, it was clear Cassandra had a natural suitability for indexing which was one of our primary requirements.

Can you share some insight on what your deployment looks like?  

We use virtualization technology on higher-end servers to provide commodity-styled VMs in our own data centers.  These are also used for our Cassandra nodes: we’ve started with a cluster of 3-nodes and so far that suffices for our needs.  It is hosting on the order of 100 million entities and relationships.

What would you like to see out of Apache Cassandra in future versions?

Having adopted the technology somewhat early, there were a few things on our wish list that have since been standardized.  We have our own proprietary indexing and expression language built on top of Cassandra.  It’s nice to see these have become standard features, and we are evaluating incorporating

these standards.

What’s your experience with the Apache Cassandra community?

The community is active, and it’s nice to see the lively developments: with numerous higher-level APIs available to extend Cassandra’s usability.  When looking to add graph technology into our infrastructure, rather than support another new framework, we found the community has already found a way to leverage Cassandra in this way.  We’re now successfully using the Titan graph framework on top of Cassandra.

Vote on Hacker News