In the Cassandra community over the last year, we have seen rapid adoption of Apache Spark. In many ways, we have Evan Chan (@evanfchan), previously of Ooyala, to thank for this.
So, why is the Cassandra community adopting Spark for analytics? Well.. as Brian O’Neill (@boneill42) of Health Market Science puts it, “Sure, you could go grab Hadoop, and be locked into articulating analytics/transformations as MapReduce constructs. But that just makes people sad. Instead, I’d recommend Spark. It makes people happy”.
The Cassandra and Spark communities are going to be even happier with today’s news that DataStax and Databricks, the company driving Apache Spark, have announced a partnership to make it easier to integrate Cassandra and Spark together and code will be contributed back to the open source community.
Chanan Braunstein of Pearson Education sums up the benefits of such a partnership nicely:”The new Spark/Shark functionality on Cassandra is giving our users a scalable and high-performance way to quickly analyze our constantly growing data set. By moving from a relational database, this new functionality will allow us to deliver real-time data analytics where before our users relied on time delayed reports”.
If you’re interested in learning more about Cassandra and Spark together, be sure to attend Spark Summit 2014 from June 30th to July 2nd and Cassandra Summit 2014 on September 10th and 11th, both hosted in San Francisco, CA.