October 20th, 2014

Cassandra Day Denver 2014, sponsored by DataStax, Planet Cassandra & Pearson eCollege, was a huge hit with over 250 attendees, 2 speaking tracks (beginner and advanced) rotating across 4 rooms on the Pearson campus, and 13 technical sessions. Attendees had a chance to learn about how Cassandra is being used across a variety of companies and projects, such as Person’s eCollege platform or Project EPIC at the University of Boulder Colorado, as well as a full getting started track to take attendees from “zero to hero” with Apache Cassandra.

Found below are slides and videos from Cassandra Day Denver 2014 that we’re excited to share with the Apache Cassandra community.  We hope  you can attend the next Cassandra Day in person! Request us to come to your city by emailing info@PlanetCassandra.org with your cities name.


Beginner Track

Getting Started with Apache Cassandra — From Zero to Hero

Speaker: Jon Haddad, Technical Evangelist for Apache Cassandra at DataStax & Luke Tillman, Language Evangelist for Apache Cassandra at DataStax

This is a crash course introduction to Cassandra. You’ll step away understanding how it’s possible to to utilize this distributed database to achieve high availability across multiple data centers, scale out as your needs grow, and not be woken up at 3am just because a server failed. We’ll cover the basics of data modeling with CQL, and understand how that data is stored on disk. We’ll wrap things up by setting up Cassandra locally, so bring your laptops!


Python & Cassandra Best Friends

Speaker: Jon Haddad, Technical Evangelist for Apache Cassandra at DataStax

We’ll be doing a deep dive into working with Cassandra using Python. We’ll cover a wide range of tools, starting with the native driver to cqlengine. We’ll also explore alternatives to the cqlsh repl to quickly explore our databases.

Building Java Applications with Apache Cassandra
Speaker: Tim Berglund, Global Director of Training at DataStax

So you’re a JVM developer, you understand Cassandra’s architecture, and you’re on your way to knowing its data model well enough to build descriptive data models that perform well. What you need now is to know the Java Driver.

What seems like an inconsequential library that proxies your application’s queries to your Cassandra cluster is actually a sophisticated piece of code that solves a lot of problems for you that early Cassandra developers had to code by hand. Come to this session to see features you might be missing and examples of how to use the Java driver in real applications.


Advanced Track

Lessons Learned Implementing Cassandra at Pearson eCollege

Speaker: Codey Whitt — Senior Software Engineer, Pearson eCollege

This talk discusses things to consider when considering Cassandra through the purview of a Pearson’s team’s recent Cassandra adoption after coming from a .NET/SQL world. Topics covered include data model design, operationalization of a cluster, and other best practices along with what happens when they aren’t followed.


Cassandra Anti-Pattern Jeopardy
Speaker: Rachel Pedreschi, Lead Sales Engineer at DataStax
Don’t put your Cassandra application in “jeopardy’! Come learn from real examples from the field on how NOT to do Cassandra. Prizes might be involved!


Setting up a DataStax Enterprise Instance on Microsoft Azure

Speaker: Joey Filichia, BI Consultant at Filichiacom

There are many options for Cloud Providers, but according to the Gartner Magic Quadrant 2014 for IaaS Solutions, Amazon AWS and Microsoft Azure are both leaders and visionaries. DataStax provides instructions for provisioning an Amazon Machine Image. This discussion will provide guidance on setting up a single-node DataStax Enterprise cluster using an Ubuntu 14.04 Server and a Windows Azure Virtual Machine. Using the DataStax Enterprise production installation in text mode, we will install DSE end to end during the presentation.


Reading Cassandra SSTables Directly for Offline Data Analysis

Speaker: Ben Vanberg, Software Engineer at FullContact

Here at FullContact we have lots and lots of contact data. In particular we have more than a billion profiles over which we would like to perform ad hoc data analysis. Much of this data resides in Cassandra, and we have many analytics MapReduce jobs that require us to iterate across terabytes of Cassandra data. To solve this problem we’ve implemented our own splittable input format which allows us to quickly process large SSTables for downstream analytics.


Feelin’ the Flow: Analyzing Data with Spark and Cassandra

Speaker: Rich Beaudoin, Senior Software Engineer at Pearson eCollege

In the world of Big Data it’s crucial that your data is accessible. Cassandra provides us with a means to reliably store our data, but how can we keep it flowing? That’s where Spark steps up to provide a powerful one-two punch with Cassandra to get your data flowing in all the right directions.


A Cassandra Data Model for Serving up Cat Videos

Speaker: Luke Tillman, Language Evangelist for Apache Cassandra at DataStax

Keyboard Cat, Nyan Cat, and of course the world famous Grumpy Cat–it seems like the Internet can’t get enough cat videos. If you were building an application to let users share and consume their fill of videos, how would you go about it? In this talk, we’ll take a look at the data model for KillrVideo, a sample video sharing application similar to YouTube where users can share videos, comment, rate them, and more. You’ll learn get a practical introduction to Cassandra data modelling, querying with CQL, how the application drives the data model, and how to shift your thinking from the relational world you probably have experience with.


Transitioning to Cassandra for an Already Giant Product
Speaker: Andrew Kuttig, Director of Software Engineering at SpotXchange, Inc.
Data modeling, cluster sizing, and planning can be difficult when transitioning an existing product to Cassandra. Especially when the new Cassandra deployment needs to handle millions of operations per second on day one! In this talk I’ll discuss our strategy for data modeling, cluster sizing, and our novel approach to data replication across data centers.

Using Cassandra to Support Crisis Informatics Research
Speaker: Ken Anderson, Associate Professor, Department of Computer Science at The University of Colorado Boulder
Crisis Informatics is an area of research that investigates how members of the public make use of social media during times of crisis. The amount of social media data generated by a single event is significant: millions of tweets and status updates accompanied by gigabytes of photos and video. To investigate the types of digital behaviors that occur around these events requires a significant investment in designing, developing, and deploying large-scale software infrastructure for both data collection and analysis. Project EPIC at the University of Colorado has been making use of Cassandra since Spring 2012 to provide a solid foundation for Project EPIC’s data collection and analysis activities. Project EPIC has collected terabytes of social media data associated with hundreds of disaster events that must be stored, processed, analyzed, and visualized. This talk will cover how Project EPIC makes use of Cassandra and discuss some of the architectural, modeling, and analysis challenges encountered while developing the Project EPIC software infrastructure.