This post was featured on the DataStax blog in their series of quick-hit interviews with companies using Apache Cassandra and/or DataStax Enterprise (DSE) for key parts of their business.
We are a smart grid software company and we help utilities develop smart customer engagement applications. We utilize sensors to collect a person’s energy utilization and then provide analysis via customer-facing dashboards on their usage.
We’ve been in business for about three years now and have around 2,000 customers.
We run partly in the cloud and partly on premise. Right now we manage about 6TB of data between them.
We collect data on the average of every 1-5 seconds, so that’s a lot of data points coming in from each customer. We started with MySQL three years ago, but that was much too low. Then we turned to HBase, but that proved to be too slow for both reads and writes as well. We then turned to Cassandra, which was much better and showed itself to handle time series data very well.
We also use a variety of development languages and were happy to receive a full stable of certified drivers from DataStax, which is a big deal for us. We currently use the Node.js, C++, and Java drivers along with the Spark connector.
Information from the Cassandra database is used to power our web-based dashboards that customers interact with. All information is shown in real-time along with historical information and alerts.
DataStax Enterprise makes our life easier in a number of ways. First, it helps automate performance monitoring, tuning, and backup tasks. We use OpsCenter to manage a lot of these things.
Next, we use Spark for our analytics and have a separate Spark cluster running. In addition to that we had another, distinct Elasticsearch cluster running for search tasks. With DataStax Enterprise, we get Spark built-in to a database cluster as well as search capabilities via DataStax Enterprise Search, which removes the need to have separate clusters of Spark and Elasticsearch. We’re using DataStax Enterprise Search today and will be migrating our separate Spark cluster to DataStax Enterprise soon.
We’ve been able to stop worrying about things like performance, security, and things like that. DataStax Enterprise has provided us with everything we need and it has given us great peace of mind.
The key is understanding how data modeling works in Cassandra. This is the first thing to tackle. I would also recommend using a solution like DataStax Enterprise over open source Cassandra because then you can focus on things like data modeling and not have to worry about managing and maintaining your clusters.