December 3rd, 2013

By 

 

Cliff Ford: CTO at CodeMettle

 

TL;DR: CodeMettle provides a software framework for customers in the satellite broadcast industry that need network management, monitoring and control capabilities. Customers utilize their software framework to monitor and control all of their back-end broadcast equipment.

 

When evaluating CouchDB, MongoDB, and Cassandra scalability was CodeMettle’s biggest motivator – that and the clustering capability. Being able to make the database elastic and expand more and more was always difficult with a relational database, and customers wanted zero downtime because they’ve become very dependent on these systems. So they decided to pursue a NoSQL database that could quickly insert, read and write terabytes of data without impacting performance, and then expand as needed.

 

Cassandra’s Apache support behind it along with a robust community, made the choice a “no-brainer”. CodeMettle stores over 1 terabyte of data in Cassandra. While one of CodeMettle’s largest systems has 200+ million records in their log database – and it definitely would have toppled over any RDBMS database that we would have implemented, but with DataStax Enterprise it has performed just fine.

 

 

Cliff, how does CodeMettle serve its customers?

CodeMettle provides a software framework for customers in the satellite broadcast industry that need network management, monitoring and control capabilities. Customers utilize our software framework to monitor and control all of their back-end broadcast equipment. There are different phases of their transmission, from video collection, to back-end transmission broadcasting across their network, to up-linking to the satellite. We provide end-to-end solutions for them to monitor and control everything that’s business critical for them.

 

A major customer of ours, one of the largest cable providers in the U.S., has our software sitting in their Customer Service Center. So when a customer calls one of their call centers, the support agent on the other end uses our software to actually bring up a live video stream and control a box that is exactly the same as the one being used by the customer. It provides the agent a virtual remote control so they can physically control their local box and emulate what the customer is experiencing. We’ve put our software across their regional call centers, and we collect and aggregate all of the data that the service agent is executing.

 

Every button an agent pushes during the troubleshooting process is recorded so they can later be analyzed, and our client can determine if there are issues that consistently crop up. They take all of that data, and they analyze it, and report on it and figure out what do they need to do to fix common problems.

 

Tell me a little bit about your technical infrastructure and the software you use to make all that happen. Is it on-premise or in the cloud?

Our customers really decide whether to operate it on-premise or in the cloud. Most customers in this industry like to have control over everything, at least at this point, because that’s how they’ve operated traditionally.

 

Most of our customer installations are on-premise, and slowly but surely, we’re migrating more and more to cloud services. From a technical infrastructure standpoint, we’ve written a framework that allows information collection from all of these physical devices or other applications.

 

From our Enterprise Service Bus, we perform real-time analytics of things that are occurring in the networks so that people can be informed of potential problems that may be occurring. From there, the bus takes it and we push it into the database.

 

Traditional network management systems might use a relational database, but we’ve found that in those situations it’s harder to store information for a long period of time and the information becomes less useful. We were traditionally archiving and pulling data out of the database every 30 to 60 days, depending on how much equipment we were monitoring. Being able to conduct historical analysis was often difficult with a relational database, so for our new product decided to look at NoSQL databases.

 

Can you elaborate a little more on why you decided to pursue a NoSQL database?

Scalability was our biggest motivator – that and the clustering capability. Being able to make the database elastic and expand more and more was always difficult with a relational database, and customers wanted zero downtime because they’ve become very dependent on these systems. So we decided to pursue a NoSQL database that could quickly insert, read and write terabytes of data without impacting performance, and then expand as needed.

 

We hear that a lot from customers – that scalability drove their interest in NoSQL. And you’re also talking about the need for continuous availability, keeping your performance high, regardless of the data volumes.

Right, it solved a lot of problems that we had been experiencing.

 

Do you primarily use Solr as your interface? Does data come in through the Solr APIs, or are you inserting data into Cassandra that’s then replicated to Solr?

We have three different Solr schemas, and we segment our database in three different ways:

  1. First, our “config database” is really sort of a static configuration with core information.

  2. Second, our “current database” is the current values of any information we’re collecting. So it’s a single snapshot in an instant that shows the latest value of any particular item.

  3. Finally, our “log database” stores our historical collection of information, so going back forever really because we don’t archive at this point

When our services fire up or need to get information, they’ll hit the database directly and say, “What’s the latest values? Give me all of the information for configuration. Give me my configuration,” and then pull it from the appropriate database.

 

Can you give me an idea about your configuration? For example, how many nodes that you operate and how much data volume you think exists in that cluster?

Volume-wise, we store more than a terabyte. Our configuration itself varies because our customers are doing on-premise stuff and don’t require hundreds of nodes, so we typically go with a three-node cluster in those cases.

 

One of the largest systems has 200+ million records in our log database – and it definitely would have toppled over any RDBMS database that we would have implemented, but with DataStax Enterprise it has performed just fine.

 

Are the applications you support external facing? Are your customers interacting with the data that’s stored in DataStax Enterprise, or is it more back-end stuff that you’re doing for analysis?

It’s both. A lot of our customers want complete control. They want the systems installed locally, so we install them, set them up, and they manage them, they maintain them, they do whatever they want with them. We’ll provide the special services to get them up and running, and they’ll pay us for maintenance and that type of stuff for ongoing support, but it’s basically however they want to implement the use.

 

How is your experience with DataStax Enterprise from a maintenance and administration standpoint?

It’s definitely “set it and forget it” for our customers. They know very little about the back-end because it is robust and big. If they run into a problem, it’s more likely a problem with some other portion of the software than it is with DataStax Enterprise.

 

When you first looked at NOSQL technology, which platforms did you evaluate?

We looked at Couch, Mongo and Cassandra. It was a significant switch for us because when the founders of the company sat down, we decided to go for open source. When it came right down it, Cassandra had Apache support behind it along with a robust community, and that made it a no-brainer. Plus the fact that DataStax is a solid company standing behind the product, and offering Solr integration, that made it easy for us to select DataStax Enterprise.

 

In the end, how has DataStax Enterprise helped you achieve success?

DataStax Enterprise lets us significantly reduce our application development time and gives us the ability to do the clustering and elasticity of data storage.

 

Previously, we had to become experts at setting up clusters and configuring systems to add a new node. It was always a nightmare to add more nodes to a specific cluster with our relational database. With DSE it’s very simple and easy to use. That let us spend more time on our domain system technology in developing our application. We didn’t have to worry about, “How are we going to figure out to cluster this thing?”

 

What advice would you give someone migrating from a relational system to NoSQL? We often hear people cite the changing mindset of their data model.

I think we had already reset our mindset anyway, so I don’t really recall it being a problem for us. But yes, going from a typical relational database to a flattened NoSQL, where you have to define your relationships, is definitely a changed mindset. But it wasn’t a big leap for us.

 

To learn more about DataStax Enterprise versus DataStax Community Edition visit,

http://www.datastax.com/download/dse-vs-dsc

LinkedIn