This article is one in a series of quick-hit interviews with companies using Apache Cassandraand/or DataStax Enterprise for key parts of their business. For this interview, we spoke with Gary Ogasawara who is VP of Engineering for Cloudian.
DataStax: Gary, we you know you’re busy and we appreciate you taking the time to tell us what you guys are doing at Cloudian.
Gary: Sure. Cloudian is an S3 compatible cloud storage solution. Cloudian is ideal for enterprises and service providers looking to offer Amazon S3-compatible Storage as a Service (StaaS) and/or provide secondary storage systems for their cloud compute platforms, such as Citrix Cloud Platform, Apache CloudStack, or OpenStack.
We are excited about our new, free Cloudian Community Edition, which will support up to 100 TB of useable storage and offer forum support, opening up the scalability, reliability and power of cloud object storage to anyone building an Amazon S3-compatible cloud, be it public, private or hybrid.
DataStax: Have you guys always done this type of work?
Gary: We started in 2001 in mobile Internet optimization software, and then moved on to messaging and mail, which drove the need for storage and gave us a footprint with top mobile operators worldwide. The company evolved with the market and over the last three years we have been working on our Cloudian object storage platform, targeting a much broader market of enterprises and service providers. Cloudian was introduced over a year ago and we’ve already built strong partnerships and good reference account base in all major markets.
DataStax: Tell us how you happened onto Cassandra.
Gary: Our messaging solution absolutely requires scale for big data. Because we store hundreds of TB’s and PB’s of data for customers, we needed a high performance storage technology to power everything. Initially, we built our own NoSQL database to tackle things.
But then, we started to notice the momentum that was behind a number of open source databases like Cassandra. We took the top NoSQL products and performed a competitive bake-off, and in the end we found the Cassandra product and its community the strongest of the pack. Making the decision to choose Cassandra was a very easy and it was extremely important for our Cloudian cloud storage solution, which is the core of our business today.
DataStax: What were the technical reasons that caused you to go with Cassandra?
Gary: If you’re offering a cloud storage platform, you want it to be scalable and elastic and have it scale reliably for customers when it needs to. But just as important, you want to provide absolute reliability and protection of data. That’s key – you can’t lose any customer data.
The key advantages that Cassandra offers to us are its absolute reliability, support for replication between multiple data centers, no single point of failure, and its performance in the scalability tests we’ve run.
DataStax: Anything else?
Gary: Something else important to us is having a database platform that is easy to deploy, manage, and operate. In addition, we wanted something that was cost efficient. Cassandra also gives us all those things. Lastly, there was a smattering of other features such as time to live columns, secondary indexes, and counters.
DataStax: Is Cassandra the only database you use?
Gary: Our platform is a combination of a file system for objects, and Cassandra that handles many different things such storing as each object’s metadata, reporting, and logging. We also use Redis for very small, specialized data needs.
DataStax: How do you manage everything?
Gary: We have our own management console that’s used along with other tools like Puppet that handle the installation and management activities of scaling out over hundreds of nodes.
DataStax: If someone brand new to Cassandra came to you for advice how the best ways to get started with the database, what words of wisdom would you pass along?
Gary: It’s important to understand the underlying design of Cassandra. How data is stored, how I/O works, what happens when a node goes down, etc. Don’t get hung up on a bunch of competitive comparison tables with checkmarks, and go with the one with the most checkmarks. Instead, you really need to know what your application needs and if the database has only two checkmarks – but those checkmarks are exactly what you need – then that can be enough.
DataStax: Gary, good info to have. Thanks for the time today.
Gary: Glad to do it.