Julien Anguenot Director of Software Engineering at iland
iland internet solutions, founded in 1995, is a pure IaaS player providing enterprise cloud infrastructure and services with several datacenters in North America, Europe and Asia.
iland platform with Cassandra
Cassandra is the sole database leveraged by the land platform, distributed across iland’s datacenters, which is the foundation of the customer-facing ECS2 iland portal.
The platform stores time series: real-time (20 seconds samples coming from vSphere) and historical rollups (1m, 1h, 1d, 1w and 1month) for dozens of virtual machine’s performance counters, corresponding resource pools and networks.
Also, Cassandra stores usage corresponding real-time and historical billing information as well as infrastructure configuration, user information etc. The platform also provides predictive analytics that help companies monitor performance, achieve consistency and anticipate growth requirements.
The iland portal is essentially an easy to use and understand front end (web and mobile) for the iland platform solutions – it covers a wealth of functionality including offering visibility into resource consumption, billing, performance, the impact of change and other key areas. It also provides usage and billing based alerts as well as cloud management features.
Evaluating MongoDB and Cassandra
iland chose Apache Cassandra over MongoDB because Cassandra provides constant time writes no matter how big the data set grows and for its distributed nature as well as its “massive” scalability, reliability, performance, availability, consistency and simplicity.
Constant-time writes no matter how big the data is a must for our real-time performance counters collection since the amount of virtual machines to collect from will increase to ten of thousands along with workers concurrently performing operations at the application level.
We use a Cassandra 2.0.x cluster distributed across 5 datacenters (Los Angeles, CA – Reston, VA – London, UK – Manchester, UK – Singapore)
iland uses the 20x Debian deb packages hosted by the Apache foundation on Ubuntu 12.04 LTS.
We use CQL3 over thrift at the moment, using Astyanax, but we are planning to switch to the DataStax CQL Java driver when Astyanax 2.0 will be released.
Each datacenter has at least one rack of 3 nodes and all data is replicated across all nodes in the cluster.
To date: total cluster nodes is 18 and and we are getting close to 1TB of data (application has been deployed empty in September 2013 with the iland ECS2 brand new offering with no legacy data to migrate over to Cassandra)
(RF = replication factor)
RF = 3, W – LOCAL_QUORUM (2 nodes), R – LOCAL_QUORUM (2 nodes)
This configuration allows for a single node to fail while still serving both READS and WRITES. This setup comes at a cost of having a larger data footprint in terms of storage size. This allows us to use Cassandra’s tunable consistency to our advantage and ensure that all reads are consistent, yet keeping our availability as high as possible when running on 3 nodes.
Cassandra nodes are running off Ubuntu powered virtual machines (in vSphere). Each node has 16GB of RAM and 8 vCPUs
My advice would be to start w/ a single-node instance to avoid clustering related concerns initially and use CQL3 (vs thrift) from the start.
Documentation is great, issue tracker and mailing list are great source of information, upgrade and maintenance of Cassandra are painless and drivers such as the DataStax CQL drivers for Python or Java as well as Netflix’s Astyanax have been working just great for us.