Outbrain, a content recommendation platform, undertook the challenging task of migrating its large-scale production clusters from DataStax Enterprise (DSE) 4.8.x to Apache Cassandra 3.11. The company uses Cassandra extensively, with around 30 production clusters of varying sizes across three data centers. The primary motivation behind the upgrade was to achieve better performance and efficiency by leveraging the newer features of Apache Cassandra 3.11.
The Decision-Making Process
After considering two options – upgrading to the commercial distribution version provided by DataStax or migrating to the Apache Cassandra distribution – Outbrain opted for Apache Cassandra 3.11. The decision was based on the fact that they did not utilize any of the Datastax enterprise features and were satisfied with the core functionality offered by Apache Cassandra.
The Proof of Concept (POC)
Outbrain conducted a POC to validate whether Apache Cassandra 3.11 would meet their needs. They built a new cluster and aimed to achieve acceptable read and write latencies with fewer nodes. Through a series of tests and problem-solving, they found that not only could they benefit from storage space savings and faster read path, but they could also potentially store more data per node for clusters that used LeveledCompactionStrategy (LCS).
The Migration Plan
The migration plan included several steps:
- Upgrade to the latest DSE 4.x version (4.8.14)
- Upgrade to DSE 5.0.14 (Cassandra 3)
- Upgrade sstables
- Upgrade to DSE 5.1.4 (Cassandra 3.11)
- Replace DSE with Apache Cassandra 3.11.1
Outbrain then proceeded to upgrade their clusters, carefully following the migration plan to ensure minimal downtime and impact on performance.
The Results
The migration from DSE 4.8.x to Apache Cassandra 3.11 was successful, bringing significant performance improvements and major storage savings. The new storage format also resulted in lower JVM pressure during reads. As a result, read latencies decreased across all three data centers during load tests performed before and after the migration.
Conclusion
Outbrain’s successful in-place migration from DSE 4.8.x to Apache Cassandra 3.11 proves that it is possible to conduct such an upgrade in running production clusters without compromising on performance or causing downtime. With careful planning, thorough testing, and a well-executed migration strategy, companies can leverage the benefits of Apache Cassandra 3.11 while minimizing disruption to their operations.