Illustration Image

Why We’re Moving to a Source Available License

TL;DR ScyllaDB has decided to focus on a single release stream – ScyllaDB Enterprise. Starting with the ScyllaDB Enterprise 2025.1 release (ETA February 2025): ScyllaDB Enterprise will change from closed source to source available. ScyllaDB OSS AGPL 6.2 will stand as the final OSS AGPL release. A free tier of the full-featured ScyllaDB Enterprise will be available to the community. This includes all the performance, efficiency, and security features previously reserved for ScyllaDB Enterprise. For convenience, the existing ScyllaDB Enterprise 2024.2 will gain the new source available license starting from our next path release (in December), allowing easy migration of older releases. The source available Scylla Manager will move to AGPL and the closed source Kubernetes multi-region operator will be merged with the main Apache-licensed ScyllaDB Kubernetes operator. Other ScyllaDB components (e.g., Seastar, Kubernetes operator, drivers) will keep their current licenses. Why are we doing this? ScyllaDB’s team has always been extremely passionate about open source, low-level optimizations, and the delivery of groundbreaking core technologies – from hypervisors (KVM, Xen), to operating systems (Linux, OSv), and the ScyllaDB database. Over our 12 years of existence, we developed an OS, pivoted to the database space, developed Seastar (the open source standalone core engine of ScyllaDB), and developed ScyllaDB itself. Dozens of open source projects were created: drivers, a Kubernetes operator, test harnesses, and various tools. Open source is an outstanding way to share innovation. It is a straightforward choice for projects that are not your core business. However, it is a constant challenge for vendors whose core product is open source. For almost a decade, we have been maintaining two separate release streams: one for the open source database and one for the enterprise product. Balancing the free vs. paid offerings is a never-ending challenge that involves engineering, product, marketing, and constant sales discussions. Unlike other projects that decided to switch to source available or BSL to protect themselves from “free ride” competition, we were comfortable with AGPL. We took different paths, from the initial reimplementation of the Apache Cassandra API, to an open source implementation of a DynamoDB-compatible API. Beyond the license, we followed the whole approach of ‘open source first.’ Almost every line of code – from a new feature, to a bug fix – went to the open source branch first. We were developing two product lines that competed with one another, and we had to make one of them dramatically better. It’s hard enough to develop a single database and support Docker, Kubernetes, virtual and physical machines, and offer a database-as-a-service. The value of developing two separate database products, along with their release trains, ultimately does not justify the massive overhead and incremental costs required. To give you some idea of what’s involved, we have had nearly 60 public releases throughout 2024. Moreover, we have been the single significant contributor of the source code. Our ecosystem tools have received a healthy amount of contributions, but not the core database. That makes sense. The ScyllaDB internal implementation is a C++, shard-per-core, future-promise code base that is extremely hard to understand and requires full-time devotion. Thus source-wise, in terms of the code, we operated as a full open-source-first project. However, in reality, we benefitted from this no more than as a source-available project. “Behind the curtain” tradeoffs of free vs paid Balancing our requirements (of open source first, efficient development, no crippling of our OSS, and differentiation between the two branches) has been challenging, to say the least. Our open source first culture drove us to develop new core features in the open. Our engineers released these features before we were prepared to decide what was appropriate for open source and what was best for the enterprise paid offering. For example, Tablets, our recent architectural shift, was all developed in the open – and 99% of its end user value is available in the OSS release. As the Enterprise version branched out of the OSS branch, it was helpful to keep a unified base for reuse and efficiency. However, it reduced our paid version differentiation since all features were open by default (unless flagged). For a while, we thought that the OSS release would be the latest and greatest and have a short lifecycle as a differentiation and a means of efficiency. Although maintaining this process required a lot of effort on our side, this could have been a nice mitigation option, a replacement for a feature/functionality gap between free and paid. However, the OSS users didn’t really use the latest and didn’t always upgrade. Instead, most users preferred to stick to old, end-of-life releases. The result was a lose-lose situation (for users and for us). Another approach we used was to differentiate by using peripheral tools – such as Scylla Manager, which helps to operate ScyllaDB (e.g., running backup/restore and managing repairs) – and having a usage limit on them. Our Kubernetes operator is open source and we added a separate closed source repository for multi-region support for Kubernetes. This is a complicated path for development and also for our paying users. The factor that eventually pushed us over the line is that our new architecture – with Raft, tablets, and native S3 – moves peripheral functionality into the core database: Our backup and restore implementation moves from an agent and external manager into the core database. S3 I/O access for backup and restore (and, in the future, for tiered storage) is handled directly by the core database. The I/O operations are controlled by our schedulers, allowing full prioritization and bandwidth control. Later on, “point in time recovery” will be provided. This is a large overhaul unification change, eliminating complexity while improving control. Repair becomes automatic. Repair is a full-scan, backend process that merges inconsistent replica data. Previously, it was controlled by the external Scylla Manager. The new generation core database runs its own automatic repair with tablet awareness. As a result, there is no need for an external peripheral tool; repair will become transparent to the user, like compaction is today. These changes are leading to a more complete core product, with better manageability and functionality. However, they eat into the differentiators for our paid offerings. As you can see, a combination of architecture consolidations, together with multiple release stream efforts, have made our lives extremely complicated and slowed down our progress. Going forward After a tremendous amount of thought and discussion on these points, we decided to unify the two release streams as described at the start of this post. This license shift will allow us to better serve our customers as well as provide increased free tier value to the community. The new model opens up access to previously-restricted capabilities that: Achieve up to 50% higher throughput and 33% lower latency via profile-guided optimization Speed up node addition/removal by 30X via file-based streaming Balance multiple workloads with different performance needs on a single cluster via workload prioritization Reduce network costs with ZSTD-based network compression (with a shard dictionary) for intra-node RPC Combine the best of Leveled Compaction Strategy and Size-tiered Compaction Strategy with Incremental Compaction Strategy – resulting in 35% better storage utilization Use encryption at rest, LDAP integration, and all of the other benefits of the previous closed source Enterprise version Provide a single (all open source) Kubernetes operator for ScyllaDB Enable users to enjoy a longer product life cycle This was a difficult decision for us, and we know it might not be well-received by some of our OSS users running large ScyllaDB clusters. We appreciate your journey and we hope you will continue working with ScyllaDB. After 10 years, we believe this change is the right move for our company, our database, our customers, and our early adopters. With this shift, our team will be able to move faster, better respond to your needs, and continue making progress towards the major milestones on our roadmap: Raft for data, optimized tablet elasticity, and tiered (S3) storage. Read the FAQ

Become part of our
growing community!
Welcome to Planet Cassandra, a community for Apache Cassandra®! We're a passionate and dedicated group of users, developers, and enthusiasts who are working together to make Cassandra the best it can be. Whether you're just getting started with Cassandra or you're an experienced user, there's a place for you in our community.
A dinosaur
Planet Cassandra is a service for the Apache Cassandra® user community to share with each other. From tutorials and guides, to discussions and updates, we're here to help you get the most out of Cassandra. Connect with us and become part of our growing community today.
© 2009-2023 The Apache Software Foundation under the terms of the Apache License 2.0. Apache, the Apache feather logo, Apache Cassandra, Cassandra, and the Cassandra logo, are either registered trademarks or trademarks of The Apache Software Foundation. Sponsored by Anant Corporation and Datastax, and Developed by Anant Corporation.

Get Involved with Planet Cassandra!

We believe that the power of the Planet Cassandra community lies in the contributions of its members. Do you have content, articles, videos, or use cases you want to share with the world?