Illustration Image

Unexpected Behavior with TimeWindowCompactionStrategy in ScyllaDB 6.1 Open Source

I’m using ScyllaDB 6.1 Open Source and have a table configured to store 30 days of data with the following compaction strategy:

compaction = { 'class': 'TimeWindowCompactionStrategy', 'compaction_window_size': '3', 'compaction_window_unit': 'DAYS', 'max_threshold': '32', 'min_threshold': '4' }

Observations

  1. Pre-7 Jan Behavior: Until January 7, the table had SSTables grouped in 3-day windows, such as (Dec 23, Dec 25, Dec 28, Dec 31, Jan 3). This aligned with the expected behavior of the configured compaction strategy.
  2. On 7 Jan: After triggering an autocompaction, a new SSTables was created for January 7 only, deviating from the 3-day window grouping behavior.

Additional Issue

Upon further investigation, I noticed that within the same 3-day window, there are multiple small SSTables instead of one large SSTable. These smaller SSTables are not being compacted into a single SSTable, even though the compaction strategy specifies min_threshold = 4 and max_threshold = 32.

Questions

  1. Why did the compaction on January 7 result in a new SSTable for just that day instead of following the 3-day grouping?
  2. Why are the smaller SSTables within the same 3-day window not being compacted into a single large SSTable as expected?
  3. Are there specific conditions under which TimeWindowCompactionStrategy skips compaction or behaves differently for insert-only workloads?
  4. Could this behavior be linked to autocompaction triggering mechanisms or internal thresholds not accounted for in the current configuration?

I’d appreciate any insights or suggestions for troubleshooting and resolving this issue.

Thank you in advance!

I configured the TimeWindowCompactionStrategy with a 3-day window, expecting SSTables to compact into larger ones within each window. However, after autocompaction on January 7, new SSTables was created for just that day, and multiple small SSTables remained instead of being compacted into a single larger one.

Become part of our
growing community!
Welcome to Planet Cassandra, a community for Apache Cassandra®! We're a passionate and dedicated group of users, developers, and enthusiasts who are working together to make Cassandra the best it can be. Whether you're just getting started with Cassandra or you're an experienced user, there's a place for you in our community.
A dinosaur
Planet Cassandra is a service for the Apache Cassandra® user community to share with each other. From tutorials and guides, to discussions and updates, we're here to help you get the most out of Cassandra. Connect with us and become part of our growing community today.
© 2009-2023 The Apache Software Foundation under the terms of the Apache License 2.0. Apache, the Apache feather logo, Apache Cassandra, Cassandra, and the Cassandra logo, are either registered trademarks or trademarks of The Apache Software Foundation. Sponsored by Anant Corporation and Datastax, and Developed by Anant Corporation.

Get Involved with Planet Cassandra!

We believe that the power of the Planet Cassandra community lies in the contributions of its members. Do you have content, articles, videos, or use cases you want to share with the world?