PLEASE READ: MAXIMUM TTL EXPIRATION DATE NOTICE (CASSANDRA-14092) ------------------------------------------------------------------ (General upgrading instructions are available in the next section) The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL thatl expire after this date are not currently supported. By default, INSERTS with TTL exceeding the maximum supported date are rejected, but it's possible to choose a different expiration overflow policy. See CASSANDRA-14092.txt for more details. Prior to 3.0.16 (3.0.X) and 3.11.2 (3.11.x) there was no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately. Clusters in the 2.X and lower series are not subject to this when assertions are enabled. Backed up SSTables can be potentially recovered and recovery instructions can be found on the CASSANDRA-14092.txt file. If you use or plan to use very large TTLS (10 to 20 years), read CASSANDRA-14092.txt for more information. PLEASE READ: CVE-2021-44521 SCRIPTED UDF SYSTEM ACCESS (CASSANDRA-17352) ------------------------------------------------------------------------ If you have enabled scripted UDFs and run without UDF threads in cassandra.yaml: enable_user_defined_functions_threads: false an attacker could access java.lang.System methods and execute arbitrary code on the machine. Disabling UDF threads is still considered insecure and not recommended. To continue running without UDF threads you will need to set: allow_insecure_udfs: true and if you need access to java.lang.System for existing UDFs, set: allow_extra_insecure_udfs: true GENERAL UPGRADING ADVICE FOR ANY VERSION ======================================== Snapshotting is fast (especially if you have JNA installed) and takes effectively zero disk space until you start compacting the live data files again. Thus, best practice is to ALWAYS snapshot before any upgrade, just in case you need to roll back to the previous version. (Cassandra version X + 1 will always be able to read data files created by version X, but the inverse is not necessarily the case.) When upgrading major versions of Cassandra, you will be unable to restore snapshots created with the previous major version using the 'sstableloader' tool. You can upgrade the file format of your snapshots using the provided 'sstableupgrade' tool. 4.1.1 ===== G1GC Recommended ---------------- - The G1 settings in jvm8-server.options and jvm11-server.options are updated according to broad feedback and testing. The G1 settings remain commented out by default in 4.1.x. It is recommended to switch to G1 for performance and for simpler GC tuning. CMS is already deprecated in JDK9, and the next major release of Cassandra makes G1 the default configuration. Upgrading --------- - All previous versions of 4.x contained a mistake on the implementation of the old CQL native protocol v3. That mistake produced issues when paging over tables with compact storage and a single clustering column during rolling upgrades involving 3.x and 4.x nodes. The fix for that issue makes it can now appear during rolling upgrades from 4.1.0 or 4.0.0-4.0.7. If that is your case, please use protocol v4 or higher in your driver. See CASSANDRA-17507 for further details. 4.1 === New features ------------ - Added API for alternative memtable implementations. For details, see src/java/org/apache/cassandra/db/memtable/ - Added a new guardrails framework allowing to define soft/hard limits for different user actions, such as limiting the number of tables, columns per table or the size of collections. These guardrails are only applied to regular user queries, and superusers and internal queries are excluded. Reaching the soft limit raises a client warning, whereas reaching the hard limit aborts the query. In both cases a log message and a diagnostic event are emitted. Additionally, some guardrails are not linked to specific user queries due to techincal limitations, such as detecting the size of large collections during compaction or periodically monitoring the disk usage. These guardrails would only emit the proper logs and diagnostic events when triggered, without aborting any processes. Guardrails config is defined through cassandra.yaml properties, and they can be dynamically updated through the JMX MBean `org.apache.cassandra.db:type=Guardrails`. There are guardrails for: - Number of user keyspaces. - Number of user tables. - Number of columns per table. - Number of secondary indexes per table. - Number of materialized tables per table. - Number of fields per user-defined type. - Number of items in a collection . - Number of partition keys selected by an IN restriction. - Number of partition keys selected by the cartesian product of multiple IN restrictions. - Allowed table properties. - Allowed read consistency levels. - Allowed write consistency levels. - Collections size. - Query page size. - Minimum replication factor. - Data disk usage, defined either as a percentage or as an absolute size. - Whether user-defined timestamps are allowed. - Whether GROUP BY queries are allowed. - Whether the creation of secondary indexes is allowed. - Whether the creation of uncompressed tables is allowed. - Whether querying with ALLOW FILTERING is allowed. - Whether DROP or TRUNCATE TABLE commands are allowed. - Add support for the use of pure monotonic functions on the last attribute of the GROUP BY clause. - Add floor functions that can be use to group by time range. - Support for native transport rate limiting via native_transport_rate_limiting_enabled and native_transport_max_requests_per_second in cassandra.yaml. - Support for pre hashing passwords on CQL DCL commands - Expose all client options via system_views.clients and nodetool clientstats --client-options. - Add new nodetool compactionstats --vtable option to match the sstable_tasks vtable. - Support for String concatenation has been added through the + operator. - New configuration max_hints_size_per_host to limit the size of local hints files per host in mebibytes. Setting to non-positive value disables the limit, which is the default behavior. Setting to a positive value to ensure the total size of the hints files per host does not exceed the limit. - Added ability to configure auth caches through corresponding `nodetool` commands. - CDC data flushing now can be configured to be non-blocking with the configuration cdc_block_writes. Setting to true, any writes to the CDC-enabled tables will be blocked when reaching to the limit for CDC data on disk, which is the existing and the default behavior. Setting to false, the writes to the CDC-enabled tables will be accepted and the oldest CDC data on disk will be deleted to ensure the size constraint. - Top partitions based on partition size or tombstone count are now tracked per table. These partitions are stored in a new system.top_partitions table and exposed via JMX and nodetool tablestats. The partitions are tracked during full or validation repairs but not incremental ones since those don't include all sstables and the partition size/tombstone count would not be correct. - New native functions to convert unix time values into C* native types: toDate(bigint), toTimestamp(bigint), mintimeuuid(bigint) and maxtimeuuid(bigint) - Support for multiple permission in a single GRANT/REVOKE/LIST statement has been added. It allows to grant/revoke/list multiple permissions using a single statement by providing a list of comma-separated permissions. - A new ALL TABLES IN KEYSPACE resource has been added. It allows to grant permissions for all tables and user types in a keyspace while preventing the user to use those permissions on the keyspace itself. - Added support for type casting in the WHERE clause components and in the values of INSERT and UPDATE statements. - A new implementation of Paxos (named v2) has been included that improves the safety and performance of LWT operations. Importantly, v2 guarantees linearizability across safe range movements, so users are encouraged to enable v2. v2 also halves the number of WAN messages required to be exchanged if used on conjunction with the new Paxos Repair mechanism (see below) and with some minor modifications to applications using LWTs. The new implementation may be enabled at any time by setting paxos_variant: v2, and disabled by setting to v1, and this alone will reduce the number of WAN round-trips by between one and two for reads, and one for writes. - A new Paxos Repair mechanism has been introduced as part of Repair, that permits further reducing the number of WAN round-trips for write LWTs. This process may be manually executed for v1 and is run automatically alongside normal repairs for v2. Once users are running regular repairs that include paxos repairs they are encouraged to set paxos_state_purging: repaired. Once this has been set across the cluster, users are encouraged to set their applications to supply a Commit consistency level of ANY with their LWT write operations, saving one additional WAN round-trip. See upgrade notes below. - Warn/fail thresholds added to read queries notifying clients when these thresholds trigger (by emitting a client warning or failing the query). This feature is disabled by default, scheduled to be enabled in 4.2; it is controlled with the configuration read_thresholds_enabled, setting to true will enable this feature. Each check has its own warn/fail thresholds, currently tombstones (tombstone_warn_threshold, and tombstone_failure_threshold), coordinator result set materialized size (coordinator_read_size_warn_threshold and coordinator_read_size_fail_threshold), local read materialized heap size (local_read_size_warn_threshold and local_read_size_fail_threshold), and RowIndexEntry estimated memory size (row_index_read_size_warn_threshold and row_index_read_size_fail_threshold) are supported; more checks will be added over time. - Prior to this version, the hint system was storing a window of hints as defined by configuration property max_hint_window_in_ms, however this window is not persistent across restarts. For example, if a node is restarted, it will be still eligible for a hint to be sent to it because it was down less than max_hint_window_in_ms. Hence if that node continues restarting without hint delivery completing, hints will be sent to that node indefinitely which would occupy more and more disk space. This behaviour was changed in CASSANDRA-14309. From now on, by default, if a node is not down longer than max_hint_window_in_ms, there is an additional check to see if there is a hint to be delivered which is older than max_window_in_ms. If there is, a hint is not persisted. If there is not, it is. This behaviour might be reverted as it was in previous version by property hint_window_persistent_enabled by setting it to false. This property is by default set to true. - Added a new feature to allow denylisting (i.e. blocking read, write, or range read configurable) access to partition keys in configured keyspaces and tables. See doc/operating/denylisting_partitions.rst for details on using this new feature. Also see CASSANDRA-12106. - Information about pending hints is now available through `nodetool listpendinghints` and `pending_hints` virtual table. - Added ability to invalidate auth caches through corresponding `nodetool` commands and virtual tables. - DCL statements in audit logs will now obscure only the password if they don't fail to parse. - Starting from 4.1 sstables support UUID based generation identifiers. They are globally unique and thus they let the node to create sstables without any prior knowledge about the existing sstables in the data directory. The feature is disabled by default in cassandra.yaml because once enabled, there is no easy way to downgrade. When the node is restarted with UUID based generation identifiers enabled, each newly created sstable will have a UUID based generation identifier and such files are not readable by previous Cassandra versions. In the future those new identifiers will become enabled by default. - Resetting schema behavior has changed in 4.1 so that: 1) resetting schema is prohibited when there is no live node where the schema could be fetched from, and 2) truncating local schema keyspace is postponed to the moment when the node receives schema from some other node. Upgrading --------- - `cache_load_timeout_seconds` being negative for disabled is equivalent to `cache_load_timeout` = 0 for disabled. - `sstable_preemptive_open_interval_in_mb` being negative for disabled is equivalent to `sstable_preemptive_open_interval` being null again. In the JMX MBean `org.apache.cassandra.db:type=StorageService`, the setter method `setSSTablePreemptiveOpenIntervalInMB`still takes `intervalInMB` negative numbers for disabled. - `enable_uuid_sstable_identifiers` parameter from 4.1 alpha1 was renamed to `uuid_sstable_identifiers_enabled`. - `index_summary_resize_interval_in_minutes = -1` is equivalent to index_summary_resize_interval being set to `null` or disabled. In the JMX MBean `org.apache.cassandra.db:type=IndexSummaryManager`, the setter method `setResizeIntervalInMinutes` still takes `resizeIntervalInMinutes = -1` for disabled. - min_tracked_partition_size_bytes parameter from 4.1 alpha1 was renamed to min_tracked_partition_size. - Parameters of type data storage, duration and data rate cannot be set to Long.MAX_VALUE (former parameters of long type) and Integer.MAX_VALUE (former parameters of int type). Those numbers are used during conversion between units to prevent an overflow from happening. (CASSANDRA-17571) - We added new JMX methods `setStreamThroughputMbitPerSec`, `getStreamThroughputMbitPerSec`, `setInterDCStreamThroughputMbitPerSec`, `getInterDCStreamThroughputMbitPerSec` to the JMX MBean `org.apache.cassandra.db:type=StorageService`. They replace the now deprecated methods `setStreamThroughputMbPerSec`, `getStreamThroughputMbPerSec`, `setInterDCStreamThroughputMbPerSec`, and `getInterDCStreamThroughputMbPerSec`, which will be removed in a future major release. - The config property `repair_session_space_in_mb` was wrongly advertised in previous versions that it should be set in megabytes when it is interpreted internally in mebibytes. To reduce the confusion we added two new JMX methods `setRepairSessionSpaceInMebibytes(int sizeInMebibytes)` and `getRepairSessionSpaceInMebibytes`. They replace the now deprecated methods `setRepairSessionSpaceInMegabytes(int sizeInMegabytes)` and `getRepairSessionSpaceInMegabytes`, which will be removed in a future major release. - There is a new cassandra.yaml version 2. Units suffixes should be provided for all rates(B/s|KiB/s|MiB/s), memory (B|KiB|MiB|GiB) and duration(d|h|m|s|ms|us|µs|ns) parameters. List of changed parameters and details to consider during configuration setup can be found at (CASSANDRA-15234) Backward compatibility with the old cassandra.yaml file will be in place until at least the next major version. By default we refuse starting Cassandra with a config containing both old and new config keys for the same parameter. Start Cassandra with -Dcassandra.allow_new_old_config_keys=true to override. For historical reasons duplicate config keys in cassandra.yaml are allowed by default, start Cassandra with -Dcassandra.allow_duplicate_config_keys=false to disallow this. - Many cassandra.yaml parameters' names have been changed. Full list and details to consider during configuration setup when installing/upgrading Cassandra can be found at (CASSANDRA-15234) - Negative values cannot be used for parameters of type data rate, duration and data storage with both old and new cassandra.yaml version. Only exception is if you use old cassandra.yaml, pre-CASSANDRA-15234 - then -1 or other negative values which were advertised as an option to disable config parameters in the old cassandra.yaml are still used. Those are probably converted to null value with the new cassandra.yaml, as written in the new cassandra.yaml version and docs. - Before you upgrade, if you are using `cassandra.auth_bcrypt_gensalt_log2_rounds` property, confirm it is set to value lower than 31 otherwise Cassandra will fail to start. See CASSANDRA-9384 for further details. You also need to regenerate passwords for users for who the password was created while the above property was set to be more than 30 otherwise they will not be able to log in. - JNA library was updated from 5.6.0 to 5.9.0. In version 5.7.0, Darwin support for M1 devices was fixed but prebuild native library for Darwin x86 (32bit Java on Mac OS) was removed. - The config properties for setting the streaming throughput `stream_throughput_outbound_megabits_per_sec` and `inter_dc_stream_throughput_outbound_megabits_per_sec` were incorrectly interpreted as mebibits. This has been fixed by CASSANDRA-17243, so the values for these properties will now indicate a throughput ~4.6% lower than what was actually applied in previous versions. This also affects the setters and getters for these properties in the JMX MBean `org.apache.cassandra.db:type=StorageService` and the nodetool commands `set/getstreamthroughput` and `set/getinterdcstreamthroughput`. - Steps for upgrading Paxos - Set paxos_variant: v2 across the cluster. This may be set via JMX, but should also be written persistently to any yaml. - Ensure paxos repairs are running regularly, either as part of normal incremental repair workflows or on their own separate schedule. These operations are cheap and better to run frequently (e.g. once per hour) - Set paxos_state_purging: repaired across the cluster. This may be set via JMX, but should also be written persistently to any yaml. NOTE: once this has been set, you must not restore paxos_state_purging: legacy. If this setting must be disabled you must instead set paxos_state_purging: gc_grace. This may be necessary if paxos repairs must be disabled for some reason on an extended basis, but in this case your applications must restore default commit consistency to ensure correctness. - Applications may now safely be updated to use ANY commit consistency level (or LOCAL_QUORUM, as preferred). Uncontended writes should now take 2 round-trips, and uncontended reads should typically take one round-trip. - A required [f|force] flag has been added to both "nodetool verify" and the standalone "sstableverify" tools. These tools have some subtleties and should not be used unless the operator is familiar with what they do and do not do, as well as the edge cases associated with their use. NOTE: ANY SCRIPTS THAT RELY ON sstableverify OR nodetool verify WILL STOP WORKING UNTIL MODIFIED. Please see CASSANDRA-17017 for details: - `MutationExceededMaxSizeException` thrown when a mutation exceeds `max_mutation_size` inherits from `InvalidRequestException` instead of `RuntimeException`. See CASSANDRA-17456 for details. Deprecation ----------- - In the command line options for ``: deprecate the `-t`, `--throttle`, `-idct`, and `--inter-dc-throttle` options for setting the throttle and inter-datacenter throttle options in Mbps. Instead, users are instructed to use the `--throttle-mib`, and `--inter-dc-throttle-mib` for setting the throttling options in MiB/s. Additionally, in the loader options builder `$Builder`: deprecate the `throttle(int)`, `interDcThrottle(int)`, `entireSSTableThrottle(int)`, and the `entireSSTableInterDcThrottle(int)` methods. - In the JMX MBean `org.apache.cassandra.db:type=StorageService`: deprecate getter method `getStreamThroughputMbitPerSec` in favor of getter method `getStreamThroughputMbitPerSecAsDouble`; deprecate getter method `getStreamThroughputMbPerSec` in favor of getter methods `getStreamThroughputMebibytesPerSec` and `getStreamThroughputMebibytesPerSecAsDouble`; deprecate getter method `getInterDCStreamThroughputMbitPerSec` in favor of getter method `getInterDCStreamThroughputMbitPerSecAsDouble`; deprecate getter method `getInterDCStreamThroughputMbPerSec` in favor of getter methods `getInterDCStreamThroughputMebibytesPerSecAsDouble`; deprecate getter method `getCompactionThroughputMbPerSec` in favor of getter methods `getCompactionThroughtputMibPerSecAsDouble` and `getCompactionThroughtputBytesPerSec`; deprecate setter methods `setStreamThroughputMbPerSec` and `setStreamThroughputMbitPerSec` in favor of `setStreamThroughputMebibytesPerSec`; deprecate setter methods `setInterDCStreamThroughputMbitPerSec` and `setInterDCStreamThroughputMbPerSec` in favor of `setInterDCStreamThroughputMebibytesPerSec`. The deprecated JMX methods may return a rounded value so if precision is important, you want to use the new getters. While those deprecated JMX getters will return a rounded number, the nodetool commands `getstreamthroughput` and `getinterdcstreamthroughput` will throw Runtime Exceptions advising to use the new -d flag in case an integer cannot be returned. See CASSANDRA-17725 for further details. - Deprecate public method `setRate(final double throughputMbPerSec)` in `Compaction Manager` in favor of `setRateInBytes(final double throughputBytesPerSec)` - `withBufferSizeInMB(int size)` in `StressCQLSSTableWriter.Builder` class is deprecated in favor of `withBufferSizeInMiB(int size)` No change of functionality in the new one, only name change for clarity in regards to units and to follow naming standartization. - `withBufferSizeInMB(int size)` in `CQLSSTableWriter.Builder` class is deprecated in favor of `withBufferSizeInMiB(int size)` No change of functionality in the new one, only name change for clarity in regards to units and to follow naming standartization. - The properties `keyspace_count_warn_threshold` and `table_count_warn_threshold` in cassandra.yaml have been deprecated in favour of the new `guardrails.keyspaces` and `guardrails.tables` properties and will be removed in a subsequent major version. This also affects the setters and getters for those properties in the JMX MBean `org.apache.cassandra.db:type=StorageService`, which are equally deprecated in favour of the analogous methods in the JMX MBean `org.apache.cassandra.db:type=Guardrails`. See CASSANDRA-17195 for further details. - The functionality behind the property `windows_timer_interval` was removed as part of CASSANDRA-16956. The property is still present but it is deprecated and it is just a place-holder to prevent breaking upgrades. This property is expected to be fully removed in the next major release of Cassandra. 4.0 === New features ------------ - Full support for Java 11, it is not experimental anymore. - The data of the system keyspaces using a local strategy (at the exception of the system.batches, system.paxos, system.compaction_history, system.prepared_statements and tables) is now stored by default in the first data directory, instead of being distributed among all the data directories. This approach will allow the server to tolerate the failure of the other disks. To ensure that a disk failure will not bring a node down, it is possible to use the system_data_file_directory yaml property to store the local system keyspaces data on a directory that provides redundancy. On node startup the local system keyspaces data will be automatically migrated if needed to the correct location. - Nodes will now bootstrap all intra-cluster connections at startup by default and wait 10 seconds for the all but one node in the local data center to be connected and marked UP in gossip. This prevents nodes from coordinating requests and failing because they aren't able to connect to the cluster fast enough. block_for_peers_timeout_in_secs in cassandra.yaml can be used to configure how long to wait (or whether to wait at all) and block_for_peers_in_remote_dcs can be used to also block on all but one node in each remote DC as well. See CASSANDRA-14297 and CASSANDRA-13993 for more information. - *Experimental* support for Transient Replication and Cheap Quorums introduced by CASSANDRA-14404 The intended audience for this functionality is expert users of Cassandra who are prepared to validate every aspect of the database for their application and deployment practices. Future releases of Cassandra will make this feature suitable for a wider audience. - *Experimental* support for Java 11 has been added. JVM options that differ between or are specific for Java 8 and 11 have been moved from jvm.options into jvm8.options and jvm11.options. IMPORTANT: Running C* on Java 11 is *experimental* and do it at your own risk. - LCS now respects the max_threshold parameter when compacting - this was hard coded to 32 before, but now it is possible to do bigger compactions when compacting from L0 to L1. This also applies to STCS-compactions in L0 - if there are more than 32 sstables in L0 we will compact at most max_threshold sstables in an L0 STCS compaction. See CASSANDRA-14388 for more information. - There is now an option to automatically upgrade sstables after Cassandra upgrade, enable either in `cassandra.yaml:automatic_sstable_upgrade` or via JMX during runtime. See CASSANDRA-14197. - `nodetool refresh` has been deprecated in favour of `nodetool import` - see CASSANDRA-6719 for details - An experimental option to compare all merkle trees together has been added - for example, in a 3 node cluster with 2 replicas identical and 1 out-of-date, with this option enabled, the out-of-date replica will only stream a single copy from up-to-date replica. Enable it by adding "-os" to nodetool repair. See CASSANDRA-3200. - The currentTimestamp, currentDate, currentTime and currentTimeUUID functions have been added. See CASSANDRA-13132 - Support for arithmetic operations between `timestamp`/`date` and `duration` has been added. See CASSANDRA-11936 - Support for arithmetic operations on number has been added. See CASSANDRA-11935 - Preview expected streaming required for a repair (nodetool repair --preview), and validate the consistency of repaired data between nodes (nodetool repair --validate). See CASSANDRA-13257 - Support for selecting Map values and Set elements has been added for SELECT queries. See CASSANDRA-7396 - Change-Data-Capture has been modified to make CommitLogSegments available immediately upon creation via hard-linking the files. This means that incomplete segments will be available in cdc_raw rather than fully flushed. See documentation and CASSANDRA-12148 for more detail. - The initial build of materialized views can be parallelized. The number of concurrent builder threads is specified by the property `cassandra.yaml:concurrent_materialized_view_builders`. This property can be modified at runtime through both JMX and the new `setconcurrentviewbuilders` and `getconcurrentviewbuilders` nodetool commands. See CASSANDRA-12245 for more details. - There is now a binary full query log based on Chronicle Queue that can be controlled using nodetool enablefullquerylog, disablefullquerylog, and resetfullquerylog. The log contains all queries invoked, approximate time they were invoked, any parameters necessary to bind wildcard values, and all query options. A human readable version of the log can be dumped or tailed using the new bin/fqltool utility. The full query log is designed to be safe to use in production and limits utilization of heap memory and disk space with limits you can specify when enabling the log. See nodetool and fqltool help text for more information. - SSTableDump now supports the -l option to output each partition as it's own json object See CASSANDRA-13848 for more detail - Metric for coordinator writes per table has been added. See CASSANDRA-14232 - Nodetool cfstats now has options to sort by various metrics as well as limit results. - Operators can restrict login user activity to one or more datacenters. See `network_authorizer` in cassandra.yaml, and the docs for create and alter role statements. CASSANDRA-13985 - Roles altered from login=true to login=false will prevent existing connections from executing any statements after the cache has been refreshed. CASSANDRA-13985 - Support for audit logging of database activity. If enabled, logs every incoming CQL command request, Authentication (successful as well as unsuccessful login) to a node. - Faster streaming of entire SSTables using ZeroCopy APIs. If enabled, Cassandra will use stream entire SSTables, significantly speeding up transfers. Any streaming related operations will see corresponding improvement. See CASSANDRA-14556. - NetworkTopologyStrategy now supports auto-expanding the replication_factor option into all available datacenters at CREATE or ALTER time. For example, specifying replication_factor: 3 translates to three replicas in every datacenter. This auto-expansion will _only add_ datacenters for safety. See CASSANDRA-14303 for more details. - Added Python 3 support so cqlsh and cqlshlib is now compatible with Python 2.7 and Python 3.6. Added --python option to cqlsh so users can specify the path to their chosen Python interpreter. See CASSANDRA-10190 for details. - Support for server side DESCRIBE statements has been added. See CASSANDRA-14825 - It is now possible to rate limit snapshot creation/clearing. See CASSANDRA-13019 - Authentication reads and writes have been changed from a mix of ONE, LOCAL_ONE, and QUORUM to LOCAL_QUORUM on reads and EACH_QUORUM on writes. This is configurable via cassandra.yaml with auth_read_consistency_level and auth_write_consistency_level respectively. See CASSANDRA-12988. Upgrading --------- - If you were on 4.0.1 - 4.0.5 and if you haven't set the compaction_thoroughput_mb_per_sec in your 4.0 cassandra.yaml file but you relied on the internal default value,then compaction_throughput_mb_per_sec was equal to an old default value of 16MiB/s in Cassandra 4.0. After CASSANDRA-17790 this is changed to 64MiB/s to match the default value in cassandra.yaml. If you prefer the old one of 16MiB/s, you need to set it explicitly in your cassandra.yaml file. - otc_coalescing_strategy, otc_coalescing_window_us, otc_coalescing_enough_coalesced_messages, otc_backlog_expiration_interval_ms are deprecated and will be removed at earliest with next major release. otc_coalescing_strategy is disabled since 3.11. - As part of the Internode Messaging improvement work in CASSANDRA-15066, internode_send_buff_size_in_bytes and internode_recv_buff_size_in_bytes were renamed to internode_socket_send_buffer_size_in_bytes and internode_socket_receive_buffer_size_in_bytes. To support upgrades pre-4.0, we add backward compatibility and currently both old and new names should work. Cassandra 4.0.0 and Cassandra 4.0.1 work ONLY with the new names (They weren't updated in cassandra.yaml though). - DESCRIBE|DESC was moved to server side in Cassandra 4.0. As a consequence DESRIBE|DESC will not work in cqlsh 6.0.0 being connected to earlier major Cassandra versions where DESCRIBE does not exist server side. - cqlsh shell startup script now prefers 'python3' before 'python' when identifying a runtime. - As part of the Internode Messaging improvement work in CASSANDRA-15066, matching response verbs for every request verb were introduced and verbs were renamed. DroppedMessageMetrics pre-4.0 are now available with _REQ suffix. As part of CASSANDRA-16083, we added DroppedMessageMetrics backward compatibility layer which exposes the metrics with their old names too. Only the value for verbs READ and RANGE_SLICE will differ from the same metrics in 3.11 as it does not include anymore the responses dropped, only the requests. After being deprecated in 3.11 PAGED_RANGE was fully removed in 4.0. ConditionNotMet metric has been moved under scope CASClientWriteRequestMetrtic but as part of CASSANDRA-16083, backward compatibility layer was added so it can be still exposed under the old 3.11 scope. - Native protocol v5 is promoted from beta in this release. The wire format has changed significantly and users should take care to ensure client drivers are upgraded to a version with support for the final v5 format, if currently connecting over v5-beta. (CASSANDRA-15299, CASSANDRA-14973) - Cassandra removed support for the OldNetworkTopologyStrategy. Before upgrading you will need to change the replication strategy for the keyspaces using this strategy to the NetworkTopologyStrategy. (CASSANDRA-13990) - Sstables for tables using with a frozen UDT written by C* 3.0 appear as corrupted. Background: The serialization-header in the -Statistics.db sstable component contains the type information of the table columns. C* 3.0 write incorrect type information for frozen UDTs by omitting the "frozen" information. Non-frozen UDTs were introduced by CASSANDRA-7423 in C* 3.6. Since then, the missing "frozen" information leads to deserialization issues that result in CorruptSSTableExceptions, potentially other exceptions as well. As a mitigation, the sstable serialization-headers are rewritten to contain the missing "frozen" information for UDTs once, when an upgrade from C* 3.0 is detected. This migration does not touch snapshots or backups. The sstablescrub tool now performs a check of the sstable serialization-header against the schema. A mismatch of the types in the serialization-header and the schema will cause sstablescrub to error out and stop by default. See the new `-e` option. `-e off` disables the new validation code. `-e fix` or `-e fix-only`, e.g. `sstablescrub -e fix keyspace table`, will validate the serialization-header, rewrite the non-frozen UDTs in the serialzation-header to frozen UDTs, if that matches the schema, and continue with scrub. See `sstablescrub -h`. (CASSANDRA-15035) - CASSANDRA-13241 lowered the default chunk_lengh_in_kb for compresesd tables from 64kb to 16kb. For highly compressible data this can have a noticeable impact on space utilization. You may want to consider manually specifying this value. - Additional columns have been added to system_distributed.repair_history, system_traces.sessions and As a result select queries against these tables - including queries against tracing tables performed automatically by the drivers and cqlsh - will fail and generate an error in the log during upgrade when the cluster is mixed version. On 3.x side this will also lead to broken internode connections and lost messages. Cassandra versions 3.0.20 and 3.11.6 pre-add these columns (see CASSANDRA-15385), so please make sure to upgrade to those versions or higher before upgrading to 4.0 for query tracing to not cause any issues during the upgrade to 4.0. - Timestamp ties between values resolve differently: if either value has a TTL, this value always wins. This is to provide consistent reconciliation before and after the value expires into a tombstone. - Support for legacy auth tables in the system_auth keyspace (users, permissions, credentials) and the migration code has been removed. Migration of these legacy auth tables must have been completed before the upgrade to 4.0 and the legacy tables must have been removed. See the 'Upgrading' section for version 2.2 for migration instructions. - Cassandra 4.0 removed support for the deprecated Thrift interface. Amongst other things, this implies the removal of all yaml options related to thrift ('start_rpc', rpc_port, ...). - Cassandra 4.0 removed support for any pre-3.0 format. This means you cannot upgrade from a 2.x version to 4.0 directly, you have to upgrade to a 3.0.x/3.x version first (and run upgradesstable). In particular, this mean Cassandra 4.0 cannot load or read pre-3.0 sstables in any way: you will need to upgrade those sstable in 3.0.x/3.x first. - Upgrades from 3.0.x or 3.x are supported since 3.0.13 or 3.11.0, previous versions will causes issues during rolling upgrades (CASSANDRA-13274). - Cassandra will no longer allow invalid keyspace replication options, such as invalid datacenter names for NetworkTopologyStrategy. Operators MUST add new nodes to a datacenter before they can set set ALTER or CREATE keyspace replication policies using that datacenter. Existing keyspaces will continue to operate, but CREATE and ALTER will validate that all datacenters specified exist in the cluster. - Cassandra 4.0 fixes a problem with incremental repair which caused repaired data to be inconsistent between nodes. The fix changes the behavior of both full and incremental repairs. For full repairs, data is no longer marked repaired. For incremental repairs, anticompaction is run at the beginning of the repair, instead of at the end. If incremental repair was being used prior to upgrading, a full repair should be run after upgrading to resolve any inconsistencies. - Config option index_interval has been removed (it was deprecated since 2.0) - Deprecated repair JMX APIs are removed. - The version of snappy-java has been upgraded to - the miniumum value for internode message timeouts is 10ms. Previously, any positive value was allowed. See cassandra.yaml entries like read_request_timeout_in_ms for more details. - Cassandra 4.0 allows a single port to be used for both secure and insecure connections between cassandra nodes (CASSANDRA-10404). See the yaml for specific property changes, and see the security doc for full details. - Due to the parallelization of the initial build of materialized views, the per token range view building status is stored in the new table `system.view_builds_in_progress`. The old table `system.views_builds_in_progress` is no longer used and can be removed. See CASSANDRA-12245 for more details. - Config option commitlog_sync_batch_window_in_ms has been deprecated as it's documentation has been incorrect and the setting itself near useless. Batch mode remains a valid commit log mode, however. - There is a new commit log mode, group, which is similar to batch mode but blocks for up to a configurable number of milliseconds between disk flushes. - nodetool clearsnapshot now required the --all flag to remove all snapshots. Previous behavior would delete all snapshots by default. - Nodes are now identified by a combination of IP, and storage port. Existing JMX APIs, nodetool, and system tables continue to work and accept/return just an IP, but there is a new version of each that works with the full unambiguous identifier. You should prefer these over the deprecated ambiguous versions that only work with an IP. This was done to support multiple instances per IP. Additionally we are moving to only using a single port for encrypted and unencrypted traffic and if you want multiple instances per IP you must first switch encrypted traffic to the storage port and not a separate encrypted port. If you want to use multiple instances per IP with SSL you will need to use StartTLS on storage_port and set outgoing_encrypted_port_source to gossip outbound connections know what port to connect to for each instance. Before changing storage port or native port at nodes you must first upgrade the entire cluster and clients to 4.0 so they can handle the port not being consistent across the cluster. - Names of AWS regions/availability zones have been cleaned up to more correctly match the Amazon names. There is now a new option in conf/ that lets users enable the correct names for new clusters, or use the legacy names for existing clusters. See conf/ for details. - Background repair has been removed. dclocal_read_repair_chance and read_repair_chance table options have been removed and are now rejected. See CASSANDRA-13910 for details. - Internode TCP connections that do not ack segments for 30s will now be automatically detected and closed via the Linux TCP_USER_TIMEOUT socket option. This should be exceedingly rare, but AWS networks (and other stateful firewalls) apparently suffer from this issue. You can tune the timeouts on TCP connection and segment ack via the `cassandra.yaml:internode_tcp_connect_timeout_in_ms` and `cassandra.yaml:internode_tcp_user_timeout_in_ms` options respectively. See CASSANDRA-14358 for details. - repair_session_space_in_mb setting has been added to cassandra.yaml to allow operators to reduce merkle tree size if repair is creating too much heap pressure. The repair_session_max_tree_depth setting added in 3.0.19 and 3.11.5 is deprecated in favor of this setting. See CASSANDRA-14096 - The flags 'enable_materialized_views' and 'enable_sasi_indexes' in cassandra.yaml have been set as false by default. Operators should modify them to allow the creation of new views and SASI indexes, the existing ones will continue working. See CASSANDRA-14866 for details. - CASSANDRA-15216 - The flag 'cross_node_timeout' has been set as true by default. This change is done under the assumption that users have setup NTP on their clusters or otherwise synchronize their clocks, and that clocks are mostly in sync, since this is a requirement for general correctness of last write wins. - CASSANDRA-15257 removed the joda time dependency. Any time formats passed will now need to conform to java.time.format.DateTimeFormatter. Most notably, days and months must be two digits, and years exceeding four digits need to be prefixed with a plus or minus sign. - cqlsh now returns a non-zero code in case of errors. This is a backward incompatible change so it may break existing scripts that rely on the current behavior. See CASSANDRA-15623 for more details. - Updated the default compaction_throughput_mb_per_sec to to 64. The original default (16) was meant for spinning disk volumes. See CASSANDRA-14902 for details. - Custom compaction strategies must now handle getting sstables added/removed notifications for sstables already added/removed - see CASSANDRA-14103 for details. - Support for JNA with glibc 2.6 and earlier has been removed. Centos 5, Debian 4, and Ubuntu 7.10 operating systems must be first upgraded. See CASSANDRA-16212 for more. - In cassandra.yaml, when using vnodes num_tokens must be defined if initial_token is defined. If it is not defined, or not equal to the numbers of tokens defined in initial_tokens, the node will not start. See CASSANDRA-14477 for details. - CASSANDRA-13701 To give a better out of the box experience, the default 'num_tokens' value has been changed from 256 to 16 for reasons described in 'allocate_tokens_for_local_replication_factor' is also uncommented and set to 3. Please note when upgrading that if the 'num_tokens' value is different than what you have configured, the upgraded node will refuse to start. Also note that if a new node joining the cluster has a different value for 'num_tokens' than the rest of the datacenter, the new node will be responsible for a different amount of data than the rest of the datacenter. Deprecation ----------- - JavaScript user-defined functions have been deprecated. They are planned for removal in the next major release. (CASSANDRA-17280) - The JMX MBean org.apache.cassandra.metrics:type=Streaming,name=ActiveOutboundStreams has been deprecated and will be removed in a subsequent major version. This metric was not updated since several version already. - The JMX MBean org.apache.cassandra.db:type=BlacklistedDirectories has been deprecated in favor of org.apache.cassandra.db:type=DisallowedDirectories and will be removed in a subsequent major version. - cqlsh support of 2.7 is deprecated and will warn when running with Python 2.7. ALTER ... DROP COMPACT STORAGE ------------------------------ - Following a discussion regarding concerns about the safety of the 'ALTER ... DROP COMPACT STORAGE' statement, the C* development community does not recommend its use in production and considers it experimental (see - An 'enable_drop_compact_storage' flag has been added to cassandra.yaml to allow operators to prevent its use. Materialized Views ------------------- - Following a discussion regarding concerns about the design and safety of Materialized Views, the C* development community no longer recommends them for production use, and considers them experimental. Warnings messages will now be logged when they are created. (See - An 'enable_materialized_views' flag has been added to cassandra.yaml to allow operators to prevent creation of views - CREATE MATERIALIZED VIEW syntax has become stricter. Partition key columns are no longer implicitly considered to be NOT NULL, and no base primary key columns get automatically included in view definition. You have to specify them explicitly now. Windows Support Removed ----------------------- - Due to the lack of maintenance and testing, Windows support is removed from this version onward. The developers who use Windows 10 still can run Apache Cassandra locally using WSL2 (Windows Subsystem for Linux version 2), Docker for Windows, or virtualization platform like Hyper-V and VirtualBox. 3.11.10 ====== Upgrading --------- - This release fix a correctness issue with SERIAL reads, and LWT writes that do not apply. Unfortunately, this fix has a performance impact on read performance at the SERIAL or LOCAL_SERIAL consistency levels. For heavy users of such SERIAL reads, the performance impact may be noticeable and may also result in an increased of timeouts. For that reason, a opt-in system property has been added to disable the fix: -Dcassandra.unsafe.disable-serial-reads-linearizability=true Use this flag at your own risk as it revert SERIAL reads to the incorrect behavior of previous versions. See CASSANDRA-12126 for details. - SASI's `max_compaction_flush_memory_in_mb` setting was previously getting interpreted in bytes. From 3.11.8 it is correctly interpreted in megabytes, but prior to 3.11.10 previous configurations of this setting will lead to nodes OOM during compaction. From 3.11.10 previous configurations will be detected as incorrect, logged, and the setting reverted to the default value of 1GB. It is up to the user to correct the setting after an upgrade, via dropping and recreating the index. See CASSANDRA-16071 for details. 3.11.6 ====== Upgrading --------- - Sstables for tables using with a frozen UDT written by C* 3.0 appear as corrupted. Background: The serialization-header in the -Statistics.db sstable component contains the type information of the table columns. C* 3.0 write incorrect type information for frozen UDTs by omitting the "frozen" information. Non-frozen UDTs were introduced by CASSANDRA-7423 in C* 3.6. Since then, the missing "frozen" information leads to deserialization issues that result in CorruptSSTableExceptions, potentially other exceptions as well. As a mitigation, the sstable serialization-headers are rewritten to contain the missing "frozen" information for UDTs once, when an upgrade from C* 3.0 is detected. This migration does not touch snapshots or backups. The sstablescrub tool now performs a check of the sstable serialization-header against the schema. A mismatch of the types in the serialization-header and the schema will cause sstablescrub to error out and stop by default. See the new `-e` option. `-e off` disables the new validation code. `-e fix` or `-e fix-only`, e.g. `sstablescrub -e fix keyspace table`, will validate the serialization-header, rewrite the non-frozen UDTs in the serialzation-header to frozen UDTs, if that matches the schema, and continue with scrub. See `sstablescrub -h`. (CASSANDRA-15035) - repair_session_max_tree_depth setting has been added to cassandra.yaml to allow operators to reduce merkle tree size if repair is creating too much heap pressure. See CASSANDRA-14096 for details. 3.11.5 ====== Experimental features --------------------- - An 'enable_sasi_indexes' flag, true by default, has been added to cassandra.yaml to allow operators to prevent the creation of new SASI indexes, which are considered experimental and are not recommended for production use. (See - The flags 'enable_sasi_indexes' and 'enable_materialized_views' have been grouped under an experimental features section in cassandra.yaml. 3.11.4 ====== Upgrading --------- - The order of static columns in SELECT * has been fixed to match that of 2.0 and 2.1 - they are now sorted alphabetically again, by their name, just like regular columns are. If you use prepared statements and SELECT * queries, and have both simple and collection static columns in those tables, and are upgrading from an earlier 3.0 version, then you might be affected by this change. Please see CASSANDRA-14638 for details. 3.11.3 ===== Upgrading --------- - Materialized view users upgrading from 3.0.15 (3.0.X series) or 3.11.1 (3.11.X series) and later that have performed range movements (join, decommission, move, etc), should run repair on the base tables, and subsequently on the views to ensure data affected by CASSANDRA-14251 is correctly propagated to all replicas. - Changes to bloom_filter_fp_chance will no longer take effect on existing sstables when the node is restarted. Only compactions/upgradesstables regenerates bloom filters and Summaries sstable components. See CASSANDRA-11163 3.11.2 ====== Upgrading --------- - See MAXIMUM TTL EXPIRATION DATE NOTICE above. - Cassandra is now relying on the JVM options to properly shutdown on OutOfMemoryError. By default it will rely on the OnOutOfMemoryError option as the ExitOnOutOfMemoryError and CrashOnOutOfMemoryError options are not supported by the older 1.7 and 1.8 JVMs. A warning will be logged at startup if none of those JVM options are used. See CASSANDRA-13006 for more details - Cassandra is not logging anymore by default an Heap histogram on OutOfMemoryError. To enable that behavior set the 'cassandra.printHeapHistogramOnOutOfMemoryError' System property to 'true'. See CASSANDRA-13006 for more details. 3.11.1 ====== Upgrading --------- - Creating Materialized View with filtering on non-primary-key base column (added in CASSANDRA-10368) is disabled, because the liveness of view row is depending on multiple filtered base non-key columns and base non-key column used in view primary-key. This semantic cannot be supported without storage format change, see CASSANDRA-13826. For append-only use case, you may still use this feature with a startup flag: "" Compact Storage (only when upgrading from 3.X or any version lower than 3.0.15) --------------- - Starting version 4.0, Thrift is no longer supported. Starting version 5.0, COMPACT STORAGE will no longer be supported. 'ALTER ... DROP COMPACT STORAGE' statement makes Compact Tables CQL-compatible, exposing internal structure of Thrift/Compact Tables. You can find more details on exposed internal structure under: For uninterrupted cluster upgrades, drivers now support 'NO_COMPACT' startup option. Supplying this flag will have same effect as 'DROP COMPACT STORAGE', but only for the current connection. In order to upgrade, clients supporting a non-compact schema view can be rolled out gradually. When all the clients are updated 'ALTER ... DROP COMPACT STORAGE' can be executed. After dropping compact storage, ’NO_COMPACT' option will have no effect after that. Materialized Views ------------------- Materialized Views (only when upgrading from any version lower than 3.0.15 (3.0 series) or 3.11.1 (3.X series)) --------------------------------------------------------------------------------------- - Cassandra will no longer allow dropping columns on tables with Materialized Views. - A change was made in the way the Materialized View timestamp is computed, which may cause an old deletion to a base column which is view primary key (PK) column to not be reflected in the view when repairing the base table post-upgrade. This condition is only possible when a column deletion to an MV primary key (PK) column not present in the base table PK (via UPDATE base SET view_pk_col = null or DELETE view_pk_col FROM base) is missed before the upgrade and received by repair after the upgrade. If such column deletions are done on a view PK column which is not a base PK, it's advisable to run repair on the base table of all nodes prior to the upgrade. Alternatively it's possible to fix potential inconsistencies by running repair on the views after upgrade or drop and re-create the views. See CASSANDRA-11500 for more details. - Removal of columns not selected in the Materialized View (via UPDATE base SET unselected_column = null or DELETE unselected_column FROM base) may not be properly reflected in the view in some situations so we advise against doing deletions on base columns not selected in views until this is fixed on CASSANDRA-13826. 3.11.0 ====== Upgrading --------- - Creating Materialized View with filtering on non-primary-key base column (added in CASSANDRA-10368) is disabled, because the liveness of view row is depending on multiple filtered base non-key columns and base non-key column used in view primary-key. This semantic cannot be supported without storage format change, see CASSANDRA-13826. For append-only use case, you may still use this feature with a startup flag: "" - The NativeAccessMBean isAvailable method will only return true if the native library has been successfully linked. Previously it was returning true if JNA could be found but was not taking into account link failures. - Primary ranges in the system.size_estimates table are now based on the keyspace replication settings and adjacent ranges are no longer merged (CASSANDRA-9639). - In 2.1, the default for otc_coalescing_strategy was 'DISABLED'. In 2.2 and 3.0, it was changed to 'TIMEHORIZON', but that value was shown to be a performance regression. The default for 3.11.0 and newer has been reverted to 'DISABLED'. Users upgrading from Cassandra 2.2 or 3.0 should be aware that the default has changed. - The StorageHook interface has been modified to allow to retrieve read information from SSTableReader (CASSANDRA-13120). 3.10 ==== New features ------------ - New `DurationType` (cql duration). See CASSANDRA-11873 - Runtime modification of concurrent_compactors is now available via nodetool - Support for the assignment operators +=/-= has been added for update queries. - An Index implementation may now provide a task which runs prior to joining the ring. See CASSANDRA-12039 - Filtering on partition key columns is now also supported for queries without secondary indexes. - A slow query log has been added: slow queries will be logged at DEBUG level. For more details refer to CASSANDRA-12403 and slow_query_log_timeout_in_ms in cassandra.yaml. - Support for GROUP BY queries has been added. - A new compaction-stress tool has been added to test the throughput of compaction for any cassandra-stress user schema. see compaction-stress help for how to use. - Compaction can now take into account overlapping tables that don't take part in the compaction to look for deleted or overwritten data in the compacted tables. Then such data is found, it can be safely discarded, which in turn should enable the removal of tombstones over that data. The behavior can be engaged in two ways: - as a "nodetool garbagecollect -g CELL/ROW" operation, which applies single-table compaction on all sstables to discard deleted data in one step. - as a "provide_overlapping_tombstones:CELL/ROW/NONE" compaction strategy flag, which uses overlapping tables as a source of deletions/overwrites during all compactions. The argument specifies the granularity at which deleted data is to be found: - If ROW is specified, only whole deleted rows (or sets of rows) will be discarded. - If CELL is specified, any columns whose value is overwritten or deleted will also be discarded. - NONE (default) specifies the old behavior, overlapping tables are not used to decide when to discard data. Which option to use depends on your workload, both ROW and CELL increase the disk load on compaction (especially with the size-tiered compaction strategy), with CELL being more resource-intensive. Both should lead to better read performance if deleting rows (resp. overwriting or deleting cells) is common. - Prepared statements are now persisted in the table prepared_statements in the system keyspace. Upon startup, this table is used to preload all previously prepared statements - i.e. in many cases clients do not need to re-prepare statements against restarted nodes. - cqlsh can now connect to older Cassandra versions by downgrading the native protocol version. Please note that this is currently not part of our release testing and, as a consequence, it is not guaranteed to work in all cases. See CASSANDRA-12150 for more details. - Snapshots that are automatically taken before a table is dropped or truncated will have a "dropped" or "truncated" prefix on their snapshot tag name. - Metrics are exposed for successful and failed authentication attempts. These can be located using the object names org.apache.cassandra.metrics:type=Client,name=AuthSuccess and org.apache.cassandra.metrics:type=Client,name=AuthFailure respectively. - Add support to "unset" JSON fields in prepared statements by specifying DEFAULT UNSET. See CASSANDRA-11424 for details - Allow TTL with null value on insert and update. It will be treated as equivalent to inserting a 0. - Removed outboundBindAny configuration property. See CASSANDRA-12673 for details. Upgrading --------- - Support for alter types of already defined tables and of UDTs fields has been disabled. If it is necessary to return a different type, please use casting instead. See CASSANDRA-12443 for more details. - Specifying the default_time_to_live option when creating or altering a materialized view was erroneously accepted (and ignored). It is now properly rejected. - Only Java and JavaScript are now supported UDF languages. The sandbox in 3.0 already prevented the use of script languages except Java and JavaScript. - Compaction now correctly drops sstables out of CompactionTask when there isn't enough disk space to perform the full compaction. This should reduce pending compaction tasks on systems with little remaining disk space. - Request timeouts in cassandra.yaml (read_request_timeout_in_ms, etc) now apply to the "full" request time on the coordinator. Previously, they only covered the time from when the coordinator sent a message to a replica until the time that the replica responded. Additionally, the previous behavior was to reset the timeout when performing a read repair, making a second read to fix a short read, and when subranges were read as part of a range scan or secondary index query. In 3.10 and higher, the timeout is no longer reset for these "subqueries". The entire request must complete within the specified timeout. As a consequence, your timeouts may need to be adjusted to account for this. See CASSANDRA-12256 for more details. - Logs written to stdout are now consistent with logs written to files. Time is now local (it was UTC on the console and local in files). Date, thread, file and line info where added to stdout. (see CASSANDRA-12004) - The 'clientutil' jar, which has been somewhat broken on the 3.x branch, is not longer provided. The features provided by that jar are provided by any good java driver and we advise relying on drivers rather on that jar, but if you need that jar for backward compatiblity until you do so, you should use the version provided on previous Cassandra branch, like the 3.0 branch (by design, the functionality provided by that jar are stable accross versions so using the 3.0 jar for a client connecting to 3.x should work without issues). - (Tools development) DatabaseDescriptor no longer implicitly startups components/services like commit log replay. This may break existing 3rd party tools and clients. In order to startup a standalone tool or client application, use the DatabaseDescriptor.toolInitialization() or DatabaseDescriptor.clientInitialization() methods. Tool initialization sets up partitioner, snitch, encryption context. Client initialization just applies the configuration but does not setup anything. Instead of using Config.setClientMode() or Config.isClientMode(), which are deprecated now, use one of the appropiate new methods in DatabaseDescriptor. - Application layer keep-alives were added to the streaming protocol to prevent idle incoming connections from timing out and failing the stream session (CASSANDRA-11839). This effectively deprecates the streaming_socket_timeout_in_ms property in favor of streaming_keep_alive_period_in_secs. See cassandra.yaml for more details about this property. - Duration litterals support the ISO 8601 format. By consequence, identifiers matching that format (e.g P2Y or P1MT6H) will not be supported anymore (CASSANDRA-11873). 3.8 === New features ------------ - Shared pool threads are now named according to the stage they are executing tasks for. Thread names mentioned in traced queries change accordingly. - A new option has been added to cassandra-stress "-rate fixed={number}/s" that forces a scheduled rate of operations/sec over time. Using this, stress can accurately account for coordinated ommission from the stress process. - The cassandra-stress "-rate limit=" option has been renamed to "-rate throttle=" - hdr histograms have been added to stress runs, it's output can be saved to disk using: "-log hdrfile=" option. This histogram includes response/service/wait times when used with the fixed or throttle rate options. The histogram file can be plotted on - TimeWindowCompactionStrategy has been added. This has proven to be a better approach to time series compaction and new tables should use this instead of DTCS. See CASSANDRA-9666 for details. - Change-Data-Capture is now available. See cassandra.yaml and for cdc-specific flags and a brief explanation of on-disk locations for archived data in CommitLog form. This can be enabled via ALTER TABLE ... WITH cdc=true. Upon flush, CommitLogSegments containing data for CDC-enabled tables are moved to the data/cdc_raw directory until removed by the user and writes to CDC-enabled tables will be rejected with a WriteTimeoutException once cdc_total_space_in_mb is reached between unflushed CommitLogSegments and cdc_raw. NOTE: CDC is disabled by default in the .yaml file. Do not enable CDC on a mixed-version cluster as it will lead to exceptions which can interrupt traffic. Once all nodes have been upgraded to 3.8 it is safe to enable this feature and restart the cluster. Upgrading --------- - The ReversedType behaviour has been corrected for clustering columns of BYTES type containing empty value. Scrub should be run on the existing SSTables containing a descending clustering column of BYTES type to correct their ordering. See CASSANDRA-12127 for more details. - Ec2MultiRegionSnitch will no longer automatically set broadcast_rpc_address to the public instance IP if this property is defined on cassandra.yaml. - The name "json" and "distinct" are not valid anymore a user-defined function names (they are still valid as column name however). In the unlikely case where you had defined functions with such names, you will need to recreate those under a different name, change your code to use the new names and drop the old versions, and this _before_ upgrade (see CASSANDRA-10783 for more details). Deprecation ----------- - DateTieredCompactionStrategy has been deprecated - new tables should use TimeWindowCompactionStrategy. Note that migrating an existing DTCS-table to TWCS might cause increased compaction load for a while after the migration so make sure you run tests before migrating. Read CASSANDRA-9666 for background on this. 3.7 === Upgrading --------- - A maximum size for SSTables values has been introduced, to prevent out of memory exceptions when reading corrupt SSTables. This maximum size can be set via max_value_size_in_mb in cassandra.yaml. The default is 256MB, which matches the default value of native_transport_max_frame_size_in_mb. SSTables will be considered corrupt if they contain values whose size exceeds this limit. See CASSANDRA-9530 for more details. 3.6 ===== New features ------------ - JMX connections can now use the same auth mechanisms as CQL clients. New options in cassandra-env.(sh|ps1) enable JMX authentication and authorization to be delegated to the IAuthenticator and IAuthorizer configured in cassandra.yaml. The default settings still only expose JMX locally, and use the JVM's own security mechanisms when remote connections are permitted. For more details on how to enable the new options, see the comments in A new class of IResource, JMXResource, is provided for the purposes of GRANT/REVOKE via CQL. See CASSANDRA-10091 for more details. Also, directly setting JMX remote port via the system property at startup is deprecated. See CASSANDRA-11725 for more details. - JSON timestamps are now in UTC and contain the timezone information, see CASSANDRA-11137 for more details. - Collision checks are performed when joining the token ring, regardless of whether the node should bootstrap. Additionally, replace_address can legitimately be used without bootstrapping to help with recovery of nodes with partially failed disks. See CASSANDRA-10134 for more details. - Key cache will only hold indexed entries up to the size configured by column_index_cache_size_in_kb in cassandra.yaml in memory. Larger indexed entries will never go into memory. See CASSANDRA-11206 for more details. - For tables having a default_time_to_live specifying a TTL of 0 will remove the TTL from the inserted or updated values. - Startup is now aborted if corrupted transaction log files are found. The details of the affected log files are now logged, allowing the operator to decide how to resolve the situation. - Filtering expressions are made more pluggable and can be added programatically via a QueryHandler implementation. See CASSANDRA-11295 for more details. 3.4 === New features ------------ - Internal authentication now supports caching of encrypted credentials. Reference cassandra.yaml:credentials_validity_in_ms - Remote configuration of auth caches via JMX can be disabled using the the system property cassandra.disable_auth_caches_remote_configuration - sstabledump tool is added to be 3.0 version of former sstable2json. The tool only supports v3.0+ SSTables. See tool's help for more detail. Upgrading --------- - Nothing specific to 3.4 but please see previous versions upgrading section, especially if you are upgrading from 2.2. Deprecation ----------- - The mbean interfaces org.apache.cassandra.auth.PermissionsCacheMBean and org.apache.cassandra.auth.RolesCacheMBean are deprecated in favor of org.apache.cassandra.auth.AuthCacheMBean. This generalized interface is common across all caches in the auth subsystem. The specific mbean interfaces for each individual cache will be removed in a subsequent major version. 3.2 === New features ------------ - We now make sure that a token does not exist in several data directories. This means that we run one compaction strategy per data_file_directory and we use one thread per directory to flush. Use nodetool relocatesstables to make sure your tokens are in the correct place, or just wait and compaction will handle it. See CASSANDRA-6696 for more details. - bound maximum in-flight commit log replay mutation bytes to 64 megabytes tunable via cassandra.commitlog_max_outstanding_replay_bytes - Support for type casting has been added to the selection clause. - Hinted handoff now supports compression. Reference cassandra.yaml:hints_compression. Note: hints compression is currently disabled by default. Upgrading --------- - The compression ratio metrics computation has been modified to be more accurate. - Running Cassandra as root is prevented by default. - JVM options are moved from cassandra-env.(sh|ps1) to jvm.options file Deprecation ----------- - The Thrift API is deprecated and will be removed in Cassandra 4.0. 3.1 ===== Upgrading --------- - The return value of SelectStatement::getLimit as been changed from DataLimits to int. - Custom index implementation should be aware that the method Indexer::indexes() has been removed as its contract was misleading and all custom implementation should have almost surely returned true inconditionally for that method. - GC logging is now enabled by default (you can disable it in the jvm.options file if you prefer). 3.0 === New features ------------ - EACH_QUORUM is now a supported consistency level for read requests. - Support for IN restrictions on any partition key component or clustering key as well as support for EQ and IN multicolumn restrictions has been added to UPDATE and DELETE statement. - Support for single-column and multi-colum slice restrictions (>, >=, <= and <) has been added to DELETE statements - nodetool rebuild_index accepts the index argument without the redundant table name - Materialized Views, which allow for server-side denormalization, is now available. Materialized views provide an alternative to secondary indexes for non-primary key queries, and perform much better for indexing high cardinality columns. See - Hinted handoff has been completely rewritten. Hints are now stored in flat files, with less overhead for storage and more efficient dispatch. See CASSANDRA-6230 for full details. - Option to not purge unrepaired tombstones. To avoid users having data resurrected if repair has not been run within gc_grace_seconds, an option has been added to only allow tombstones from repaired sstables to be purged. To enable, set the compaction option 'only_purge_repaired_tombstones':true but keep in mind that if you do not run repair for a long time, you will keep all tombstones around which can cause other problems. - Enabled warning on GC taking longer than 1000ms. See cassandra.yaml:gc_warn_threshold_in_ms Upgrading --------- - Clients must use the native protocol version 3 when upgrading from 2.2.X as the native protocol version 4 is not compatible between 2.2.X and 3.Y. See for details. - A new argument of type InetAdress has been added to IAuthenticator::newSaslNegotiator, representing the IP address of the client attempting authentication. It will be a breaking change for any custom implementations. - token-generator tool has been removed. - Upgrade to 3.0 is supported from Cassandra 2.1 versions greater or equal to 2.1.9, or Cassandra 2.2 versions greater or equal to 2.2.2. Upgrade from Cassandra 2.0 and older versions is not supported. - The 'memtable_allocation_type: offheap_objects' option has been removed. It should be re-introduced in a future release and you can follow CASSANDRA-9472 to know more. - Configuration parameter memory_allocator in cassandra.yaml has been removed. - The native protocol versions 1 and 2 are not supported anymore. - Max mutation size is now configurable via max_mutation_size_in_kb setting in cassandra.yaml; the default is half the size commitlog_segment_size_in_mb * 1024. - 3.0 requires Java 8u40 or later. - Garbage collection options were moved from cassandra-env to jvm.options file. - New transaction log files have been introduced to replace the compactions_in_progress system table, temporary file markers (tmp and tmplink) and sstable ancerstors. Therefore, compaction metadata no longer contains ancestors. Transaction log files list sstable descriptors involved in compactions and other operations such as flushing and streaming. Use the sstableutil tool to list any sstable files currently involved in operations not yet completed, which previously would have been marked as temporary. A transaction log file contains one sstable per line, with the prefix "add:" or "remove:". They also contain a special line "commit", only inserted at the end when the transaction is committed. On startup we use these files to cleanup any partial transactions that were in progress when the process exited. If the commit line is found, we keep new sstables (those with the "add" prefix) and delete the old sstables (those with the "remove" prefix), vice-versa if the commit line is missing. Should you lose or delete these log files, both old and new sstable files will be kept as live files, which will result in duplicated sstables. These files are protected by incremental checksums so you should not manually edit them. When restoring a full backup or moving sstable files, you should clean-up any left over transactions and their temporary files first. You can use this command: ===> sstableutil -c ks table See CASSANDRA-7066 for full details. - New write stages have been added for batchlog and materialized view mutations you can set their size in cassandra.yaml - User defined functions are now executed in a sandbox. To use UDFs and UDAs, you have to enable them in cassandra.yaml. - New SSTable version 'la' with improved bloom-filter false-positive handling compared to previous version 'ka' used in 2.2 and 2.1. Running sstableupgrade is not necessary but recommended. - Before upgrading to 3.0, make sure that your cluster is in complete agreement (schema versions outputted by `nodetool describecluster` are all the same). - Schema metadata is now stored in the new `system_schema` keyspace, and legacy `system.schema_*` tables are now gone; see CASSANDRA-6717 for details. - Pig's support has been removed. - Hadoop BulkOutputFormat and BulkRecordWriter have been removed; use CqlBulkOutputFormat and CqlBulkRecordWriter instead. - Hadoop ColumnFamilyInputFormat and ColumnFamilyOutputFormat have been removed; use CqlInputFormat and CqlOutputFormat instead. - Hadoop ColumnFamilyRecordReader and ColumnFamilyRecordWriter have been removed; use CqlRecordReader and CqlRecordWriter instead. - hinted_handoff_enabled in cassandra.yaml no longer supports a list of data centers. To specify a list of excluded data centers when hinted_handoff_enabled is set to true, use hinted_handoff_disabled_datacenters, see CASSANDRA-9035 for details. - The `sstable_compression` and `chunk_length_kb` compression options have been deprecated. The new options are `class` and `chunk_length_in_kb`. Disabling compression should now be done by setting the new option `enabled` to `false`. - The compression option `crc_check_chance` became a top-level table option, but is currently enforced only against tables with enabled compression. - Only map syntax is now allowed for caching options. ALL/NONE/KEYS_ONLY/ROWS_ONLY syntax has been deprecated since 2.1.0 and is being removed in 3.0.0. - The 'index_interval' option for 'CREATE TABLE' statements, which has been deprecated since 2.1 and replaced with the 'min_index_interval' and 'max_index_interval' options, has now been removed. - The 'replicate_on_write' and 'populate_io_cache_on_flush' options for 'CREATE TABLE' statements, which have been deprecated since 2.1, have also been removed. - Batchlog entries are now stored in a new table - system.batches. The old one has been deprecated. - JMX methods set/getCompactionStrategyClass have been removed, use set/getCompactionParameters or set/getCompactionParametersJson instead. - SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed. - The secondary index API has been comprehensively reworked. This will be a breaking change for any custom index implementations, which should now look to implement the new org.apache.cassandra.index.Index interface. New syntax has been added to create and query row-based indexes, which are not explicitly linked to a single column in the base table. 2.2.4 ===== Deprecation ----------- - Pig support has been deprecated, and will be removed in 3.0. Please see CASSANDRA-10542 for more details. - Configuration parameter memory_allocator in cassandra.yaml has been deprecated and will be removed in 3.0.0. As mentioned below for 2.2.0, jemalloc is automatically preloaded on Unix platforms. Operations ---------- - Switching data center or racks is no longer an allowed operation on a node which has data. Instead, the node will need to be decommissioned and rebootstrapped. If moving from the SimpleSnitch, make sure that the data center and rack containing all current nodes is named "datacenter1" and "rack1". To override this behaviour use -Dcassandra.ignore_rack=true and/or -Dcassandra.ignore_dc=true. - Reloading the configuration file of GossipingPropertyFileSnitch has been disabled. Upgrading --------- - The default for the inter-DC stream throughput setting (inter_dc_stream_throughput_outbound_megabits_per_sec in cassandra.yaml) is the same than the one for intra-DC one (200Mbps) instead of being unlimited. Having it unlimited was never intended and was a bug. New features ------------ - Time windows in DTCS are now limited to 1 day by default to be able to handle bootstrap and repair in a better way. To get the old behaviour, increase max_window_size_seconds. - DTCS option max_sstable_age_days is now deprecated and defaults to 1000 days. - Native protocol server now allows both SSL and non-SSL connections on the same port. 2.2.3 ===== Upgrading --------- - Nothing specific to this release, but please see 2.2 if you are upgrading from a previous version. 2.2.2 ===== Changed Defaults ---------------- - commitlog_total_space_in_mb will use the smaller of 8192, and 1/4 of the total space of the commitlog volume. (Before: always used 8192) - The following INFO logs were reduced to DEBUG level and will now show on debug.log instead of system.log: - Memtable flushing actions - Commit log replayed files - Compacted sstables - SStable opening (SSTableReader) New features ------------ - Custom QueryHandlers can retrieve the column specifications for the bound variables from QueryOptions by using the hasColumnSpecifications() and getColumnSpecifications() methods. - A new default assynchronous log appender debug.log was created in addition to the system.log appender in order to provide more detailed log debugging. In order to disable debug logging, you must comment-out the ASYNCDEBUGLOG appender on conf/logback.xml. See CASSANDRA-10241 for more information. 2.2.1 ===== New features ------------ - COUNT(*) and COUNT(1) can be selected with other columns or functions 2.2 === Upgrading --------- - The authentication & authorization subsystems have been redesigned to support role based access control (RBAC), resulting in a change to the schema of the system_auth keyspace. See below for more detail. For systems already using the internal auth implementations, the process for converting existing data during a rolling upgrade is straightforward. As each node is restarted, it will attempt to convert any data in the legacy tables into the new schema. Until enough nodes to satisfy the replication strategy for the system_auth keyspace are upgraded and so have the new schema, this conversion will fail with the failure being reported in the system log. During the upgrade, Cassandra's internal auth classes will continue to use the legacy tables, so clients experience no disruption. Issuing DCL statements during an upgrade is not supported. Once all nodes are upgraded, an operator with superuser privileges should drop the legacy tables, system_auth.users, system_auth.credentials and system_auth.permissions. Doing so will prompt Cassandra to switch over to the new tables without requiring any further intervention. While the legacy tables are present a restarted node will re-run the data conversion and report the outcome so that operators can verify that it is safe to drop them. New features ------------ - The LIMIT clause applies now only to the number of rows returned to the user, not to the number of row queried. By consequence, queries using aggregates will not be impacted by the LIMIT clause anymore. - Very large batches will now be rejected (defaults to 50kb). This can be customized by modifying batch_size_fail_threshold_in_kb. - Selecting columns,scalar functions, UDT fields, writetime or ttl together with aggregated is now possible. The value returned for the columns, scalar functions, UDT fields, writetime and ttl will be the ones for the first row matching the query. - Windows is now a supported platform. Powershell execution for startup scripts is highly recommended and can be enabled via an administrator command-prompt with: 'powershell set-executionpolicy unrestricted' - It is now possible to do major compactions when using leveled compaction. Doing that will take all sstables and compact them out in levels. The levels will be non overlapping so doing this will still not be something you want to do very often since it might cause more compactions for a while. It is also possible to split output when doing a major compaction with STCS - files will be split in sizes 50%, 25%, 12.5% etc of the total size. This might be a bit better than old major compactions which created one big file on disk. - A new tool has been added bin/sstableverify that checks for errors/bitrot in all sstables. Unlike scrub, this is a non-invasive tool. - Authentication & Authorization APIs have been updated to introduce roles. Roles and Permissions granted to them are inherited, supporting role based access control. The role concept supercedes that of users and CQL constructs such as CREATE USER are deprecated but retained for compatibility. The requirement to explicitly create Roles in Cassandra even when auth is handled by an external system has been removed, so authentication & authorization can be delegated to such systems in their entirety. - In addition to the above, Roles are also first class resources and can be the subject of permissions. Users (roles) can now be granted permissions on other roles, including CREATE, ALTER, DROP & AUTHORIZE, which removesthe need for superuser privileges in order to perform user/role management operations. - Creators of database resources (Keyspaces, Tables, Roles) are now automatically granted all permissions on them (if the IAuthorizer implementation supports this). - SSTable file name is changed. Now you don't have Keyspace/CF name in file name. Also, secondary index has its own directory under parent's directory. - Support for user-defined functions and user-defined aggregates have been added to CQL. ************************************************************************ IMPORTANT NOTE: user-defined functions can be used to execute arbitrary and possibly evil code in Cassandra 2.2, and are therefore disabled by default. To enable UDFs edit cassandra.yaml and set enable_user_defined_functions to true. CASSANDRA-9402 will add a security manager for UDFs in Cassandra 3.0. This will inherently be backwards-incompatible with any 2.2 UDF that perform insecure operations such as opening a socket or writing to the filesystem. ************************************************************************ - Row-cache is now fully off-heap. - jemalloc is now automatically preloaded and used on Linux and OS-X if installed. - Please ensure on Unix platforms that there is no installed which is accessible by Cassandra. Old versions of libjna packages (< 4.0.0) will cause problems - e.g. Debian Wheezy contains libjna versin 3.2.x. - The node now keeps up when streaming is failed during bootstrapping. You can use new `nodetool bootstrap resume` command to continue streaming after resolving an issue. - Protocol version 4 specifies that bind variables do not require having a value when executing a statement. Bind variables without a value are called 'unset'. The 'unset' bind variable is serialized as the int value '-2' without following bytes. In an EXECUTE or BATCH request an unset bind value does not modify the value and does not create a tombstone, an unset bind ttl is treated as 'unlimited', an unset bind timestamp is treated as 'now', an unset bind counter operation does not change the counter value. Unset tuple field, UDT field and map key are not allowed. In a QUERY request an unset limit is treated as 'unlimited'. Unset WHERE clauses with unset partition column, clustering column or index column are not allowed. - New `ByteType` (cql tinyint). 1-byte signed integer - New `ShortType` (cql smallint). 2-byte signed integer - New `SimpleDateType` (cql date). 4-byte unsigned integer - New `TimeType` (cql time). 8-byte long - The toDate(timeuuid), toTimestamp(timeuuid) and toUnixTimestamp(timeuuid) functions have been added to allow to convert from timeuuid into date type, timestamp type and bigint raw value. The functions unixTimestampOf(timeuuid) and dateOf(timeuuid) have been deprecated. - The toDate(timestamp) and toUnixTimestamp(timestamp) functions have been added to allow to convert from timestamp into date type and bigint raw value. - The toTimestamp(date) and toUnixTimestamp(date) functions have been added to allow to convert from date into timestamp type and bigint raw value. - SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed. - The default JVM flag -XX:+PerfDisableSharedMem will cause the following tools JVM to stop working: jps, jstack, jinfo, jmc, jcmd as well as 3rd party tools like Jolokia. If you wish to use these tools you can comment this flag out in cassandra-env.{sh,ps1} Upgrading --------- - Thrift rpc is no longer being started by default. Set `start_rpc` parameter to `true` to enable it. - Pig's CqlStorage has been removed, use CqlNativeStorage instead - Pig's CassandraStorage has been deprecated. CassandraStorage should only be used against tables created via thrift. Use CqlNativeStorage for all other tables. - IAuthenticator been updated to remove responsibility for user/role maintenance and is now solely responsible for validating credentials, This is primarily done via SASL, though an optional method exists for systems which need support for the Thrift login() method. - IRoleManager interface has been added which takes over the maintenance functions from IAuthenticator. IAuthorizer is mainly unchanged. Auth data in systems using the stock internal implementations PasswordAuthenticator & CassandraAuthorizer will be automatically converted during upgrade, with minimal operator intervention required. Custom implementations will require modification, though these can be used in conjunction with the stock CassandraRoleManager so providing an IRoleManager implementation should not usually be necessary. - Fat client support has been removed since we have push notifications to clients - cassandra-cli has been removed. Please use cqlsh instead. - YamlFileNetworkTopologySnitch has been removed; switch to GossipingPropertyFileSnitch instead. - CQL2 has been removed entirely in this release (previously deprecated in 2.0.0). Please switch to CQL3 if you haven't already done so. - The results of CQL3 queries containing an IN restriction will be ordered in the normal order and not anymore in the order in which the column values were specified in the IN restriction. - Some secondary index queries with restrictions on non-indexed clustering columns were not requiring ALLOW FILTERING as they should. This has been fixed, and those queries now require ALLOW FILTERING (see CASSANDRA-8418 for details). - The SSTableSimpleWriter and SSTableSimpleUnsortedWriter classes have been deprecated and will be removed in the next major Cassandra release. You should use the CQLSSTableWriter class instead. - The sstable2json and json2sstable tools have been deprecated and will be removed in the next major Cassandra release. See CASSANDRA-9618 ( for details. - nodetool enablehandoff will no longer support a list of data centers starting with the next major release. Two new commands will be added, enablehintsfordc and disablehintsfordc, to exclude data centers from using hinted handoff when the global status is enabled. In cassandra.yaml, hinted_handoff_enabled will no longer support a list of data centers starting with the next major release. A new setting will be added, hinted_handoff_disabled_datacenters, to exclude data centers when the global status is enabled, see CASSANDRA-9035 for details. 2.1.13 ====== New features ------------ - New options for cqlsh COPY FROM and COPY TO, see CASSANDRA-9303 for details. 2.1.10 ===== New features ------------ - The syntax TRUNCATE TABLE X is now accepted as an alias for TRUNCATE X 2.1.9 ===== Upgrading --------- - cqlsh will now display timestamps with a UTC timezone. Previously, timestamps were displayed with the local timezone. - Commit log files are no longer recycled by default, due to negative performance implications. This can be enabled again with the commitlog_segment_recycling option in your cassandra.yaml - JMX methods set/getCompactionStrategyClass have been deprecated, use set/getCompactionParameters/set/getCompactionParametersJson instead 2.1.8 ===== Upgrading --------- - Nothing specific to this release, but please see 2.1 if you are upgrading from a previous version. 2.1.7 ===== 2.1.6 ===== Upgrading --------- - Nothing specific to this release, but please see 2.1 if you are upgrading from a previous version. 2.1.5 ===== Upgrading --------- - The option to omit cold sstables with size tiered compaction has been removed - it is almost always better to use date tiered compaction for workloads that have cold data. 2.1.4 ===== Upgrading --------- The default JMX config now listens to localhost only. You must enable the other JMX flags in manually. 2.1.3 ===== Upgrading --------- - Prepending a list to a list collection was erroneously resulting in the prepended list being reversed upon insertion. If you were depending on this buggy behavior, note that it has been corrected. - Incremental replacement of compacted SSTables has been disabled for this release. 2.1.2 ===== Upgrading --------- - Nothing specific to this release, but please see 2.1 if you are upgrading from a previous version. 2.1.1 ===== Upgrading --------- - Nothing specific to this release, but please see 2.1 if you are upgrading from a previous version. New features ------------ - Netty support for epoll on linux is now enabled. If for some reason you want to disable it pass, the following system property -Dcassandra.native.epoll.enabled=false 2.1 === New features ------------ - Default data and log locations have changed. If not set in cassandra.yaml, the data file directory, commitlog directory, and saved caches directory will default to $CASSANDRA_HOME/data/data, $CASSANDRA_HOME/data/commitlog, and $CASSANDRA_HOME/data/saved_caches, respectively. The log directory now defaults to $CASSANDRA_HOME/logs. If not set, $CASSANDRA_HOME, defaults to the top-level directory of the installation. Note that this should only affect source checkouts and tarballs. Deb and RPM packages will continue to use /var/lib/cassandra and /var/log/cassandra in cassandra.yaml. - SSTable data directory name is slightly changed. Each directory will have hex string appended after CF name, e.g. ks/cf-5be396077b811e3a3ab9dc4b9ac088d/ This hex string part represents unique ColumnFamily ID. Note that existing directories are used as is, so only newly created directories after upgrade have new directory name format. - Saved key cache files also have ColumnFamily ID in their file name. - It is now possible to do incremental repairs, sstables that have been repaired are marked with a timestamp and not included in the next repair session. Use nodetool repair -par -inc to use this feature. A tool to manually mark/unmark sstables as repaired is available in tools/bin/sstablerepairedset. This is particularly important when using LCS, or any data not repaired in your first incremental repair will be put back in L0. - Bootstrapping now ensures that range movements are consistent, meaning the data for the new node is taken from the node that is no longer a responsible for that range of keys. If you want the old behavior (due to a lost node perhaps) you can set the following property (-Dcassandra.consistent.rangemovement=false) - It is now possible to use quoted identifiers in triggers' names. WARNING: if you previously used triggers with capital letters in their names, then you must quote them from now on. - Improved stress tool ( - New incremental repair option (, - Incremental replacement of compacted SSTables ( - The row cache can now cache only the head of partitions ( - Off-heap memtables ( - CQL improvements and additions: User-defined types, tuple types, 2ndary indexing of collections, ... ( Upgrading --------- - commitlog_sync_batch_window_in_ms behavior has changed from the maximum time to wait between fsync to the minimum time. We are working on making this more user-friendly (see CASSANDRA-9533) but in the meantime, this means 2.1 needs a much smaller batch window to keep writer threads from starving. The suggested default is now 2ms. - Rolling upgrades from anything pre-2.0.7 is not supported. Furthermore pre-2.0 sstables are not supported. This means that before upgrading a node on 2.1, this node must be started on 2.0 and 'nodetool upgdradesstables' must be run (and this even in the case of not-rolling upgrades). - For size-tiered compaction users, Cassandra now defaults to ignoring the coldest 5% of sstables. This can be customized with the cold_reads_to_omit compaction option; 0.0 omits nothing (the old behavior) and 1.0 omits everything. - Multithreaded compaction has been removed. - Counters implementation has been changed, replaced by a safer one with less caveats, but different performance characteristics. You might have to change your data model to accomodate the new implementation. (See and the blog post at for details). - (per-table) index_interval parameter has been replaced with min_index_interval and max_index_interval paratemeters. index_interval has been deprecated. - support for supercolumns has been removed from json2sstable 2.0.11 ====== Upgrading --------- - Nothing specific to this release, but refer to previous entries if you are upgrading from a previous version. New features ------------ - DateTieredCompactionStrategy added, optimized for time series data and groups data that is written closely in time (CASSANDRA-6602 for details). Consider this experimental for now. 2.0.10 ====== New features ------------ - CqlPaginRecordReader and CqlPagingInputFormat have both been removed. Use CqlInputFormat instead. - If you are using Leveled Compaction, you can now disable doing size-tiered compaction in L0 by starting Cassandra with -Dcassandra.disable_stcs_in_l0 (see CASSANDRA-6621 for details). - Shuffle and taketoken have been removed. For clusters that choose to upgrade to vnodes, creating a new datacenter with vnodes and migrating is recommended. See for further information. 2.0.9 ===== Upgrading --------- - Default values for read_repair_chance and local_read_repair_chance have been swapped. Namely, default read_repair_chance is now set to 0.0, and default local_read_repair_chance to 0.1. - Queries selecting only CQL static columns were (mistakenly) not returning one result per row in the partition. This has been fixed and a SELECT DISTINCT can be used when only the static column of a partition needs to be fetch without fetching the whole partition. But if you use static columns, please make sure this won't affect you (see CASSANDRA-7305 for details). 2.0.8 ===== New features ------------ - New snitches have been used for users of Google Compute Engine and of Cloudstack. Upgrading --------- - Nothing specific to this release, but please see 2.0.7 if you are upgrading from a previous version. 2.0.7 ===== Upgrading --------- - Nothing specific to this release, but please see 2.0.6 if you are upgrading from a previous version. 2.0.6 ===== New features ------------ - CQL now support static columns, allows to batch multiple conditional updates and has a new syntax for slicing over multiple clustering columns ( - Repair can be restricted to a set of nodes using the -hosts option in nodetool. - A new 'nodetool taketoken' command relocate tokens with vnodes. - Hinted handoff can be enabled only for some data-centers (see hinted_handoff_enabled in cassandra.yaml) Upgrading --------- - Nothing specific to this release, but please see 2.0.5 if you are upgrading from a previous version. 2.0.5 ===== New features ------------ - Batchlog replay can be, and is throttled by default now. See batchlog_replay_throttle_in_kb setting in cassandra.yaml. - Scrub can now optionally skip corrupt counter partitions. Please note that this will lead to the loss of all the counter updates in the skipped partition. See the --skip-corrupted option. Upgrading --------- - If your cluster began on a version before 1.2, check that your secondary index SSTables are on version 'ic' before upgrading. If not, run 'nodetool upgradesstables' if on 1.2.14 or later, or run 'nodetool upgradesstables ks cf' with the keyspace and secondary index named explicitly otherwise. If you don't do this and upgrade to 2.0.x and it refuses to start because of 'hf' version files in the secondary index, you will need to delete/move them out of the way and recreate the index when 2.0.x starts. 2.0.3 ===== New features ------------ - It's now possible to configure the maximum allowed size of the native protocol frames (native_transport_max_frame_size_in_mb in the yaml file). Upgrading --------- - NaN and Infinity are new valid floating point constants in CQL3 and are now reserved keywords. In the unlikely case you were using one of them as an identifier (for a column, a keyspace or a table), you will now have to double-quote them (see for "quoted identifiers"). - The IEndpointStateChangeSubscriber has a new method, beforeChange, that any custom implemenations using the class will need to implement. 2.0.2 ===== New features ------------ - Speculative retry defaults to 99th percentile (See blog post at - Configurable metrics reporting (see conf/metrics-reporter-config-sample.yaml) - Compaction history and stats are now saved to system keyspace (system.compaction_history table). You can access historiy via new 'nodetool compactionhistory' command or CQL. Upgrading --------- - Nodetool defaults to Sequential mode for repair operations 2.0.1 ===== Upgrading --------- - The default memtable allocation has changed from 1/3 of heap to 1/4 of heap. Also, default (single-partition) read and write timeouts have been reduced from 10s to 5s and 2s, respectively. 2.0.0 ===== Upgrading --------- - Java 7 is now *required*! - Upgrading is ONLY supported from Cassandra 1.2.9 or later. This goes for sstable compatibility as well as network. When upgrading from an earlier release, upgrade to 1.2.9 first and run upgradesstables before proceeding to 2.0. - CAS and new features in CQL such as DROP COLUMN assume that cell timestamps are microseconds-since-epoch. Do not use these features if you are using client-specified timestamps with some other source. - Replication and strategy options do not accept unknown options anymore. This was already the case for CQL3 in 1.2 but this is now the case for thrift too. - auto_bootstrap of a single-token node with no initial_token will now pick a random token instead of bisecting an existing token range. We recommend upgrading to vnodes; failing that, we recommend specifying initial_token. - reduce_cache_sizes_at, reduce_cache_capacity_to, and flush_largest_memtables_at options have been removed from cassandra.yaml. - CacheServiceMBean.reduceCacheSizes() has been removed. Use CacheServiceMBean.set{Key,Row}CacheCapacityInMB() instead. - authority option in cassandra.yaml has been deprecated since 1.2.0, but it has been completely removed in 2.0. Please use 'authorizer' option. - ASSUME command has been removed from cqlsh. Use CQL3 blobAsType() and typeAsBlob() conversion functions instead. See for details. - Inputting blobs as string constants is now fully deprecated in favor of blob constants. Make sure to update your applications to use the new syntax while you are still on 1.2 (which supports both string and blob constants for blob input) before upgrading to 2.0. - index_interval is now moved to ColumnFamily property. You can change value with ALTER TABLE ... WITH statement and SSTables written after that will have new value. When upgrading, Cassandra will pick up the value defined in cassanda.yaml as the default for existing ColumnFamilies, until you explicitly set the value for those. - The deprecated native_transport_min_threads option has been removed in Cassandra.yaml. Operations ---------- - VNodes are enabled by default in cassandra.yaml. initial_token for non-vnode deployments has been removed from the example yaml, but is still respected if specified. - Major compactions, cleanup, scrub, and upgradesstables will interrupt any in-progress compactions (but not repair validations) when invoked. - Disabling autocompactions by setting min/max compaction threshold to 0 has been deprecated, instead, use the nodetool commands 'disableautocompaction' and 'enableautocompaction' or set the compaction strategy option enabled = false - ALTER TABLE DROP has been reenabled for CQL3 tables and has new semantics now. See and for details. - CAS uses gc_grace_seconds to determine how long to keep unused paxos state around for, or a minimum of three hours. - A new hints created metric is tracked per target, replacing countPendingHints - After performance testing for CASSANDRA-5727, the default LCS filesize has been changed from 5MB to 160MB. - cqlsh DESCRIBE SCHEMA no longer outputs the schema of system_* keyspaces; use DESCRIBE FULL SCHEMA if you need the schema of system_* keyspaces. - CQL2 has been deprecated, and will be removed entirely in 2.2. See CASSANDRA-5918 for details. - Commit log archiver now assumes the client time stamp to be in microsecond precision, during restore. Please refer to Features -------- - Lightweight transactions ( - Alias support has been added to CQL3 SELECT statement. Refer to CQL3 documentation ( for details. - JEMalloc support (see memory_allocator in cassandra.yaml) - Experimental triggers support. See examples/ for how to use. "Experimental" means "tied closely to internal data structures; we plan to decouple this in the future, which will probably break triggers written against this initial API." - Numerous improvements to CQL3 and a new version of the native protocol. See for details. 1.2.11 ====== Features -------- - Added a new consistency level, LOCAL_ONE, that forces all CL.ONE operations to execute only in the local datacenter. - New replace_address to supplant the (now removed) replace_token and replace_node workflows to replace a dead node in place. Works like the old options, but takes the IP address of the node to be replaced. 1.2.9 ===== Features -------- - A history of executed nodetool commands is now captured. It can be found in ~/.cassandra/nodetool.history. Other tools output files (cli and cqlsh history, .cqlshrc) are now centralized in ~/.cassandra, as well. - A new sstablesplit utility allows to split large sstables offline. 1.2.8 ===== Upgrading --------- - Nothing specific to this release, but please see 1.2.7 if you are upgrading from a previous version. 1.2.7 ===== Upgrading --------- - If you have decommissioned a node in the past 72 hours, it is imperative that you not upgrade until such time has passed, or do a full cluster restart (not rolling) before beginning the upgrade. This only applies to decommission, not removetoken. 1.2.6 ===== Upgrading --------- - hinted_handoff_throttle_in_kb is now reduced by a factor proportional to the number of nodes in the cluster (see - CQL3 syntax for CREATE CUSTOM INDEX has been updated. See CQL3 documentation for details. 1.2.5 ===== Features -------- - Custom secondary index support has been added to CQL3. Refer to CQL3 documentation ( for details and examples. Upgrading --------- - The native CQL transport is enabled by default on part 9042. 1.2.4 ===== Upgrading --------- - 'nodetool upgradesstables' now only upgrades/rewrites sstables that are not on the current version (which is usually what you want). Use the new -a flag to recover the old behavior of rewriting all sstables. Features -------- - superuser setup delay (10 seconds) can now be overridden using 'cassandra.superuser_setup_delay_ms' property. 1.2.3 ===== Upgrading --------- - CQL3 used to be case-insensitive for property map key in ALTER and CREATE statements. In other words: CREATE KEYSPACE test WITH replication = { 'CLASS' : 'SimpleStrategy', 'REPLICATION_FACTOR' : '1' } was allowed. However, this was not consistent with the fact that string literal are case sensitive in every other places and more importantly this break NetworkTopologyStrategy for which DC names are case sensitive. Those property map key are now case sensitive. So the statement above should be changed to: CREATE KEYSPACE test WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : '1' } 1.2.2 ===== Upgrading --------- - CQL3 type validation for constants has been fixed, which may require fixing queries that were relying on the previous loose validation. Please refer to the CQL3 documentation ( and in particular the changelog section for more details. Please note in particular that inputing blobs as strings constants is now deprecated (in favor of blob constants) and its support will be removed in a future version. Features -------- - Built-in CQL3-based implementations of IAuthenticator (PasswordAuthenticator) and IAuthorizer (CassandraAuthorizer) have been added. PasswordAuthenticator stores usernames and hashed passwords in system_auth.credentials table; CassandraAuthorizer stores permissions in system_auth.permissions table. - system_auth keyspace is now alterable via ALTER KEYSPACE queries. The default is SimpleStrategy with replication_factor of 1, but it's advised to raise RF to at least 3 or 5, since CL.QUORUM is used for all auth-related queries. It's also possible to change the strategy to NTS. - Permissions caching with time-based expiration policy has been added to reduce performance impact of authorization. Permission validity can be configured using 'permissions_validity_in_ms' setting in cassandra.yaml. The default is 2000 (2 seconds). - SimpleAuthenticator and SimpleAuthorizer examples have been removed. Please look at CassandraAuthorizer/PasswordAuthenticator instead. 1.2.1 ===== Upgrading --------- - In CQL3, date string are no longer accepted as timeuuid value since a date string is not a correct representation of a timeuuid. Instead, new methods (minTimeuuid, maxTimeuuid, now, dateOf, unixTimestampOf) have been introduced to make working on timeuuid from date string easy. cqlsh also does not display timeuuid as date string (since this is a lossy representation), but the new dateOf method can be used instead. Please refer to the reference documentation ( for more detail. - For client implementors: CQL3 client using the thrift interface should use the new execute_cql3_query, prepare_cql3_query and execute_prepared_cql3_query since 1.2.0. However, Cassandra 1.2.0 was not complaining if CQL3 was set through set_cql_version but the now CQL2 only methods were used. This is now the case. - Queries that uses unrecognized or bad compaction or replication strategy options are now refused (instead of simply logging a warning). 1.2 === Upgrading --------- - IAuthenticator interface has been updated to support dynamic user creation, modification and removal. Users, even when stored externally, now have to be explicitly created using CREATE USER query first. AllowAllAuthenticator and SimpleAuthenticator have been updated for the new interface, but you'll have to update your old IAuthenticator implementations for 1.2. To ease this process, a new abstract LegacyAuthenticator class has been added - subclass it in your old IAuthenticator implementaion and everything should just work (this only affects users who implemented custom authenticators). - IAuthority interface has been deprecated in favor of IAuthorizer. AllowAllAuthority and SimpleAuthority have been renamed to AllowAllAuthorizer and SimpleAuthorizer, respectively. In order to simplify the upgrade to the new interface, a new abstract LegacyAuthorizer has been added - you should subclass it in your old IAuthority implementation and everything should just work (this only affects users who implemented custom authorities). 'authority' setting in cassandra.yaml has been renamed to 'authorizer', 'authority' is no longer recognized. This affects all upgrading users. - 1.2 is NOT network-compatible with versions older than 1.0. That means if you want to do a rolling, zero-downtime upgrade, you'll need to upgrade first to 1.0.x or 1.1.x, and then to 1.2. 1.2 retains the ability to read data files from Cassandra versions at least back to 0.6, so a non-rolling upgrade remains possible with just one step. - The default partitioner for new clusters is Murmur3Partitioner, which is about 10% faster for index-intensive workloads. Partitioners cannot be changed once data is in the cluster, however, so if you are switching to the 1.2 cassandra.yaml, you should change this to RandomPartitioner or whatever your old partitioner was. - If you using counters and upgrading from a version prior to 1.1.6, you should drain existing Cassandra nodes prior to the upgrade to prevent overcount during commitlog replay (see CASSANDRA-4782). For non-counter uses, drain is not required but is a good practice to minimize restart time. - Tables using LeveledCompactionStrategy will default to not creating a row-level bloom filter. The default in older versions of Cassandra differs; you should manually set the false positive rate to 1.0 (to disable) or 0.01 (to enable, if you make many requests for rows that do not exist). - The hints schema was changed from 1.1 to 1.2. Cassandra automatically snapshots and then truncates the hints column family as part of starting up 1.2 for the first time. Additionally, upgraded nodes will not store new hints destined for older (pre-1.2) nodes. It is therefore recommended that you perform a cluster upgrade when all nodes are up. Because hints will be lost, a cluster-wide repair (with -pr) is recommended after upgrade of all nodes. - The `nodetool removetoken` command (and corresponding JMX operation) have been renamed to `nodetool removenode`. This function is incompatible with the earlier `nodetool removetoken`, and attempts to remove nodes in this way with a mixed 1.1 (or lower) / 1.2 cluster, is not supported. - The somewhat ill-conceived CollatingOrderPreservingPartitioner has been removed. Use Murmur3Partitioner (recommended) or ByteOrderedPartitioner instead. - Global option hinted_handoff_throttle_delay_in_ms has been removed. hinted_handoff_throttle_in_kb has been added instead. - The default bloom filter fp chance has been increased to 1%. This will save about 30% of the memory used by the old default. Existing columnfamilies will retain their old setting. - The default partitioner (for new clusters; the partitioner cannot be changed in existing clusters) was changed from RandomPartitioner to Murmur3Partitioner which provides faster hashing as well as improved performance with secondary indexes. - The default version of CQL (and cqlsh) is now CQL3. CQL2 is still available but you will have to use the thrift set_cql_version method (that is already supported in 1.1) to use CQL2. For cqlsh, you will need to use 'cqlsh -2'. - CQL3 is now considered final in this release. Compared to the beta version that is part of 1.1, this final version has a few additions (collections), but also some (incompatible) changes in the syntax for the options of the create/alter keyspace/table statements. Typically, the syntax to create a keyspace is now: CREATE KEYSPACE ks WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; Also, the consistency level cannot be set in the language anymore, but is at the protocol level. Please refer to the CQL3 documentation ( for details. - In CQL3, the DROP behavior from ALTER TABLE has currently been removed (because it was not correctly implemented). We hope to add it back soon (Cassandra 1.2.1 or 1.2.2) Features -------- - Cassandra can now handle concurrent CREATE TABLE schema changes as well as other updates - rpc_timeout has been split up to allow finer-grained control on timeouts for different operation types - num_tokens can now be specified in cassandra.yaml. This defines the number of tokens assigned to the host on the ring (default: 1). Also specifying initial_token will override any num_tokens setting. - disk_failure_policy allows blacklisting failed disks in JBOD configuration instead of erroring out indefinitely - event tracing can be configured per-connection ("trace_next_query") or globally/probabilistically ("nodetool settraceprobability") - Atomic batches are now supported server side, where Cassandra will guarantee that (at the price of pre-writing the batch to another node first), all mutations in the batch will be applied, even if the coordinator fails mid-batch. - new IAuthorizer interface has replaced the old IAuthority. IAuthorizer allows dynamic permission management via new CQL3 statements: GRANT, REVOKE, LIST PERMISSIONS. A native implementation storing the permissions in Cassandra is being worked on and we expect to include it in 1.2.1 or 1.2.2. - IAuthenticator interface has been updated to support dynamic user creation, modification and removal via new CQL3 statements: CREATE USER, ALTER USER, DROP USER, LIST USERS. A native implementation that stores users in Cassandra itself is being worked on and is expected to become part of 1.2.1 or 1.2.2. 1.1.5 ===== Upgrading --------- - Nothing specific to this release, but please see 1.1 if you are upgrading from a previous version. 1.1.4 ===== Upgrading --------- - Nothing specific to this release, but please see 1.1 if you are upgrading from a previous version. 1.1.3 ===== Upgrading --------- - Running "nodetool upgradesstables" after upgrading is recommended if you use Counter columnfamilies. Features -------- - the cqlsh COPY command can now export to CSV flat files - added a new tools/bin/token-generator to facilitate generating evenly distributed tokens 1.1.2 ===== Upgrading --------- - If you have column families using the LeveledCompactionStrategy, you should run scrub on those column families. Features -------- - cqlsh has a new COPY command to load data from CSV flat files 1.1.1 ===== Upgrading --------- - Nothing specific to this release, but please see 1.1 if you are upgrading from a previous version. Features -------- - Continuous commitlog archiving and point-in-time recovery. See conf/ - Incremental repair by token range, exposed over JMX 1.1 === Upgrading --------- - Compression is enabled by default on newly created ColumnFamilies (and unchanged for ColumnFamilies created prior to upgrading). - If you are running a multi datacenter setup, you should upgrade to the latest 1.0.x (or 0.8.x) release before upgrading. Versions 0.8.8 and 1.0.3-1.0.5 generate cross-dc forwarding that is incompatible with 1.1. - EACH_QUORUM ConsistencyLevel is only supported for writes and will now throw an InvalidRequestException when used for reads. (Previous versions would silently perform a LOCAL_QUORUM read instead.) - ANY ConsistencyLevel is only supported for writes and will now throw an InvalidRequestException when used for reads. (Previous versions would silently perform a ONE read for range queries; single-row and multiget reads already rejected ANY.) - The largest mutation batch accepted by the commitlog is now 128MB. (In practice, batches larger than ~10MB always caused poor performance due to load volatility and GC promotion failures.) Larger batches will continue to be accepted but will not be durable. Consider setting durable_writes=false if you really want to use such large batches. - Make sure that global settings: key_cache_{size_in_mb, save_period} and row_cache_{size_in_mb, save_period} in conf/cassandra.yaml are used instead of per-ColumnFamily options. - JMX methods no longer return custom Cassandra objects. Any such methods will now return standard Maps, Lists, etc. - Hadoop input and output details are now separated. If you were previously using methods such as getRpcPort you now need to use getInputRpcPort or getOutputRpcPort depending on the circumstance. - CQL changes: + Prior to 1.1, you could use KEY as the primary key name in some select statements, even if the PK was actually given a different name. In 1.1+ you must use the defined PK name. - The sliced_buffer_size_in_kb option has been removed from the cassandra.yaml config file (this option was a no-op since 1.0). Features -------- - Concurrent schema updates are now supported, with any conflicts automatically resolved. Please note that simultaneously running ‘CREATE COLUMN FAMILY’ operation on the different nodes wouldn’t be safe until version 1.2 due to the nature of ColumnFamily identifier generation, for more details see CASSANDRA-3794. - The CQL language has undergone a major revision, CQL3, the highlights of which are covered at [1]. CQL3 is not backwards-compatibile with CQL2, so we've introduced a set_cql_version Thrift method to specify which version you want. (The default remains CQL2 at least until Cassandra 1.2.) cqlsh adds a --cql3 flag to enable this. [1] - Row-level isolation: multi-column updates to a single row have always been *atomic* (either all will be applied, or none) thanks to the CommitLog, but until 1.1 they were not *isolated* -- a reader may see mixed old and new values while the update happens. - Finer-grained control over data directories, allowing a ColumnFamily to be pinned to specfic volume, e.g. one backed by SSD. - The bulk loader is not longer a fat client; it can be run from an existing machine in a cluster. - A new write survey mode has been added, similar to bootstrap (enabled via -Dcassandra.write_survey=true), but the node will not automatically join the cluster. This is useful for cases such as testing different compaction strategies with live traffic without affecting the cluster. - Key and row caches are now global, similar to the global memtable threshold. Manual tuning of cache sizes per-columnfamily is no longer required. - Off-heap caches no longer require JNA, and will work out of the box on Windows as well as Unix platforms. - Streaming is now multithreaded. - Compactions may now be aborted via JMX or nodetool. - The stress tool is not new in 1.1, but it is newly included in binary builds as well as the source tree - Hadoop: a new BulkOutputFormat is included which will directly write SSTables locally and then stream them into the cluster. YOU SHOULD USE BulkOutputFormat BY DEFAULT. ColumnFamilyOutputFormat is still around in case for some strange reason you want results trickling out over Thrift, but BulkOutputFormat is significantly more efficient. - Hadoop: KeyRange.filter is now supported with ColumnFamilyInputFormat, allowing index expressions to be evaluated server-side to reduce the amount of data sent to Hadoop. - Hadoop: ColumnFamilyRecordReader has a wide-row mode, enabled via a boolean parameter to setInputColumnFamily, that pages through data column-at-a-time instead of row-at-a-time. - Pig: can use the wide-row Hadoop support, by setting PIG_WIDEROW_INPUT to true. This will produce each row's columns in a bag. 1.0.8 ===== Upgrading --------- - Nothing specific to 1.0.8 Other ----- - Allow configuring socket timeout for streaming 1.0.7 ===== Upgrading --------- - Nothing specific to 1.0.7, please report to instruction for 1.0.6 Other ----- - Adds new setstreamthroughput to nodetool to configure streaming throttling - Adds JMX property to get/set rpc_timeout_in_ms at runtime - Allow configuring (per-CF) bloom_filter_fp_chance 1.0.6 ===== Upgrading --------- - This release fixes an issue related to the chunk_length_kb option for compressed sstables. If you use compression on some column families, it is recommended after the upgrade to check the value for this option on these column families (the default value is 64). In case the option would not be set correctly, you should update the column family definition, setting the right value and then run scrub on the column family. - Please report to instruction for 1.0.5 if coming from an older version. 1.0.5 ===== Upgrading --------- - 1.0.5 comes to fix two important regression of 1.0.4. So all information concerning 1.0.4 are valid for this release, but please avoids upgrading to 1.0.4. 1.0.4 ===== Upgrading --------- - Nothing specific to 1.0.4 but please see the 1.0 upgrading section if upgrading from a version prior to 1.0.0 Features -------- - A new upgradesstables command has been added to nodetool. It is very similar to scrub but without the ability to discard corrupted rows (and as a consequence it does not snapshot automatically before). This new command is to be prefered to scrub in all cases where sstables should be rewritten to the current format for upgrade purposes. JMX --- - The path for the data, commit log and saved cache directories exposed through JMX - The in-memory bloom filter sizes are now exposed through JMX 1.0.3 ===== Upgrading --------- - Nothing specific to 1.0.3 but please see the 1.0 upgrading section if upgrading from a version prior to 1.0.0 Features -------- - For non compressed sstables (compressed sstable already include more fine grained checsums), a sha1 for the full sstable is now automatically created (in a fix with suffix -Digest.sha1). It can be used to check the sstable integrity with sha1sum. 1.0.2 ===== Upgrading --------- - Nothing specific to 1.0.2 but please see the 1.0 upgrading section if upgrading from a version prior to 1.0.0 Features -------- - Cassandra CLI queries now have timing information 1.0.1 ===== Upgrading --------- - If upgrading from a version prior to 1.0.0, please see the 1.0 Upgrading section - For running on Windows as a Service, procrun is no longer discributed with Cassandra, see README.txt for more information on how to download it if necessary. - The name given to snapshots directories have been improved for human readability. If you had scripts relying on it, you may need to update them. 1.0 === Upgrading --------- - Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling restart, one node at a time. (0.8.0 or 0.8.1 are NOT network-compatible with 1.0: upgrade to the most recent 0.8 release first.) You do not need to bring down the whole cluster at once. - After upgrading, run nodetool scrub against each node before running repair, moving nodes, or adding new ones. - CQL inserts/updates now generate microsecond resolution timestamps by default, instead of millisecond. THIS MEANS A ROLLING UPGRADE COULD MIX milliseconds and microseconds, with clients talking to servers generating milliseconds unable to overwrite the larger microsecond timestamps. If you are using CQL and this is important for your application, you can either perform a non-rolling upgrade to 1.0, or update your application first to use explicit timestamps with the "USING timestamp=X" syntax. - The BinaryMemtable bulk-load interface has been removed (use the sstableloader tool instead). - The compaction_thread_priority setting has been removed from cassandra.yaml (use compaction_throughput_mb_per_sec to throttle compaction instead). - CQL types bytea and date were renamed to blob and timestamp, respectively, to conform with SQL norms. CQL type int is now a 4-byte int, not 8 (which is still available as bigint). - Cassandra 1.0 uses arena allocation to reduce old generation fragmentation. This means there is a minimum overhead of 1MB per ColumnFamily plus 1MB per index. - The SimpleAuthenticator and SimpleAuthority classes have been moved to the example directory (and are thus not available from the binary distribution). They never provided actual security and in their current state are only meant as examples. Features -------- - SSTable compression is supported through the 'compression_options' parameter when creating/updating a column family. For instance, you can create a column family Cf using compression (through the Snappy library) in the CLI with: create column family Cf with compression_options={sstable_compression: SnappyCompressor} SSTable compression is not activated by default but can be activated or deactivated at any time. - Compressed SSTable blocks are checksummed to protect against bitrot - New LevelDB-inspired compaction algorithm can be enabled by setting the Columnfamily compaction_strategy=LeveledCompactionStrategy option. Leveled compaction means you only need to keep a few MB of space free for compaction instead of (in the worst case) 50%. - Ability to use multiple threads during a single compaction. See multithreaded_compaction in cassandra.yaml for more details. - Windows Service ("cassandra.bat install" to enable) - A dead node may be replaced in a single step by starting a new node with -Dcassandra.replace_token=<token>. More details can be found at - It is now possible to repair only the first range returned by the partitioner for a node with `nodetool repair -pr`. It makes it easier/possible to repair a full cluster without any work duplication by running this command on every node of the cluster. New data types -------------- - decimal Other ----- - Hinted Handoff has two major improvements: - Hint replay is much more efficient thanks to a change in the data model - Hints are created for all replicas that do not ack a write. (Formerly, only replicas known to be down when the write started were hinted.) This means that running with read repair completely off is much more viable than before, and the default read_repair_chance is reduced from 1.0 ("always repair") to 0.1 ("repair 10% of the time"). - The old per-ColumnFamily memtable thresholds (memtable_throughput_in_mb, memtable_operations_in_millions, memtable_flush_after_mins) are ignored, in favor of the global memtable_total_space_in_mb and commitlog_total_space_in_mb settings. This does not affect client compatibility -- the old options are still allowed, but have no effect. These options may be removed entirely in a future release. - Backlogged compactions will begin five minutes after startup. The 0.8 behavior of never starting compaction until a flush happens is usually not what is desired, but a short grace period is useful to allow caches to warm up first. - The deletion of compacted data files is not performed during Garbage Collection anymore. This means compacted files will now be deleted without delay. 0.8.5 ===== Features -------- - SSTables copied to a data directory can be loaded by a live node through nodetool refresh (may be handy to load snapshots). - The configured compaction throughput is exposed through JMX. Other ----- - The sstableloader is now bundled with the debian package. - Repair detects when a participating node is dead and fails instead of hanging forever. 0.8.4 ===== Upgrading --------- - Nothing specific to 0.8.4 Other ----- - This release comes to fix a bug in counter that could lead to (important) over-count. - It also fixes a slight upgrade regression from 0.8.3. It is thus advised to jump directly to 0.8.4 if upgrading from before 0.8.3. 0.8.3 ===== Upgrading --------- - Token removal has been revamped. Removing tokens in a mixed cluster with 0.8.3 will not work, so the entire cluster will need to be running 0.8.3 first, except for the dead node. Features -------- - It is now possible to use thrift asynchronous and half-synchronous/half-asynchronous servers (see cassandra.yaml for more details). - It is now possible to access counter columns through Hadoop. Other ----- - This release fix a regression of 0.8 that can make commit log segment to be deleted even though not all data it contains has been flushed. Upgrades from 0.8.* is very much encouraged. 0.8.2 ===== Upgrading --------- - 0.8.0 and 0.8.1 shipped with a bug that was setting the replicate_on_write option for counter column families to false (this option has no effect on non-counter column family). This is an unsafe default and 0.8.2 correct this, the default for replicate_on_write is now true. It is advised to update your counter column family definitions if replicate_on_write was uncorrectly set to false (before or after upgrade). 0.8.1 ===== Upgrading --------- - 0.8.1 is backwards compatible with 0.8, upgrade can be achieved by a simple rolling restart. - If upgrading for earlier version (0.7), please refer to the 0.8 section for instructions. Features -------- - Numerous additions/improvements to CQL (support for counters, TTL, batch inserts/deletes, index dropping, ...). - Add two new AbstractTypes (comparator) to support compound keys (CompositeType and DynamicCompositeType), as well as a ReverseType to reverse the order of any existing comparator. - New option to bypass the commit log on some keyspaces (for advanced users). Tools ----- - Add new data bulk loading utility (sstableloader). 0.8 === Upgrading --------- - Upgrading from version 0.7.1 or later can be done with a rolling restart, one node at a time. You do not need to bring down the whole cluster at once. - After upgrading, run nodetool scrub against each node before running repair, moving nodes, or adding new ones. - Running nodetool drain before shutting down the 0.7 node is recommended but not required. (Skipping this will result in replay of entire commitlog, so it will take longer to restart but is otherwise harmless.) - 0.8 is fully API-compatible with 0.7. You can continue to use your 0.7 clients. - Avro record classes used in map/reduce and Hadoop streaming code have been removed. Map/reduce can be switched to Thrift by changing org.apache.cassandra.avro in import statements to org.apache.cassandra.thrift (no class names change). Streaming support has been removed for the time being. - The loadbalance command has been removed from nodetool. For similar behavior, decommission then rebootstrap with empty initial_token. - Thrift unframed mode has been removed. - The addition of key_validation_class means the cli will assume keys are bytes, instead of strings, in the absence of other information. See for more details. Features -------- - added CQL client API and JDBC/DBAPI2-compliant drivers for Java and Python, respectively (see: drivers/ subdirectory and doc/cql) - added distributed Counters feature; see - optional intranode encryption; see comments around 'encryption_options' in cassandra.yaml - compaction multithreading and rate-limiting; see 'concurrent_compactors' and 'compaction_throughput_mb_per_sec' in cassandra.yaml - cassandra will limit total memtable memory usage to 1/3 of the heap by default. This can be ajusted or disabled with the memtable_total_space_in_mb option. The old per-ColumnFamily throughput, operations, and age settings are still respected but will be removed in a future major release once we are satisfied that memtable_total_space_in_mb works adequately. Tools ----- - stress and py_stress moved from contrib/ to tools/ - clustertool was removed (see for examples of how to script nodetool across the cluster instead) Other ----- - In the past, sstable2json would write column names and values as hex strings, and now creates human readable values based on the comparator/validator. As a result, JSON dumps created with older versions of sstable2json are no longer compatible with json2sstable, and imports must be made with a configuration that is identical to the export. - manually-forced compactions ("nodetool compact") will do nothing if only a single SSTable remains for a ColumnFamily. To force it to compact that anyway (which will free up space if there are a lot of expired tombstones), use the new forceUserDefinedCompaction JMX method on CompactionManager. - most of contrib/ (which was not part of the binary releases) has been moved either to examples/ or tools/. We plan to move the rest for 0.8.1. JMX --- - By default, JMX now listens on port 7199. 0.7.6 ===== Upgrading --------- - Nothing specific to 0.7.6, but see 0.7.3 Upgrading if upgrading from earlier than 0.7.1. 0.7.5 ===== Upgrading --------- - Nothing specific to 0.7.5, but see 0.7.3 Upgrading if upgrading from earlier than 0.7.1. Changes ------- - system_update_column_family no longer snapshots before applying the schema change. (_update_keyspace never did. _drop_keyspace and _drop_column_family continue to snapshot.) - added memtable_flush_queue_size option to cassandra.yaml to avoid blocking writes when multiple column families (or a colum family with indexes) are flushed at the same time. - allow overriding initial_token, storage_port and rpc_port using system properties 0.7.4 ===== Upgrading --------- - Nothing specific to 0.7.4, but see 0.7.3 Upgrading if upgrading from earlier than 0.7.1. Features -------- - Output to Pig is now supported as well as input 0.7.3 ===== Upgrading --------- - 0.7.1 and 0.7.2 shipped with a bug that caused incorrect row-level bloom filters to be generated when compacting sstables generated with earlier versions. This would manifest in IOExceptions during column name-based queries. 0.7.3 provides "nodetool scrub" to rebuild sstables with correct bloom filters, with no data lost. (If your cluster was never on 0.7.0 or earlier, you don't have to worry about this.) Note that nodetool scrub will snapshot your data files before rebuilding, just in case. 0.7.1 ===== Upgrading --------- - 0.7.1 is completely backwards compatible with 0.7.0. Just restart each node with the new version, one at a time. (The cluster does not all need to be upgraded simultaneously.) Features -------- - added flush_largest_memtables_at and reduce_cache_sizes_at options to cassandra.yaml as an escape valve for memory pressure - added option to specify -Dcassandra.join_ring=false on startup to allow "warm spare" nodes or performing JMX maintenance before joining the ring Performance ----------- - Disk writes and sequential scans avoid polluting page cache (requires JNA to be enabled) - Cassandra performs writes efficiently across datacenters by sending a single copy of the mutation and having the recipient forward that to other replicas in its datacenter. - Improved network buffering - Reduced lock contention on memtable flush - Optimized supercolumn deserialization - Zero-copy reads from mmapped sstable files - Explicitly set higher JVM new generation size - Reduced i/o contention during saving of caches 0.7.0 ===== Features -------- - Secondary indexes (indexes on column values) are now supported - Row size limit increased from 2GB to 2 billion columns. rows are no longer read into memory during compaction. - Keyspace and ColumnFamily definitions may be added and modified live - Streaming data for repair or node movement no longer requires anticompaction step first - NetworkTopologyStrategy (formerly DatacenterShardStrategy) is ready for use, enabling ConsistencyLevel.DCQUORUM and DCQUORUMSYNC. See comments in `cassandra.yaml.` - Optional per-Column time-to-live field allows expiring data without have to issue explicit remove commands - `truncate` thrift method allows clearing an entire ColumnFamily at once - Hadoop OutputFormat and Streaming [non-jvm map/reduce via stdin/out] support - Up to 8x faster reads from row cache - A new ByteOrderedPartitioner supports bytes keys with arbitrary content, and orders keys by their byte value. This should be used in new deployments instead of OrderPreservingPartitioner. - Optional round-robin scheduling between keyspaces for multitenant clusters - Dynamic endpoint snitch mitigates the impact of impaired nodes - New `IntegerType`, faster than LongType and allows integers of both less and more bits than Long's 64 - A revamped authentication system that decouples authorization and allows finer-grained control of resources. Upgrading --------- The Thrift API has changed in incompatible ways; see below, and refer to for a list of higher-level clients that have been updated to support the 0.7 API. The Cassandra inter-node protocol is incompatible with 0.6.x releases (and with 0.7 beta1), meaning you will have to bring your cluster down prior to upgrading: you cannot mix 0.6 and 0.7 nodes. The hints schema was changed from 0.6 to 0.7. Cassandra automatically snapshots and then truncates the hints column family as part of starting up 0.7 for the first time. Keyspace and ColumnFamily definitions are stored in the system keyspace, rather than the configuration file. The process to upgrade is: 1) run "nodetool drain" on _each_ 0.6 node. When drain finishes (log message "Node is drained" appears), stop the process. 2) Convert your storage-conf.xml to the new cassandra.yaml using "bin/config-converter". 3) Rename any of your keyspace or column family names that do not adhere to the '^\w+' regex convention. 4) Start up your cluster with the 0.7 version. 5) Initialize your Keyspace and ColumnFamily definitions using "bin/schematool <host> <jmxport> import". _You only need to do this to one node_. Thrift API ---------- - The Cassandra server now defaults to framed mode, rather than unframed. Unframed is obsolete and will be removed in the next major release. - The Cassandra Thrift interface file has been updated for Thrift 0.5. If you are compiling your own client code from the interface, you will need to upgrade the Thrift compiler to match. - Row keys are now bytes: keys stored by versions prior to 0.7.0 will be returned as UTF-8 encoded bytes. OrderPreservingPartitioner and CollatingOrderPreservingPartitioner continue to expect that keys contain UTF-8 encoded strings, but RandomPartitioner now works on any key data. - keyspace parameters have been replaced with the per-connection set_keyspace method. - The return type for login() is now AccessLevel. - The get_string_property() method has been removed. - The get_string_list_property() method has been removed. Configuraton ------------ - Configuration file renamed to cassandra.yaml and to - PropertyFileSnitch configuration file renamed to - The ThriftAddress and ThriftPort directives have been renamed to RPCAddress and RPCPort respectively. - EndPointSnitch was renamed to RackInferringSnitch. A new SimpleSnitch has been added. - RackUnawareStrategy and RackAwareStrategy have been renamed to SimpleStrategy and OldNetworkTopologyStrategy, respectively. - RowWarningThresholdInMB replaced with in_memory_compaction_limit_in_mb - GCGraceSeconds is now per-ColumnFamily instead of global - Keyspace and column family names that do not confirm to a '^\w+' regex are considered illegal. - Keyspace and column family definitions will need to be loaded via "bin/schematool <host> <jmxport> import". _You only need to do this to one node_. - In addition to an authenticator, an authority must be configured as well. Users of SimpleAuthenticator should use SimpleAuthority for this value (the default is AllowAllAuthority, which corresponds with AllowAllAuthenticator). - The format of has changed, see the sample configuration conf/ for documentation on the new format. JMX --- - StreamingService moved from o.a.c.streaming to o.a.c.service - GMFD renamed to GOSSIP_STAGE - {Min,Mean,Max}RowCompactedSize renamed to {Min,Mean,Max}RowSize since it no longer has to wait til compaction to be computed Other ----- - If extending AbstractType, make sure you follow the singleton pattern followed by Cassandra core AbstractType classes: provide a public static final variable called 'instance'. 0.6.6 ===== Upgrading --------- - As part of the cache-saving feature, a third directory (along with data and commitlog) has been added to the config file. You will need to set and create this directory when restarting your node into 0.6.6. 0.6.1 ===== Upgrading --------- - We try to keep minor versions 100% compatible (data format, commitlog format, network format) within the major series, but we introduced a network-level incompatibility in 0.6.1. Thus, if you are upgrading from 0.6.0 to any higher version (0.6.1, 0.6.2, etc.) then you will need to restart your entire cluster with the new version, instead of being able to do a rolling restart. 0.6.0 ===== Features -------- - row caching: configure with the RowsCached attribute in ColumnFamily definition - Hadoop map/reduce support: see contrib/word_count for an example - experimental authentication support, described under Authenticator in storage.conf Configuraton ------------ - MemtableSizeInMB has been replaced by MemtableThroughputInMB which triggers a memtable flush when the specified amount of data has been written, including overwrites. - MemtableObjectCountInMillions has been replaced by the MemtableOperationsInMillions directive which causes a memtable flush to occur after the specified number of operations. - Like MemtableSizeInMB, BinaryMemtableSizeInMB has been replaced by BinaryMemtableThroughputInMB. - Replication factor is now per-keyspace, rather than global. - KeysCachedFraction is deprecated in favor of KeysCached - RowWarningThresholdInMB added, to warn before very large rows get big enough to threaten node stability Thrift API ---------- - removed deprecated get_key_range method - added batch_mutate meethod - deprecated multiget and batch_insert methods in favor of multiget_slice and batch_mutate, respectively - added ConsistencyLevel.ANY, for when you want write availability even when it may not be readable immediately. Unlike CL.ZERO, though, it will throw an exception if it cannot be written *somewhere*. JMX metrics ----------- - read and write statistics are reported as lifetime totals, instead of averages over the last minute. average-since-last requested are also available for convenience. - cache hit rate statistics are now available from JMX under org.apache.cassandra.db.Caches - compaction JMX metrics are moved to org.apache.cassandra.db.CompactionManager. PendingTasks is now a much better estimate of compactions remaining, and the progress of the current compaction has been added. - commitlog JMX metrics are moved to org.apache.cassandra.db.Commitlog - progress of data streaming during bootstrap, loadbalance, or other data migration, is available under org.apache.cassandra.streaming.StreamingService. See for details. Installation/Upgrade -------------------- - 0.6 network traffic is not compatible with earlier versions. You will need to shut down all your nodes at once, upgrade, then restart. 0.5.0 ===== 0. The commitlog format has changed (but sstable format has not). When upgrading from 0.4, empty the commitlog either by running bin/nodeprobe flush on each machine and waiting for the flush to finish, or simply remove the commitlog directory if you only have test data. (If more writes come in after the flush command, starting 0.5 will error out; if that happens, just go back to 0.4 and flush again.) The format changed twice: from 0.4 to beta1, and from beta2 to RC1. .5 The gossip protocol has changed, meaning 0.5 nodes cannot coexist in a cluster of 0.4 nodes or vice versa; you must upgrade your whole cluster at the same time. 1. Bootstrap, move, load balancing, and active repair have been added. See When upgrading from 0.4, leave autobootstrap set to false for the first restart of your old nodes. 2. Performance improvements across the board, especially on the write path (over 100% improvement in throughput). 3. Configuration: - Added "comment" field to ColumnFamily definition. - Added MemtableFlushAfterMinutes, a global replacement for the old per-CF FlushPeriodInMinutes setting - Key cache settings 4. Thrift: - Added get_range_slice, deprecating get_key_range 0.4.2 ===== 1. Improve default garbage collector options significantly -- throughput will be 30% higher or more. 0.4.1 ===== 1. SnapshotBeforeCompaction configuration option allows snapshotting before each compaction, which allows rolling back to any version of the data. 0.4.0 ===== 1. On-disk data format has changed to allow billions of keys/rows per node instead of only millions. The new format is incompatible with 0.3; see 0.3 notes below for how to import data from a 0.3 install. 2. Cassandra now supports multiple keyspaces. Typically you will have one keyspace per application, allowing applications to be able to create and modify ColumnFamilies at will without worrying about collisions with others in the same cluster. 3. Many Thrift API changes and documentation. See 4. Removed the web interface in favor of JMX and bin/nodeprobe, which has significantly enhanced functionality. 5. Renamed configuration "<Table>" to "<Keyspace>". 6. Added commitlog fsync; see "<CommitLogSync>" in configuration. 0.3.0 ===== 1. With enough and large enough keys in a ColumnFamily, Cassandra will run out of memory trying to perform compactions (data file merges). The size of what is stored in memory is (S + 16) * (N + M) where S is the size of the key (usually 2 bytes per character), N is the number of keys and M, is the map overhead (which can be guestimated at around 32 bytes per key). So, if you have 10-character keys and 1GB of headroom in your heap space for compaction, you can expect to store about 17M keys before running into problems. See 2. Because fixing #1 requires a data file format change, 0.4 will not be binary-compatible with 0.3 data files. A client-side upgrade can be done relatively easily with the following algorithm: for key in old_client.get_key_range(everything): columns = old_client.get_slice or get_slice_super(key, all columns) new_client.batch_insert or batch_insert_super(key, columns) The inner loop can be trivially parallelized for speed. 3. Commitlog does not fsync before reporting a write successful. Using blocking writes mitigates this to some degree, since all nodes that were part of the write quorum would have to fail before sync for data to be lost. See Additionally, row size (that is, all the data associated with a single key in a given ColumnFamily) is limited by available memory, because compaction deserializes each row before merging. See
