Illustration Image

11/1/2024

Reading time:4

GitHub - datastax/zdm-proxy: An open-source component designed to seamlessly handle the real-time client application activity while a migration is in progress.

logo

This resource is based on an article originally published here.

The ZDM Proxy is client-server component written in Go that enables users to migrate with zero downtime from an Apache Cassandra® cluster to another (which may be an Astra cluster) and not requiring code changes in the application client.

The only change to the client is pointing it to the proxy rather than directly to the original cluster (Origin). In turn, the proxy connects to both Origin and Target clusters.

By default, the proxy will forward read requests only to the Origin cluster, though you can optionally configure it to forward reads to both clusters asynchronously, while writes will always be sent to both clusters concurrently.

An overview of the proxy architecture and logical flow can be viewed here.

In order to run the proxy, you'll need to set some environment variables or pass reference to YAML configuration file. Below you'll find a list with the most important variables along with their default values. The required ones are marked with a comment. Variable names for YAML configuration file do not have ZDM_ prefix and are lower-cased.

ZDM_ORIGIN_CONTACT_POINTS=10.0.0.1  #required
ZDM_ORIGIN_USERNAME=cassandra       #required
ZDM_ORIGIN_PASSWORD=cassandra       #required
ZDM_ORIGIN_PORT=9042
ZDM_TARGET_CONTACT_POINTS=10.0.0.2  #required
ZDM_TARGET_USERNAME=cassandra       #required
ZDM_TARGET_PASSWORD=cassandra       #required
ZDM_TARGET_PORT=9042
ZDM_PROXY_LISTEN_PORT=14002
ZDM_PROXY_LISTEN_ADDRESS=127.0.0.1
ZDM_PRIMARY_CLUSTER=ORIGIN
ZDM_READ_MODE=PRIMARY_ONLY
ZDM_LOG_LEVEL=INFO

The environment variables (or YAM configuration file) must be set for the proxy to work.

In order to get started quickly, in your local environment, grab a copy of the binary distribution in the Releases page. For the recommended installation in a production environment, check the Production Setup section below.

Now, suppose you have two clusters running at 10.0.0.1 and 10.0.0.2 with cassandra/cassandra credentials and the same key-value schema. You can start the proxy and connect it to these clusters like this:

$ export ZDM_ORIGIN_CONTACT_POINTS=10.0.0.1 \ 
export ZDM_TARGET_CONTACT_POINTS=10.0.0.2 \
export ZDM_ORIGIN_USERNAME=cassandra \
export ZDM_ORIGIN_PASSWORD=cassandra \
export ZDM_TARGET_USERNAME=cassandra \
export ZDM_TARGET_PASSWORD=cassandra \
./zdm-proxy-v2.0.0 # run the ZDM proxy executable

If you prefer to use YAML configuration file, an equivalent setup would look like:

$ cat zdm-config.yml
origin_contact_points: 10.0.0.1
target_contact_points: 10.0.0.2
origin_username: cassandra
origin_password: cassandra
target_username: cassandra
target_password: cassandra
$ ./zdm-proxy-v2.0.0 --config=./zdm-config.yml # run the ZDM proxy executable

At this point, you should be able to connect some client such as CQLSH to the proxy and write data to it and the proxy will take care of forwarding the requests to both clusters concurrently.

$ cqlsh <proxy-ip-address> 14002 # this is the proxy's default listen port

From the CQLSH prompt:

cqlsh> INSERT INTO test.keyvalue (key, value) VALUES (1, 'ABC');
cqlsh> INSERT INTO test.keyvalue (key, value) VALUES (2, 'DEF');
cqlsh> SELECT * FROM test.keyvalue;
cqlsh> UPDATE test.keyvalue SET value='GYEKJF' WHERE key = 1;
cqlsh> DELETE FROM test.keyvalue WHERE key = 2;

You can confirm that the data is stored in both clusters by querying them directly in other cqlsh sessions.

Note: For the moment, the keyspace must be specified when accessing a table, even after using USE <keyspace>.

If you don't have test clusters readily available to try with, check the alternative method with docker-compose in the Contributor's guide, which will set up all the dependencies, including two test clusters and a proxy instance, in a containerized sandbox environment.

ZDM Proxy supports protocol versions v2, v3, v4, DSE_V1 and DSE_V2.

It technically doesn't support v5, but handles protocol negotiation so that the client application properly downgrades the protocol version to v4 if v5 is requested. This means that any client application using a recent driver that supports protocol version v5 can be migrated using the ZDM Proxy (as long as it does not use v5-specific functionality).

ZDM Proxy requires origin and target clusters to have at least one protocol version in common. It is therefore not feasible to configure Apache Cassandra 2.0 as origin and 3.x / 4.x as target. Below table displays protocol versions supported by various C* versions:

Apache Cassandra Protocol Version
2.0 V2
2.1 V2, V3
2.2 V2, V3, V4
3.x V3, V4
4.x V3, V4, V5

⚠️ Thrift is not supported by ZDM Proxy. If you are using a very old driver or cluster version that only supports Thrift then you need to change your client application to use CQL and potentially upgrade your cluster before starting the migration process.


In practice this means that ZDM Proxy supports the following cluster versions (as Origin and / or Target):

  • Apache Cassandra from 2.0+ up to (and including) Apache Cassandra 4.x. (although both clusters have to support a common protocol version as mentioned above).
  • DataStax Enterprise 4.8+. DataStax Enterprise 4.6 and 4.7 support will be introduced when protocol version v2 is supported.
  • DataStax Astra DB (both Serverless and Classic)

The setup we described above is only for testing in a local environment. It is NOT recommended for a production installation where the minimum number of proxy instances is 3.

For a comprehensive guide with the recommended production setup check the documentation available at Datastax Migration.

There you'll find information about an Ansible-based tool that automates most of the process.

For information on the packaged dependencies of the Zero Downtime Migration (ZDM) Proxy and their licenses, check out our open source report.

For frequently asked questions, please refer to our separate FAQ page.

Related Articles

logo
migration
proxy
cassandra

Explore Further

migration

proxy

datastax

Become part of our
growing community!
Welcome to Planet Cassandra, a community for Apache Cassandra®! We're a passionate and dedicated group of users, developers, and enthusiasts who are working together to make Cassandra the best it can be. Whether you're just getting started with Cassandra or you're an experienced user, there's a place for you in our community.
A dinosaur
Planet Cassandra is a service for the Apache Cassandra® user community to share with each other. From tutorials and guides, to discussions and updates, we're here to help you get the most out of Cassandra. Connect with us and become part of our growing community today.
© 2009-2023 The Apache Software Foundation under the terms of the Apache License 2.0. Apache, the Apache feather logo, Apache Cassandra, Cassandra, and the Cassandra logo, are either registered trademarks or trademarks of The Apache Software Foundation. Sponsored by Anant Corporation and Datastax, and Developed by Anant Corporation.

Get Involved with Planet Cassandra!

We believe that the power of the Planet Cassandra community lies in the contributions of its members. Do you have content, articles, videos, or use cases you want to share with the world?