Apache Cassandra Lunch #68: DataStax Apache Kafka Connector

6/25/2022

Reading time:3

Apache Cassandra Lunch #68: DataStax Apache Kafka Connector - Business Platform Team

This resource is based on an article originally published here.

In Apache Cassandra Lunch #68: DataStax Apache Kafka Connector, we introduce the DataStax Apache Kafka Connector and discuss how we can use it to connect Apache Kafka and Cassandra. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!

In Apache Cassandra Lunch #68: DataStax Apache Kafka Connector, we introduce theDataStax Apache Kafka Connectorand discuss how we can use it to connectApache KafkaandCassandra. In the video recording embedded below, we go through some basic information regarding the connector, basic architecture of how it works, and also go through a simple Katacoda example from DataStax to show you how to use the connector. Additionally, we also discuss how we use the DataStax Apache Kafka Connector in ourCassandra.Realtimerepo, so be sure to check out the embedded video below!

The DataStax Apache Kafka Connector is open source software that works with the Kafka Connect framework. It synchronizes records from a Kafka topic with table rows in the following supported databases: DataStax Astra cloud databases, DataStax Enterprise (DSE) 4.7 and later databases, and Open source Apache Cassandra® 2.1 and later databases. The connector gets deployed on the Kafka Connect Worker nodes and runs within the worker JVM. The connector Workers running one or more instances of the DataStax Kafka Connector pull messages from Kafka topics and write them to a database table on the DataStax platform using the DataStax Enterprise Java driver.

Each instance of the DataStax Apache Kafka Connector creates a single session with the cluster.
- A single connector instance can process records from multiple Kafka topics and write to several database tables.
Data is pulled from the Kafka topic and written to the mapped table using a CQL batch that contains multiple write statements.
A map specification binds a Kafka topic field to a table column.
- Fields that are omitted from the specification are not included in the write request.
- Fields with null values are written to the database as UNSET (see nullToUnset).
- To ensure proper ordering, all records are written using the Kafka record timestamp.
Use multiple connectors when different global connect settings are required for different scenarios, such as writing to different clusters or datacenters.
The Datastax Connector tasks store the offsets in config.offset.topic.
- In the event of a failure, the DataStax Connector task resumes reading from the last recorded location.
Ingest data from Kafka topics with records in the following data structures:
- Primitive type values, such as integer or string
- Complex field values in record types:
  - JSON formatted string
  - Kafka Struct
  - Avro
Built-in SSL, LDAP/Active Directory, and Kerberos integration
More Features: https://docs.datastax.com/en/kafka/doc/kafka/kafkaFeatures.html

The demo portion of Apache Cassandra Lunch #68: DataStax Apache Kafka Connector is split into two parts as mentioned above. In the first portion, we cover a DataStax Katacoda Scenario in which we create a Kafka topic, configure and start a Kafka Connect Worker, download and configure the DataStax Kafka Connector, and push data from the topic in Kafka to a Cassandra instance. In the second portion of the demo, we take a look at

Cassandra.Realtimeand discuss how that walkthrough uses the same basics we covered in the Katacoda scenario. If you want a more in-depth discussion and video demo, be sure to watch the embedded Youtube video below!

Resources

Cassandra.Link

Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!

Posted in Data & Analytics, Events | Comments Off on Apache Cassandra Lunch #68: DataStax Apache Kafka Connector

Related Articles

migration

proxy

datastax

GitHub - datastax/zdm-proxy: An open-source component designed to seamlessly handle the real-time client application activity while a migration is in progress.

11/1/2024

migration

proxy

datastax

GitHub - datastax/zdm-proxy: An open-source component designed to seamlessly handle the real-time client application activity while a migration is in progress.

11/1/2024

cassandra

event.driven

spark

Build an Event-Driven Architecture with Apache Kafka, Apache Spark, and Apache Cassandra

8/3/2024

cloud

kubernetes

datastax

DataStax Hyper-Converged Database: The Future of Data Infrastructure Is Here | DataStax

7/11/2024

cluster

troubleshooting

datastax

GitHub - arodrime/Montecristo: Datastax Cluster Health Check Tooling

4/3/2024

analytics

streaming

visualization

Keen - Event Streaming Platform

2/3/2024

mongo

cassandra

kafka

Top 10 Real-Time Databases to Use in 2024

1/5/2024

node

hybrid.cloud

datastax

GitHub - IBM/datastax-cassandra-clickstream: Use DataStax Enterprise built on Apache Cassandra as a clickstream database

12/8/2023

examples

cassandra

datastax

GitHub - datastaxdevs/workshop-betterreads: Clone of Good Reads using Spring and Cassandra

12/2/2023

examples

cassandra

datastax

NoSQL Database Built on Apache Cassandra | DataStax

12/2/2023

Explore Further

datastax

migration

proxy

datastax

GitHub - datastax/zdm-proxy: An open-source component designed to seamlessly handle the real-time client application activity while a migration is in progress.

11/1/2024

migration

proxy

datastax

GitHub - datastax/zdm-proxy: An open-source component designed to seamlessly handle the real-time client application activity while a migration is in progress.

11/1/2024

cloud

kubernetes

datastax

DataStax Hyper-Converged Database: The Future of Data Infrastructure Is Here | DataStax

7/11/2024

cluster

troubleshooting

datastax

GitHub - arodrime/Montecristo: Datastax Cluster Health Check Tooling

4/3/2024

cassandra.lunch

stargate

cassandra.lunch

cassandra

Apache Cassandra Lunch #87: Cassandra.api, Astra, and Stargate - Business Platform Team

7/8/2022

cqlsh

cassandra.lunch

cassandra

Apache Cassandra Lunch #77: Connect to DataStax Astra via Standalone CQLSH - Business Platform Team

7/2/2022

datastax

cassandra.basics

cassandra.lunch

Cassandra Lunch #75: Getting Started with DataStax Enterprise (DSE) on Docker - Business Platform Team

6/29/2022

cassandra.basics

cassandra.lunch

cassandra

Cassandra Lunch #70: Basics of Apache Cassandra - Business Platform Team

6/27/2022

cassandra

acid

open.source

cassandra

GitHub - pmcfadin/awesome-accord: Repository of all kinds of things to help you get up and running with ACID transactions on Apache Cassandra®

1/16/2025

mongo

nocode

elasticsearch

GitHub - ibagroup-eu/Visual-Flow: Visual-Flow main repository

12/2/2024

mongo

nocode

elasticsearch

GitHub - ibagroup-eu/Visual-Flow: Visual-Flow main repository

12/2/2024

migration

proxy

cassandra

GitHub - datastax/cql-proxy: A client-side CQL proxy/sidecar.

11/1/2024

Resources

Cassandra.Link

Become part of our

growing community!

Planet Cassandra is a service for the Apache Cassandra® user community to share with each other. From tutorials and guides, to discussions and updates, we're here to help you get the most out of Cassandra. Connect with us and become part of our growing community today.

Get Involved with Planet Cassandra!

We believe that the power of the Planet Cassandra community lies in the contributions of its members. Do you have content, articles, videos, or use cases you want to share with the world?