December 9th, 2013

 How to Do an Upsert in Cassandra blog posting was created by Ike Walker. To view more postings by Ike, check out the Mechanics of Flite blog here.


In a recent post I talked about how to do an upsert in MySQL. Doing an upsert in Cassandra 1.2 using CQL 3 is more straightforward. Why? Because in Cassandra every INSERT (and every UPDATE) is actually an upsert. Cassandra is a distributed database that avoids reading before a write, so an INSERT or UPDATE sets the column values you specify regardless of whether the row already exists. This means inserts can update existing rows, and updates can create new rows. It also means it’s easy to accidentally overwrite existing data, so keep that in mind.

Read on for some examples that illustrate Cassandra’s upsert behavior. For the sake of consistency I’ll use the same row from my related MySQL post.

First I create the actor table in Cassandra:

Now I INSERT my test row and read it back to verify the data:

I could update the last_name and last_update values in that row using a CQL UPDATE, but since this post is about upserts I’ll do it with an INSERT instead:

Now that I’ve covered updating an existing row using an INSERT, how about inserting a non-existent row using UPDATE? I’ll delete the row first and verify it’s gone, then do the UPDATE and SELECT to verify the behavior:

If the row already exists I can update specific columns and leave others alone by only naming the columns I want to update. In this case any columns that I do not specify will keep their existing values. For example: