- Things You Should Be Doing When Using Cassandra Drivers
- September 30, 2014 12:00 am - 10:00 am(PST)
3:00 am - 2:00 pm(EST)
7:00 am - 5:00 pm(GMT)
- Getting Started with DataStax Enterprise - A Free Webinar
- October 8, 2014 9:00 am - 10:00 am(PST)
12:00 pm - 2:00 pm(EST)
4:00 pm - 5:00 pm(GMT)
- Getting Started with DataStax Enterprise - A Free Webinar
- October 22, 2014 12:00 am - 10:00 am(PST)
3:00 am - 2:00 pm(EST)
7:00 am - 5:00 pm(GMT)
Last week at the Cassandra Summit I gave a talk with Blake Eggleston on diagnosing performance problems in production. We spoke to about 300 people for about 25 minutes followed by a healthy Q&A session. I’ve expanded on our presentation to include a few extra tools, screenshots, and more clarity on our talking points.
There’s finally a lot of material available for someone looking to get started with Cassandra. There’s several introductory videos on YouTube by both me and Patrick McFadin as well as videos on time series data modeling. I’ve posted videos for my own project, cqlengine, (intro & advanced), and plenty more on the PlanetCassandra channel. There’s also a boatload of getting started material on PlanetCassandra written by Rebecca Mills.
This is the guide for what to do once you’ve built your application and you’re ready to put Cassandra in production. Whether you’ve been in operations for years or you are first getting started, this post should give you a good sense of what you need in order to address any issues you encounter.
The original slides are available via Slideshare.
Before you even put your cluster under load, there’s a few things you can set up that will help you diagnose problems if they pop up.
- Ops center
This is the standard management tool for Cassandra clusters. This is recommended for every cluster. While not open source, the community version is free. It gives you a high level overview of your cluster and provides historical metrics for the most important information. It comes with a variety of graphs that handle about 90% of what you need on a day to day basis.
- Metrics plugins
Cassandra has since version 1.1 included the metrics library. In every release it tracks more metrics using it. Why is this awesome? In previous persons of Cassandra, the standard way to access what was going on in the internals was over JMX, a very Java centric communications protocol. That meant writing a Java Agent, setting up mx4j, or Jolokia, then digging through JMX, which can be a little hairy. Not everyone wants to do this much work.
The metrics library allows you to tell Cassandra to report its internal, table level metrics out to a whole slew of different places. Out to CSV, Ganglia, Graphite, and STDOUT, and it’s pluggable to push metrics to anywhere you want.
- Munin, Nagios, Icinga (or other system metrics monitoring)
I’ve found these tools to be incredibly useful at graphing system metrics as well as custom application metrics. There are many options. If you’re already familiar with one tool, you can probably keep using it. There are hosted solutions as well (server density, data dog, etc)
- Statsd, Graphite, Grafana
Your application should be tracking internal metrics. Timing queries, frequently called functions, etc. These tools let you get a profile of what’s going on with your code in production. Statsd collects raw stats and aggregates them together, then kicks them to graphite. Grafana is an optional (better) front end to Graphite.
There was a great post by etsy, Measure Anything, Measure Everything, that introduced statsd and outlined its usage with Graphite.
We didn’t mention Logstash in our presentation, but we’ve found it to be incredibly useful in correlating application issues with other failures. This is useful for application logging aggregation. If you don’t want to host your own log analysis tool, there are hosted services for this as well.
There’s a bunch of system tools that are useful if you’re logged onto a machine and want to see real time information.
iostat is useful for seeing what’s happening with each disk on your machine. If you’re hitting I/O issues, you’ll see it here. Specifically, you’re looking for high read & write rates and a big avgqu-sz (disk queue), or a high svctm (service time) there’s a good chance you’re bottlenecked on your disk. You either want to use more disks or faster disks. Cassandra loves SSDs.
Htop is a better version of top, which is useful for getting a quick glance at your system. It shows load, running processes, memory usage, and a bunch of other information at a quick glance.
- iftop & netstat
iftop is like top, but shows you active connections and the transfer rates between your server and whoever is at the other end.
Netstat is more of a networking swiss army knife. You can see network connections, routing tables, interface statistics, and a variety of other network information.
I prefer to use dstat over iostat now since it includes all of its functionality and much of the functionality of other tools as well.
strace is useful when you want to know what system calls are happening for a given process.
This tool, written by Al Tobey, allows you to examine a bunch of files and quickly determine how much of each file is in the buffer cache. If you’re trying to figure out why table access is slow, this tool can tell you if your data is in cache already or if you have to go out to disk. Here’s a good read to get familiar with buffer cache. Check out the repo.
There’s a few issues that are easy to run into that I’d consider “gotchas”, things that come up often enough that they’re worth mentioning.
A important design decision in Cassandra is that it uses last write wins when there are two inserts, updates, or deletes to a cell. To determine the last update, Cassandra uses the system clock (or the client can specify the time explicitly). If server times are different, the last write may not actually win, it’ll be the one that’s the most skewed into the future.
To address this issue, always make sure your clocks are synced. Ntpd will constantly correct for drift. ntpdate will perform a hard adjustment to your system clock. Ntpdate needs to be used if you clock is significantly off, and ntpd will keep it at the correct time.
Disk space not reclaimed
if you add new nodes to a cluster, each replica is responsible for less data. it’s streamed to the new nodes. however, it is not removed from the old nodes. If you’re adding new nodes because you’re running low on disk space, this is extremely important. You are required to run
nodetool cleanup in order to reclaim that disk space. This is a good idea any time you change your database topology.
Issues adding nodes, or running repairs
There are two common problems that come up with repair. The first is that repairs take forever in 2.0. This is solved in 2.1 which uses an incremental repair, and does not repair data which has already been repaired. The second issue relates to trying to repair (or add nodes) to a cluster when the versions do not match. It is, in general, not a good idea (yet) to stream data between servers which are of different versions. It will appear to have started, but will just hang around doing nothing.
Cassandra comes with several tools to help diagnose performance problems. They are available via
nodetool, Cassandra’s multipurpose administration tool.
Compaction is the process of merging SSTables together. It reduces the number of seeks required to return a result. It’s a necessary part of Cassandra. If not configured correctly, it can be problematic. You can limit the I/O used by compaction by using
There’s 2 types of compaction available out of the box. Size Tiered is the default and great for write heavy workloads. Leveled compaction is good for read & update heavy workloads, but since it uses much higher I/O it’s recommended you use this only if you’re on SSD. I recommend reading through the documentation to understand more about which is right for your workload.
Cfstats and Histograms
Histograms let you quickly understand at both a high level and table level what your performance looks like on a single node in your cluster. The first histogram,
proxyhistograms, give you a quick top level view of all your tables on a node. This includes network latency. Histogram output has changed between versions to be more user friendly. The screenshot below is from Cassandra 2.1.
If you’d like to find out if you’ve got a performance problem isolated to a particular table, I suggest first running
nodetool cfstats on a keyspace. You’ll be able to scan the list of tables and see if there’s any abnormalities. You’ll be able to quickly tell which tables are queried the most (both reads and writes).
nodetool cfhistograms lets you identify performance problems with a single table on a single node. The statistics are more easily read in Cassandra 2.1.
If you’ve narrowed down your problem to a particular table, you can start to trace the queries that you execute. If you’re coming from a something like MySQL, you’re used to the command
explain, which tells in in advance what the query plan is for a given query. Tracing takes a different approach. Instead of showing a query plan, query tracing keeps track of the events in the system whewn it actually executes. Here’s an example where we’ve created a whole bunch of tombstones on a partition. Even on a SSD you still want to avoid a lot of tombstones – it’s disk, CPU, and memory intensive.
JVM Garbage Collection
The JVM gets a reputation for being a bit of a beast. It’s a really impressive feat of engineering, but it shouldn’t be regarded as black magic. I strongly recommend reading through Blake Eggleston’s post on the JVM, it’s well written and does a great job of explaining things. (Much better than I would here).
OK – we’ve got all these tools under our belt. Now we can start to narrow down the problem.
- Are you seeing weird consistency issues, even on consistency level ALL?
It’s possible you’re dealing with a clock sync issue. If you’re sending queries really close to one another, they might also be getting the same millisecond level timestamp due to an async race condition in your code. If you’re sending lots of writes at the same time to the same row, you may have a problem in your application. Try to rethink your data model to avoid this.
- Has query performance dropped? Are you bottlenecked on disk, network, CPU, memory? Use the tools above to figure out your bottleneck. Did the number of queries to your cluster increase? Are you seeing longer than normal garbage collection times? Ops center has historical graphs that are useful here. Is there a single table affected, or every table? Use histograms and cfstats to dig into it.
- Are nodes going up and down? Use a combination of ops center and your system metrics to figure out which node it is. If it’s the same node, start investigating why. Is there a hot partition? Is it doing a lot of garbage collection? Is your application opening more connections than before? You should have system metrics that show these trends over time. Maybe you just have additional load on the system – it may be necessary to add new nodes. Don’t forget to run cleanup.
This started out as a small recap but has evolved into much more than that. The tools above have helped me a wide variety of problems, not just Cassandra ones. If you follow the above recommendations you should be in a great spot to diagnose most problems that come your way.
You can find me on Twitter for any comments or suggestions.
As one of the fastest growing online and mobile services in the world, Spotify delivers streaming music in
real time to over 40 million active users and growing, without interruption. To achieve the level of service
demanded by its fast growing userbase, Spotify needed a database technology that could keep up with its
growth without performance or availability issues.
Spotify initially started out as a PostgreSQL shop, but with skyrocketing popularity, Spotify realized that a
relational system couldn’t keep up with their performance, scalability and availability requirements. What
Spotify needed was a scalable solution that was highly available and could support multiple data centers.
“After we had scaled up to one or two million users we started to experience some scalability problems with
certain services,” said Axel Liljencrantz Backend Engineer at Spotify. “Once you hit multiple data centers,
streaming replication in postgreSQL doesn’t really work that well for high write volumes and so on because
of its limiting architecture.”
Read the case study
Last week at the Cassandra Summit in San Francisco we announced the 2014-2015 Datastax Cassandra MVP roster. The individuals who made it on the roster go above and beyond to share their knowledge and expertise to help the Cassandra community, simply put they are the BEST. They educate others and assist the community by speaking at meetups, conferences, answering questions on StackOverflow, flagging Jira tickets, etc.
This year we are very excited to be welcoming some new faces as well as seeing some old friends on the list. As apart of this year’s program, MVPs will get exclusive swag, promoted as a thought leader within the community and get notified of various speaking opportunities (webinars, conferences, etc), participate in an MVP day and more!
During the Cassandra Summit I was able to weed through the hundreds of people and met a lot of the MVPs. It was great meeting (almost) everyone in person and putting a face to the name!
For a complete list of the MVPs, you can go here. Congratulations to all the 2014-2015 MVPs and a big thank you for all the work you do within the community. The Cassandra community wouldn’t be where it is without members like you.
Julien Anguenot (@anguenot) is an open-source advocate and veteran Python and Java developer.
He is now serving as the director of software engineering at iland internet solutions where he is leading, on the one end, the development of a Java EE distributed platform running on top of Cassandra, connecting VMware and OpenStack services across multiple data-centers and on the other end, the customer-facing iland cloud ECS portal app.
With data centers in the U.S., U.K. and Singapore, iland (@ilandcloud) delivers proven enterprise cloud solutions that help companies do business faster, smarter and more flexibly. Unlike any other provider, iland’s technology and consultative approach mean anyone–regardless of expertise, location or business objective–can experience the benefits of a hassle-free cloud. From scaling production workloads, to supporting testing and development, to disaster recovery, iland’s secure cloud and decades of experience translate into unmatched service. Underscoring the strength of its platform, the company has been recognized as VMware’s Service Provider Partner of the Year, Global and Americas. Visit www.iland.com.
Below is a compilation of my notes as well as some thoughts about the Cassandra summit 2014 conference day I attended last week in San Francisco.
Billy Bosworth’s (@BillyBosworth) Opening Keynote
The conference day started with the opening keynote of Billy Bosworth, CEO of DataStax. I was especially looking forward to it since DataStax just announced a $106M series E round of venture capital the week right before the conference. Billy did not mention anything in particular about this new round of funding but rather spoke about the evolution of internet usages (Internet of things, multi-devices, mobility, etc.), more specifically in the enterprise, and, as a consequence, the insane pace at which data are now being generated, growing and the increasing need to store, process and access these data everywhere, any time and with low latency.
Evidently, according to Billy, Cassandra is the perfect match for such new usages which are the needs of enterprise big data applications, and Cassandra will disrupt the old traditional (relational) database market lead by Oracle: DataStax is after a chunk of Oracle’s enterprise market, always has been, and this is what is exciting about DataStax and Cassandra it you ask me.
DataStax customers were also invited to speak about their respective use case and success story: Jeff Ludwig, VP, Network Platform Data and Engineering at Sony Network Entertainment and Yi Li, CEO at Orbeus who presented, with an actual live demo, their cloud-based visual computing solution for recognizing faces, scenes and objects.
The keynote left attendees with the clear feeling that we were now in the era of enterprise big data applications and that DataStax and Cassandra not only have a solid footing in it but that they are leading this new era.
No new service, initiative or partnership announced: I guess I was just curious to hear about what DataStax would be up to with their new round of venture capital and was especially wondering if some kind of Cassandra As A Service offering were about to be launched.
Jonathan Ellis’ (@Spyced) Technical Keynote
Following Billy opening keynote, Jonathan Ellis, CTO of DataStax and Apache Cassandra lead took the stage for a technical keynote.
As anticipated, Cassandra 2.1.0 release was officially announced. Jonathan described what were the new features and improvements coming with this new release. Jonathan highlighted major performance improvements: despite CQL3 benchmark numbers being better than with Cassandra 2.0, the most interesting thing IMHO, is the fact that performances were now more consistent over time thanks to incremental repairs and anticompaction. Jonathan described the process to migrate to new SSTables and enable incremental compaction which is something I am in a hurry to try out on our development cluster here.
New features in Cassandra 2.1 includes: User-defined types, indexes on collections, better and “non-buggy” implementation of counters.
Performance improvements in Cassandra 2.1 includes: faster reads and writes, improved row cache, off-heap memtables, compaction improvement, incremental repairs and better bootstrapping of new nodes (which I am in a hurry to try out as well)
Jonathan mentioned Windows support being in beta with 2.1 and that they are targeting with 3.0. I guess this is quite important at this point considering the DataStax enterprise focus highlighted during Billy’s keynote.
Jonathan mentioned as well the anticipated stability of the 2.1 release and the fact that DataStax has now a dedicated team of engineers doing QA against base Apache Cassandra.
The keynote was great but a tentative roadmap for Cassandra 3.0 and / or new DataStax products would have been interesting.
Breakout Sessions I Attended
The breakout sessions were organized in 6 tracks running in parallel with more than 60+ 45 minutes sessions during the day.
I deliberately forced myself to avoid the overcrowded use-case sessions presented by the big names (Netflix, Apple, Sony, eBay, etc.) to focus on smaller and more technical sessions around operations and developments. We will definitely be able to read about big name use-cases online later anyway.
Diagnosing Problems in Production by Jon Haddad and Blake Eggleston
Jon and Blake did a great job at presenting tools and process to diagnose and solve production and performance related problems with a focus on the JVM garbage collector. They mentioned quite a few OS level tools as well as tricks to monitor and debug a Cassandra Cluster. What was especially great about this talk is that although the speakers mentioned DataStax OpsCenter, they did not focus on it. I am not going into details about the tools they mentioned here as their slides will be available soon.
TitanDB – Scaling Relationship Data and Analysis with Cassandra by Matthias Broecheler
This talk was about storing relationship data in Cassandra using TitanDB.
TitanDB property graph data model on top of Cassandra is well designed and allows the definition of formal and clean relations and allows one to take advantage of a more sophisticated query mechanism relations centric. It also adds an abstraction at the data model level that comes with some significant complexity on the data model itself. I am ensure moving that complexity from the application level will have such clear benefits for developers, as advertised, because it will make their job much harder forcing them to work with the TitanDB paradigm away from the Cassandra simpler data model currently rather easy to program.
Reading Cassandra SSTables Directly for Offline Data Analysis by Ben Vanberg
Great sessions from Ben Vanberg on how FullContact is performing analytics MapReduce jobs against large SSTables for downstream analytics using their own input splittable format: they use couple of Netflix open source tools in addition to their Hadoop SSTable implementation: https://github.com/fullcontact/hadoop-sstable
Streaming From Backups – Reducing Cluster Load When Adding Nodes by Ben Bromhead
Was really looking forward to this talk having faced issues bootstrapping new nodes against large cluster in the past. Unfortunately, this talk was only about benchmarks, explanations and brainstorming about their attempts and ideas making streaming from backups work. I knew I should have gone to the Gossip session that was running at the same time…
Apache Spark – The SDK for All Big Data Platforms by Pat McDonough
Interesting introductory talk about Apache Spark. Just what it took to “sparkle” interest about Spark and its Cassandra as well as Hadoop integration. Will definitely be looking into this for iland.
Monitor Everything! by Chris Lohfink
One of my favorite breakout session of the day: mostly a walk-through key Cassandra performance metrics and tools including JMX and JVM GC. Definitely check the slides for details when available.
Interactive OLAP Queries using Apache Cassandra and Spark by Evan Chan
Great complimentary talk to the Spark introductory session I attended earlier. Loved Spark as an in-memory cache on top of Cassandra.
Cassandra in Large Scale Enterprise Grade xPatterns Deployments by Claudiu Barbura
Left this session after 15 minutes mostly because I needed a coffee break
Down with Tweaking! Removing Tunable Complexity for Cassandra Performance and Administrator by Don Marti, Glauber Costa and Dor Laor
The title of this session was a bit misleading: it was about OSv which is an optimized operating system for the cloud. Great session but not really specific to Cassandra except for the Cassandra OSv image they used to demonstrate lighting fast boot time of OSv images. I found the presentation was fun and very interesting covering the challenges involved in designing such OS. More about OSv here: http://osv.io/
Lightning Talks and Summit Closing Reception
The conference day ended up with a dozen 5 minutes long lighting talk sessions while the conference attendees were enjoying food and beverages during the closing reception. Definitely a good way to wrap up the conference day. Have to admit I did not really follow that much any of lightings talks being busy talking with people and enjoying refreshments…
Final Thoughts About the Conference
This was my first Cassandra summit and I have to say I have been rather impressed by DataStax’s organization: everything, was perfectly planned and well organized. And you have to remember that it it was a free conference (I purchased a VIP ticket at $99 for priority seating…) Welcome and Closing ceremony were awesome too.
They were around 2200 attendees this year. The Westin St Francis is a perfectly located but rather older hotel which is not necessarily well suited for conferences with sessions on multiple floors and narrow corridors starting with this amount of attendees.
It was definitely great to meet face to face with all the people at DataStax with whom I have been in touch by email over the past year.
Had the pleasure to talk briefly with Jonathan Ellis who not only is a great Cassandra project leader but a great French speaker too!
As a final note, I especially loved the feeling of this conference: it was a true tech conference without excessive commercial and marketing displays for tech people. (or maybe was it because I attended VMWorld the week before?)
Big up and thanks to DataStax for a great conference!