Illustration Image

Cassandra version 1.2 servers problems

We have a cassandra cluster with 24 servers running version 1.2. Since a few weeks ago two of the servers started to crash everyday always at the same time. Their status appears as down on nodetool status comand and the only way to bring them back is to make a reboot on both servers. On one of them we are able to stop the cassandra service and do the reboot. On the other one the service does not stop until we force a reboot.

After the reboots they work normally until the next day at exactly the same time. We analised the logs for errors and the main problem seems to be the HEAP memory that passes the treshold. The servers have 32GB of memory and the HEAP is set to 22GB. All the other servers in the cluster have the same memory an HEAP size and there is no problem whatsoever.

We have checked that the repair and compactation processes run without any errors. We also noticed that just before it crash the gossip service starts to point that some servers sometimes are not responding tothe handshake but then they start responding again, they go DOWN and UP until these two servers crash.

If we do the rebbot on the servers before the time they ussualy crash they they don't crash anymore until the next day.

As a workaround we have setup a script that reboots the servers before the time they crash.

We are running out of options about what might be causing this problem on these two servers.

Any help would be much appreciated! thanks in advance.

Become part of our
growing community!
Welcome to Planet Cassandra, a community for Apache Cassandra®! We're a passionate and dedicated group of users, developers, and enthusiasts who are working together to make Cassandra the best it can be. Whether you're just getting started with Cassandra or you're an experienced user, there's a place for you in our community.
A dinosaur
Planet Cassandra is a service for the Apache Cassandra® user community to share with each other. From tutorials and guides, to discussions and updates, we're here to help you get the most out of Cassandra. Connect with us and become part of our growing community today.
© 2009-2023 The Apache Software Foundation under the terms of the Apache License 2.0. Apache, the Apache feather logo, Apache Cassandra, Cassandra, and the Cassandra logo, are either registered trademarks or trademarks of The Apache Software Foundation. Sponsored by Anant Corporation and Datastax, and Developed by Anant Corporation.

Get Involved with Planet Cassandra!

We believe that the power of the Planet Cassandra community lies in the contributions of its members. Do you have content, articles, videos, or use cases you want to share with the world?