Michael Kjellman Software Engineer at Barracuda Networks
"We needed a highly scalable system that could be real-time. No other database was ready for what we needed to do."
Michael Kjellman Software Engineer at Barracuda Networks

While best known for their security solutions, Barracuda Networks offers products across three areas of IT: Content Security, Networking and Application Delivery and Data Storage, Protection and Disaster Recovery.

Battling the Zombies!

The Barracuda Central Research Database is using Cassandra to battle the Zombies. Before adopting Cassandra, we could not monitor every malicious site and IP forever – the data volumes were just too great. We would monitor a site or IP for a while, and once we saw that the IP address was no longer alive we would stop monitoring it or need to truncate our history. The big problem however, was once we stop monitoring a site or domain they frequently come back to life – hence the Zombie moniker. 

Initially, around Version 0.8 we were using it as a key value store, but around 1.0 we looked at it to replace MySQL. In the past, taking down one botnet or IP would drastically reduce spam to our customers, but today spammers are smarter; the attacks change constantly. We had a scale problem and MySQL could not handle it, whereas Cassandra is designed to scale and be highly available. We needed a highly scalable system that could be real-time. No other database was ready for what we needed to do. The thriving community was also a reason for us to choose Cassandra.

We had data coming in from multiple databases and flat files, and now we use Cassandra to consolidate all that data. Before it could take us as long as three or four hours to mark a site or IP; now with Cassandra we are able to do that in real-time and not worry about losing history.

Barracuda Website stats

Deployment at Barracuda

At the Barracuda Central Research Database our configuration is 2 spindles, no raid, 2 data directories (one directory per spindle) and and an SSD for small “hot” column families. 12 cores, 32GB of RAM. 

Currently we have 12 nodes online in one datacenter. We have racked (the OS is installed and everything) another 12 nodes at a second datacenter and are bringing them online with the 1.2.1 official release.
Collections & CQL

Collections and CQL3 were compelling for us.  With collections, we are able to maintain associations between IPs, domains, and full uris . With CQL3 and Collections, we will be able to pull back all the data we need with one call.

 I also wrote Perlcassa; no other PERL drivers were up to snuff.  

Follow @twitter