November 13th, 2013

By 

“To meet all requirements of having large scale storage with predictable scalability, high availability and robustness of a mission-critical application, we found the match in Cassandra.”

-Hallo Khaznadar, Chief Architect at Qafe

Hallo Khaznadar Chief Architect at Qafe

 

 

TL;DR: QAFE stands for “Qualogy Application Framework for the Enterprise”. QAFE allows you to recycle legacy applications, build applications from scratch and execute your mobile strategy to an anywhere, anyplace collaboration.

 

In the telecom industry, QAFE is using Cassandra for storing call data records. They also use Cassandra in the process for: decoding, rating, transformations and feeding of several subsystems in the landscape of a telecom company like the billing system.

 

QAFE, predominantly an Oracle shop, needed to find a large scale storage solution with predictable scalability, high availability and robustness of a mission-critical application.  While they also looked at HBase and MongoDB, the relative ease of management that you find in Cassandra, the linear scalability and no single point of failure made Cassandra their preferred choice.

 

Hello Planet Cassandra listeners! This is Brady Gentile Community Manager at DataStax. Today we have Hallo Khaznadar here with us to share their Apache Cassandra use case. He is the Chief Architect Officer at Qafe, a daughter company of Qualogy. Hallo, thank you so much for joining us today. To start things off could you tell us a little bit about what Qualogy does?

Hi Brady. It’s nice talking to you, and for offering this opportunity for us. Qualogy is specialized in integrating, streamlining and exonerating complex business processes from advised development and testing to implementation, hosting, training and after completion monitoring and maintenance.  Qualogy offers high quality Oracle Java, HTML5 ICT solutions every step of the way while responding rapidly and flexibly to change.

 

QAFE stands for “Qualogy Application Framework for the Enterprise”. QAFE allows you to recycle legacy applications, build applications from scratch and execute your mobile strategy to an anywhere, anyplace collaboration.

 

We also offer a one-of-a-kind Oracle forms modernization tool that lets you unlock legacy applications to any modern front end of your choice. That’s in short about the company.

 

Excellent. How does Apache Cassandra fit into the mix at QAFE?

Within QAFE we are not only developing a product, but we’re also doing projects with our product. We are using Cassandra for creating a telecom solution that will make a difference in the telecom marketplace by filling needs that we have identified in the market. The need of real-time processing and storing a huge amount of communication data and the ability to provide a framework for realizing analytic processes on huge history communications data.

 

The current phase of our project, this is one of our key client of course, we are currently in phase one. We are using Cassandra for storing telecom call data records. They call it in the telecom terminology “CDRs” which is used for performing the mediation process. The mediation is the core business of any telecom company. This process involves decoding, rating, transformations and feeding of several subsystems in the landscape of a telecom company like the billing system.

 

In the coming phases we are also aiming to use the Hadoop capabilities to realize an analytic based functionality. After that we will adopt Solr so that we can perform ad hoc Queries as we want to use Cassandra to store all the data of the enterprise.

 

What was your motivation for choosing Apache Cassandra for this project over other technologies? I know you had mentioned you are also an Oracle shop. What benefits have you found by using Cassandra over other technologies?

To meet all requirements of having large scale storage with predictable scalability, high availability and robustness of a mission-critical application, we found the match in Cassandra.  Where you can have linear scalability on a large storage growth without compromising the performance, especially the data writes. Cassandra has no single point of failure, making it the right choice for fault-tolerance in the highly available applications, yet relatively easy to manage the cluster of Cassandra nodes.

 

With Cassandra you could also find the right answer for modeling the data captured by large data acquisitions systems, which has the nature of a time series. In the same time you can build a real-time processing application on top of it. It is also well proven technology with well supporting community and well trusted high-end specialists. In terms of comparing to other technology we have actually used Google data store, which is backed by the Google Big Table. This is part of the Google App Engine and other projects that we have and actually developed for our other clients.

 

What we have seen in Google’s App Engine is it’s only available in public cloud solutions. This is not an option for our key clients right now, which are the telecom companies, banks and governmental institutes. They all want the application to be behind their Firewalls. Google App Engine does not provide a flexible framework or an APIs that enables you to incorporate faceted search capabilities like Solr, or map reduce based processes like the one that you find in Cassandra, and the ease of using it in Cassandra.

 

We also looked at HBase and MongoDB, the relative ease of management that you find in Cassandra, the linear scalability and no single point of failure made Cassandra our preferred choice.

 

It sounds like you did a lot of research before deciding on Cassandra. Also it’s very interesting to hear about the companies that you’re working with, the government and banking companies, not wanting to store their data in the public cloud.

 

Would you like to share some insights into what your deployment looks like?

Obviously, our solution is not the public cloud. As we have broad experience in Oracle we decided to use Oracle Exalogic as a private elastic cloud platform for our deployment and project development.

 

Basically we are starting with one data center in phase one, but we are targeting two data centers. Our main telecom client right now actually is the number one telecom company in the Caribbean Area, which covers several islands.  The biggest island right now we are doing this implementation for. Of course, we are taking in consideration scaling to cover the entire area, of the Caribbean.

 

Regarding the servers it’s what you can have under the Exalogic, so Oracle Linux, Tomcat or Web Logic. It doesn’t matter. It’s up to the client to use it then.  For disks we are using SSDs. The number of nodes we have is on a growth plan of four nodes per six months; as we are expecting one Terabyte of data per month write.

 

We are also taking into consideration the overhead storage required by Cassandra internals, like metadata and the compaction process. One Terabyte is really the final, total that you get per month for the expected current network. They are using also 3G Network at the moment over there. In the near future they are moving to the 4G Network.

 

Regarding the read rate, we are measuring 100 Gigabytes per month read. Actually, most of the data, obtained is raw data that is stored together with the decoded version. Only a fraction of that will be used for the reading. Most of the parts will be writes, which is also a very good match with the architecture of Cassandra. For hard disk we are using 8 Terabytes SSD with leveled compaction strategy per node, RAM of 64 Gigabytes per note and four cores at 16 Gigabyte RAM per core.

 

In regards to future versions of Apache Cassandra, is there anything specific that you are looking forward to or that you would like to see?

From what we have experienced, we have identified some improvements or things that we would like to see. Distributed transactions, better recycling strategy of deleted data, as we have realized that when you delete data you create actually gaps in the data that are not recycled.

 

On the Java Driver, as we are a Java Enterprise of course and our solution is built in Java, we would like to have more generic APIs that allow you to have a better mapping of result sets.  Reducing the boilerplates code that you need to write and maintain your Java layer. That would be nice to have in the future releases of the Java Driver.

 

Hallo, thank you so much for joining us today. I really enjoyed hearing about how you’re using Apache Cassandra. Is there anything else that you’d like to add before we sign off here?

You’re very welcome. Yes, we are very happy to work with you guys. I hope the future will open more opportunities to do business together.

Vote on Hacker News