Vassilis Bekiaris: Director of IT at Icon Platforms
Christian Hasker: Editor at Planet Cassandra, A DataStax Community Service
TL;DR: Icon Platforms helps brands build customer engagement and loyalty through technology that tracks customer actions within mobile applications and web sites.
Cassandra is used to store customer profiles alongside a series of their events which are generated as a customer interacts with the use of a mobile application or a website. Icon Platforms monitors these events which scores the customers’ behavior against certain profiles which are defined for each customer.
Icon Platforms migrated from a Microsoft SQL Server relational database which, as they grew, became difficult to maintain and hit performance ceilings. To fix this, they looked at the lineup of NoSQL databases and Cassandra won hands down due to its performance and masterless design.
Hi, everyone. I am joined today, for this Apache Cassandra use case 5 Minute Interview, by Vassilis from Icon Platforms. He’s the director of IT there. Vassilis, welcome. Why don’t you start off by telling us a little bit about what Icon Platforms does and what your role is there?
Hi, Christian. Icon Platforms’ business is to help brands build customer engagement and loyalty. We are building technology that tracks customer actions within mobile applications and web sites. Using the series of events that we track from customers we reward them with points, and implement gamification features such as badges, achievements, coupon redemption, rewards in virtual currency or virtual items and more.
Any company wanting to launch a customer loyalty program could contract with Icon Platforms?
Correct. Yes, that’s right.
Vassilis if you wouldn’t mind talking to us a little bit about why Apache Cassandra and how you are using it.
Our platform processes a series of customer events which are generated as a customer interacts with the mobile application or a website. We monitor for certain kinds of concepts which we use to score the customers’ behavior against certain profiles which are defined for each customer. For example a company might want to identify who are the most vocal users seen within a community or measure engagement with specific products.
All of this data that we gather and its event processing is stored in our Cassandra database.
Could you talk a little bit about why you chose Cassandra, did you look at anything else ?
Actually when we started out developing our platform we started with Microsoft SQL Server relational database. We chose it as we had staff with skills around SQL Server and it was the obvious choice for a small startup. As we grew and grew, it was obvious that some of our data needs were not a good fit for a relational database.
Actually when you try to implement a data model that has so many dynamic aspects to it, trying to make it relational , well it starts looking really ugly trying to map it in relational terms. Another strong motive was that we seemed to be hitting a performance ceiling with our relational database back end. We looked at the lineup of NoSQL databases and Cassandra won hands down due to its masterless design.
Its configuration is much simpler and its horizontal scalability is very attractive. Another feature that is quite important for us is that we use Cassandra support for distributed counters a lot and it is great.
Thank you very much for outlining that. Coming from a relational database background Vassilis, what was it like? Any advice for those looking to make the transition?
I think the single most important piece of advice I would give to anyone coming to Cassandra is to look closely at your use cases. Model your data in such a way that you can accommodate your needs for queries. We have redesigned our Cassandra schema a couple of times and each time it’s getting better and better. It really affects the performance of our solution.
Data modeling is very important; learning a bit of the internals of Cassandra really helps a lot with that because you can understand what you can do and what you cannot do with Cassandra. It’s important to distinguish between the two. Another thing that was a bit confusing to us as newcomers to Cassandra was knowing which driver to pick. In the relational world you have ODBC or JDBC, whereas there are lots and lots of choices with Cassandra.
How did you find the right drivers? Did you ask the community, did you go through DataStax? How did you end up solving that for yourself?
Well in fact we did our own performance testing internally with different drivers; our own performance testing was the way we did it and it helped us fix also our data model and address our specific needs. We settled down with the Datastax Java driver because we found it to be the most performant by a large margin for our use case and also uses the CQL interface which is the way forward.
Okay and lastly, if you wouldn’t mind talking a little bit about what your actual deployment looks like of Cassandra. Are you hosted in your data centers, and how many loads can you have, how many clusters things like that?
Our deployment is hosted on Rackspace Cloud Servers with 16 gigabytes RAM each. With our own setup, we are able to process something like 10,000 events per second for each node we have in the cluster, with each event process cycle being composed of several reads and writes to Cassandra. We’re really happy with the performance.
Well thank you very much Vassilis, is there any advice you have to those just starting out?
In terms of community direction I think it’s a good advice for everyone starting to get their feet wet with Cassandra to subscribe to the mailing lists. When you start looking at Cassandra, there’s so many things to learn. All of the stuff really affects how you’re going to use Cassandra. It’s a really good idea to watch what others are talking about. Start learning yourself and digging deeper.