April 9th, 2013

By 

 

Talkbits

Max Alexejev | IT Consultant

What does Talkbits do?

Talkbits is a mobile application whose goal is to connect people and explore the world through sound and voice; starting where you live.

 

Imagine you want to connect with nearby people in San Francisco or Singapore. You launch talkbits, which works like a Walkie Talkie, and push a button to talk. As you talk, the app sends voice messages (bits) to nearby people in real time.

 

All voice communication is grouped by channels of common interest, common location or friendship relations.

 

How do you use Cassandra (C*) at TalkBits?

At Talkbits, Cassandra is our primary storage for all metadata like user accounts, channels and bit streams. We rely heavily on Cassandra’s fault tolerance and scalability characteristics.

 

Our storage infrastructure includes Amazon S3 for voice and image blobs, ElasticSearch for full-text and geolocation capabilities, and Cassandra for everything else.

 

Have you always used C* or did you switch to C* from another database offering (HBase, MySQL, MongoDB, etc.)?

Our backend architecture was designed from scratch, and I was free to form the technology stack and had a great professional team, experienced with modern technologies and scalability approaches.

 

Now our backend consists of a number of independent stateless services working together in a Finagle cluster. All services are launched in multiple instances across several Amazon availability zones within a region. We have 100% highly available and horizontally scalable business layer with no single points of failure.

 

So, we had the same set of requirements for the data layer backing services that actually store data. We wanted our storage to be distributed, highly available, easy to scale horizontally, and proven in production environments of companies bigger than Talkbits. We also wanted a symmetrical architecture with no “special” nodes or “active-standby” patterns. After initial evaluation, we began using Cassandra and still think it was a good decision.

 

As an additional bonus, we use JVM platform with Java and Scala as the primary technology and Python for all infrastructure tasks, and Cassandra made a great fit into our technology stack. Keeping a technology stack compact is crucial in small startup teams like Talkbits, because you don’t usually have many people to support and manage multiple languages and platforms.

 

Is your C* data stored in the cloud or a physical data center?

Currently, we use amazon AWS as a hosting provider. Cassandra 1.1.2 is deployed on 3 ‘m1.large’ EC2 nodes across 3 availability zones in a single region. It stores both data and commit log on local ephemeral disks (i.e., we do not rely on EBS for Cassandra).

 

As our dataset grows, we will increase cluster capacity by doubling the cluster size, keeping general nodes topology unchanged.

As our business reach expands, we will launch additional clusters in other AWS regions with WAN replication between them. Our nodes and TopologyStrategy are already configured for it.

 

We may also consider moving to hybrid cloud in the future, serving sustained load from managed hardware servers and elastically expanding on EC2 to serve traffic peaks. However, this is not a priority yet.

 

Do you have any thoughts on the physical and/or virtual C* community?

After working as a scalability consultant in Bay Area, I see some lack of cutting-edge IT events here in Moscow (although Talkbits is an international company, its engineering team is mostly located in Russia).

 

We do have many events on highly loaded, distributed and low latency systems, ranging from user groups to bigger conferences.

 

Cassandra trend is also there, and I know lots of people already using or evaluating Cassandra for their needs. Hopefully, I will be able to catalyze this process as a(delete this “a”) local events organized for DataStax.

 

Is there anything that you’ve learned while using C*, that in hindsight you would have done differently?

Data and indices modeling is paramount. Think about your queries more than you think about your data. Cassandra’s columnar data model was a bit new for our team, and we still plan to improve the initial design we started with.

 

Don’t ever use CQL as a limited substitution for SQL, and don’t believe Cassandra when it pretends it’s a relational database. I personally consider CQL a good way to do some ad-hoc querying (and I’d love to see some index-independent, fullscan-like capabilities just for this purpose), but I’m still not sure if it is a good primary API for external clients. We went with CQL 3 and Cassandra 1.1 in production and had some minor problems with that (for example, protocol version setup in Hector client).

 

Understand the guarantees Cassandra gives you and don’t ask for more. It is important to understand which operations are atomic and which are not. Can you live with a possibly inconsistent state or do you need any kind of transaction manager on top of Cassandra and other participating systems?