Stackoverflow Q&A

Ask a question by clicking here, and tagging "Cassandra".
311
vote
6answers
146269 views

MongoDB vs. Cassandra

I am evaluating what might be the best migration option. Currently, I am on a sharded MySQL (horizontal partition), with most of my data stored in JSON blobs. I do not have any complex SQL q
165
vote
14answers
14479 views

What scalability problems have you encountered using a NoSQL data store?

NoSQL refers to non-relational data stores that break with the history of relational databases and ACID guarantees. Popular open source NoSQL data stores include:
163
vote
7answers
37086 views

NoSQL (MongoDB) vs Lucene (or Solr) as your database

With the NoSQL movement growing based on document-based databases, I've looked at MongoDB lately. I have noticed a striking similarity with how to treat items as "Documents", just like Lucene does
145
vote
5answers
36975 views

Ways to implement data versioning in MongoDB

Can you share your thoughts how would you implement data versioning in MongoDB. (I've asked simil
106
vote
3answers
56645 views

Facebook Architecture

I have been scrounging for articles/info about the architecture at Facebook, the challenges & ways they tackle them. What they use & why they use. How do they scale & what are the desig
102
vote
5answers
14118 views

Non-Relational Database Design

I'm interested in hearing about design strategies you have used with non-relational "nosql" databases - that is, the (mostly new) class of data stores that don't use traditional re
63
vote
3answers
30312 views

Large scale data processing Hbase vs Cassandra

I am nearly landed at Cassandra after my research on large scale data storage solutions. But its generally said that Hbase is better solution for large scale data processing and analysis. W
62
vote
8answers
27630 views

What should I choose: MongoDB/Cassandra/Redis/CouchDB?

We're developing a really big project and I was wondering if anyone can give me some advice about what DB backend should we pick. Our system is compound by 1100 electronic devices that send
58
vote
6answers
21948 views

What's The Best Practice In Designing A Cassandra Data Model?

And what are the pitfalls to avoid? Are there any deal breaks for you? E.g., I've heard that exporting/importing the Cassandra data is very difficult, making me wonder if that's going to hinder syn
56
vote
7answers
2839 views

Life without JOINs... understanding, and common practices

Lots of "BAW"s (big ass-websites) are using data storage and retrieval techniques that rely on huge tables with indexes, and using queries that won't/can't use JOINs in their queries (BigTable, HQL
56
vote
15answers
15157 views

Which key value store is the most promising/stable?

I'm looking to start using a key/value store for some side projects (mostly as a learning experience), but so many have popped up in the recent past that I've got no idea where to begin. Just listi
56
vote
8answers
18836 views

Best data store for billions of rows

I need to be able to store small bits of data (approximately 50-75 bytes) for billions of records (~3 billion/month for a year). The only requirement is fast inserts and fast lookups for all
56
vote
3answers
1997 views

Cassandra server throws java.lang.AssertionError: DecoratedKey(...) != DecoratedKey

I'm currently experimenting around with Cassandra. On the client-side (with Hector) I look up a few keys like this: ColumnFamilyResult<String, String> result = template
53
vote
2answers
10244 views

What does "Document-oriented" vs. Key-Value mean when talking about MongoDB vs Cassandra?

What does going with a document based NoSQL option buy you over a KV store, and vice-versa?
49
vote
3answers
22978 views

What is the difference between Cassandra and CouchDB?

I'm looking at both projects and I can't really see the difference from Cassandra Site: Cassandra is a highly scalable, eventually consistent, distributed, structured k
48
vote
1answers
11386 views

Explain Merkle Trees for use in Eventual Consistency

Merkle Trees are used as an anti-entropy mechanism in several distributed, replicated key/value stores:
47
vote
3answers
25527 views

Switching from MySQL to Cassandra - Pros/Cons?

For a bit of background - this question deals with a project running on a single small EC2 instance, and is about to migrate to a medium one. The main components are Django, MySQL and a large numbe
46
vote
9answers
22207 views

When NOT to use Cassandra?

There has been a lot of talk related to Cassandra lately. Twitter, Digg, Facebook, etc all use it. When does it make sense to:
46
vote
5answers
9442 views

Why are document stores like Lucene / Solr not included in NoSQL conversations?

All of us have come across the recent hype of no-SQL solutions lately. MongoDB, CouchDB, BigTable, Cassandra, and others have been listed as no-SQL options. Here's an example:
44
vote
2answers
14280 views

Difference between Document-based and Key/Value-based databases?

I know there are three different, popular types of non-sql databases. Key/Value: Redis, Tokyo Cabinet, Memcached ColumnFamily: Cassandra, HBase Document: MongoDB, Cou
37
vote
6answers
24784 views

Row count of a column family in Cassandra

Is there a way to get a row count (key count) of a single column family in Cassandra? get_count can only be used to get the column count. For instance, if I have a column family containing u
36
vote
5answers
21565 views

What is an SSTable?

In BigTable/GFS and Cassandra terminology, what is the definition of a SSTable?
36
vote
1answers
30563 views

How to choose between Cassandra, Membase, Hadoop, MongoDB, RDBMS etc.?

Is there a paper/blog-post on when to use Cassandra or Membase or Hadoop or plain old relational databases ? Is there a paper discussing the strengths/weaknesses of each, and on what scenarios eith
34
vote
5answers
30691 views

Cassandra port usage - how are the ports used?

When experimenting with Cassandra I've observed that Cassandra listens to the following ports: TCP *:8080 TCP *:8888 TCP *:57311 TCP *:57312 TCP 127
34
vote
2answers
16521 views

Redis, CouchDB or Cassandra?

What are the strengths and weaknesses of the various NoSQL databases available? In particular, it seems like Redis is weak when it comes to distributing write load over multiple servers. Is
34
vote
5answers
16403 views

Cassandra Client Java API's

I have recently started working with Cassandra Database. Now I am in the process of evaluating which Cassandra client we should go forward with. I have seen various post on sta
31
vote
5answers
13313 views

Is Cassandra production ready for Ruby on Rails?

I'm working on a project that is considering using Cassandra as a database. We would like to eventually migrate to Cassandra even if we use MySQL to start with, given its scalability. I know that b
30
vote
5answers
14163 views

How to use Cassandra in Django framework

Is there any robust way of implementing Cassandra back end to a web application developed using Django web framework. Thanks
29
vote
5answers
11667 views

storing massive ordered time series data in bigtable derivatives

I am trying to figure out exactly what these new fangled data stores such as bigtable, hbase and cassandra really are. I work with massive amounts of stock market data, billions of rows of p
29
vote
2answers
9551 views

How does Voldemort compare to Cassandra?

How does Voldemort compare to Cassandra? I'm not talking about size of community and only wan
29
vote
5answers
25746 views

How do I delete all data in a Cassandra column family?

I'm looking for a way to delete all of the rows from a given column family in cassandra. This is the equivalent of TRUNCATE TABLE in SQL.
29
vote
3answers
13424 views

anybody tried neo4j vs titan - pros and cons

Can anybody please provide or point out to a good comparison between Neo4j and Titan? One thing i can see is in terms of scale - Titan is scaleout and requires an underlying scalable datastore like
28
vote
4answers
22320 views

PHP-friendly NoSQL solutions

I'm looking to use a NoSQL solution for my next project, which will be written in PHP. What choices do I have in terms of NoSQL solutions that can easily interfaced via PHP? I haven't done much thi
28
vote
9answers
24707 views

MongoDB vs. Redis vs. Cassandra for a fast-write, temporary row storage solution

I'm building a system that tracks and verifies ad impressions and clicks. This means that there are a lot of insert commands (about 90/second average, peaking at 250) and some read operations, but
27
vote
5answers
10522 views

MySQL and NoSQL: Help me to choose the right one

There is a big database, 1,000,000,000 rows, called threads (these threads actually exist, I'm not making things harder just because of I enjoy it). Threads has only a few stuff in it, to make thin
27
vote
3answers
9887 views

Sorted String Table (SSTable) or B+ Tree for a Database Index?

Using two databases to illustrate this example: CouchDB and Cassandra. CouchDB C
25
vote
4answers
6919 views

Scala + Akka: How to develop a Multi-Machine Highly Available Cluster

We're developing a server system in Scala + Akka for a game that will serve clients in Android, iPhone, and Second Life. There are parts of this server that need to be highly available, running on
25
vote
1answers
7115 views

Difference between partition key, composite key and clustering key in Cassandra?

I have been reading articles around the net to understand the differences between the following key types. But it just seems hard for me to grasp. Examples will definitely help make un
24
vote
1answers
6654 views

Why Does the Leap Second Cause Problems?

So at this moment (but most likely not for long) Reddit, Meetup, Fark, LinkedIn, Yelp, 4Chan are all down. Netflix apparently was out for a while too. According to Reddit's tweet, they are h
23
vote
2answers
17772 views

bigtable vs cassandra vs simpledb vs dynamo vs couchdb vs hypertable vs riak vs hbase, what do they have in common?

Sorry if this question is somewhat subjective. I am new to 'could store', 'distributed store' or some concepts like this. I really wonder what do they have in common and want to get an overview on