November 29th, 2012

Schema Design the cornerstone of making awesome products. If you do not understand you data, do not understand what users need, and do not understand limitations of hardware and software you can not effectively design schema.

To understand schema design in Cassandra I think start with what Cassandra is:

Cassandra is a highly scalable, eventually consistent, distributed,
structured key-value store. Cassandra brings together the distributed
systems technologies from Dynamo and the data model from Google’s BigTable. Like Dynamo, Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems. 


A Bigtable is a sparse, distributed, persistent multi-
dimensional sorted map. The map is indexed by a row
key, column key, and a timestamp; each value in the map
is an uninterpreted array of bytes.
(row:string, column:string, time:int64) ? string

Table *T = OpenOrDie("/bigtable/web/webtable");
RowMutation r1(T, "com.cnn.www");
r1.Set("", "CNN");
Operation op;
Apply(&op, &r1);