Example of a music service
CREATE TABLE songs ( id uuid PRIMARY KEY, title text, album text, artist text, data blob );
In a relational database, you would create a playlists table with a foreign key to the songs, but in Cassandra, you denormalize the data because joins are not performant in a distributed system. Later, you use a collection to accomplish the same goal as joining the tables to tag songs.
CREATE TABLE playlists ( id uuid, song_order int, song_id uuid, title text, album text, artist text, PRIMARY KEY (id, song_order ) );
The combination of the id and song_order in the playlists table uniquely identifies a row in the playlists table. You can have more than one row with the same id as long as the rows contain different song_order values.
After inserting the example data into playlists, the output of selecting all the data looks like this:
SELECT * FROM playlists;
INSERT INTO playlists (id, song_order, song_id, title, artist, album) VALUES (62c36092-82a1-3a00-93d1-46196ee77204, 4, 7db1a490-5878-11e2-bcfd-0800200c9a66, 'Ojo Rojo', 'Fu Manchu', 'No One Rides for Free');
With the schema as given so far, a query that includes the artist filter would require a sequential scan across the entire playlists dataset. Cassandra will reject such a query. If you first create an index on artist, Cassandra can efficiently pull out the records in question.
CREATE INDEX ON playlists(artist );
Now, you can query the playlists for songs by Fu Manchu, for example:
SELECT * FROM playlists WHERE artist = 'Fu Manchu';
The output looks something like this: