DataStax Developer Blog

Coming in 1.2: Collections support in CQL3

By Sylvain Lebresne -  August 5, 2012 | 24 Comments

Apache Cassandra is a distributed store. When you distribute data over many machines, doing joins at read time is expensive in the general case (compared to what can be done on a single host), as you might have to join over data that is not stored on the same physical host. That is why Cassandra has always encouraged denormalization instead of joins, i.e. physical collocation of the data that is queried together. In particular, the Cassandra Query Language does not offer JOIN in the language.

Not using joins has however the drawback of making some simple patterns less elegant. Consider the case where you want to allow users to have multiple email addresses. In a relational database, the canonical way to do that would be a create an email_addresses table with a many-to-one relationship to users … which implies a join. So in Cassandra, you would traditionally denormalize that as muliple columns email1, email2, etc. While this is usually fine from a performance standpoint (because both adding new columns and having columns without values is virtually free in Cassandra), this is tedious to use, not very natural, and have a few drawback like forcing you to do a read before adding a new email address (to know which column name to use). Enter the CQL3 supports for collections.

Sets

Cassandra 1.2 (slated for release towards the end of October) will provide a much powerful and elegant alternative to the “multiple email addresses problem” above. It introduces 3 new CQL3 types: map, set and list. Concretely, you will be able to define the users table as:

cqlsh> CREATE TABLE users (
           user_id text PRIMARY KEY,
           first_name text,
           last_name text,
           emails set<text>
       );

The operations allowed on that set include setting the set to specific value using a set literal:

cqlsh> INSERT INTO users (user_id, first_name, last_name, emails)
       VALUES('frodo', 'Frodo', 'Baggins', {'f@baggins.com', 'baggins@gmail.com'});

as well as adding new elements:

cqlsh> UPDATE users
       SET emails = emails + {'fb@friendsofmordor.org'} WHERE user_id = 'frodo';

which, upon query, will return:

cqlsh> SELECT user_id, emails FROM users WHERE user_id = 'frodo';
 user_id | emails
---------+-------------------------------------------------------------------
 frodo   | {"baggins@caramail.com","f@baggins.com","fb@friendsofmordor.org"}


Note that the elements of the set are returned in sorted order (i.e. sets are really sorted sets). If a different order (like the insertion order) is important, you might prefer using a list instead.

You can also remove an email:

cqlsh> UPDATE users
       SET emails = emails - {'fb@friendsofmordor.org'} WHERE user_id = 'frodo'

or remove all the emails, which can be done by either one of the two following queries (that are equivalent):

cqlsh> UPDATE users SET emails = {} WHERE user_id = 'frodo';
cqlsh> DELETE emails FROM users WHERE user_id = 'frodo';

What this means is that a set (or a list or map for that matter) only exists when it has one element. Or in other words, the empty set is not distinguishable of the null value:

cqlsh> SELECT user_id, emails FROM users WHERE user_id = 'frodo';
 user_id | emails
---------+-------------------------------------------------------------------
 frodo   | null

Note that if the set (or list/map) is empty (i.e. null), the two following queries will yield the same result:

cqlsh> UPDATE users SET emails = {'a@b.com'} WHERE user_id = 'frodo';
cqlsh> UPDATE users SET emails = emails + {'a@b.com'} WHERE user_id = 'frodo';

but if emails is not empty, the first query will replace the existing set by the set containing a@b.com as sole element, while the second query will add the value to the existing set.

Using a set has the following advantages over the email1, email2, … solution hinted above:

  • It’s undeniably cleaner and more natural to use.
  • Adding a new email don’t require a read (and in particular, no reads are done internally). And because it is a set, you do not have to care about adding an email multiple times.

Lists

The list type is also provided for the cases where the order of elements matters (and is not the natural order induced by the type of the elements) or when the same value should be allowing multiple times. Suppose for instance that each users can specify its top 5 list of preferred places. You could build on the previous example with:

cqlsh> ALTER TABLE users ADD top_places list<text>;

and then:

cqlsh> UPDATE users
       SET top_places = [ 'rivendell', 'rohan' ] WHERE user_id = 'frodo';

You can prepend elements:

cqlsh> UPDATE users
       SET top_places = [ 'the shire' ] + top_places WHERE user_id = 'frodo';

as well as append ones:

cqlsh> UPDATE users
       SET top_places = top_places + [ 'mordor' ] WHERE user_id = 'frodo';

Note that all of these operations are implemented internally without any read-before-write, and appending/prepending a new element only writes this new element, it neither read anything nor imply rewriting the whole list.

Lists also support setting (and removing) an element by its index:

cqlsh> UPDATE users SET top_places[2] = 'riddermark' WHERE user_id = 'frodo';
cqlsh> DELETE top_places[3] FROM users WHERE user_id = 'frodo';

However, one must be aware that these two operations do require a read of the whole list internally (only the updated element is written however after that read, not the whole list) and so they will have a greater latency than appending or prepending for instance.

At that point, you will have:

cqlsh> SELECT user_id, top_places FROM users WHERE user_id = 'frodo';
 user_id | top_places
---------+----------------------------------------
 frodo   | ["the shire","rivendell","riddermark"]

The last supported operation on lists is removing elements by values:

cqlsh> UPDATE users
       SET top_places = top_places - ['riddermark'] WHERE user_id = 'frodo';

will remove all the occurrences of 'riddermark' from the list. As with index based operations, this requires a read internally. Note that one could want to “emulate” this operation client-side by reading the whole list, finding the indexes that contain the value to remove and then issue remove of those indexes. This however wouldn’t be “thread-safe”, in the sense that if another thread/client prepend elements to the list between the read and the write, the wrong elements will be removed. This native operation does not suffer from that problem.

Maps

The last provided collection type is the map. Say you want to store in each user profile a very basic reminder/todo list, that associates to a timestamp something that the user should remember before that time. You can add:

cqlsh> ALTER TABLE users ADD todo map<timestamp, text>;

Then the map can be wholly set/replaced using a map literal:

cqlsh> UPDATE users
       SET todo = { '2012-9-24' : 'enter mordor',
                    '2012-10-2 12:00' : 'throw ring into mount doom' }
       WHERE user_id = 'frodo';

or you can update/set/delete a specific element using:

cqlsh> DELETE todo['2012-9-24'] FROM users WHERE user_id = 'frodo';
cqlsh> UPDATE users SET todo['2012-10-2 12:00'] = 'throw my precious into mount doom'
       WHERE user_id = 'frodo';
cqlsh> UPDATE users SET todo['2012-10-2 12:10'] = 'die' WHERE user_id = 'frodo';

Note that similarly to sets, maps are really ordered maps (the order is the one induced by the type of the keys):

cqlsh> SELECT user_id, todo FROM users WHERE user_id = 'frodo';
 user_id | todo
---------+-------------------------------------------------------------------
 frodo   | {"2012-10-02 12:00:00+0200":"throw my precious into mount doom",
            "2012-10-02 12:10:00+0200":"die"}

Note that each element of the map is internally stored as one Cassandra column (this is true of sets and lists too), meaning that each element can have an individual TTL for instance. If you want elements of the todo list to expire the day of their timestamp, you can compute the correct ttl and do:

cqlsh> UPDATE users USING TTL <computed_ttl>
       SET todo['2012-10-1'] = 'find water' WHERE user_id = 'frodo';

Things to know

  • You can only retrieve a collection in its entirety. And while we may (or may not) relax that rule a bit in the future, this still means that collections are not meant to be excessively large. They are not a replacement for a proper modelisation into tables.
  • Collections are typed but cannot currently be nested. You can have a list<text> or a list<int>, but you cannot define a list<list<int>>. It might be possible to support such nesting in the future if that is deemed useful, but it will require additional work.
  • There is no support for secondary indexes on collections yet. Concretely, you could associate a set of tags to a user, but you cannot automatically index users by their tags yet. Adding that support is definitively on the roadmap but remains to be implemented.

Conclusion

Collection support has already been added to the Cassandra trunk so you can already play with them if you are curious. They will be part of the final release of CQL3 that will be part of Cassandra 1.2.

We believe collections in CQL3 will be a great help for data modelisation and are a very natural fit for the language. Furthermore, given that any decent programming language has good support for lists, sets and maps, the integration with higher level API will be straightforward.



Comments

  1. Patrick says:

    How does that a table like that actually translate to the underlying system?

    So if you use “users” from CQL, what do you actually see? I mean is this a CQL only feature?

  2. Aaron says:

    Is this implemented purely in CQL or will other API’s like Hector/pycassa also be able to use lists, maps and sets?

  3. Ivan says:

    Good news:) Do they rely on SCF?

  4. Sylvain Lebresne says:

    @Patrick @Aaron: The short answer is yes, this is a CQL3 only feature.

    The slightly longer, more nuanced answer is that internally each collection is implemented using one column per element, and with some composite comparator trick to make that work. So in theory you could access those columns from thrift, but you’d have to regroup the columns together, and the thrift API doesn’t expose enough metadata, so this will be hard to correctly.

    @Ivan: no, this doesn’t rely on SCF, but as said above on the composite comparator. Though the latter can largely be though as a more general and more flexible way to achieve super columns.

  5. Manfred Dardenne says:

    Would it be possible to do range select on list for instance ?

    Seems that range select according to the documentation is only available for second column of a composite primary key for the moment

  6. Abhijit says:

    Can we assume that secondary index support will be provided on collections at the time of release or it will take some more time???

  7. Sylvain Lebresne says:

    @manfred Not really. More precisely, we wouldn’t be able to do that without reading the whole list (due to the way it is implemented), which removes most interest.

    @Abhijit Sadly, the secondary index support won’t likely make it for 1.2, it will take a bit more time (in other words, it’s not yet implemented, it’s not just a 5 minutes work and we don’t want to delay 1.2 for that).

  8. Ian Campbell says:

    Will we be able to include items in the sets/lists/maps in the WHERE statement? Such as: ingredients set … SELECT * … WHERE ingredients IN (['pineapple', 'salami']) … or a CONTAINS or NOT CONTAINS? This would be immensely useful.

  9. Sylvain Lebresne says:

    @Ian Basically the secondary index support discussed in the previous comments will give us the CONTAINS. So that will happen but is not done yet. For the NOT CONTAINS, I’m not sure how we could do that efficiently, so it’s unclear if or when this might happen.

  10. Ian Campbell says:

    @Sylvain, so it won’t be able to inspect the data itself, but will depend on an index of the data? If it inspected the actual data, an index wouldn’t be needed, and CONTAINS / NOT CONTAINS would be straightforward since the data is sorted.

  11. Sylvain Lebresne says:

    @Ian The problem with doing a CONTAINS without an index is that this is very inefficient in general, and forbidding such generally inefficient query is something we want with Cassandra/CQL (much like the fact that we don’t allow ‘WHERE a = 3′ unless a is either part of the PK or is indexed).

  12. Ian Campbell says:

    @Sylvian Ok, I understand. I was thinking more from the point of view of filtering results to reduce network traffic. For example, one could query against several PK columns (composite key), but cannot filter values. At the end of the day, this is going to happen on the client, which is worse. My 2 cents worth :)

  13. Roshni Rajagopal says:

    Would it be possible to store a name value pair, like column name is Item Id and value is a collection with 2 elements in each row like {{‘name’ , ‘apple’}, {‘Descr’ , ‘Granny Smith’}, {‘Qty’,’3′} }

    or would we need to have a column named Item Id and value is {‘apple’, ‘granny smith’, ’3′}. and the application needs to figure out that 1st element is name, second is descr etc

    Also can you store different data types in each row of the list or do all elements need to be of the same type?

    Im guessing we cant do both- but can anyone confirm.

  14. Sylvain Lebresne says:

    @Roshni First, let me note that the first thing that come to mind given your example in CQL3 would be to use something like:
      CREATE TABLE products (
        SomeKey text,
        ItemId text,
        Name text,
        Descr text,
        Qty int,
        PRIMARY KEY (someKey, ItemId)
       )

    That being said, if you really want you could use a map (i.e. declare ‘ItemId map‘). However, the map will be typed, and so both key and values will have to have the same type, which in your exemple is fine for keys, but mean you’ll have to coerce the value for ‘Qty’ to a string for instance. That’s why I believe the schema described above likely correspond better to your need.

  15. Natalia Levine says:

    I have the same problem as commenter in comment 13 – I need to store a small number of child objects in the same row as the parent object. These child objects are not accessed separately from the parent. The use case is identical to your use case with e-mails, except I want to add some more data to these child objects. For example, I could store instant messaging addresses instead of e-mails and I would like to specify the IM provider along with address. I could of course follow the relational design pattern as you suggested and add another table for these child objects. However this will require a join every time I want to get the parent object (and custom implemented join at that). I could also serialize the objects, but I’d rather avoid it for obvious reasons. If you supported collections of collections, it would solve this problem. In the meantime I guess I’m back to IMaddress1, IMaddress2 etc… Do you have any other suggestions to resolve this?

  16. Homayoun Afshari says:

    @Natalia, “Map” is good for you!

  17. Joselo says:

    If i have table something like:

    CREATE table nicetable (
    key varchar PRIMARY KEY,
    mapcounter map
    );

    it is posible to update the counter? and how i do that.

  18. Joselo says:

    mapcounter map<text, counter>

  19. Sylvain Lebresne says:

    @Joselo: no, collections of counters are not supported. Though we don’t validate it correctly (see CASSANDRA-5082)

  20. Harry says:

    How can i load collection types with sstableloader?

  21. cowardlydragon says:

    Ok, so how do you test the existence of a member of a set in CQL, or get a specific key value, or select the keys of a map in CQL?

    If these are truly mapped as individual columns for the set keys and map keys, that should be doable through CQL3, but I haven’t found any syntax that works.

    For example, how does one do an efficient query to check for set/map key membership existence using CQL3? That should translate to a bloom filter check if the keys/set members are actually implemented as separate columns under the hood, isn’t it?

    Or if I want to retrieve a specific map’s value for a key, that should be a single column retrieve under the hood, but where’s the CQL3 syntax for SELECT map['key'] FROM table WHERE rowkey = ‘rowval’ ?

    - See more at: http://www.opensourceconnections.com/2013/07/24/understanding-how-cql3-maps-to-cassandras-internal-data-structure-sets-lists-and-maps/#sthash.qAgDH9Yk.dpuf

  22. Daniel Smedegaard Buus says:

    “Note that each element of the map is internally stored as one Cassandra column (this is true of sets and lists too)”

    So, does this also mean that changes to a map are replicated on this sub-level?

    What I mean is, given, say, a CF “meetup” with a map participants, one data center might remove a participant from the map while another data center might add a participant at the same time, and those changes, when replicated are merged? So that we get to eventual consistency with a map where both changes are reflected?

  23. Daniel Smedegaard Buus says:

    Okay, the details of the participants map where removed from my comment due to it resembling html markup.

    Basically the “participants” map would just be userid => username, but either way it shouldn’t matter ;)

  24. Sylvain Lebresne says:

    > So, does this also mean that changes to a map are replicated on this sub-level?

    Yes. Each element of the map will have it’s own timestamp, and replication will resolve each elements separately.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>