Hi @ll experts,
First, excuse me for my English. It is not my native language. I'm working on moving a SQL database to Cassandra but I have a question I'm not able to solve. Let's say I have a SQL table where I store songs. Each song has an ID as primary key which allows to access all its related data, which are stored in the fields of the row given by the key. I also have some indexes to search using some different criterias as the author, gender, title...
When I think on moving this to a Cassandra schema, I work around the idea that I can create an equivalent column family, where the song ID is the row key and the song attributes are the columns. Then, I can create 5 or 6 manual indexes to search by author, title, gender and more. The author, title... will be the column key (adding some extra data to keep them unique, using a composite column name) and the value will be the song ID for searching in the static column family where each row is identified by the song ID.
But I here appears my doubt. What is better: each index CF storing only the ID or storing all the attributes? The first option allows me to reduce the amount of necessary memory, but I need (at least) 2 reads to get each song attributes. With the second option I need more memory because repeat the same information once per index, but by in one read I can get all the attributes I need. I think I can assume the extra memory needed if this will be a faster schema, but, it will be really faster? Having a bigger database will not make it work slower? Or the slower operation is to search each row given by the index CF due to the way Cassandra stores the rows and due to the 2 reads?
Any help will be very appreciated.
Thanks in advance!