Learning Objectives for Developer Training with Cassandra

  • 2.1. Basics.
    • 2.1.1. Understand what Cassandra (hereafter “C*”) columns are, their components, and role.

    • 2.1.2. Describe the column’s role in a column family.
    • 2.1.3. Identify column components 
(TimeStamp,
Name/Value,
TTL,
Optional or Required Components).
    • 2.1.4. Details for Column Name.
    • 2.1.5. Details for Column Value.
    • 2.1.6. What is the TimeStamp used for?
    • 2.1.7. What is the TimeStamp?
    • 2.1.8. Details for the TTL.
    • 2.1.9. What are SuperColumns.
    • 2.1.10. What are CounterColumns.
    • 2.1.11. Inside Rows columns are in sorted order, how?
    • 2.1.12. How are column values validated?
    • 2.1.13. What are comparators used for?
    • 2.1.14. Describe what a comparator is?
    • 2.1.15. What does it mean to reverse a comparator?

  • 2.2. Column Families.
    • 2.2.1. What is a column family?

    • 2.2.2. What is a Keyspace?

    • 2.2.3. What is stored in a Keyspace?

    • 2.2.4. What is a cluster?

    • 2.2.5. Acquire a basic understanding of Consistent Hashing.

    • 2.2.6. How the partitions are selected by the Partitioner?

    • 2.2.7. What are rows?

    • 2.2.8. What is Denormalization?
    • 2.2.9. What is Eventual Consistency?

    • 2.2.10. Describe Static Column Families.

    • 2.2.11. Describe Dynamic Column Families.
    • 2.2.12. Use effective query patterns with C*.
    • 2.2.13. Identify anti-patterns in C* queries.

    • 2.2.14. Be able to denormalize your data into C*.
    • 2.2.15. What do TTLs do?

    • 2.2.16. How do Comparators affect the data modeling?

    • 2.2.17. Recognize a client side join and how to avoid it.

    • 2.2.18. Know how C* models relationships in the data, (it doesn’t).

    • 2.2.19. Analyze denormalized schema for additional work due to absence of triggers, foreign keys and joins.

    • 2.2.20. Perform cost benefit analysis for choosing reads or writes.

    • 2.2.21. Understand what happens when you insert data in the absence of primary key constraints.

    • 2.2.21. Understand what UUIDs are and the different types.
  • 2.3. The API.
    • 2.3.1. Understand the purpose of the Thrift API.

    • 2.3.2. What is Thrift?

    • 2.3.3. For what do we use Thrift?

    • 2.3.4. Identify what could go wrong and why?

    • 2.3.5. Describe consistency requirements for reads and writes.

    • 2.3.6. Describe the function of Thrift methods for get, set, multi-get, indexed slice.

    • 2.3.7. What is execute_cql_query used for?

    • 2.3.8. Undertand what is in the Thrift type for ColumnOrSupercolumn
    • 2.3.9.
What is a slice predicate?
    • 2.3.10. What are the available constraints.

    • 2.3.11. What is a key range?

    • 2.3.12. Why you shouldn’t use raw Thrift.

    • 2.3.13. How does Thrift access CQL?

    • 2.3.14. Which client is best for you, and how to choose one.
  • 2.4. Indexing.
    • 2.4.1. Explain how C* uses “Primary Indexes” or “Partition Keys” to locate data in the cluster.

    • 2.4.2. Explain how C* uses “Primary Indexes” to locate data on a machine.
    • 2.4.3. Know why you should choose the Random Partitioner.

    • 2.4.4. What are secondary indexes?

    • 2.4.5. What is the absolute column limit?

    • 2.4.6. What is a more practical column limit?

    • 2.4.7. What are schemas good for?

    • 2.4.8. Why do people typically use wide rows in Cassandra?

    • 2.4.9. Know how rows are stored on partitions.

    • 2.4.10. Understand how to create and maintain wide row indexes for:
Grouping
Sorting
Range Queries.

    • 2.4.11. Understand how Super and Composite columns sort data.

  • 2.5. Looking at a Cassandra Application.
    • 2.5.1. Analyze an existing Cassandra application.

    • 2.5.2. Identify effective applications of Column Families.

    • 2.5.3. Identify Static and Dynamic column families and how they are used.

    • 2.5.4. Identify C* application anti-patterns.

    • 2.5.5. Use CQL3 to denormalize a schema and eliminate client side joins.

  • 2.6. Composite Columns.
    • 2.6.1. What are composite columns
in Thrift?

    • 2.6.2. In CQL 
How should I use composites?

    • 2.6.3. What should be avoided?
  • 2.7. Sorted Lists Wide Rows.
    • 2.7.1. Leverage columns to create sorted lists.

    • 2.7.2. Understand how columns are sorted.
    • 2.7.3. Understand what happens when you change a value in a column.

    • 2.7.4. Understand what happens when you insert a new column name (or an item with a new column name).

    • 2.7.5. Describe what happens when you try to “update” the column name in a list (inserts a new item).

    • 2.7.6. Know some alternative ways to update lists (create new lists, versions, surrogate keys).

    • 2.7.7. How to use inverted indexes to add, update, delete items from a list.

  • 2.8. Secondary Indexes.
    • 2.8.1. Understand how Secondary Indexes are implemented in C*.

    • 2.8.2. Identify effective query patterns with Secondary indexes.

    • 2.8.3. Identify anti-patterns with secondary indexes.
    • 2.8.4. Be able to Create a secondary index on a table.

    • 2.8.5. Be able to query on a secondary index.

    • 2.8.6. Understand how cardinality affects querying secondary indexes.
  • 2.9. Time Series Data.
  • 2.10. Modeling Achievements.
  • 2.11. Hadoop.
  • 2.12. Hive.
  • 2.13. Pig.
  • 2.14. Solr.
  • For more information, contact us.