Learning Objectives for Administrator Training with Cassandra

Module 1: Introduction to Operations

The student will be able to:

  • understand the basic operations of Cassandra 1.2
  • demonstrate how to configure Cassandra 1.2 operations

Module 2: The Write Path

The student will be able to:

  • understand why Cassandra 1.2 can write data fast
  • understand what the coordinator role is
  • understand what the data role of a node is
  • describe what a Memtable is
  • describe what the CommitLog is
  • explain how durability is achieved in Cassandra 1.2
  • define what a successful write is within Cassandra 1.2
  • understand what could cause a write failure in Cassandra 1.2
  • understand how Cassandra 1.2 recovers writes via the CommitLog
  • identify the different methods available in Cassandra to monitor writes
  • understand how concurrent reads and writes can affect each otheridentify the different issues that can cause writes to be slow or fail
  • understand that the Murmur3Partitioner (M3P) is best used compared to the ordered partitioner
  • understand that upgrading from MD5 to M3P cannot be done

Module 3: The Read Path

The student will be able to:

  • describe the sequence of the read path
  • understand the role of the read cache in a read operation
  • understand the role of the key cache in a read operation
  • what are bloom filters and how do they improve read performance?
  • understand the role of the primary index in a read operation
  • identify the various components that Cassandra 1.2 uses when reading columns versus an entire row
  • identify the causes of slow reads or a read failure

Module 4: Ring Management

The student will be able to:

  • explain what virtual nodes are
  • understand why to use virtual nodes
  • demonstrate how virtual nodes are enabled
  • describe how repair works
  • describe what decommission does
  • describe what remove token does
  • understand how to recover a downed node in your cluster
  • understand what bootstrapping is and how it works
  • describe what a seed node is
  • understand how many seeds a cluster should have
  • understand how to set up a multi-datacenter cluster

Module 5: Packaging for Cassandra

The student will be able to:

  • understand the different Cassandra distributions
  • explain how Cassandra is packaged from Apache
  • differentiate between the different DataStax distributions
  • identify how the DataStax distributions are shipped
  • identify where to download the DataStax distributions
  • explain what is in the DataStax Community distribution
  • explain what is in the DataStax Enterprise distribution

Module 6: Tuning

The student will be able to:

  • understand effective methods of tuning Cassandra 1.2
  • understand how the row cache works
  • understand when and when not to use the row cache
  • tune the row cache
  • demonstrate how do to monitor the row cache
  • understand how the key cache works
  • demonostrate how to tune the key cache
  • explain how can to properly estimate the size of my cache(s)
  • understand how to tune for database performance through data modeling
  • identify hardware and OS causes of performance issues
  • determine the appropriate heap size for my Cassandra 1.2 instance
  • understand the different methods of tuning for write performance
  • understand how to tune for better garbage collection
  • understand how to tune for better compaction performance

Module 7: The System Keyspace

The student will be able to:

  • identify the different components of the system Keyspace
  • understand what can is found in the index info
  • identify the different items that can be found in the location info
  • understand what schema migrations are and where to view the data
  • locate hint information from the current node
  • locate version information from the system keyspace
  • understand how to manage the system keyspace if problems occur

Module 8: Managing Data

The student will be able to:

  • understand and identify the various components of the Cassandra 1.2 data structure
  • establish data directory location(s)
  • explain how keyspaces fit into the data structure
  • explain how tables fit into the data structure
  • identify the table filename structure
  • understand what information is held within a table data file
  • understand what a snapshot id
  • create and locate snapshots within the Cassandra 1.2 system

Module 9: Sizing

The student will be able to:

  • determine how to properly size my Cassandra 1.2 cluster
  • how much memory should I give to Java?
  • what type of CPU should I use?
  • determine the optimal disk configuration for the nodes in my Cassandra 1.2 cluster
  • choose the proper disk type for the CommitLog and data directories
  • understand how different compaction strategies effect disk sizing requirements
  • determine disk capacity requirements for my Cassandra 1.2 node
  • explain how big can I/should I go for my per node capacity
  • decide if SSDs are good for a Cassandra 1.2 node
  • understand concerns when using the cloud for a Cassandra 1.2 cluster

Module 10: Troubleshooting

The student will be able to:

  • understand how to troubleshoot different issues with Cassandra 1.2
  • identify the different issues that can cause data corruption in Cassandra 1.2
  • understand the indicators for data corruption
  • list the various steps and methods to manually fix corruption issues
  • understand what can cause schema problems
  • understand how to fix schema migration problems

Module 11: Monitoring

The student will be able to:

  • describe what OpsCenter is and how it operates
  • describe what can be monitored with OpsCenter
  • identify how OpsCenter monitors a cluster
  • describe how to monitor a cluster with nodetool
  • describe how to monitor a cluster using JConsole
  • explain what can be monitored with JConsole
  • explain how to navigate the JConsole UI
  • describe what the different compaction metrics are
  • describe the thread pool monitoring options
  • understand how to monitor Read and Write latency in JConsole
  • understand how to monitor and adjust your cache(s) in JConsole

Module 12: Compaction

The student will be able to:

  • describe what compaction is and why it is necessary
  • describe the compaction process in general
  • identify the two different compaction strategies available in Cassandra 1.2
  • explain what a full compaction is when using Size-Tiered compaction and why it is not a good practice
  • describe the repair operation, cleanup operation, and scrub operation
  • describe what upgradesstable tool does
  • describe when caches are saved
  • describe how to tune compaction in the JVM, and in YAML
  • demonstrate the use of nodetool operations to monitor compaction

Module 13: Tombstones

The student will be able to:

  • explain what a tombstone is
  • explain why tombstones exist
  • identify when tombstones are created
  • explain how deletes and tombstones work
  • understand deletes in a distributed system
  • identify when tombstones are evicted
  • explain how the tombstone configuration parameters affect tombstone removal
  • name the different components of a row that can receive a tombstone
  • identify which methods of tombstone removal are better

Module 14: Failure

The student will be able to:

  • identify the different types of failure that can occur in a distributed system
  • understand the effects of failure on the Cassandra 1.2 system
  • understand the consequences of a node failure and how to protect against
  • identify the various methods of addressing node failure
  • identify the causes of a partial node failure
  • understand the various causes of network failure
  • what are the causes and effects of a network partition?
  • understand how to fix a network partition error
  • understand how Cassandra 1.2 reacts to disk failure
  • understand what type of disk configurations are nominal for a Cassandra 1.2 cluster

For more information, contact us.