DataStax Enterprise 2.2 Documentation

Completing the Configuration and Starting Up the Upgraded Node

A few configuration settings need to be made before completing the upgrade.

To complete the upgrade

  1. Configure the snitch setting as described in Configuring the Snitch Setting.

  2. If you customized property files, other than the cassandra-topology.properties, update files by hand. Merge the settings of old property files, other than cassandra-topology.properties, into the new property files instead of overwriting the files. Users who overwrite property files, other than cassandra-topology.properties, have reported problems.

    It is ok to overwrite the old with the new cassandra-topology.properties file as instructed in Configuring the Snitch Setting.

  3. Start the node.

  4. If necessary, upgrade any CQL drivers and client libraries, such as python-cql, Hector, or Pycassa that are incompatible with the new DSE version. You can download CQL drivers and client libraries from the DataStax download page.

    The CQL utility is included in the DataStax Enterprise installation, so no upgrade of the CQL utility is necessary.

  5. Restart client applications.

  6. You can use a rolling restart to upgrade a cluster: Repeat the upgrade procedures on the next node in the cluster, following instructions in Performing a Rolling Upgrade exactly. Monitor the log files for any issues.

  7. If you created column families using the default SizeTieredCompaction, continue to the next step. If you created column families having LeveledCompactionStrategy, scrub the SSTables that store those column families.

  8. Validate the upgrade.

    If you use counter columns, upgrading SSTables is highly recommended.

Scrubbing SSTables

If you created column families having LeveledCompactionStrategy, you need to scrub the SSTables that store those column families.

First, upgrade all nodes to the latest version of DataStax Enterprise, according to the platform-specific instructions presented earlier in this document. Next, complete steps 1-7 of Completing the Configuration and Starting Up the Upgraded Node. At this point, all nodes are upgraded and started.

Finally, follow these steps to install the sstablescrub utility and scrub the affected SSTables:

Tarball Installations

Download the sstablescrub and dse-env.sh utilities.

  1. Place the downloaded sstablescrub script into the $DSE_HOME/bin directory.

  1. Replace dse-env.sh script in the $DSE_HOME/bin directory with the version you downloaded.

Packaged Installations (deb/rpm)

Download the sstablescrub and dse-env.sh

  1. Place the attached sstablescrub in the /usr/bin directory.

  2. Replace dse.in.sh in the /usr/share/dse directory with the version you downloaded.

    Note

    Do not repace dse-env.sh in the /etc/dse directory.

To scrub SSTables:

  1. Shut down the nodes, one-at-a-time.

  2. On each offline node, run the sstablescrub utility. Help for sstablescrub is:

    usage: sstablescrub [options] <keyspace> <column_family>
    --
    Scrub the sstable for the provided column family.
    --
    Options are:
      --debug display stack traces
      -h,--help display this help message
      -m,--manifest-check only check and repair the leveled manifest, without
      actually scrubbing the sstables
      -v,--verbose verbose output
    

    For example, on a tarball installation:

    cd <install directory>/bin
    ./sstablescrub mykeyspace mycolumnfamily
    
  3. Restart each node and client applications, one node at-a-time.

  4. Validate the upgrade.

If you do not scrub the affected SSTables, you might encounter the following error during compactions on column families using LeveledCompactionStrategy:

ERROR [CompactionExecutor:150] 2012-07-05 04:26:15,570 AbstractCassandraDaemon.java (line 134)
Exception in thread Thread[CompactionExecutor:150,1,main]
java.lang.AssertionError
at org.apache.cassandra.db.compaction.LeveledManifest.promote
(LeveledManifest.java:214)

Performing a Rolling Upgrade

Using a rolling restart, you upgrade and start one node at a time, instead of bringing down the entire cluster and starting all nodes at once. Between the time the first node begins the upgrade process until the last node completes the process, a schema disagreement condition exists. This is expected behavior.

When the schema disagreement condition exists, client interfaces block the following operations:

  • DDL
  • TRUNCATE
  • Solr queries

DDL, TRUNCATE, and Solr queries are not supported during a rolling restart. For example, during a rolling upgrade, these are the CQL commands that are and are not supported:

OK to Run Do Not Run Do Not Run (continued)
DELETE ALTER TABLE [1] DROP TABLE [1]
INSERT CREATE TABLE [1] DROP INDEX
SELECT CREATE INDEX DROP KEYSPACE
UPDATE CREATE KEYSPACE TRUNCATE
[1]TABLE and COLUMNFAMILY are interchangeable.

Cassandra throws a SchemaDisagreementException when a schema disagreement occurs. Continue upgrading until you complete all upgrade steps on all nodes, then using the Command Line Interface (CLI), run the DESCRIBE CLUSTER command:

$ cd <new DSE installation>

$ cassandra-cli -host localhost -port 9160

[default@demo]DESCRIBE CLUSTER;

Ensure that the output shows a single schema version for all nodes. If the output indicates a schema disagreement, or if a node is UNREACHABLE, perform these steps:

  1. Restart the node.

  2. Run the DESCRIBE CLUSTER command again.

  3. Repeat this process until the output shows a single schema version for all nodes.

    Ensure that the schema agrees before running DDL workloads.

Performing a Rolling Restart on Analytics or Solr Nodes

A rolling restart is not fully supported on Analytics and Solr nodes in that exceptions, which you can ignore, flood the log file.

The Hadoop job tracker repeatedly logs exceptions until all analytics nodes are upgraded. If you can tolerate these exceptions being added to the log file, use the rolling restart. The runtime exceptions you might see when starting analytics nodes look something like these snippet.

ERROR [GossipStage:1] 2012-09-21 01:09:21,510 AbstractCassandraDaemon.java
 (line 139) Fatal exception in thread . . .

INFO [JOB-TRACKER-INIT] 2012-09-20 07:06:38,064 JobTracker.java (line 2427) problem
 cleaning system directory: cfs:/tmp/hadoop-automaton/mapred/system
 java.io.IOException: java.lang.RuntimeException: TimedOutException() . . .

The runtime exceptions you might see when starting Solr nodes look something like these snippet.

javax.management.InstanceAlreadyExistsException: solr/Logging.log_entries:
  type= . . .

Ignore these exceptions. When the last node upgrades, restarts, and joins the cluster, the exceptions cease. As previously mentioned, upgrade and start the new job tracker node first.

Validating the Upgrade

After all nodes are upgraded, validate the upgrade by checking that these conditions do not exist:

  • A schema disagreement
  • Missing keyspaces

To check for a schema disagreement

  1. Using the Command Line Interface (CLI), run the DESCRIBE CLUSTER command:

    $ cassandra-cli -host localhost -port 9160
    
    [default@unknown] DESCRIBE cluster;
    

    If any node is UNREACHABLE, you see output something like this:

      [default@unknown] describe cluster;
      Cluster Information:
      Snitch: com.datastax.bdp.snitch.DseDelegateSnitch
      Partitioner: org.apache.cassandra.dht.RandomPartitioner
      Schema versions:
    UNREACHABLE: [10.202.205.203, 10.80.207.102, 10.116.138.23]
    
  2. Restart unreachable nodes.

  3. Repeat steps 1 and 2 until all nodes show the same schema number.

To check for missing keyspaces

  1. Use the CLI to run a SHOW KEYSPACES command:

    cd <new installation>/bin
    $ cassandra-cli -host localhost -port 9160
    [default@unknown] SHOW KEYSPACES
    

    The output consists of the schema definitions in the upgraded installation.

  2. Compare the output with the schema definitions you saved.

Post-Upgrade Problems?

In the event of a post-upgrade problem, such as a schema disagreement, contact Support before attempting further DDL operations.