DataStax Enterprise 2.0 Documentation

Upgrading DataStax Enterprise

You can upgrade these releases to DataStax Enterprise 2.0:

  • A previous release of DataStax Enterprise
  • Cassandra 0.7.10, 0.8.10, and 1.0.x

To upgrade from a Brisk release, contact Support.

To upgrade DataStax OpsCenter, see Upgrading OpsCenter and OpsCenter Agents.

This section lists component version changes and other major changes included in DSE upgrades:

Upgrade Changes
DSE 1.0 or 1.0.x to 2.0.x Cassandra updated to 1.0.8, Hadoop updated to 1.0, Hive updated to 0.8.1, Pig updated to 0.8.3 (effective in DSE 1.0.2), Sqoop 1.4.1 added, Solr 4.0 added.

Best Practices for Upgrading

The following best practices are recommended when upgrading:

  • Always take a snapshot before upgrading to a new release. This allows you to rollback to the previous version if necessary. Cassandra is able to read data files created by the previous version, but the inverse is not always true.

    Note

    Taking a snapshot is fast, especially if you have JNA installed, and consumes effectively zero disk space until you start compacting the live data files again.

  • Always read NEWS.txt before starting an upgrade. News.txt contains critical information about upgrading from early releases, especially Cassandra 0.6.x and 0.7, to DSE. NEWS.txt is in the following directory:

    Binary tarball: <install_location>/resources/cassandra/NEWS.txt

    Debian or RPM: /usr/share/doc/dse-libcassandra*/NEWS.txt

  • If your cluster includes a job tracker (Hadoop-enabled) node, upgrade that node first, then upgrade other nodes.

Upgrading to DataStax Enterprise 2.0

  1. In general, follow the instructions for a new installation with a few modifications for upgrading:

  2. Diff the following configuration files:

    • The cassandra.yaml from the old installation
    • The new DSE 2.0 cassandra.yaml

    The new DSE 2.0 cassandra.yaml is in:

    • Binary tarball installation: <install_location>/resources/cassandra/conf
    • Debian or RPM package: etc/dse/cassandra
  3. Merge the diffs, except those related to snitches, from your old file into the new DSE 2.0 version of cassandra.yaml. Observe these Do's and Don'ts:

    Do perform merging by hand. For example, set the seed location and local host name in the new cassandra.yaml to the same values as the old cassandra.yaml.

    Don't attempt to copy property files from the prior release and overwrite files in the new release. Users who have attempted this have reported problems.

    Don't add snitch settings from the old cassandra.yaml to the new cassandra.yaml. The new default snitch in the cassandra.yaml is com.datastax.bdp.snitch.DseDelegateSnitch. In previous versions, the default snitch was: com.datastax.bdp.snitch.DseSimpleSnitch.

  4. Perform one of the following tasks, depending on the snitch setting (endpoint_snitch URL) of your old cassandra.yaml file:

    • org.apache.cassandra.locator.SimpleSnitch - Leave the default delegated_snitch in the new dse.yaml unchanged.
    • org.apache.cassandra.locator.PropertyFileSnitch - Copy cassandra-topology.properties from the old installation. Paste it to <install_location>/resources/cassandra/conf, overwriting the new properties file.
    • Any other snitch URL - Change the default delegated_snitch in the new dse.yaml file to your current snitch setting

    Note

    The default delegated_snitch setting in the new dse.yaml file in <install_location>/resources/dse/conf is: delegated_snitch: com.datastax.bdp.snitch.DseSimpleSnitch.

  5. If necessary, upgrade any CQL drivers and client libraries, such as python-cql, Hector, or Pycassa that are incompatible with the new DSE version. You can download CQL drivers and client libraries from the DataStax download page.

  6. Flush the commit log on the upgraded node by running nodetool drain.

  7. Stop the old Cassandra process. Start the new Cassandra process as described in the next section.

About Starting the Upgraded Node

DataStax supports rolling restarts of nodes other than Analytic nodes. Using a rolling restart, you upgrade and start one node at a time, instead of bringing down the entire cluster and starting all nodes at once.

Note

You can actually start Analytic nodes using a rolling restart if you can accept your log files being flooded with exceptions.

The Hadoop job tracker repeatedly logs exceptions until all Analytic nodes are upgraded. The runtime exception you see when starting Analytic nodes looks something like this snippet:

INFO [pool-3-thread-1] 2012-03-22 02:09:08,868 Server.java (line 542) IPC Server listener
on 8012: readAndProcess threw exception
java.lang.RuntimeException: readObject can't find class. Count of bytes read: 0
java.lang.RuntimeException: readObject can't find class
at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:185)
. . .

You can ignore these exceptions. When the last node upgrades, restarts, and joins the cluster, the exceptions cease.

Note

If you have Analytics nodes in the cluster, upgrade and start the new job tracker node first.

Completing the Upgrade

  1. Flush the commit log on the upgraded node by running nodetool drain.

  2. You might need to run the following command against each node before running repair, moving nodes, or adding new ones.

    • Binary tarball: <install_location>/bin/nodetool -h upgradesstables
    • Debian or RPM Package: nodetool -h upgradesstables

    To determine whether or not you need to run this command, see News.txt in the location mentioned earlier.

  3. Monitor the log files for any issues.

  4. Upgrade the next node.

Upgrading a DataStax Enterprise AMI

Before upgrading, be sure to read Best Practices for Upgrading above and the /usr/share/doc/dse-libcassandra/NEWS.txt in the newer packages.

Note

If you have Analytics nodes in the cluster, upgrade and restart the job tracker node first.

  1. On each node ensure that the the DataStax repository is listed in the /etc/apt/sources.list:

    deb http://<username>:<password>@debian.datastax.com/enterprise stable main
    

    where <username> and <password> are the DataStax account credentials from your registration confirmation email.

  2. If necessary, add the DataStax repository key to your aptitude trusted keys.

    $ wget -O - http://debian.datastax.com/debian/repo_key | sudo apt-key add -
    
  3. On each node, run the following command:

    $ sudo apt-get update
    $ sudo apt-get install dse-full
    
  4. Compare the new and old version of the cassandra.yaml file and other property files that may have changed in /etc/dse directory.

    After installing the upgrade, a backup of the cassandra.yaml is created in the /etc/dse/conf/cassandra/conf directory. Use this copy to compare the two configurations and make appropriate changes.

    1. Diff the following configuration files:

      • The cassandra.yaml from the old installation
      • The new DSE 2.0 cassandra.yaml
    2. Merge the versions by hand from the old cassandra.yaml into the new DSE 2.0 cassandra.yaml:

      Don't add snitch settings from the old file to the new file. The new default snitch in the cassandra.yaml is com.datastax.bdp.snitch.DseDelegateSnitch. In previous versions, the default snitch was com.datastax.bdp.snitch.DseSimpleSnitch.

      Don't copy property files from the prior release and overwrite files in the new release. Users who have attempted this have reported problems.

  5. Perform one of the following tasks, depending on the snitch setting of your old cassandra.yaml file:

    endpoint_snitch URL

    Upgrade Task

    org.apache.cassandra.locator.SimpleSnitch

    Leave the default delegated_snitch in the new dse.yaml unchanged.

    org.apache.cassandra.locator.PropertyFileSnitch

    Copy cassandra-topology.properties from the old installation. Paste it to <install_location>/resources/cassandra/conf, overwriting the new properties file.

    Any other snitch URL

    Change the default delegated_snitch in the new dse.yaml file to your current snitch setting.

    Note

    The default delegated_snitch setting in the new dse.yaml file in <install_location>/resources/dse/conf is delegated_snitch: com.datastax.bdp.snitch.DseSimpleSnitch.

  6. If necessary, upgrade any CQL drivers and client libraries, such as python-cql, Hector, or Pycassa that are incompatible with the new DSE version. You can download CQL drivers and client libraries from the DataStax download page.

  7. Flush the commit log on the upgraded node by running nodetool drain.

  8. Restart the node:

    sudo service dse restart
    
  9. Restart client applications.