DataStax Enterprise 3.0 Documentation

Upgrading Analytics/Hadoop nodes

This documentation corresponds to an earlier product version. Make sure this document corresponds to your version.

Latest DSE documentation | Earlier DSE documentation

To upgrade DataStax Enterprise 1.x.x - 2.2.x to 3.0.x, perform these upgrade steps on each node in the cluster. If the cluster is a mixed workload cluster, upgrade in the order described in Order of upgrading nodes. You need to restart the nodes as real-time Cassandra nodes before upgrading as described in this procedure. Restarting the nodes as real-time Cassandra nodes prevents unwanted schema changes from occurring when you start the upgraded node. Complete all steps on one node before starting to upgrade the next node.

Tarball release

  1. Stop the first node to be upgraded and restart it in real-time Cassandra mode:

    dse cassandra
    
  2. Create a directory for the new installation, download the tarball, and move it to that directory.

  3. Unpack the DataStax Enterprise 3.0.x tarball in the new install location.

    tar –xzvf <dse-3.0.x tarball name>
    
  4. If you customized the location of the data in the old installation, create a symbolic link to the old data directory:

    cd <new install location>
    ln -s <old data directory> <new install location>/<new data directory>
    

To configure the upgraded node

  1. In the new installation, open the cassandra.yaml for writing. The file is located in:

    <install location>/resources/cassandra/conf
    
  2. In the old installation of Cassandra, open the cassandra.yaml. The file is located in:

    <install location>/conf
    
  3. Diff the new and old cassandra.yaml files.

  4. Merge the diffs by hand from the old file into the new one, except do not merge snitch settings.

    If you are migrating data and set up the symbolic link described in the previous procedure, ensure that you merge the data_file_directories, commitlog_directory, and saved_caches_directory properties correctly.

  5. Configure the snitch setting.

  6. If you customized property files, other than the cassandra-topology.properties, update files by hand. Merge the settings of old property files, other than cassandra-topology.properties, into the new property files instead of overwriting the files. Users who overwrite property files, other than cassandra-topology.properties, have reported problems.

    It is ok to overwrite the old with the new cassandra-topology.properties file as instructed in Configuring the snitch setting.

  7. Start each node in real-time Cassandra mode during the rolling restart.

  8. Check for schema disagreements on each node.

  9. After all nodes are upgraded and the schemas agree, restart the nodes in Hadoop mode using a rolling restart.

  10. In DataStax 3.0, the ownership of the Hadoop mapred staging directory in the CassandraFS changed. After upgrading, you need to set the owner of /tmp/hadoop-<dseuser>/mapred/staging to the dse user. For example, if you run DataStax Enterprise 3.0.x as root, use the following command on Linux:

    dse hadoop fs -chown root /tmp/hadoop-root/mapred/staging
    
  11. If you created column families using the default SizeTieredCompaction, continue to the next step. If you created column families having LeveledCompactionStrategy, scrub the SSTables that store those column families.

  12. Check for schema disagreements again.

  13. If you meet conditions for upgrading SSTables, upgrade SSTables now.

Packaged release

  1. Stop the dse service, and then disable Hadoop by setting options in /etc/default/dse: HADOOP_ENABLED=0

  2. Restart the dse service.

  3. Run the Yum (CentOS/RHEL/Oracle Linux) or Aptitude (Debian/Ubuntu) update commands.

  4. Run the install commands shown in Installing the DataStax Enterprise package on Debian and Ubuntu or Installing the DataStax Enterprise package on RHEL-based distributions.

  5. Start the first node.

  6. Configure the node: Open the old cassandra.yaml. Open the new cassandra.yaml:

    Debian/Ubuntu: /etc/dse/cassandra

    RHEL-based: /etc/dse/cassandra/cassandra.yaml

    Diff the new and old cassandra.yaml files. Merge the diffs by hand from the old file to the new one except do not merge the snitch setting.

  7. Configure the snitch setting.

  8. If you customized property files, other than the cassandra-topology.properties, update files by hand. Merge the settings of old property files, other than cassandra-topology.properties, into the new property files instead of overwriting the files. Users who overwrite property files, other than cassandra-topology.properties, have reported problems.

    It is ok to overwrite the old with the new cassandra-topology.properties file as instructed in Configuring the snitch setting.

  9. Start up each node in real-time Cassandra mode during the rolling restart (Hadoop mode disabled).

  10. Check for schema disagreements.

  11. After all nodes are upgraded and the schemas agree, reset the mode from real-time Cassandra to Hadoop.

To restart the mode to Hadoop

  1. Stop the dse service, and then enable Hadoop by setting this option in /etc/default/dse: HADOOP_ENABLED=1

  2. Start the dse service using a rolling restart.

  3. In DataStax 3.0, the ownership of the Hadoop mapred staging directory in the CassandraFS changed. After upgrading, you need to set the owner of /tmp/hadoop-<dseuser>/mapred/staging to the dse user. For example, if you run DataStax Enterprise 3.0.x as root, use the following command on Linux:

    dse hadoop fs -chown root /tmp/hadoop-root/mapred/staging
    
  4. Repeat the previous steps for each node. Monitor the log files for any issues.

  5. If you created column families using the default SizeTieredCompaction, continue to the next step. If you created column families having LeveledCompactionStrategy, scrub the SSTables that store those column families.

  6. Check for schema disagreements.

  7. If you meet conditions for upgrading SSTables, upgrade SSTables now.