DataStax Enterprise 2.2 Documentation

Release Notes

DataStax Enterprise 2.2.3

This release includes features that improve DSE Search/Solr operations.

  • Two features for performing an anti-entropy node repair on a subrange of data instead of all the data in a keyspace.

    • A new dsetool command, list_subranges, estimates subranges of data in a keyspace based on a specified number of rows.
    • New nodetool repair options, start token (-st) and end token (et), designate subranges of data for distribution within those ranges.

    Using these commands, DSE Search now performs a partial re-index instead of a full re-index of Solr data after an anti-entropy repair.

  • A new nodetool error code. When a replica node is dead and repair cannot proceed, nodetool sends an error status code to standard output.

New dsetool list_subranges command

The new dsetool command syntax for listing subranges of data in a keyspace is:

dsetool list_subranges <keyspace> <table> <rows per subrange> <start token> <end token>

<rows per subrange> is the approximate number of rows per subrange.

<start token> is the start range of the node.

<end token> is the end range of the node.

Note

You run nodetool repair on a single node using the output of list_subranges. The output must be tokens used on that node.

Example

dsetool list_subranges Keyspace1 Standard1 10000 113427455640312821154458202477256070485 0

The dsetool is located in <install_location>/bin on Linux platforms.

Output

The output lists the subranges to use as input to the nodetool repair command. For example:

Start Token                             End Token                               Estimated Size
------------------------------------------------------------------------------------------------
113427455640312821154458202477256070485 132425442795624521227151664615147681247 11264
132425442795624521227151664615147681247 151409576048389227347257997936583470460 11136
151409576048389227347257997936583470460 0                                       11264

New nodetool repair command options

The start token (-st) and end token (-et) options specify the portion of the node needing repair. You get values for the start and end tokens from the output of dsetool list_subranges command. The new nodetool repair syntax for using these options is:

nodetool repair <keyspace> <table> -st <start_token> -et <end_token>

Example

nodetool repair Keyspace1 Standard1 -st 113427455640312821154458202477256070485 -et 132425442795624521227151664615147681247
nodetool repair Keyspace1 Standard1 -st 132425442795624521227151664615147681247 -et 151409576048389227347257997936583470460
nodetool repair Keyspace1 Standard1 -st 151409576048389227347257997936583470460 -et 0

These commands begins an anti-entropy node repair from the start token to the end token.

Resolved issue

The following issue has been resolved:

  • The SliceFromReadCommand assertion has been removed: assert maxLiveColumns <= count; (CASSANDRA-5284).

Unresolved issues

Issues listed in DataStax Enterprise 2.2 to DataStax Enterprise 2.2.2 that have not been listed as resolved have not yet been fixed.

DataStax Enterprise 2.2.2

This release contains the following changes:

Issues

The Cassandra log4j appender doesn't support multiple hosts. (DSP-1601)

DataStax Enterprise 2.2.1

This release contains the following changes:

  • Updates Cassandra to Cassandra 1.1.6. See 1.1.6 CHANGES.txt.
  • Fixes issues, including the following noteworthy ones:
    • cassandra.yaml and cassandra-env.sh - Corrects Issue 1 in 2.2 Issues. DSP-1053
    • Oracle JDK/JRE 6 update 34-35 can now be used. - Corrects Issue 4 in 2.2 Issues.
    • Cassandra CLI error - Cannot update keyspace without first issuing use <keyspace>;. DSP-833
    • Hive error - Hive returns incorrect results (missing columns). DSP-1076
    • Hinted Handoff fails - Hinted handoff fails to deliver hints. Cassandra-4772
  • Includes two patches to Cassandra 1.1.6. These patches are:

Issues

This release has the following issues:

  • Issue 1: In this release, an exception occurs under either of these conditions:

    • Dropping a Solr keyspace and then recreating it. DSP-1126
    • Updating the schema.xml or solrconfig.xml. (The schema.xml exception is a known issue from all previous releases.) DSP-655 and DOC-62

    The workaround is to perform a rolling restart (restart each node one-at-a-time) before adding any data to the database.

    Note

    Adding data before performing the workaround can cause unpredictable problems.

    After you drop a Solr keyspace or column family, Solr-specific residual data remains in memory until you perform the workaround. For example, if you drop a Solr keyspace on node 1 and search for the data on node 2, Solr returns residual data. To completely remove the residual data, you need to perform the workaround to restart all Solr nodes.

  • Issue 2: DataStax Enterprise is designed for a multiple data center environment and not intended for use with the SimpleStrategy replication placement strategy. SimpleStrategy is not data center-aware. DataStax Enterprise does not work correctly using SimpleStrategy. Use NetworkTopologyStrategy. DSP-1195

  • Issue 3: The nodetool repair does not completely repair a keyspace unless it is in every datacenter. CASSANDRA-5424

DataStax Enterprise 2.2

  • Apache Cassandra 1.1.5
  • Apache Hadoop 1.0.2
  • Apache Hive 0.9.0
  • Apache Pig 0.9.2
  • Apache Solr 4.0
  • Apache Thrift 0.6.1
  • Apache log4j 1.2.16
  • Apache Sqoop 1.4.2
  • Apache Mahout 0.6

What's New

DataStax Enterprise 2.2 has been enhanced in the following ways:

  • Production certified Cassandra – DataStax Enterprise contains a fully tested, benchmarked, and certified version of Apache Cassandra that is suitable for mission-critical production deployments.
  • Updates Cassandra 1.0 to Cassandra 1.1.5 - In Cassandra 1.1, key improvements have been made in the areas of CQL, performance, and management ease of use.
  • Support for Installation on the HP Cloud - In addition to Amazon Elastic Compute Cloud, DataStax now supports installation of DataStax Enterprise in the HP Cloud environment. You can install DataStax on Ubuntu 11.04 Natty Narwhal and Ubuntu 11.10 Oneiric Ocelot.
  • Support for SUSE Enterprise Linux - DataStax Enterprise adds SUSE Enterprise Linux 11.4 and 11.2 to its list of supported platforms.
  • Improved Solr shard selection algorithm - Previously, for each queried token range, Cassandra selected the first closest node to the node issuing the query within that range. Equally distant nodes were always tried in the same order, so that resulted in one or more nodes being hotspotted and often selecting more shards than actually needed. The improved algorithm uses a shuffling technique to balance the load, and also attempts to minimize the number of shards queried as well as the amount of data transferred from non-local nodes.
  • Capability to Set Solr Column Expiration - You can update a DSE Search column to set a column expiration date using CQL, which eventually causes removal of the column from the database.

Issues

This release has the following issues:

  • Issue 1: The cassandra.yaml file in DataStax 2.2 is incomplete. Download the correct cassandra.yaml file for DataStax Enterprise 2.2 from:

    Use this file to overwrite the existing cassandra.yaml file in the following location:

    • Binary Tarball Install

      <install location>/resources/cassandra/conf

    • Packaged Install

      <install location>/etc/dse/cassandra

    The next release will correct this issue. DSP-1053

  • Issue 2: You might experience a problem upgrading to DataStax Enterprise 2.2. You definitely will not lose data if you experience the problem. The workaround is to save keyspaces in your old installation before upgrading and validate the upgrade to ensure that keyspaces were migrated. A patch will be issued to resolve this issue. CASSANDRA-4698

  • Issue 3: This release has no native support for Cassandra composite columns when using Hive, Pig, Solr, Mahout, or Sqoop components. When using these components, the columns are transposed in CQL 3 query results. It is the user's responsibility to create a user-defined function (UDF) to display the tables correctly.

  • Issue 4: DataStax recommends that you use the latest version of Oracle JDK/JRE 6, but not Oracle JDK/JRE 6 updates 34-35, updates prior to 30, or JDK/JRE 7.

  • Issue 5: Sometimes, under a heavy write load, Cassandra fails with an assertion error that looks something like this:

    java.lang.AssertionError: DecoratedKey(xxx, yyy)
      != DecoratedKey(zzz, kkk) . . .
    

    The workaround is to disable caching using CQL.

  • Issue 6: If a node has hints for a few nodes, that node delivers hints only for the first one of them. Cassandra-4772

  • Issue 7: MapReduce jobs hang before completing or finishing cleanup with older versions of Hadoop (MAPREDUCE-4560, MAPREDUCE-4299. The workaround is remove the mapred.reduce.slowstart.completed.maps parameter and restart. DSP-1154

  • Issue 8: The nodetool repair -pr command does not completely repair a keyspace unless the keyspace is in every datacenter. CASSANDRA-5424