When you install a node within a cluster, you mark it as either real-time (Cassandra), analytics (Hadoop), or search (Solr). To meet the requirements of changing workloads, DataStax Enterprise allows you to re-provision your existing Cassandra and Hadoop nodes at will and change the overall dynamic and capacity of your clusters.
For example, suppose an online Web application's daily operations dictate that a cluster’s is allocated as follows:
To meet the analytics requirements of various marketing programs, more than two nodes are needed to perform the required analysis. To meet this need, you can re-provision two of the Cassandra nodes to Hadoop nodes during low traffic volume hours so the cluster looks like:
Then after the Hadoop tasks are completed, you can return the cluster to its daily configuration.
The first step for enabling workload re-provisioning is to change the delegated snitch, which is designated by the DseDelegateSnitch (located in the dse.yaml file). A snitch is a configurable component of a Cassandra cluster that defines how the nodes are grouped together within the overall network topology.
By default the delegated snitch is the DseSimpleSnitch (org.apache.cassandra.locator.DseSimpleSnitch). In addition to the DseDelegateSnitch, DataStax Enterprise delegates to the standard Cassandra snitches. For more information see About Snitches in the DataStax Cassandra documentation.
The second step for packaged installations, such as RHEL and Debian, is to set the node's role in a configuration file and then restart the node.
The second step for tarball installations is stop the node and restart it in the desired role by setting an option.
For more information about installing, see Installing a Mulitple Node Cluster.
Note
Solr nodes cannot be re-provisioned.
The DseDelegateSnitch sets which snitch is used for re-provisioning. You need to only set the snitch one time. All nodes must use the same snitch in a cluster.
This section provides an example of delegating the RackInferringSnitch to enable workload re-provisioning. The RackInferringSnitch infers the topology of the network by analyzing the node IP addresses.
To delegate a snitch:
Open the dse.yaml file.
Set the delegated snitch and save the file:
delegated_snitch: org.apache.cassandra.locator.RackInferringSnitch
Packaged installations provide startup scripts in /etc/init.d.
Edit the /etc/default/dse file to set the node's role:
Restart the node:
$ sudo service dse restart
Use these instructions for Mac and other tarball installations:
To stop a node, find the Cassandra or DSE Java process ID (PID) and kill the process using the PID. For example:
$ ps -auwx | grep cassandra
$ kill <pid>
Start the node:
Note
DataStax does not recommend running Hadoop and Solr on the same node in production environments.