To install Cassandra, download and unpack the binary distribution on each machine in the cluster. You can install Cassandra on Windows, Mac OSX, and Linux variants such as Ubuntu, Red Hat, and CentOS. Normally, any platform with a recent Sun JVM (1.6 or higher) and a working network interface should support Cassandra.
All Cassandra hosts must meet basic requirements, described on this page, for a Java virtual machine and for required data directories. On Linux platforms, installing JNA is highly recommended but not required.
DataStax provides rpm and dkpg packages for each stable release of Cassandra. Steps to install packaged releases are different from the steps described on this page for binary downloads; see Packaged Releases for more information.
Binary distributions of Cassandra are available in the downloads section of the Cassandra website. Unless you have a specific need for features in a beta version, DataStax recommends the binary distribution of the latest stable version.
The rest of this document will refer to the directory where you unpack the distribution as $CASSANDRA_HOME. The subdirectories of $CASSANDRA_HOME are:
binCassandra executables, including a startup script, the command-line client (CLI), and the nodetool utility for cluster management.
confFiles for configuring Cassandra, including cassandra.yaml, an SH file containing important environment variables, and properties files for logging, authentication, and network topology.
interfaceContains cassandra.thrift (RPC client API) and Avro.
javadocStandard Javadoc API documentation for Cassandra.
libContains external JARs and license documents.
A JVM is required to run Cassandra. DataStax recommends using the most recently released version of the Sun JVM. Versions earlier than 1.6.0_19 are specifically not recommended.
By default, Cassandra uses the following directories for data and commitlog storage:
Make sure that both of these directories exist and are writeable by Cassandra, either by changing their ownership or permissions. In Linux, this can be done as follows, where $USER is the user that will run Cassandra:
sudo mkdir /var/lib/cassandra sudo mkdir /var/log/cassandra sudo chown -R $USER:$GROUP /var/lib/cassandra sudo chown -R $USER:$GROUP /var/log/cassandra
Before starting Cassandra, it is recommended to set the first node’s initial token value to zero. This simplifies load balancing as you later expand the cluster. To set the initial token value, edit $CASSANDRA_HOME/conf/cassandra.yaml and set 0 as the value for the parameter initial_token. If this is unset (the default), Cassandra picks a token number randomnly.
For multi-node clusters, initial tokens should be calculated and specified for each node. This procedure is described in Adding Nodes to a Cluster.
Another important factor for load balancing in Cassandra is the selection of the partitioner . The default of RandomPartitioner is usually the best option for balancing a typical cluster for testing and evaluation.
Installing JNA (Java Native Access) on Linux platforms can improve Cassandra’s memory usage. With JNA installed and configured as described in this section, Linux does not swap out the JVM, and thus avoids related performance issues.
To install JNA with Cassandra
$USER soft memlock unlimited $USER hard memlock unlimited
Start Cassandra using the startup script in $CASSANDRA_HOME/bin.
cd $CASSANDRA_HOME sh bin/cassandra -f
You can verify connectivity to your Cassandra instance with the nodetool command line utility in CASSANDRA_HOME/bin:
~$ nodetool -h localhost -p 8080 ring Address Status State Load Owns Range Ring 127.0.0.1 Up Normal 495 bytes 100% 95315431979199388464207182617231204396 |<--|
The Cassandra instance uses the following ports:
|9160||Client traffic via the Thrift protocol||cassandra.yaml|
|7000||Cluster traffic via gossip||cassandra.yaml|
|8080||Port for monitoring attributes via JMX||cassandra.in.sh|
After Cassandra is installed successfully on a single node, you can continue to the steps for adding nodes to a Cassandra cluster.