A distributed system like Cassandra is designed from the ground up to run on a cluster (or clusters) of nodes. For testing and evaluation purposes however, it is easiest to run on a single node.
The most recent stable version can be found in the Downloads section of the Cassandra website. Given that Cassandra is written in the Java programming language, a recent JVM is required to run Cassandra. (Java 1.6.0_22 works well, although any version after 1.6.0_19 should be fine.)
By default, Cassandra uses the following directories for data and commitlog storage:
Make sure that both of these directories exist and are writeable by Cassandra, either by changing their ownership or permissions. In Linux, this can be done as follows:
sudo mkdir /var/lib/cassandra sudo mkdir /var/log/cassandra sudo chown -R $USER:$USER /var/lib/cassandra sudo chown -R $USER:$USER /var/log/cassandra
This assumes the current user is the same one that will run Cassandra.
Once you have extracted the file, starting Cassandra for the first time is pretty simple:
cd $CASSANDRA_HOME sh bin/cassandra -f
By default, an instance of Cassandra has started and is listening on the ports described below:
|9160||Client traffic via the Thrift protocol||storage-conf|
|7000||Cluster traffic via gossip||storage-conf|
|8080||Port for monitoring attributes via JMX||cassandra.in.sh|
You can verify connectivity to your Cassandra instance with the nodetool command line utility:
~$ nodetool -h localhost -p 8080 ring Address Status Load Range Ring 127.0.0.1 Up 495 bytes 95315431979199388464207182617231204396 |<--|
Working with a single Cassandra node is a good way to get a feel for the API, but to truly understand the functionality, operations, and performance characteristics, installing and running your own cluster is the best method.
For those with a background in administering large RDBMS systems, the term cluster carries a lot of baggage when considering installation and operation. In truth, the same features that provide Cassandra’s inherent scalability and fault tolerance actually help to make cluster configuration significantly easier. The install process for a multi node cluster is almost as direct as for the single node example above, but requires some minor edits to storage-config.xml on each node as described below.
At least one node must be provided that will be the Seed for other hosts that will join the ring. There is no hard and fast rule about what hosts need to be listed as seeds, but all nodes need the same list of seeds. The Gossip protocol simply uses this list to disseminate ring topology. Edit storage-config.xml for each node and add the first node (10.203.55.185 in this example) as the seed in each.
<Seeds> <Seed>10.203.55.186</Seed> <Seeds>
The next change involves setting the interfaces on which your nodes will listen for client traffic via Thrift and inter-cluster traffic via Gossip. This is accomplished by changing the ThriftAddress and ListenAddress elements to interfaces that are routable from clients and other servers in the cluster, respectively.
Again, edit storage-config.xml on both nodes and replace the default localhost entries to specify the interfaces which will listen for traffic. For the first node:
<ListenAddress>10.203.55.186</ListenAddress> ... <ThriftAddress>10.203.55.186</ThriftAddress>
and for the second node (10.205.2.67 for this example):
<ListenAddress>10.205.2.67</ListenAddress> ... <ThriftAddress>10.205.2.67</ThriftAddress>
Start a seed node, and verify connectivity with nodetool ring as in the single node example above. Now start the remaining nodes. After a few minutes of pauses to exchange data (you can follow the progress on the second node via the system log located by default in /var/log/cassandra/system.log), running nodetool ring again should give you something like the following (the example here shows two nodes):
~$ nodetool -h localhost -p 8080 ring Address Status Load Range Ring 95315431979199388464207182617231204396 10.205.2.67 Up 495 bytes 61078635599166706937511052402724559481 |<--| 10.203.55.186 Up 1.24 KB 95315431979199388464207182617231204396 |-->|
Congratulations, you now have a multi node Cassandra cluster.