In this scenario, data replication is distributed across a single data center in mixed workload clusters. For example, if the cluster has 3 Hadoop nodes, 3 Cassandra nodes, and 2 Solr nodes, the cluster has 3 data centers: one for each type of node. A multiple data center cluster has more than one data center for each type of node.
Data replicates across the data centers automatically and transparently - no ETL work is necessary to move data between different systems or servers. You can configure the number of copies of the data in each data center and Cassandra handles the rest, replicating the data for you. To configure a multiple data center cluster, see Multiple data center deployment.
To correctly configure a multi-node cluster, requires the following:
This information is used to configure Node and Cluster Initialization Properties in the cassandra.yaml configuration file on each node in the cluster. Each node should be correctly configured before starting up the cluster.
This example describes installing a six node cluster spanning two racks in a single data center.
Location of the property file:
You set properties for each node in the cassandra.yaml file. This file is located in different places depending on the type of installation:
Note
After changing properties in the cassandra.yaml file, you must restart the node for the changes to take effect.
To configure a mixed-workload cluster:
The nodes have the following IPs, and one node per rack will serve as a seed:
Calculate the token assignments using the Token Generating Tool for a single data center.
Node |
Token |
|---|---|
node0 |
0 |
node1 |
21267647932558653966460912964485513216 |
node2 |
42535295865117307932921825928971026432 |
node3 |
63802943797675961899382738893456539648 |
node4 |
85070591730234615865843651857942052864 |
node5 |
106338239662793269832304564822427566080 |
node6 |
12760588759535192379876547778691307929 |
node7 |
148873535527910577765226390751398592512 |
If you have a firewall running on the nodes in your Cassandra or DataStax Enterprise cluster, you must open certain ports to allow communication between the nodes. See Configuring firewall port access.
Stop the nodes and clear the data.
For packaged installs, run the following commands:
$ sudo service dse stop (stops the service)
$ sudo rm -rf /var/lib/cassandra/* (clears the data from the default directories)
For binary installs, run the following commands from the install directory:
$ ps auwx | grep cassandra (finds the Cassandra and DataStax Enterprise Java process ID [PID])
$ sudo kill <pid> (stops the process)
$ sudo rm -rf /var/lib/cassandra/* (clears the data from the default directories)
Modify the following property settings in the cassandra.yaml file for each node:
Note
In the - seeds list property, include the internal IP addresses of each seed node.
node0
cluster_name: 'MyDemoCluster'
initial_token: 0
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "110.82.155.0,110.82.155.3"
listen_address: 110.82.155.0
rpc_address: 0.0.0.0
node1 to node7
The properties for the rest of the nodes are the same as Node0 except for the initial_token and listen_address:
Node |
initial_token |
listen address |
|---|---|---|
node1 |
21267647932558653966460912964485513216 |
110.82.155.1 |
node2 |
42535295865117307932921825928971026432 |
110.82.155.2 |
node3 |
63802943797675961899382738893456539648 |
110.82.155.3 |
node4 |
85070591730234615865843651857942052864 |
110.82.155.4 |
node5 |
106338239662793269832304564822427566080 |
110.82.155.5 |
node6 |
12760588759535192379876547778691307929 |
110.82.155.6 |
node7 |
148873535527910577765226390751398592512 |
110.82.155.7 |
After you have installed and configured DataStax Enterprise on all nodes, start the seed nodes one at a time, and then start the rest of the nodes.
Note
If the node has restarted because of automatic restart, you must stop the node and clear the data directories, as described in above.
Check that your ring is up and running:
Packaged installs: nodetool ring -h localhost
Binary installs:
$ cd /<install_directory>
$ bin/nodetool ring -h localhost