DataStax Enterprise 2.2 Documentation

Data center operations

This documentation corresponds to an earlier product version. Make sure this document corresponds to your version.

Latest DSE documentation | Earlier DSE documentation

Common DSE Search/Solr operations are:

Adding a new Solr node

To increase the number of nodes in a Solr cluster, you can add or bootstrap a DSE node to the cluster. If you want to increase capacity of your search, you need to bootstrap the node, then optionally, rebalance the cluster. To bootstap a Solr node, use the same method you use to bootstrap a Cassandra node. Using the default DSESimpleSnitch automatically puts all the Solr nodes in the same data center. Use OpsCenter Enterprise to rebalance the cluster.

Modifying Solr data

When you insert data into Cassandra, it shows up in Solr. When you add data to Solr, it shows up in Cassandra. You can use any Solr API to write data to Solr, however, the native Solr HTTP REST API is recommended. Writes are durable. A Solr API client writes data to Cassandra first, and then Cassandra updates secondary indexes.

To modify or remove data from a Solr node use the Cassandra Query Language (CQL), the Command Line Interface (CLI), or Solr APIs. By virtue of updating a field in Cassandra, the data in Solr is updated. When you update the column family, the Solr document is updated.

Updating individual fields in a Solr document

You can use the Solr API to insert into, modify, or delete data from a Solr node. When using the Solr API to change a document, the entire document is updated. Using DSE Search, you can update an individual field. After writing the modifications to the Solr document, use a URL in the following format to update the document:

curl http://<host>:<port>/solr/<keyspace>.<column family>/update?
  replacefields=false

The Solr convention is to use curl for issuing update commands instead of using a browser.

When you use CQL or CLI to update a field, DSE Search implicitly sets replacefields to false and updates individual fields in the Solr document.

Warning about using optimize

Do not use the optimize command. Using the optimize command in a URL can cause nodes to fail.

Decommissioning and repairing a node

You can decommission and repair a Solr node in the same manner as you would a Cassandra node.

Rebuilding an index

The dsetool is equipped to rebuild a Solr index from existing Cassandra data. To rebuild a corrupted index:

  1. Run nodetool drain.

  2. Shut down the node.

  3. Delete the Solr index directory for the bad column family. The Solr index directory path is <Cassandra data directory>/solr.data/<keyspace_name>.<column-family-name>.

  4. Restart the node.

  5. Use this command to rebuild the index:

    ./dsetool rebuild_indexes <keyspace> <columnfamily>
    

Managing the Location of Solr Data

Solr has its own set of data files. Like Cassandra data files, you can control where the Solr data files are saved on the server. By default, the data is saved in <Cassandra data directory>/solr.data. You can change the location from the <Cassandra data directory> to another directory, from the command line. For example:

cassandra -s -Ddse.solr.data.dir=/opt

In this example, the data in solr.data is saved in the /opt directory.

Accessing the Validation Log

DSE Search stores validation errors that arise from non-indexable data sent from non-Solr nodes in this log:

/var/log/cassandra/solrvalidation.log

For example, if a Cassandra node that is not running Solr puts a string in a date field, an exception is logged for that column when the data is replicated to the Solr node.

Changing the Solr Connector Port

To change the Solr port from the default, 8983, change the http.port setting in the catalina.properties file installed with DSE in <dse-version>/resources/tomcat/conf.