DataStax Enterprise 2.2 Documentation

Querying Search Results

This documentation corresponds to an earlier product version. Make sure this document corresponds to your version.

Latest DSE documentation | Earlier DSE documentation

DSE Search hooks into the Cassandra Command Line Interface (CLI), Cassandra Query Language (CQL) library, the CQLsh tool, existing Solr APIs, and Thrift APIs.

Using Existing Solr Clients

All existing Solr clients work with DSE 2.0 and later. If you have an existing Solr application, and you want to use DSE, it is straight-forward. Create a schema, then import your data and query using your existing Solr tools. The Wikipedia demo is built and queried using Solrj. The query is done using pure Ajax. No Cassandra API is used for the demo.

Integration of Solr Queries into Cassandra API

Assuming you have set up DSE Search, and have data indexed in Solr from a column family, you can include a solr_query expression to CQL that takes advantage of the DSE Search hooks into the Solr API. This capability offers extensive query options, such as fuzzy matching.

The solr_query value supports any Lucene syntax. You can also use any Thrift API, such as Pycassa or Hector. Pycassa supports secondary indexes. You can use secondary indexes in Pycassa just as you use the Solr_query expression in DSE Search.

Querying Search Results Using CQL

You can use the CQL select statement to retrieve Solr data.

Synopsis

SELECT [FIRST <n>] [REVERSED] <select expression>
FROM <column family>
[USING <consistency>]
[WHERE solr_query = '<search expression>' [LIMIT <n>]

<select expression> syntax is:

{ <start_of_range> .. <end_of_range> | * }
| COUNT(* | 1)

A SELECT expression reads one or more records from a Cassandra column family and returns a result-set of rows. Each row consists of a row key and a collection of columns corresponding to the query.

Unlike the projection in a SQL SELECT, there is no guarantee that the results will contain all of the columns specified because Cassandra is schema-optional. An error does not occur if you request non-existent columns. In a production environment that uses a mixed workload cluster, you must search using the LOCAL_QUORUM consistency, as described in the Data Consistency in DSE Search article.

Example

To query the Wikipedia demo search results:

  1. Connect to the Cassandra Query Language (CQL) shell program. On the Mac, for example:

    cd <install_location>/bin
    
    ./cqlsh localhost
    
  2. Use the wiki keyspace and include the solr_query expression in a select statement to find the titles in the solr column family that begin with the letters natio:

    use wiki;
    
    SELECT title FROM solr USING CONSISTENCY local_quorum
      WHERE solr_query='title:natio*';
    

The query output appears:

 title
--------------------------------------------------------------------------
                                      Bolivia national football team 2002
 List of French born footballers who have played for other national teams
                    Lithuania national basketball team at Eurobasket 2009
                                      Bolivia national football team 2000
                                    Kenya national under-20 football team
                                      Bolivia national football team 1999
                                 Israel men's national inline hockey team
                                      Bolivia national football team 2001

Querying Multiple Column Families

To map multiple Cassandra column families to a single Solr core, use the Solr API. Specify multiple column families using the shards parameter. For example:

http://<host>:<port>/solr/<keyspace1>.<cf1>/select?q=*:*&shards=
  <host>:<port>/solr/<keyspace1>.<cf1>,<host>:<port>/solr/<keyspace2>.<cf2>

Using the Solr API, you can query multiple column families simultaneously if they have same schema.