DataStax Enterprise 3.1 Documentation

Using the ExtendedDisMax query parser

This documentation corresponds to an earlier product version. Make sure this document corresponds to your version.

Latest DSE documentation | Earlier DSE documentation

The traditional Solr query parser (defType=lucene) is the default query parser and intended for compatibility with traditional Solr queries. The ExtendedDisMax Query Parser (eDisMax) includes more features than than the traditional Solr query parser, such as multi-field query (Disjunction Max Query), relevancy calculations, full query syntax, and aliasing. Edismax is essentially a combination of the traditional Solr query parser and the dismax query parser, plus a number of other functional and usability enhancements. It is the most powerful query parser for Solr that is offered out of the box. For more information, see Solr 4.x Deep Dive by Jack Krupansky.

eDisMax supports phrase query features, such as phrase fields and phrase slop. You can use full query syntax to search multiple fields transparently, eliminating the need for inefficient copyField directives.

eDisMax example

To query the title and body fields of the mykeyspace.mysolr table from the previous example, specify the edismax deftype, the title and body query fields, and boost factors in your query:

http://localhost:8983/solr/mykeyspace.mysolr/
  select?q=life+Life
  &defType=edismax&qf=title^10.0+body^0.2
  &wt=json&indent=on&omitHeader=on

Output in json format is:

{
 "response":{"numFound":3,"start":0,"docs":[
  {
    "id":"123",
    "body":"Life is a foreign language; all men mispronounce it.",
    "name":"Christopher Morley",
    "title":"Life"},
  {
    "id":"124",
    "body":"In matters of self-control as we shall see again and again, speed kills. But a little friction really can save lives.",
    "name":"Daniel Akst",
    "title":"Life"},
  {
    "id":"126",
    "body":"If A is success in life, then A equals x plus y plus z. Work is x; y is play; and z is keeping your mouth shut.",
    "name":"Albert Einstein",
    "title":"Success"}]
}}

If you change the boost factors in the query to make matches in the body more significant than matches in the title field (qf=title^0.2+body^10.0 for example), the response changes to list the quotations from Morley and Einstein before the quotation of Akst:

{
  "response":{"numFound":3,"start":0,"docs":[
      {
        "id":"123",
        "body":"Life is a foreign language; all men mispronounce it.",
        "name":"Christopher Morley",
        "title":"Life"},
      {
        "id":"126",
        "body":"If A is success in life, then A equals x plus y plus z.
          Work is x; y is play; and z is keeping your mouth shut.",
        "name":"Albert Einstein",
        "title":"Success"},
      {
        "id":"124",
        "body":"In matters of self-control as we shall see again and
          again, speed kills. But a little friction really can save lives.",
        "name":"Daniel Akst",
        "title":"Life"}]
  }}

Configuring the default parser and query fields

You can set default values for most Solr request parameters in the search request handler in solrconfig.xml. To simplify queries you can make eDisMax the default query parser and also specify the query fields in the solrconfig.xml. After configuring the default parser, you no longer need to specify &defType=edismax on the Solr query request.

To modify the default query parser:

  1. Navigate to the demos/wikipedia directory and open the solrconfig.xml, for example, for editing.

  2. Locate the solr.SearchHandler and add edismax as the defType.

    This step eliminates the need to use defType in a query.

  3. Add a line to the solr.SearchHandler that specifies different default query fields and field boosting.

    For example:

    <requestHandler name="search" class="solr.SearchHandler" default="true">
      <!-- default values for query parameters can be specified, these
       will be overridden by parameters in the request
      -->
        <lst name="defaults">
          <str name="echoParams">explicit</str>
          <int name="rows">10</int>
          <str name="defType">edismax</str>
          <str name="qf">body^.2 title^10.0</str>
        </lst>
    

    This step eliminates the need to use qf in a query.

  4. Post the configuration and schema files using the cURL utility:

    curl http://localhost:8983/solr/resource/mykeyspace.mysolr/solrconfig.xml
      --data-binary @solrconfig.xml -H 'Content-type:text/xml; charset=utf-8'
    
    curl http://localhost:8983/solr/resource/mykeyspace.mysolr/schema.xml
      --data-binary @schema.xml -H 'Content-type:text/xml; charset=utf-8'
    
  5. Reload the Solr core for the keyspace and table.

    curl "http://localhost:8983/solr/admin/cores?action=RELOAD&name=mykeyspace.mysolr"
    
  6. Test the configuration using this simplified query:

    http://localhost:8983/solr/mykeyspace.mysolr/
      select?q=life+Life
      &wt=json&indent=on&omitHeader=on
    

    Output is:

    {
     "response":{"numFound":3,"start":0,"docs":[
       {
          "id":"123",
          "body":"Life is a foreign language; all men mispronounce it.",
          "name":"Christopher Morley",
          "title":"Life"
       },
       {
          "id":"124",
          "body":"In matters of self-control as we shall see again and
          again, speed kills. But a little friction really can save lives.",
          "name":"Daniel Akst",
          "title":"Life"},
        {
          "id":"126",
          "body":"If A is success in life, then A equals x plus y plus z.
          Work is x; y is play; and z is keeping your mouth shut.",
          "name":"Albert Einstein",
          "title":"Success"}]
    }}
    

    The default query field boost factors make matches in the title more significant than matches in the body, so the output lists the quotations from Morley and Akst before the quotation of Einstein.