Riak SOLR over HTTP and date ranges?

460 views Asked by At

Can anyone tell me what Riak expects for date format when using the SOLR api over HTTP to search? I have some data that's indexed. A wildcard search confirms that:

{
    "responseHeader": {
        "status": 0,
        "QTime": 13,
        "params": {
            "q": "*",
            "q.op": "or",
            "filter": "",
            "wt": "json"
        }
    },
    "response": {
        "numFound": 2,
        "start": 0,
        "maxScore": "0.00000e+0",
        "docs": [
            {
                "id": "09d1bf74-9cdc-4001-8797-fc5a4b9170b0",
                "index": "TestIndex",
                "fields": {
                    "Timestamp_dt": "2014-06-06T02:10:35.367Z"
                },
                "props": {}
            },
            {
                "id": "09d1bf74-9cdc-4001-8797-fc5a4b9170b0",
                "index": "TestIndex",
                "fields": {
                    "Timestamp_dt": "2014-06-08T02:10:35.367Z"
                },
                "props": {}
            }
        ]
    }
}

I've also confirmed my schema is picking up _dt as datetime:

%% Field names ending in "_dt" are indexed as dates
        {dynamic_field, [
            {name, "*_dt"},
            {type, date},
            {analyzer_factory, {erlang, text_analyzers, noop_analyzer_factory}}
        ]},

I've tried a bunch of variations including these:

/solr/TestIndex/select?wt=json&q=Timestamp_dt:[20140508000000%20TO%2020140608000000]
/solr/TestIndex/select?wt=json&q=Timestamp_dt:[20140508T000000Z TO 20140607T000000Z]
/solr/TestIndex/select?wt=json&q=Timestamp_dt:%5B2014-05-08T00%3A00%3A00.000Z%20TO%202014-06-07T00%3A00%3A00.000Z%5D

I'm stumped, and the docs on date ranges are somewhat lacking.. Has anyone gotten this to work? Am I stuck converting to epoch datetimes?

1

There are 1 answers

0
Joe On

Date fields use the noop analyzer, so the indexed text will be exactly as you stored it. However, the colon is an active character in the query, so you'll need to escape it in the value:

% curl localhost:8098/buckets/testbucket/keys/1 -XPUT -H "content-type: application/json" \
  -d '{"item":"1","stamp_dt":"2014-06-06T02:10:35.367Z"}'
% curl localhost:8098/buckets/testbucket/keys/2 -XPUT -H "content-type: application/json" \
  -d '{"item":"2","stamp_dt":"2014-06-07T02:10:35.367Z"}'
% curl localhost:8098/buckets/testbucket/keys/3 -XPUT -H "content-type: application/json" \
  -d '{"item":"3","stamp_dt":"2014-06-07T06:10:35.367Z"}'

% curl -g 'localhost:8098/solr/testbucket/select?q=stamp_dt:2014-06-06T02\:10\:35.367Z'
<?xml version="1.0" encoding="UTF-8"?>
<response>
  <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">1</int>
    <lst name="params">
      <str name="indent">on</str>
      <str name="start">0</str>
      <str name="q">stamp_dt:2014-06-06T02\:10\:35.367Z</str>
      <str name="q.op">or</str>
      <str name="filter"></str>
      <str name="df">value</str>
      <str name="wt">standard</str>
      <str name="version">1.1</str>
      <str name="rows">1</str>
    </lst>
  </lst>
  <result name="response" numFound="1" start="0" maxScore="0.353553">
    <doc>
      <str name="id">1
      </str>
      <str name="item">1
      </str>
      <date name="stamp_dt">2014-06-06T02:10:35.367Z
      </date>
    </doc>
  </result>
</response>

The date format you've chosen also lends itself well to range queries (not sure why the spaces have to be explicitly url-encoded as %20):

% curl -g 'localhost:8098/solr/testbucket/select?q=stamp_dt:[2014-06-06%20TO%202014-06-07T23\:59]'
<?xml version="1.0" encoding="UTF-8"?>
<response>
  <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">4</int>
    <lst name="params">
      <str name="indent">on</str>
      <str name="start">0</str>
      <str name="q">stamp_dt:[2014-06-06 TO 2014-06-07T23\:59]</str>
      <str name="q.op">or</str>
      <str name="filter"></str>
      <str name="df">value</str>
      <str name="wt">standard</str>
      <str name="version">1.1</str>
      <str name="rows">3</str>
    </lst>
  </lst>
  <result name="response" numFound="3" start="0" maxScore="0.00000e+0">
    <doc>
      <str name="id">1
      </str>
      <str name="item">1
      </str>
      <date name="stamp_dt">2014-06-06T02:10:35.367Z
      </date>
    </doc>
    <doc>
      <str name="id">2
      </str>
      <str name="item">2
      </str>
      <date name="stamp_dt">2014-06-07T02:10:35.367Z
      </date>
    </doc>
    <doc>
      <str name="id">3
      </str>
      <str name="item">3
      </str>
      <date name="stamp_dt">2014-06-07T06:10:35.367Z
      </date>
    </doc>
  </result>
</response>