Solr - Match whole word only in text fields

Question

Solr - Match whole word only in text fields

1.9k views Asked by axelrod At 22 December 2013 at 14:19

I have a text field that can contain very long values (like text files). I want to create field type for it (text, not string), in order to have something like "Match whole word only" in notepad++, but the delimiter should not be only white spaces. If i have:

myName=aaa bbb

I would like to get it for the following search strings "aaa", "bbb", "aaa bbb", "myName=aaa bbb", "myName", but not for "aa" or "ame=a" or "a bb". Another example is:

<myName>aaa bbb</myName>

Can i do this somehow?

What should be my field type definition?

[EDIT] the text can contain any character. Before search i'm escaping the search string using http://lucene.apache.org/solr/4_2_1/solr-solrj/org/apache/solr/client/solrj/util/ClientUtils.html

Thanks

Original Q&A

There are 1 answers

**Arun** · Answer 1 · 2013-12-22T20:49:31+00:00

Start with, (why do you need to escape special chars? , you need let them get tokenized on them both at index and query time) :

<!-- A general text field that has reasonable, generic
         cross-language defaults: it tokenizes with StandardTokenizer,
     removes stop words from case-insensitive "stopwords.txt"
     (empty by default), and down cases.  At query time only, it
     also applies synonyms. -->
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>

This is a good place to learn how your text gets processed both at index and query time. Very useful admin tool : http://localhost:8983/solr/#/collection1/analysis

TechQA.

Solr - Match whole word only in text fields

There are 1 answers

Related Questions in SOLR

Related Questions in EXACT-MATCH

Popular Questions

Popular Tags

Trending Questions