redissearch fuzzy search chinese

64 views Asked by At

enter image description here

I want to perform fuzzy search in redissearch.

name is a text filed. When I tried

FT.SEARCH panda_beta.mp2c.job_store.json:index '@name_fts:*店员* @pk:{16509}' RETURN 2 pk company_id LIMIT 0 1

but I can't find the result.

When I tried set name='大员大学路创智天地店'

and search

FT.SEARCH panda_beta.mp2c.job_store.json:index '@name_fts:*大员* @pk:{16509}' RETURN 2 pk company_id LIMIT 0 1```


the result is output.Why? How can I resolve it?
1

There are 1 answers

0
A. Guy On

For Chinese you have to specify the language when creating the index.

Take a look at FT.CREATE documentation:

LANGUAGE {default_lang}

if set, indicates the default language for documents in the index. Default is English.

LANGUAGE_FIELD {lang_attribute}

is document attribute set as the document language.

A stemmer is used for the supplied language during indexing. If an unsupported language is sent, the command returns an error. The supported languages are Arabic, Basque, Catalan, Danish, Dutch, English, Finnish, French, German, Greek, Hungarian, Indonesian, Irish, Italian, Lithuanian, Nepali, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish, Tamil, Turkish, and Chinese.

When adding Chinese language documents, set LANGUAGE chinese for the indexer to properly tokenize the terms. If you use the default language, then search terms are extracted based on punctuation characters and whitespace. The Chinese language tokenizer makes use of a segmentation algorithm (via Friso), which segments text and checks it against a predefined dictionary. See Stemming for more information.

And the FT.SEARCH documentation:

LANGUAGE {language}

use a stemmer for the supplied language during search for query expansion. If querying documents in Chinese, set to chinese to properly tokenize the query terms. Defaults to English. If an unsupported language is sent, the command returns an error. See FT.CREATE for the list of languages.