Redisearch full text index not working with Python client

338 views Asked by At

I am trying to follow this Redis documentation link to create a small DB of notable people searchable in real time (using Python client).

I tried a similar code, but the final line, which queries by "s", should return two documents, instead, it returns a blank set. Can anybody help me find out the mistake I am making?

import redis
from redis.commands.json.path import Path
import redis.commands.search.aggregation as aggregations
import redis.commands.search.reducers as reducers
from redis.commands.search.field import TextField, NumericField, TagField
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
from redis.commands.search.query import NumericFilter, Query

d1 = {"key": "shahrukh khan", "pl": '{"d": "mvtv", "id": "1234-a", "img": "foo.jpg", "t: "act", "tme": "1965-"}', "org": "1", "p": 100}
d2 = {"key": "salman khan", "pl": '{"d": "mvtv", "id": "1236-a", "img": "fool.jpg", "t: "act", "tme": "1965-"}', "org": "1", "p": 100}
d3 = {"key": "aamir khan", "pl": '{"d": "mvtv", "id": "1237-a", "img": "fooler.jpg", "t: "act", "tme": "1965-"}', "org": "1", "p": 100}


schema = ( 
    TextField("$.key", as_name="key"),  
    NumericField("$.p", as_name="p"),  
) 

r = redis.Redis(host='localhost', port=6379)
rs = r.ft("idx:au") 
rs.create_index(     
    schema,     
    definition=IndexDefinition(     
        prefix=["au:"], index_type=IndexType.JSON   
    )    
)

r.json().set("au:mvtv-1234-a", Path.root_path(), d1)  
r.json().set("au:mvtv-1236-a", Path.root_path(), d2)  
r.json().set("au:mvtv-1237-a", Path.root_path(), d3)  

rs.search(Query("s"))
3

There are 3 answers

3
L. Meyer On BEST ANSWER

When executing a query from the redis-py client, it will transmit the FT.SEARCH command to the redis server. You can observe it by using the command MONITOR from a redis-client for example.

According to the documentation, when providing a single word for the research, the matching is full. That's why the result of your query is the empty set. If you want to search by prefix, you need to use the expression prefix*.

However, documentation says:

The prefix needs to be at least two characters long.

Hence, you cannot search by word starting only by s. What you could do:

rs.search(Query("sa*"))
#Result{1 total, docs: [Document {'id': 'au:mvtv-1236-a', 'payload': None, 'json': '{"key":"salman khan","pl":"{\\"d\\": \\"mvtv\\", \\"id\\": \\"1236-a\\", \\"img\\": \\"fool.jpg\\", \\"t: \\"act\\", \\"tme\\": \\"1965-\\"}","org":"1","p":100}'}]}

Aside note

If you want to scope your search on a specific field, the syntax is:

Query("@field_name: word") # Query("@key: sa*")

where @field_name is the schema field's name. Otherwise, the search will look up for all TextField attributes.

0
Kareem Adel On

Issue might be related to the way you are constructing the Query object in the you're trying to search for documents where the value of the "key" field matches the string "s". However, since your "key" field is of type TextField, it won't perform a full-text search for the term "s" Instead it will look for an exact match

So, If you want to perform a full-text search on the "key" field you should use the TextFiel search capabilities

import redis
from redis.commands.json.path import Path
from redis.commands.search.field import TextField, NumericField
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
from redis.commands.search.query import Query

d1 = {"key": "shahrukh khan", "pl": '{"d": "mvtv", "id": "1234-a", "img": "foo.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}
d2 = {"key": "salman khan", "pl": '{"d": "mvtv", "id": "1236-a", "img": "fool.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}
d3 = {"key": "aamir khan", "pl": '{"d": "mvtv", "id": "1237-a", "img": "fooler.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}

schema = (
    TextField("$.key", as_name="key"),
    NumericField("$.p", as_name="p"),
)

r = redis.Redis(host='localhost', port=6379)
rs = r.ft("idx:au")
rs.create_index(
    schema,
    definition=IndexDefinition(
        prefix=["au:"], index_type=IndexType.JSON
    )
)

r.json().set("au:mvtv-1234-a", Path.root_path(), d1)
r.json().set("au:mvtv-1236-a", Path.root_path(), d2)
r.json().set("au:mvtv-1237-a", Path.root_path(), d3)

# Use TextField with wildcard for partial matching on the "key" field
rs.search(Query("@key:s*"))
1
Kareem Adel On

you can try redefine the documents d1, d2, and d3. There are syntax errors in the JSON

On Following code. I corrected the syntax errors in the JSON strings of the "pl" field and fixed the typo in the query string.

d1 = {"key": "shahrukh khan", "pl": '{"d": "mvtv", "id": "1234-a", "img": "foo.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}
d2 = {"key": "salman khan", "pl": '{"d": "mvtv", "id": "1236-a", "img": "fool.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}
d3 = {"key": "aamir khan", "pl": '{"d": "mvtv", "id": "1237-a", "img": "fooler.jpg", "t": "act", "tme": "1965-"}', "org": "1", "p": 100}


schema = ( 
    TextField("$.key", as_name="key"),  
    NumericField("$.p", as_name="p"),  
) 

r = redis.Redis(host='localhost', port=6379)
rs = r.ft("idx:au") 
rs.create_index(     
    schema,     
    definition=IndexDefinition(     
        prefix=["au:"], index_type=IndexType.JSON   
    )    
)

r.json().set("au:mvtv-1234-a", Path.root_path(), d1)  
r.json().set("au:mvtv-1236-a", Path.root_path(), d2)  
r.json().set("au:mvtv-1237-a", Path.root_path(), d3)