MongoDB string index that is not text?

242 views Asked by At

How does MongoDB index a string that is not identified as text? For example, tweets have many fields that are text, and I create an index for any of them. In my application, I have created an index for when the tweet was written, who wrote it, and the text of the tweet, but only the text is identified as a text index.

import pymongo as pm
db.collection.create_index('created_at')  # tweet creation time is a string
db.collection.create_index('user.screen_name')  # user's screen name
db.collection.create_index([('text', pm.TEXT)])  # tweet text is a string

Yet as I can still search the string fields.

db.collection.find({'user.screen_name': 'johndoe'})

Why? MongoDB's documentation says only one text index can be created, so what is the difference between an index on a string field and a text index?

1

There are 1 answers

0
Sylvain Leroux On BEST ANSWER

text indexes are for full-text search. Implementation is somewhat more complex than that, but think of it as an index on every word in the string.

On the contrary, plain indexes index the whole field at once. They should be your default choice -- even when a field contains a string -- as they are very efficient to search for equality, range or prefix. But not to retrieve a word in the middle of a field.

Given your example, it is quite meaningful to use a plain index on the user name, but to use full-text index on the tweet content.