Searching on multiple fields with Indextank

160 views Asked by At

I'm indexing books and perform text searches on different fields of the book :

  • Title
  • Author
  • Book summary

i tried to create an index by concatenating the book name, the author name and the book summary but some of my searches don't return the expected results and i dont understand why.

What is the right way to index the books so that i search on all these fields at the same time ?

--

Here is the code sample :

book_text_index = "#{book.name} #{book.author} #{book.summary}"

idx.document("book_502").add({  :text => book_text_index,
                                  :book_id => "#{book.id}",
                                  :name => "#{book.name}",
                                  :author => "#{book.author}",
                                  :summary => "#{book.summary}"
                                })

And here is an example of the results i get for the book "L'art de la guerre" by "Sun Tzu".

If i search the author name ("tzu") it returns the book:

idx.search("tzu", :function => 1, :fetch => 'text' )['results']

=> [{"text"=>"L'art de la guerre Sun Tzu Youboox libres de droits Traduit pour la première fois...", "docid"=>"book_502", "query_relevance_score"=>-2967.0}]

But if i search a part of the book title ("guerre") i dont get the book in the results.

idx.search("guerre", :function => 1, :fetch => 'book_id' )['results'].map { |result| result["docid"]}

=> ["book_1962", "book_1963", "book_1951", "book_1832", "book_1812", "book_1787", "book_1775", "book_1778", "book_1730", "book_1740"]

You can see that the book_502 is not in the results.

1

There are 1 answers

0
Chris Lamprecht On BEST ANSWER

In reply to your question, "What is the right way to index the books so that i search on all these fields at the same time ?" - concatenating the fields into a single 'text' field is the simplest way to achieve this. One possible downside to this method is that for relevance (the order of the results), this gives equal weight to the book title, author, and summary.

In this particular case (Book title, author, and summary), I would guess that the book title and author are more "important" for matching than the description. In other words, if the user's query matches a book title, it is a better result than if it only matched the summary. If this is the case, here is how you can get more relevant results for your users (it's a little more work, but often worth it).

First, you index into 3 separate fields:

  1. name - contains the book title
  2. author - contains the author
  3. text - contains the book summary, and possibly other keywords you want to match

Then at search time, in order to search across all fields, you will use an OR query. However, to give more weight to the title and author than the summary, your queries will look like this (example user search for "guerre"):

name:(guerre)^6 OR author:(guerre)^5 OR text:(guerre)

Another example, if the user searches for "sun tzu":

name:(sun tzu)^6 OR author:(sun tzu)^5 OR text:(sun tzu)

The parenthesis are necessary to keep proper field grouping. So your query template will be something like this (note, my Ruby is rusty):

searchify_query = "name:(#{user_query})^6 OR author:(#{user_query})^5 OR text:(#{user_query})"

Hope this helps!