We're running Solr 3.4 and have a relatively small index of 90,000 documents or so. These documents are split over several logical sources, and so each search will have an applied filter query for a particular source, e.g:
?q=<query>&fq=source:<source>
where source
is a classic string field. We're using edismax and have a default search field text.
We are currently seeing q=*
taking on average 20 times longer to run than q=*:*
. The difference is quite noticeable, with *:*
taking 100ms and *
taking up to 3500ms. A search for a common word in the document set (matching nearly 50% of all documents) will return a result in less than 200ms.
Looking at the queries with debugQuery on, we can see that *
is parsed to a DisjunctionMaxQuery((text:*))
, while *:*
is parsed to a MatchAllDocsQuery(*:*)
. This makes sense, but I still don't feel like it accounts for a slowdown of this magnitude (a slowdown of 2000% over something that matches 50% of the documents).
What could be causing this? Is there anything we can tweak?
When you are passing just
*
you are ordering to check every value in the field and match it against*
and that is a lot to do. However when you are using* : *
you are asking Solr to give you everything and skip any matching.Solr/Lucene is optimized to do
* : *
fast and efficient!