Suppose a user search "koreanpop" when he really means "korean pop".
I don't think I can build a dictionary in order to recognize "korean" and "pop" as word.
I'm going to use nGram for query analyzer. (is this a horrible idea?)
I'd like to try out
"ko/reanpop"
"kor/eanpop"
"kore/anpop"
"korea/npop"
"korean/pop"
"koreanp/op"
and find out documents with both "korean/pop". (which will be edge-ngram, min=2)
- Is this an ok strategy in practice? (I know that koreans do not use whitespaces as they should to separate words because Korean search engines support them)
- How do I accomplish this with elasticsearch?