Combine search engine and machine learning

226 views Asked by At

I'm pretty new on search engines and pretty newbie on machine learning. But I wanted to know if there is a way to combine functionalities of search engines like elasticsearch or Apache Solr and machine learning project like Apache Mahout, H2O or PredictionIO.

For exemple, if you work on a travel website where you can search for a destination. You start type "au", so the first suggestions are "AUstria", "AUstralia", "mAUrice island", "mAUritania"... etc... This is typically what elasticsearch can do.

But you know that this user has already travelled on Mauritania three times, so you want that Mauritania goes on the first place of suggestions. And I guess that's typically what machine learning can do.

Is there bridges between this two type of technologies ? Can machine learning ensure the work of search engine efficiently ?

I'm open to all answers, regardless of the technologies used. If you have ever experienced this type of problems, my ears are wide open :-)

Thank you

1

There are 1 answers

0
rawkintrevo On

Your question is very general in nature- so my answer will have to be the same.

Consider a recommender framework such as the one in Apache Mahout correlated co-occurance. Unlike the vanilla spark recommender, this implementation allows for multiple types of actions, such as viewed a web site, booked a trip their before, demographic information, etc.

Now you would calculate the recommendations for each user at whatever interval. Recommendations being based on multiple criteria and what other people similar to this user has done. Consider your 'items' in this case to be every destination in the world. So we now have every possible destination ranked for each user.

It is then a trivial extension to index elastic search by user/the ordered list of that users recommended destinations.

For example, we have a user who has visited Berlin, looked at several hotels in Vienna, and is from Romainia. When the user types in "au", we would expect to see "Austria" come up in the results much higher than 'Austrailia'

Per the comments and down votes- you probably should have either A) asked a more specific programming question or B) asked this question on another forum such as Data Science Stack Exchange, fyi