Polish search for Sphinx?

986 views Asked by At

I want to implement a search solution for a website written in Django. From the available options (I have researched Solr, Sphinx, Xapian, PostgreSQL/Tsearch3, MySQL) Sphinx looks like the nicest. However, it does not support stemming for Polish, and that is the language of the data that I want to make searchable.

What are the best ways of dealing with unsupported languages in Sphinx? I have an intuition that I could create a stemming corpus from the Ispell dictionary. How can I make that work with Sphinx?

1

There are 1 answers

0
aditirex On

Search in http://snowball.tartarus.org/ mailist , you might find some info if someone tried to create a polish stemmer . There are 2 free stemmers available , but they are made in java ( I think at least one is made for solr/lucene) . From Ispell , I'm not sure if the stemming corpus can help you , you could create files to be used for wordforms or excepts .