Perform Fuzzy Search on Wikipedia

850 views Asked by At

I'm trying to retrieve the page of the author Agatha Christie from Wikipedia exploiting its API. Wikipedia seems to perform this work quite well:

From https://en.wikipedia.org/wiki/Main_Page I search Agatha Christie and I find her.

By the API this seems not to be possible. As I can try to build my query from the API Search SandBox:

https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=search&srsearch=Agatha%20Christie&utf8=

I find one band, the Agatha Christie Memorial, some book, nothing else about her. And I cannot understand the reason. With Albert Einstein the query works.

Of course the search API returns all the matching pages, then it is my task to refine the search, but why no useful results in this case?

1

There are 1 answers

1
Nemo On BEST ANSWER

You just misread the results. The first result is the correct one. https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=search&format=json&srsearch=Agatha%20Christie&srprop=timestamp :

        {
            "ns": 0,
            "title": "Agatha Christie",
            "timestamp": "2015-07-06T19:37:15Z"
        },

Visit that title: https://en.wikipedia.org/wiki/Agatha_Christie. It's the correct page. The snippet mistakenly extracted the disambiguation information at the top, «For the band, see Agatha Christie (band). For the video game series, see Agatha Christie (video game series)», but it's just a snippet.

For an overview of the various search APIs, including the one which allows fuzzy searches (via CirrusSearch), see https://www.mediawiki.org/wiki/API:Search_and_discovery.