recommendations for programmatic web searches

317 views Asked by At

I am working on a system that needs to associate URLs with data based on keywords. I was hoping I could use a web service to automatically perform full-web searches based on keywords or tags, and the results would be in a machine-friendly format like JSON.

My first thought was Google, and their Google Custom Search service looks pretty good, and has proven itself in tests. It has a simple REST-like URL and returns results in JSON format. The only problem is that it has a limit of 100 queries per day. I need more like 1000. Their higher-quota pay option (Google Site Search) does not allow full-web searches, so is useless to me.

Surely others have wanted to do programmatic web searches before. Does Google offer another B2B search service that we could use? We are happy to pay per query, sign agreements, etc. I fear I am not looking in the right place on Google's site.

As I wrote this question I found Microsoft's Bing web services home page. At first blush it looks pretty good. I have a slight preference for Google, but am open to Microsoft. I would love to hear any advice about using Microsoft's APIs.

2

There are 2 answers

1
sync On BEST ANSWER

Google custom search offers a 'pay for >100 queries' option, I believe:

https://developers.google.com/custom-search/v1/overview (see 'paid usage' section at the bottom)

0
Randall Cook On

@Sync found the right way in, and I believe I now understand the problem: Google has two control panels for custom search, and you can't get to one from the other.

I was on the panel for my Google Custom Search engine (www.google.com/cse/panel), which gives me control over low-level aspects of my search engine, and the only pay option was to convert to Google Site Search, but in so doing I would lose my full-web search power.

There is another, higher-level, control panel for all of Google's APIs (code.google.com/apis/console), of which Custom Search is a component. And from here, setting up billing to get a larger quota is clearly linked.

Sorry I am not providing proper links, as the relevant pages require login to access. While I consider this answer to be the authoritative one for my question, I am giving the green checkmark to @sync, without whose help I would not have been able to figure it out. I'd still love to see some comments on Bing's APIs, however!