I'm writing a Greasemonkey (v2.3) script that basically screen-scrapes the contents served by lema.rae.es/drae/srv/search, for lack of an API of any sort.
The thing is, I want to query that URL from Google Translate, a different domain. For that, I can use GM_xmlhttpRequest without problems, but a GET request to a specific URL (for instance lema.rae.es/drae/srv/search?val=test) yields an HTML page with a hidden form that gets POSTed after calling the challenge()
javascript function -- which calculates some sort of token which gets passed along in the POST request.
Obviously, this happens asynchronously and Greasemonkey sees nothing of it. Via trial and error I have come to realise that if my browser (Iceweasel 31.2.0) has a cookie for lema.drae.es, then the GET request issued using GM_xmlhttpRequest
actually returns what I want, which is the HTML of the definition of the word passed as a the parameter "val" in the URL. However, if I delete all cookies for lema.drae.es, the GET request returns the aforementioned hidden form.
In short, I need a way to receive the response of that POST request from within Greasemonkey, and I believe that if it were possible to receive the cookie from the server and store it then I could include it as a request header in a further request and it should work as I expect. Or it should simply be stored in the browser and therefore would get sent as a header when I trigger GM_xmlhttpRequest
.
I tried a different solution to my problem, namely using a hidden iframe, but the creation of such iframe was blocked by the browser on the grounds of the Same Origin policy, even after configuring the userscript to run on both domains.
Hopefully I've made clear what I want to achieve, and I hope somebody can point me in the right direction.
On a side note: if someone could explain what the challenge()
function calculates, I would really appreciate it. My hypotheses would be that the token it generates gets sent to the server which in turn uses it to produce the cookie, but that sounds so overly complicated...
The hidden iframe route is the way to go, but it is being blocked by translate.google.com in this case.
Here is an alternate approach to ensure that Firefox has the fresh cookies it needs to keep your mashup site (lema.rae.es) happy:
Find some source HTML that is present when the mashup site wants fresh cookies, but is absent otherwise.
In this case, the JS source
function challenge
will do.Make the
GM_xmlhttpRequest
to the mashup site and test the response.If the GM_xmlhttpRequest response has the desired data, parse it as desired.
Done!
If the GM_xmlhttpRequest response has the "needs cookies" source, open a special query, of the mashup site, in a popup window:
Here is a complete and tested (on Firefox) Greasemonkey script that mashes lema.rae.es/drae/srv/search into translate.google.com. :