How to use dbpedia spotlight docker image?

1.2k views Asked by At

I'm facing a problem with dbpedia spotlight. I can't seem to connect to the local docker image found here.

I used the command docker pull dbpedia/spotlight-english with docker run -i -p 2222:80 dbpedia/spotlight-english and then checked that the container is running with docker ps. Everything works fine.

After that, I try to query the server by running the curl given in the spotlight documentation:

curl http://0.0.0.0:2222/en/annotate  \
  --data-urlencode "text=President Obama called Wednesday on Congress to extend a tax break
  for students included in last year's economic stimulus package, arguing
  that the policy provides more generous assistance." \
  --data "confidence=0.35"

And the same with the following URLs:

All I get is curl: (52) Empty reply from server.

What am I not getting here? All help appreciated.

4

There are 4 answers

3
Sandro Athaide On

The correct is

curl -X POST \
  http://localhost:2222/rest/annotate \
  -H 'accept: application/json' \
  -H 'content-type: application/x-www-form-urlencoded' \
  --data-urlencode "text=President Obama called Wednesday on Congress to extend a tax break for students included in last year's economic stimulus package, arguing that the policy provides more generous assistance" \
  --data-urlencode "confidence=0.35"
2
AliOli On

The empty reply error indicates that nothing was listening on your local port 2222. This is caused by the docker command docker run -i -p 2222:80 dbpedia/spotlight-english, in which the Spotlight container's port 2222 is mapped to port 80 on the host machine.

With the correct request syntax, as @Sandro has shared, the example should work on a locally running docker container with the url http://localhost:80/rest/annotate (or by omitting the port number altogether, given that 80 is the default).

0
Laharika Vidiyala On

To run the docker image of English version:

  1. docker run -i -p 2222: 80 dbpedia / spotlight-english spotlight.sh

  2. Open the localhost and give the text in the below format: localhost: 2222 / rest / annotate? Text = TextYouWantToAnnotate & confidence = 0.2 & support = 20

Example:

localhost:2222/rest/annotate?text=When I was growing up, my zealously frugal parents refused to buy anything from a bookstore, insisting that the local library had whatever it was we could possibly want to read. Faced with a small child’s intensive lobbying for repeated storytelling sessions with a lavishly illustrated picture book, my father would borrow one from the library and photocopy it. I still remember how anything colorful on the page (i.e. everything) would get transformed into dark blobs, the toner blurring the text and smudging my fingers.&confidence=0.2&support=20

0
AHH On

The error "Empty reply from server" actually comes from Docker, not from Spotlight. It simply means that docker, which received your request on port 2222, didn't receive a response from the container's port 80.

To test this, you can run the command from a terminal on the container (use Docker Desktop for example to get a terminal to the container). If you run the same curl to localhost:80 from that terminal, I think you would get a "connection refused error" meaning that DBPedia Spotlight is not listening on that port at all.

In my case, I suspected that it had to do with the hardware. I was trying to run it on a Macbook, so I decided to test it on a Linux machine.

What I did to make it work was to take the jar and the batch file from the docker image, put them on a Linux machine with enough memory (8 GB wasn't enough for the English version) and do the following:

  • Put dbpedia-spotlight.jar and spotlight.sh in /opt/spotlight
  • Create a folder named models under /opt/spotlight
  • run "./spotlight.sh en" (Or your preferred language as a parameter) from that directory.

Now the batch file will start downloading the model for the chosen language (since the models directory is empty) and extract it into the models folder. After that, Spotlight runs and starts loading data into memory. The whole process might take up to 15 minutes, but the logs would always show some activity. At the end, Spotlight was running and the request was answered as expected.

And yes, the right URL in this case would be http://localhost/rest/annotate (No need for the port since it is now the standard 80).