I am trying to send a get request to DSpace 5.5 API to check if an item with a given handle is present in DSpace.
When I tested it in browser, it worked fine (return code 200, I've got the data about the searched item).
Then I began testing sending request with Python 3 requests module in Python console. Again, DSpace API returned correct response code (200) and json data in the response.
So, I implemented tested function into my script and suddenly DSpace API started to return error code 500. In the DSpace log I came accross this error message:
org.dspace.rest.RestIndex @ REST Login Success for user: [email protected]
2017-01-03 15:38:34,326 ERROR org.dspace.rest.Resource @ Something get wrong. Aborting context in finally statement.
2017-01-03 15:38:34,474 ERROR org.dspace.rest.Resource @ Something get wrong. Aborting context in finally statement.
2017-01-03 15:38:34,598 ERROR org.dspace.rest.Resource @ Something get wrong. Aborting context in finally statement.
According to DSpace documentation, the request should by like this:
GET /handle/{handle-prefix}/{handle-suffix}
It is pointing to handle API endpoint on our DSpace server, so whole request should be sent to https://dspace.cuni.cz/rest/handle/123456789/937
(I think you can test it yourself).
In the browser I get following response:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<item>
<expand>metadata</expand
<expand>parentCollection</expand>
<expand>parentCollectionList</expand>
<expand>parentCommunityList</expand>
<expand>bitstreams</expand>
<expand>all</expand>
<handle>123456789/937</handle>
<id>1423</id>
<name>Komparace vývoje české a slovenské pravicové politiky od roku 1989 do současnosti</name>
<type>item</type>
<archived>true</archived>
<lastModified>2016-12-20 17:52:30.641</lastModified
<withdrawn>false</withdrawn>
</item>
When testing in Python console, my code looked like this:
from urllib.parse import urljoin
import requests
def document_in_dspace(handle):
url = 'https://dspace.cuni.cz/rest/handle/'
r_url = urljoin(url, handle)
print(r_url)
r = requests.get(r_url)
if r.status_code == requests.codes.ok:
print(r.text)
print(r.reason)
return True
else:
print(r.reason)
print(r.text)
return False
After calling this function in Python Console with document_in_dspace('123456789/937')
, response was this:
https://dspace.cuni.cz/rest/handle/123456789/937
{"id":1423,"name":"Komparace vývoje české a slovenské pravicové politiky od roku 1989 do současnosti","handle":"123456789/937","type":"item","link":"/rest/items/1423","expand":["metadata","parentCollection","parentCollectionList","parentCommunityList","bitstreams","all"],"lastModified":"2016-12-20 17:52:30.641","parentCollection":null,"parentCollectionList":null,"parentCommunityList":null,"bitstreams":null,"archived":"true","withdrawn":"false"}
OK
True
So I've decided to implement this function into my script (without any changes), but now DSpace API returns response code 500 when function is called.
Details on the implementation are bellow:
def get_workflow_process(document):
if document.document_in_dspace(handle=document.handle) is True:
return 'delete'
else:
return None
wf_process = get_workflow_process(document)
log.msg("Document:", document.doc_id, "Workflow process:", wf_process)
And the output is:
2017-01-04 11:08:45+0100 [-] DSPACE API response code: 500
2017-01-04 11:08:45+0100 [-] Internal Server Error
2017-01-04 11:08:45+0100 [-]
2017-01-04 11:08:45+0100 [-] False
2017-01-04 11:08:45+0100 [-] Document: 28243 Workflow process: None
Can you please provide me with any suggestions what might be causing it and how to solve this? I am quite surprised that this works in Python Console but not in actual script and it seems I can't figure out by myself. Thank you!
I think I figured it out. The problem was probably with some trailing newline characters in the
handle
param of thedocument_in_dspace
function. Updated function looks like this:Basically, what I did was to call
.rstrip()
on handle string to get rid of all unwanted trailing charactes, then I separated theprefix
andsuffix
parts of the handle (just for the sake of being sure) and constructed request url (r_url
) by joining all the parts together.I will make the function prettier in the future, but at least this now works as intended.
Output is following:
Nevertheless, DSpace API seems to return response code 500 when item with given handle is not present in the repository, instead of response code 404.