Rest Template unable to parse json rest api response properly

Question

Rest Template unable to parse json rest api response properly

548 views Asked by Ravi At 14 September 2017 at 15:01

I am trying to extract Named Entities from text using Spacy's NER for German text. I have exposed the service as a REST POST request which takes source text as input and returns a dictionary(Map) of list of named entities (person, location, organization). These services are exposed using Flask Restplus hosted on a linux server.

Consider for a sample text, I get following response using POST request at REST API exposed via Swagger UI:

{
  "ner_locations": [
    "Deutschland",
    "Niederlanden"
  ],
  "ner_organizations": [
    "Miele & Cie. KG",
    "Bayer CropScience AG"
  ],
  "ner_persons": [
    "Sebastian Krause",
    "Alex Schröder"
  ]
}

When I use Spring's RestTemplate to POST request at the API hosted at Linux server from Spring boot application (on Windows OS in Eclipse). The json parsing is done correctly. I have added following line for using UTF-8 encoding.

restTemplate.getMessageConverters().add(0, new StringHttpMessageConverter(Charset.forName("UTF-8")));

But When I deploy this spring boot application on linux machine and POST request to API for NER tagging, the ner_persons are not parsed correctly. While remotely debugging, I get following response

{
  "ner_locations": [
    "Deutschland",
    "Niederlanden"
  ],
  "ner_organizations": [
    "Miele & Cie. KG",
    "Bayer CropScience AG"
  ],
  "ner_persons": [
    "Sebastian ",
    "Krause",
    "Alex ",
    "Schröder"
  ]
}

I am not able to understand why this strange behavior occurs in case of persons but not organizations.

Original Q&A

There are 1 answers

**Ravi** · Answer 1 · 2017-09-18T15:45:15+00:00

Being new to python, it took me 2 days of debugging to understand the real problem and to find a workaround fix.

The reason was that the names (e.g., "Sebastian Krause") were separated by \xa0 i.e., non-breaking space character (e.g., "Sebastian\xa0Krause") instead of a whitespace. So Spacy was failing to detect them as a single NamedEntity.

Browsing through SO, I found following solution from here:

import unicodedata 
norm_text = unicodedata.normalize("NFKD", source_text)

This also normalizes other unicode characters such as \u2013,\u2026, etc.

TechQA.

Rest Template unable to parse json rest api response properly

There are 1 answers

Related Questions in SPRING-BOOT

Related Questions in JACKSON

Related Questions in RESTTEMPLATE

Related Questions in SPACY

Related Questions in FLASK-RESTPLUS

Popular Questions

Popular Tags

Trending Questions