There are html equivalents for ">" and "<" ("<" and ">") in the OBX-5 field which is causing the Terser.get(..) method to only fetch the characters up to the ampersand character. The encoding characters in MSH-2 are "^~\&". Is the terser.get(..) failing because there's an encoding character in the OBX-5 field? Is there a way to change these characters to ">" and "<" easily?
Thanks a lot for your help.
Yes, it fails because the ampersand has been declared as subcomponent separator and the message you are trying to process is not valid -- it should not contain (unescaped) html character entities (< and >).
If you cannot help how the incoming messages are encoded you should preprocess the message before giving it to terser, replacing illegal characters. I'm pretty sure HAPI cannot help you there.
In a valid HL7v2 message, the data type used in OBX-5 is determined by OBX-2. OBX-5 should only contain the characters and escape sequences allowed by declared data type. < and > are among them (if not declared as separators in MSH-2).
HL7 standtard defines escape sequences for the separator and delimiter characters (e.g. \T\ is the escape sequence for subcomponent separator).