The Java node receives an Erlang string encoded in UTF-8. Its class type is OtpErlangString
. If I simply do .toString()
or .stringValue()
the resulting java.lang.String
has invalid codepoints (basically every byte from the Erlang string is considered distinct character).
Now, I want to use new String(bytes, "UTF-8")
when creating the Java String but how to get the bytes from the OtpErlangString
?
It's strange you get OtpErlangString on Java side when you use UTF8 characters. I get object of this type if I use ASCII characters only. If I add at least one UTF8 character, the resulting type is OtpErlangList (which is logical as strings are just lists of ints in Erlang) and then I can use its stringValue() method. So that after sending string form Erlang like:
On Java node I receive and print it with:
The output is correct:
However, if its not the case in your situation, you could try to work it around by forcing OtpErlangList representation by e.g. adding an empty tuple as the very first element of the string list:
And on Java side something like: