Java doesn't support IRI?

342 views Asked by At

Example code:

java.net.URI.create("http://测试.com").getHost(); // return null
new java.net.URL("http://测试.com").getHost(); // return "测试.com"
  • Actual: URI doesn't resolve IRI
  • Expected: both return "测试.com"

Related documents:

In javadoc of URI, it is mentioned that it supports:

other The Unicode characters that are not in the US-ASCII character set, are not control characters (according to the Character.isISOControl method), and are not space characters (according to the Character.isSpaceChar method) (Deviation from RFC 2396, which is limited to US-ASCII)

It is also well-known that

every URL is a URI, abstractly speaking, but not every URI is a URL.

So the behavior above doesn't seem to follow the expectation.

There used to be a RFE, but it seems to be reverted if I'd understood correctly.

1

There are 1 answers

1
Stephen C On BEST ANSWER

You are correct. Java does not support IRIs properly at this time.

The RFE that you found indicates that an attempt to implement IRIs was made in Java 6. It was rolled back for compatibility reasons:

The change integrated into Mustang b67 as part of CCC 6348622 causes incompatibility issue, i.e. 6380332. We need to reexamine the usage of the term 'registry-based' in java.net.URI specification and record the incompatibility issue.

Will rollback URI class to Tiger version for compatibility reason.

More recently, David Fuchs did some work to analyze the problem and has come up with some prototype code, but this doesn't seem to have progressed since 2019: