I need to reference to a Unicode character with a URI. Following IANA references list multiple schemes and namespaces but do not mention anything about identifiers for the Unicode characters. Does anyone know if something like this exists already?
- http://www.iana.org/assignments/uri-schemes.html
- http://www.iana.org/assignments/urn-namespaces/urn-namespaces.xml
I hoped to find something like
unicode://U+0394
urn:unicode://0394
http://unicode.org/unicode/0394
for the greek capital letter delta Δ.
If someone wonders, this is for a semantic web like application that uses URIs as identifiers for concepts, including concepts of the Unicode characters.
I’m afraid there is no URL or URN for referring authoritative information on a Unicode character in general. In the Unicode Standard, information about individual characters is partly in the so-called character database (mostly plain text files in specific formats), partly in the Code Charts (PDF files). Neither of them offers a way to point at an individual character. Moreover, the information there is not exhaustive: there are important remarks on individual characters information scattered around the standard.
The Decodeunicode site has individually addressable items, such as
http://www.decodeunicode.org/en/u+0394
but its information content varies a lot and is generally very limited. It is not official, and it currently contains Unicode 5.0 only.
The Fileformat.info site is much more systematic, but it, too, is unofficial. It is basically limited to formal properties and data derivable from them, plus comments extracted from the Code Charts, plus instructions on typing the character in Windows, plus information about support in fonts—but that’s quite a lot! Example:
http://www.fileformat.info/info/unicode/char/0394/