How to express the full range of values of a char in F#?

206 views Asked by At

Unicode code points range from U+000000 to U+10FFFF. While writing myself a lexer generator in F#, I ran into the following problem:

For the character set definitions, I intend to use a simple tuple of type char * char, expressing a range of characters. Omitting some peripheral details, I also need a range I call Alland which is supposed to be the full unicode range.

Now, it is possible to define a char literal as such: let c = '\u3000'. And for strings, it is also possible to refer to a real 32 bit code point like this: let s = "\U0010FFFF". But the latter does not work for chars. The reason being, that a char in .NET is a 16 bit unicode character and the code point would yield 2 words, not one.

So the question is - is there a way I can stick to my char * char tuple and get my All defined somehow or do I need to change it to uint32 * uint32 and define all my character ranges as 32 bit values? And if I have to change, is there a type I should prefer over uint32 I did not discover yet?

Thanks, in advance.

0

There are 0 answers