why escape sequence can't be represented as unicodeEscape in java?

938 views Asked by At

In java,

"Carriage return" is represented as '\r'

&

"Line Feed" is represented as '\n'.

But Java does not allow,

"Carriage return" as '\u000d'

and

"Line Feed" as '\u000a'.

Why?

2

There are 2 answers

3
Jon Skeet On

The Unicode escape sequences are applied earlier in the source transformation than the character literal escape sequences. Unicode escape sequences are transformed very early in the process - before any other lexing happens, including before line breaks are detected. See JLS 3.2 for details.

So when you put \u000a into a Java source file, it will behave exactly as if you'd put an actual line feed in there - causing a line break as far as the rest of the compiler is concerned.

(Personally I think this was a design mistake; I prefer the C# approach of only allowing Unicode escape sequences at very specific points in the code, but that's a different matter.)

2
Ian Roberts On

Unicode escapes are recognised anywhere in a Java source file, not just inside string literals, and are processed very early in the compiler chain. A \u000d is treated as a literal carriage return, not an escaped one, i.e. For the source code

String cr = "\u000d";

what the compiler sees is

String cr = "
";

And this is not legal Java code.