Weird output with characters overlaying each other

27 views Asked by At

I collected a bunch of Tweets and output them to the command line, here is what I got:

enter image description here

The tweets are in different languages, so I suspect I also have arabic ones. Can control characters be responsible for this output? There are a few thousand lines, that somehow get contracted into one, and as far as I can tell, characters overlay each other.

What is going on?

1

There are 1 answers

0
Lookaji On

Depending on the default text encoding and the locale of the system, your data will be interpreted when printed to a console.

I'd rather have a look at the hex data you receive i.e: 0x4142430d0a... instead of Unicode, UTF or whatever text encoding your system is using.

an introduction on different text encoding could be found even on http://en.wikipedia.org/wiki/Character_encoding