I've got an issue with iterating through unicode strings, character by character, with python.
print "w: ",word
for c in word:
print "word: ",c
This is my output
w: 文本
word: ?
word: ?
word: ?
word: ?
word: ?
word: ?
My desired output is:
文
本
When I use len(word) I get 6. Apparently each character is 3 unicode chunks.
So, my unicode string is successfully stored in the variable, but I cannot get the characters out. I have tried using encode('utf-8'), decode('utf-8) and codecs but still cannot obtain any good results. This seems like a simple problem but is frustratingly hard for me.
Hope someone can point me to the right direction.
Thanks!
Output: