UnicodeEncodeError Python 2.7

2.3k views Asked by At

I am using Tweepy for authentication and I am trying to print text, but I am unable to print the text. I am getting some UnicodeEncodeError. I tried some method but I was unable to solve it.

# -*- coding: utf-8 -*-

import tweepy

consumer_key = ""
consumer_secret = ""
access_token = ''
access_token_secret = ''

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tweepy.API(auth)

public_tweets = api.home_timeline()
for tweet in public_tweets:
    print tweet.text.decode("utf-8")+'\n'

Error:

(venv) C:\Users\e2sn7cy\Documents\GitHub\Tweepy>python tweepyoauth.py
Throwback to my favourite! Miss this cutie :) #AdityaRoyKapur https://t.co/sxm8g1qhEb/n
Cristiano Ronaldo: 3 hat-tricks in his last 3 matches.

Lionel Messi: 3 trophies in his last 3 matches. http://t.co/For1It4QxF/n
How to Bring the Outdoors in With Indoor Gardens http://t.co/efQjwcszDo http://t.co/1NLxSzHxlI/n
Traceback (most recent call last):
  File "tweepyoauth.py", line 17, in <module>
    print tweet.text.decode("utf-8")+'/n'
  File "C:\myPython\venv\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128)
1

There are 1 answers

6
Serge Ballesta On

This line print tweet.text.decode("utf-8")+'/n' is the cause.

You decode tweet.text as utf-8 into an unicode string. Fine until here.

But you next try to concatenate it with a raw string '/n' (BTW, I think you really wanted \n) and python try to convert the unicode string to an ascii raw string giving the error.

You should concatenate with a unicode string to obtain a unicode string without conversion :

print tweet.text.decode("utf-8") + u'\n'

If this is not enough, it could be because your environment cannot directly print unicode strings. Then you should explictely encode it in the native charset of your system :

print (tweet.text.decode("utf-8") + u'\n').encode('cp850')

[here replace 'cp850' (my charset) with the charset on your system]