I am trying to run text from twitter api through sentiment analysis from textblob library, When I run my code, the code prints one or two sentiment values and then errors out, to the following error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 31: ordinal not in range(128)
I do not understand why this is an issue for the code to handle if it is only analyzing text. I have tried to code the script to UTF-8. Here is the code:
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
import json
import sys
import csv
from textblob import TextBlob
# Variables that contains the user credentials to access Twitter API
access_token = ""
access_token_secret = ""
consumer_key = ""
consumer_secret = ""
# This is a basic listener that just prints received tweets to stdout.
class StdOutListener(StreamListener):
def on_data(self, data):
json_load = json.loads(data)
texts = json_load['text']
coded = texts.encode('utf-8')
s = str(coded)
content = s.decode('utf-8')
#print(s[2:-1])
wiki = TextBlob(s[2:-1])
r = wiki.sentiment.polarity
print r
return True
def on_error(self, status):
print(status)
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
stream = Stream(auth, StdOutListener())
# This line filter Twitter Streams to capture data by the keywords: 'python', 'javascript', 'ruby'
stream.filter(track=['dollar', 'euro' ], languages=['en'])
Can someone please help me with this situtation?
Thank you in advance.
You're mixing too many things together. As the error says, you're trying to decode a byte type.
json.loads
will result in data as string, from that you'll need to encode it.So, in your script, when you tried to decode
coded
you got an error about decodingbyte
data.