CoreNLP Sentiment Analysis Python Loop through dataframe

471 views Asked by At

How can I make this code loop(run) through all the sentences in my dataframe?

def get_sentiment(review):
    for text in review:
        senti = nlp.annotate(text,
                       properties={
                           'annotators': 'sentiment',
                           'outputFormat': 'json',
                           'timeout': 40000,
                       })

    #for i in senti["sentences"]:
        return ("{}: '{}': {} (Sentiment Value) {} (Sentiment)".format(
        s["index"],
        " ".join([t["word"] for t in s["tokens"]]),
        s["sentimentValue"], s["sentiment"]))

The above when executed returns only the first row sentence: Below...

"0: 'you can see everything from thousands of years in human history it was an unforgettable and wonderful trip to paris for me': 3 (Sentiment Value) Positive (Sentiment)"

I have tried several variations for the get_sentiment function but the best result I get is the one shown.

My dataframe is called 'reviews' and has one column (Review). This is the content:

                                                                                                 Review
0   you can see everything from thousands of years in human history it was an unforgettable and wonderful trip to paris for me
1   buy your tickets in advance and consider investing in one of many reputable tour guides that you can find online for at least part of your visit to the louvre these 2 arrangements will absolutely maximize your time and enjoyment of th...
2   quite an interesting place and a must see for art lovers the museum is larger than i expected and has so many exhibition areas that a full day trip might be needed if one wants to visit the whole place
3   simply incredible do not forget to get a three day pass if you love architecture art and history it is a must
4   we got here about 45 minutes before opening time and we were very first in line to get into the museum make sure to buy tickets ahead of time to help get in faster this museum is massive and can easily take your entire day an incredi...
2

There are 2 answers

1
arnaud On BEST ANSWER

Define your method get_sentiment as the following:

def get_sentiment(row):

    s = nlp.annotate(
        row.Review,
        properties={
            "annotators": "sentiment",
            "outputFormat": "json",
            "timeout": 40000,
        },
    )

    print(
        "{}: '{}': {} (Sentiment Value) {} (Sentiment)".format(
            row.index.iloc[0],
            " ".join([t["word"] for t in s["tokens"]]),
            s["sentimentValue"],
            s["sentiment"],
        )
    )

Use pandas.DataFrame.apply() and run:

>>> reviews.apply(get_sentiment, axis=1)
1
Azle Blade On

You return statement is inside the for loop. As the property of return is to break the function as soon as it executes, so the function will break just after the first.

What you need do to :

Make a var just before the loop starts, append the values after each loop. And finally, move return var out of the for a loop,.