Sentiment Analysis Case Study

Full Answer Section

   

Data Overview

Python
df.head()

Output:

                                feedback_text                date
0                      "Thank you for the prompt delivery!" 2023-10-01
1                                         "Great service!"   2023-10-01
2    "The delivery guy was so friendly and helpful. Thanks!" 2023-10-01
3  "I'm always satisfied with CouchPotato Couriers. :)"    2023-10-01
4                       "My order arrived late and cold."   2023-10-01
Python
df.info()

Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 674 entries, 0 to 673
Data columns (total 2 columns):
 #   Column       Non-null Count    Dtype 
---  ------       --------------    -----
 0  feedback_text    674 non-null    object
 1  date          674 non-null    object
dtypes: object(2)
memory usage: 107.2 KB

As shown above, the DataFrame contains 674 observations with no missing values. Both columns, 'feedback_text' and 'date', are of object data type.

Text Preprocessing

Before performing sentiment analysis, it's essential to preprocess the text data to clean and standardize the feedback. This involves steps like removing punctuation, converting text to lowercase, removing stop words, and stemming or lemmatization.

Python
import nltk
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer

# Remove punctuation
df['feedback_text'] = df['feedback_text'].apply(lambda text: re.sub(r'[^\w\s]', '', text))

# Convert text to lowercase
df['feedback_text'] = df['feedback_text'].apply(lambda text: text.lower())

# Remove stop words
stop_words = set(stopwords.words('english'))
df['feedback_text'] = df['feedback_text'].apply(lambda text: [word for word in text.split() if word not in stop_words])

# Stemming
stemmer = PorterStemmer()
df['feedback_text'] = df['feedback_text'].apply(lambda text: [stemmer.stem(word) for word in text])

By performing these preprocessing steps, we have cleaned and standardized the feedback text, making it more suitable for sentiment analysis.

Sentiment Analysis using Lexicon-Based Sentiment Analysis

Lexicon-based sentiment analysis involves using a predefined dictionary or lexicon of words with associated sentiment scores. For each word in the text, we add its sentiment score to obtain an overall sentiment score for the text.

Sentiment Dictionary

Python
sentiment_dict = {
    'excellent': 5,
    'great': 4,
    'good': 3,
    'ok': 2,
    'bad': 1,
    'terrible': 0
}

This sentiment dictionary assigns sentiment scores ranging from 0 (negative) to 5 (positive) to a set of common sentiment-bearing words.

Calculating Sentiment Scores

Python
df['sentiment_score'] = 0

for index, row in df.iterrows():
    for word in row['feedback_text']:
        if word in sentiment_dict:
            df.loc[index, 'sentiment_score'] += sentiment_dict[word]

We iterate through each feedback text and add the sentiment score of each word to the overall sentiment score for that observation.

Python
df.head()

Output:

                                               feedback_text                date      sentiment_score

0 "Thank you for the prompt delivery!" 2023-10-01 5 1 "Great service!" 2023-10-01 4 2 "The delivery guy was so friendly and helpful. Thanks!" 2023-10-01 5 3 "I'm always satisfied with

Sample Solution

 

To begin our analysis, let's load the customer feedback data into a pandas DataFrame.

Python
import pandas as pd

# Load the data into a DataFrame
df = pd.read_csv('cc_customer_feedback.csv')

The DataFrame consists of two columns: 'feedback_text' and 'date'. The 'feedback_text' column contains the customer feedback, while the 'date' column indicates when the feedback was submitted

IS IT YOUR FIRST TIME HERE? WELCOME

USE COUPON "11OFF" AND GET 11% OFF YOUR ORDERS