Skip to main content

Sentiment Analysis of Customer Reviews

Introduction to Sentiment Analysis

Sentiment analysis helps in understanding the emotional tone behind text data like customer reviews, support conversations, social media feeds, or survey responses. This technique is critical for businesses to gauge customer satisfaction, perform market research, and improve products or services.

In the following example, we draw from the Women's Ecommerce Clothing Reviews on Kaggle.

For our analysis, we use the womens_clothing dataset, which contains a variety of features that influence customer sentiment. You can download the raw dataset for this analysis here.

SQL-inf Query for Sentiment Prediction

  SELECT * FROM womens_clothing SENTIMENT(Review_Text)

This SQL query will allow you to label your reviews as 'Positive', 'Negative' or 'Neutral'.

WITH sentiment_features AS (
SELECT * FROM womens_clothing SENTIMENT(Review_Text)
)
SELECT Age, Rating, Recommended_IND, Positive_Feedback_Count, Division_Name, Department_Name, Class_Name, label FROM sentiment_features PREDICT(label)

A more powerful way of understanding sentiment it to combine it with PREDICT. The SQL query extracts features and predicts sentiment labels from the review text while maintaining other relevant details like Age, Rating, and so on.

Key Findings

  • Model Performance: With an accuracy score of 61%, which is significantly better than random guessing at 33%, the model provides fairly reliable sentiment predictions.

  • Feature Importance: The top four features influencing sentiment are Recommended Indicator, Rating, Age, and Positive Feedback Count, covering significant ground in our understanding of sentiment.

  • Highlights:

    1. Recommended Indicator: Reviews with a '0' in this indicator are more likely to be negative, and make up 76% of the dataset.
    2. Rating: Lower ratings correspond to higher negative sentiment, while a majority of the data, having a rating of 3, shows lower median negative sentiment.
    3. Age: Sentiments, whether positive or negative, are uniformly distributed across age groups, indicating age is not a big differentiator in sentiment.
    4. Positive Feedback Count: As the count increases, the positive sentiment generally rises but slightly dips at the highest quantile, indicating a nonlinear relationship.

Strategic Conclusions and Recommendations

  • Recommendation Metrics: Given that a '0' in the recommended indicator is strongly linked to negative reviews, efforts should be made to understand the underlying reasons and aim to shift the needle toward '1'.

  • Importance of Ratings: Ratings are directly proportional to sentiment, and thus, focusing on areas that receive low ratings could significantly improve overall sentiment.

  • Age-Neutral Strategies: Since age does not significantly influence sentiment, marketing strategies and feedback systems can be designed to be age-neutral, focusing instead on other key factors.

  • Leveraging Positive Feedback: A higher Positive Feedback Count generally indicates more positive sentiment, but businesses should be cautious as this is not a guarantee. It would be beneficial to dive deeper into why the highest quantile of positive feedback experiences a dip in sentiment.

With these insights, companies can fine-tune their products, services, and customer interaction strategies to enhance overall customer satisfaction and brand reputation.