Sentiment Detective: Amazon Review Edition

🎮 Game Overview

You're a data scientist at Amazon tasked with building the ultimate review sentiment classifier. Your challenge: correctly identify positive, negative, and edge-case reviews to train an AI that will process millions of customer opinions. But watch out—sarcasm, mixed feelings, and clever wordplay will test your detective skills!

Manual Label

10 min

Build Model

10 min

Test & Predict

10 min

Edge Cases

5 min

Results

5 min

📚 Phase Details

👁️ Phase 1: Human Training

You receive 20 product reviews without labels. Your task:

Classify each as Positive, Negative, or Neutral
Flag potential sarcasm or mixed sentiment
Note key sentiment indicators (words/phrases)
Build intuition for patterns

Time Pressure: 30 seconds per review!

🤖 Phase 2: Algorithm Design

Choose your text analytics approach:

Bag of Words: Count positive/negative terms
TF-IDF: Weight important words
N-grams: Capture phrases like "not bad"
Sentiment Lexicons: Use pre-built dictionaries

Build rules based on your Phase 1 observations!

🎯 Phase 3: Prediction Challenge

Your model faces 80 unseen reviews:

Apply your algorithm to new data
No manual intervention allowed
Reviews include various products
Hidden test set has tricky cases

Goal: Beat 75% accuracy baseline!

💣 Phase 4: Edge Case Bonus

Special challenge reviews worth double points:

Sarcastic reviews ("Great, another broken product")
Mixed sentiment ("Love the design, hate the price")
Negations ("Not disappointed at all!")
Context-dependent ("Sick headphones" - good or bad?)

Correctly identify these for bonus points!

🎭 Sample Reviews You'll Encounter

"This vacuum cleaner really sucks... which is exactly what I wanted it to do! Five stars for excellent suction power. My carpets have never been cleaner."

⭐⭐⭐⭐⭐

Positive Wordplay

"Oh fantastic, another 'smart' device that requires a PhD to set up. Spent 3 hours trying to connect it to WiFi. The future is here, and it doesn't work."

⭐

Negative Sarcasm

"The product itself is amazing - best phone I've ever owned. But the delivery? Absolute nightmare. Came two weeks late and the box was damaged. Mixed feelings here."

⭐⭐⭐

Mixed Neutral Overall

🛠️ Available Techniques

Word Lists Easy

Method: Count positive/negative words

Example: "amazing" (+1), "terrible" (-1)

Accuracy: ~65-70%

Pitfall: Misses context and negations

TF-IDF Medium

Method: Weight words by importance

Example: "broken" matters more than "the"

Accuracy: ~75-80%

Pitfall: Still misses word relationships

N-grams Medium

Method: Capture word pairs/triplets

Example: "not bad" → positive

Accuracy: ~78-83%

Pitfall: Exponential feature growth

VADER Easy

Method: Pre-built sentiment analyzer

Example: Handles emphasis (!!!) and emojis

Accuracy: ~80-85%

Pitfall: Generic, not product-specific

Machine Learning Hard

Method: Train classifier on features

Example: Naive Bayes, SVM, Neural Net

Accuracy: ~85-92%

Pitfall: Needs lots of labeled data

Hybrid Approach Hard

Method: Combine multiple techniques

Example: VADER + custom rules + ML

Accuracy: ~88-95%

Pitfall: Complex to implement quickly

📊 Scoring System

500

Base Accuracy Points

200

Speed Bonus

200

Edge Cases

100

Innovation

Detailed Scoring Breakdown

Base Accuracy (500 pts): 5 points per correctly classified review
Speed Bonus (200 pts): First team to submit gets full points, -20 per rank
Sarcasm Detection (100 pts): 20 points per correctly identified sarcastic review
Mixed Sentiment (100 pts): 25 points per correctly handled mixed review
Innovation Bonus (100 pts): Creative approaches, good documentation, clean code

🏆 Accuracy Thresholds

< 60%: "Needs Coffee" - Random guessing territory
60-70%: "Junior Detective" - Better than random!
70-80%: "Senior Analyst" - Solid performance
80-90%: "Sentiment Master" - Professional level
> 90%: "Algorithm Whisperer" - Are you cheating? 😉

⚡ Tricky Cases to Watch For

🎭 Sarcasm Signals

"Great" followed by negative context
Excessive punctuation (!!!???)
Quotes around 'positive' words
"Thanks for nothing" patterns

🔄 Negation Patterns

"Not bad" → Actually positive
"Can't complain" → Positive
"Wasn't disappointed" → Positive
"No issues whatsoever" → Very positive

🎯 Domain-Specific Terms

"Sick" (headphones) → Positive
"Killer" (app) → Positive
"Addictive" (game) → Positive
"Cheap" → Context-dependent!

⚖️ Mixed Signals

Product good, service bad
Love X, hate Y patterns
Star rating vs. text mismatch
Comparison reviews (better than X, worse than Y)

📊 Rating Inconsistencies

5 stars but complaints in text
1 star but "not that bad"
3 stars could be anything!
Cultural differences in rating

🌍 Cultural Context

British understatement
American enthusiasm
Technical jargon variations
Generation-specific slang

🏆 Leaderboard Categories

Award	Description	Prize
1 Accuracy Champion	Highest overall classification accuracy	500 bonus XP
2 Sarcasm Detector	Best at identifying sarcastic reviews	300 bonus XP
3 Speed Demon	First to submit with >70% accuracy	200 bonus XP
🎨 Creative Coder	Most innovative approach	Special recognition
📈 Most Improved	Biggest gain from manual to automated	Encouragement award

💡 Pro Strategies

🎯 Quick Wins

Start with simple word counting
Handle "not" + positive word cases
Use star ratings as a hint (but don't trust completely)
Look for ALL CAPS for emphasis
Check first and last sentences (usually summary)

🚀 Advanced Tactics

Build product-specific lexicons
Weight recent reviews more heavily
Detect review bombing patterns
Use emoji sentiment (😍 vs 😡)
Ensemble multiple approaches

⚠️ Common Pitfalls

Over-relying on single words
Ignoring context around sentiment words
Missing double negatives
Treating all products the same
Forgetting about neutral reviews

🎓 Learning Objectives

By Playing This Game, You'll Learn:

Text Preprocessing: Why cleaning and tokenization matter
Feature Engineering: Converting text to numbers for analysis
Sentiment Complexity: Why sentiment isn't just positive/negative
Context Importance: How word relationships change meaning
Model Limitations: When automated systems fail and need human help
Real-World Challenges: Sarcasm, irony, and cultural differences
Performance Metrics: Accuracy, precision, recall, and F1 scores
Business Impact: How sentiment analysis drives product decisions

🚀 Ready to Start Detecting?

Choose your implementation and become a Sentiment Detective!

Download Python Version Get Excel Template View Sample Dataset Instructor Guide