🔍 The Analytics Detective

Solve real-world predictive analytics challenges from top companies

Current Level
1
Score
0
Progress
0/5

Welcome, Analytics Detective! 🕵️

You've been hired as a consultant to help major companies solve their business challenges using predictive analytics. Each case will test your ability to:

  • Choose the right analytical technique
  • Interpret model results correctly
  • Translate findings into business action
  • Understand deployment challenges

Remember: In the real world, the hardest part isn't just building the model—it's taking the results back to operations and managing change.

Case 1: Netflix - Content Recommendation Crisis

The Situation

Netflix's subscriber growth has plateaued. User engagement data shows that 40% of subscribers watch less than 5 hours per month, and churn risk is highest in the first 90 days. The content team has a $200M budget for new shows but doesn't know what to produce.

Your Mission: Recommend an analytics approach to increase engagement and reduce churn.

Key Metrics

Monthly Active Users: 180M Avg Watch Time: 12 hours/month Churn Rate (90 days): 18% Content Library: 5,000+ titles

Question: Which predictive modeling approach makes the most sense?

A. Logistic Regression
Build a churn prediction model using viewing history, account age, and subscription type to identify at-risk users in their first 90 days.
B. Clustering + Recommendation System
Segment users based on viewing behavior, then build collaborative filtering to recommend content similar to what their cluster enjoys.
C. Linear Regression
Predict monthly viewing hours based on content genre preferences and send generic "watch more" emails to low-engagement users.
D. Market Basket Analysis
Use association rules to find which shows are frequently watched together, similar to "customers who bought this also bought that."

Case 2: Capital One - Real-Time Fraud Detection

The Situation

Capital One processes 3 billion credit card transactions annually. Current fraud detection flags 2% of transactions for review, but only 0.1% are actual fraud. This means 99% of flagged transactions are false alarms, frustrating customers and costing $50M/year in review costs.

Your Mission: Improve fraud detection accuracy while maintaining real-time processing.

Current Performance

Transactions/day: 8.2M Fraud rate: 0.1% False positive rate: 1.9% Manual review cost: $12 per transaction Avg transaction processing: 0.3 seconds

Question: What's your biggest challenge in building this model?

A. Model Complexity
The hardest part is choosing between random forest, neural networks, or logistic regression for best accuracy.
B. Imbalanced Data
With only 0.1% fraud cases, you have a massive class imbalance. Your model might just predict "not fraud" every time and still be 99.9% accurate.
C. Feature Engineering
Finding the right input variables like transaction amount, merchant type, and location patterns.
D. Data Cleaning
Dealing with missing values and outliers in the transaction data before model building.

Case 3: Spotify - The Premium Conversion Challenge

The Situation

Spotify has 180M free users but only converts 8% to Premium ($9.99/month). Your model predicts conversion probability with 85% accuracy. You've identified 20M "high-probability" users. The marketing team wants to send them all upgrade offers immediately.

Your Mission: Should you deploy this as-is, or are there operational considerations?

Model Performance

Overall Accuracy: 85% Precision: 12% Recall: 78% AUC: 0.82 Cost per offer: $2 Value per conversion: $60/year

Question: What's the critical issue before deploying?

A. Technical Infrastructure
Deploy immediately. 85% accuracy is excellent, and the system can handle sending 20M emails.
B. Precision vs. Recall Trade-off
The 12% precision means you'll waste money on 17.6M users who won't convert. You need to adjust the probability threshold to balance marketing spend with conversions.
C. Model Retraining
Wait to collect more data before deploying—you need at least 90% accuracy for a production model.
D. A/B Testing Setup
Just show the offers to a random sample first to test if they work, ignoring the model predictions.

Case 4: Cleveland Clinic - Predicting Hospital Readmissions

The Situation

Cleveland Clinic faces penalties from Medicare for high readmission rates. Your model identifies patients at 70%+ risk of returning within 30 days. The COO asks: "Should we just keep them longer?" The nursing staff says: "We're already overwhelmed."

Your Mission: Navigate the change management and operational reality.

The Numbers

Annual discharges: 45,000 Current readmission rate: 17.8% Medicare penalty: $2.4M/year Avg readmission cost: $15,000 High-risk patients identified: 6,000/year

Question: What's the real challenge here?

A. Model Accuracy
The model needs to be better than 70% before anyone should act on it.
B. Change Management
This isn't a technical problem—it's operational. You need to design interventions (discharge planning, home visits, medication management) that staff can actually execute, then measure if they reduce readmissions.
C. Data Quality
You need more historical patient data before building reliable predictions.
D. Legal Compliance
Focus on making sure the model meets HIPAA regulations before deployment.

Case 5: Target - The Pregnancy Prediction Controversy

The Situation (Real Story!)

Target built a model that predicted pregnancy using purchase patterns (unscented lotion, supplements, large purses). Their model worked SO WELL that they sent baby coupons to a teenager. Her father complained to Target... until he learned his daughter actually was pregnant.

Your Mission: You're the analytics lead. What went wrong?

Model Performance

Prediction Accuracy: 87% Estimated Revenue Lift: $50M/year Customer Complaints: 247 PR Crisis Cost: Priceless

Question: What's the key lesson about model deployment?

A. Privacy Laws
Target should have obtained explicit consent before using purchase data for predictions.
B. Model Ethics & Business Context
Being technically right doesn't mean you should deploy. You need to consider ethics, customer perception, and potential backlash. Maybe hide pregnancy items among other random coupons or use predictions for inventory planning instead.
C. Prediction Accuracy
The model wasn't accurate enough—87% isn't good enough for such sensitive predictions.
D. Segmentation Issues
Target should have built separate models for different age groups instead of one universal model.

Case Closed! 🎉

🏆

0

Total Points

0

Correct Answers

0%

Success Rate

Key Takeaways

Technical skills are just the beginning. Choosing the right algorithm matters, but understanding business context matters more.

Accuracy isn't everything. A 99% accurate model that predicts "no fraud" on every transaction is useless.

Deployment is where models meet reality. Organizational resistance, ethical considerations, and operational constraints often matter more than R-squared.

Change management is critical. You're not building models for fun—you're trying to change how businesses operate. That's the hard part.