๐Ÿ” Fraud Detective Challenge

Anomaly Detection โ€” Team-Based In-Class Activity
๐Ÿ‘ฅ Teams of 3-4 โฑ๏ธ 25-30 min ๐Ÿ“Š Supervised + Unsupervised

๐ŸŽฏ Activity Goal

Your team is a fraud investigation unit. You'll analyze transaction data to identify anomalies, design your own fraud patterns, and try to stump other teams โ€” all while learning supervised and unsupervised anomaly detection thinking.

๐Ÿ—‚๏ธ Choose Your Scenario

Select the industry scenario your team wants to investigate:

๐Ÿฆ
Banking
Credit card transactions with potential fraud
๐Ÿฅ
Healthcare
Insurance claims with suspicious billing
๐ŸŽ“
Campus
Student dining & access card swipes
๐Ÿ›’
E-Commerce
Online orders with return/refund abuse

๐Ÿ“– How It Works

Round 1 โ€” Detect (10 min): Examine 30 transactions. Flag the ones you think are anomalies. Some have labels (supervised clues), others don't โ€” you'll need to spot patterns on your own (unsupervised thinking).

Round 2 โ€” Plant (5 min): Your team designs 3 fraudulent transactions to hide in a clean dataset. Make them clever enough to fool another team!

Round 3 โ€” Swap & Hunt (10 min): Exchange planted datasets with another team. Find their hidden fraud. The harder your fakes are to detect, the more points you earn.

โฑ๏ธ 10:00

๐Ÿ”Ž Round 1 โ€” Detect the Anomalies

Review the transactions below. Click "Flag" on any row you believe is anomalous. Look for unusual amounts, odd timing, location mismatches, or behavioral patterns that don't fit.

๐Ÿ’ก Hint: The first 5 transactions have a "Risk Label" column (supervised). The rest don't โ€” you're on your own (unsupervised). Think about what features or rules you'd use.
# Date/Time Customer Amount Category Location Risk Label Action
โฑ๏ธ 5:00

๐ŸŽญ Round 2 โ€” Plant Your Fraud

Now you are the fraudster! Design 3 sneaky anomalous transactions that will blend into a normal dataset. The other team will try to catch them. Make them realistic but subtly off.

๐Ÿ’ก Strategy Tips: Avoid obviously huge amounts. Think about subtle mismatches โ€” a purchase at 3 AM in a different city, a category that doesn't match the customer profile, or a velocity anomaly (many transactions in a short window).

Planted Transaction #1

Planted Transaction #2

Planted Transaction #3

โฑ๏ธ 10:00

๐Ÿ”„ Round 3 โ€” Swap & Hunt

You've received another team's dataset with their 3 planted anomalies hidden among normal transactions. Work together to find all 3!

๐Ÿ’ก Discussion Questions as You Hunt:
โ€ข What unsupervised technique would help here? (Clustering? Isolation Forest? Z-score?)
โ€ข If you had labels, what supervised model would you choose? (Logistic regression? Decision tree?)
โ€ข What features would you engineer to catch these anomalies?

Record Your Findings

For each suspected anomaly, write which transaction and why:

๐Ÿง  Team Reflection (Discuss Together)

After finding the planted frauds, discuss:

  1. Supervised vs. Unsupervised: Which approach would work better for this dataset, and why?
  2. Feature Engineering: What new features (time-of-day bucket, amount z-score, velocity count) would improve detection?
  3. Real-World Application: Where in your career or industry could anomaly detection create the most value?
  4. False Positives vs. False Negatives: What's worse โ€” flagging a good transaction or missing a fraud? How does the business context change your answer?

๐Ÿ† Scoring & Leaderboard

0
Correct Flags (R1)
0
False Alarms (R1)
0
Missed Frauds (R1)
0
Total Points

๐Ÿ“Š Scoring Rubric

ActionPointsNotes
Correctly flag an anomaly (R1)+10True Positive
False alarm โ€” flag a normal tx (R1)-5False Positive penalty
Miss a real anomaly (R1)-3False Negative
Other team fails to find your plant (R3)+8Per undetected plant
Other team catches your plant (R3)+2Good sportsmanship
Correct explanation of technique+5Bonus for stating which method

๐Ÿ’ผ Career Connection Rubric (Bonus Points)

Each team gets up to 10 bonus points for connecting anomaly detection to a real-world career scenario:

  • +3: Named a specific industry use case
  • +3: Identified which approach (supervised/unsupervised) fits and why
  • +4: Proposed a realistic feature engineering strategy