Tuesday, July 8, 2025

Data Science: Importance of Statistics

90% of students learn ML during grad school; only 10% master statistics. It's like putting the roof on before the walls.

Master these 7 statistics concepts and stand out in data science and product analytics interviews.

1️⃣ P-value: If you see a result, what are the odds that it's by chance? For instance, if your resume A gets 2X conversions vs. B with a P-value of 0.05, it means there is a 5% chance that the result happened by chance. https://lnkd.in/g-nXm_SZ


2️⃣ Power of a test: How likely are you to find an effect if it exists? Learn about P-hacking and how power can help avoid it.
https://lnkd.in/gR32QvKW

3️⃣ Z-test, T-test, and Chi-Square Test: I never paid attention to these during grad school, but these are fundamental tools for understanding outliers, statistical significance, and experimentation. https://lnkd.in/grfGUrFu

4️⃣ Stats behind regression: Regression is the father of all machine learning in a way, so master the fundamentals. Here’s my favorite video: link
https://lnkd.in/gr5hPAz9

5️⃣ ROC AUC and PR AUC Curves: You can technically have a 99% accuracy without a model - crazy, right?
Understand why this happens, and learn which metrics are best suited for different use cases. These curves will make you a better model evaluator.

6️⃣ Learn the math behind Decision Trees: Calculate entropy, Gini, and information gain for a small dataset. Ask GPT "give calculations with example for gini, entropy and information gain"

7️⃣
Run an A/B test: Work with a startup or ask GPT for sample A/B test data. Calculate the minimum sample size, Z-score, and P-value.
Statistics are asked more in interviews than I probably imagined.

Stat Quest is a really cool place to learn about statistics! https://lnkd.in/gPS9yubM

No comments:

Post a Comment


People call me aggressive, people think I am intimidating, People say that I am a hard nut to crack. But I guess people young or old do like hard nuts -- Isnt It? :-)