iPython Notebooks

Expository notebooks

Here are a collection of Jupyter iPython notebooks I've created to solidify my understanding of ideas during my studies of probability, stats and machine learning.

Kaggle notebooks

I've attempted a few Kaggle competitions, here are the relevant notebooks:

Titanic: Machine Learning from Disaster
- Attempt 1: quick and dirty first attempt with a few models
- Attempt 2: more exploration, feature engineering
Forest Cover Type Prediction
- Attempt 1: quick and dirty first attempt with a few models, also using PCA
- Attempt 2: pipelines, deeper performance analysis with k-fold cross validation and learning curves, hyperparameter tuning
Predicting Red Hat Business Value
- Attempt 1 quick first attempt using routine preprocessing pipeline for categorical / quantitative variables and using logistic regression and random forest models. Ignores categorical variables with thousands of unique values that can't be one-hot encoded.
- Attempt 2 exploring ways of including categorical variables that have thousands of unique values (ordinal, mix of one-hot and binary)

KarlRosaen

Expository notebooks

Expectation Maximization with Coin Flips (Thu, 12/22)

Simulating Random Variables with Inverse Transform Sampling (Thu, 6/9)

The Sigmoid Function in Logistic Regression (Mon, 5/16)

The Birthday Problem Simulated (Fri, 4/29)

NBA Team Net Ratings (Thu, 3/17)

Kaggle notebooks