Topic 5. Bagging and Random Forest


Yet again, both theory and practice are exciting. We discuss why “wisdom of a crowd” works for machine learning models, and an ensemble of several models works better than each one of the ensemble members. In practice, we try out Random Forest (an ensemble of many decision trees) – a “default algorithm” in many tasks. We discuss in detail the numerous advantages of the Random Forest algorithm and its applications. No silver bullet though: in some cases, linear models still work better and faster.

Steps in this block

1. Read 5 articles:

2. Watch a video lecture on coming in 3 parts:

3. Complete demo assignment 5 (same as a Kaggle Notebook) where you compare logistic regression and Random Forest in the credit scoring problem;

4. Check out the solution (same as a Kaggle Notebook) to the demo assignment (optional);

5. Complete Bonus Assignment 5 were you’ll be applying logistic regression and Random Forest in two different tasks, which will be great for your understanding of application scenarios of these two extremely popular algorithms (optional, available under Patreon “Bonus Assignments” tier).