Topic 10. Gradient Boosting

Gradient boosting is one of the most prominent Machine Learning algorithms, it finds a lot of industrial applications. For instance, the Yandex search engine is a big and complex system with gradient boosting (MatrixNet) somewhere deep inside. Many recommender systems are also built on boosting. It is a very versatile approach applicable to classification, regression, and ranking. Therefore, here we cover both theoretical basics of gradient boosting and specifics of most widespread implementations – Xgboost, LightGBM, and Catboost.

Steps in this block

1. Read the article (same as a Kaggle Notebook);

2. Watch a video lecture on logistic regression coming in 2 parts:

  • the theoretical part covers fundamental ideas behind gradient boosting;

  • the practical part, reviews key ideas behind major implementations: Xgboost, LightGBM, and CatBoost;

3. Complete demo assignment 10 where you’ll be beating baselines in a Kaggle “Flight delays” competition provided a CatBoost starter;

4. Complete Bonus Assignment 10 where you’ll be implementing gradient boosting from scratch (optional, available under Patreon “Bonus Assignments” tier).