Topic 3. Classification, Decision Trees, and k Nearest Neighbors¶
Here we delve into machine learning and discuss two simple approaches to solving the classification problem. In a real project, you’d better start with something simple, and often you’d try out decision trees or nearest neighbors (as well as linear models, the next topic) right after even simpler heuristics. We discuss the pros and cons of trees and nearest neighbors. Also, we touch upon the important topic of assessing the quality of model predictions and performing cross-validation. The article is long, but decision trees, in particular, deserve it – they make a foundation for Random Forest and Gradient Boosting, two algorithms that you’ll be likely using in practice most often.
Steps in this block¶
Complete Bonus Assignment 3 where you’ll go through the math of decision trees, practice with Sklearn’s implementation and then implement this algorithm on your own, from scratch (optional, available under Patreon “Bonus Assignments” tier).