# Topic 1. Exploratory data analysis with Pandas#

Diving into Machine Learning and seeing the math in action is certainly an exciting prospect. However, a significant portion of working on real-world projects, around 70-80%, is actually spent on preparing and cleaning the data. This is where Pandas comes in handy and proves to be a valuable tool, as I use it on a daily basis in my work. This article outlines the essential Pandas methods for preliminary data analysis. We will then analyze a dataset on telecom customer churn and attempt to predict it using common sense alone (and Pandas of course), without any model training. Don’t underestimate the power of such an approach.

## Steps in this block#

Read the article “Exploratory data analysis with Pandas” (same in a form of a Kaggle Notebook);

Watch a video lecture “Pandas & Data Analysis” (optional);

Complete demo assignment 1 (same as a Kaggle Notebook) where you’ll be exploring demographic data, the UCI “Adult” dataset;

Check out the solution (same as a Kaggle Notebook) to the demo assignment (optional);

Complete Bonus Assignment 1 where you’ll be analyzing the history of the Olympic Games with Pandas (optional, available under Patreon “Bonus Assignments” tier).