Roadmap

All activities accounted for in spring 2019 rating.

Plan

  • Calendar and deadlines
  • Assignments
  • Kaggle Inclass Competition “Alice”
  • Kaggle Inclass Competition “Medium”
  • 2 more Kaggle competitions
  • Tutorials

Calendar and deadlines

Google Calendar with all deadlines.

Current deadlines (see also the assignments page):

  • February 24 - A1
  • March 10 - A2
  • March 17 - uploading solutions to Alice & Medium

All deadlines are 20:59 UTC (London time).

Assignments

Assignments are announced in the #mlcourse_ai channel in ODS Slack team (pinned items). Also, links to fresh assignments are provided in the Readme file of the course repository and on mlcourse.ai/assignments. Deadlines are typically on Sundays, 20:59 UTC. Apart from that, you can practice with demo assignments, don’t confuse them with “real” ones. Rough plan for spring 2019 assignments is the following:

  • A1. Pandas and exploratory data analysis
  • A2. Beating baselines in Alice & Medium competitions
  • A3. Decision trees, random forest, and gradient boosting. Beating baseline in the “flight delays” competition
  • A4. Time series analysis with Python
  • A5. Vowpal Wabbit

If stuck with assignments, check course video lectures.

Kaggle Inclass Competition “Alice”

In the 1st competition you’ll be solving a task of user identification using tracking of his/her visited websites. Let’s call this competition “Alice”, because we’ll be classifying whether a person is Alice (let’s say, intruder) or somebody else (innocent users). The competition is held in cooperation with Yandex and MIPT specialization “Machine Learning and Data Analysis” (in Russian).

Rules:

  • Deadline for submissions: 2019 March 10th
  • You can make maximum 5 submissions a day, and the competition is individual (that is, 1 person per team, team merges are not allowed)
  • In case you want to get credits, you need to rename your team (of 1 person) in full accordance with your name in the course rating
  • if you overfitted and plunged several positions down on the private LB - no offense, it’s life. Only private LB is used to calculate final credits (that’s not true for baselines in A2, check instructions therein)
  • Till March 17th, those who managed to beat all benchmarks must upload their reproducible solutions in the .py format (python script) here, more details are provided in this post (dated March 5th) in the #mlcourse_ai_news channel
  • The results of the competition and the final course rating will be published on 2019 April 26th.

Scoring rules for competitions (Alice & Medium):

Necessary conditions:

  • Beat all Yorko’s baselines on the private LB
  • A reproducible solution must be submitted within the specified period
  • Team name must exactly correspond to the name in the course rating

Credits:

  • 1 place – 40 credits
  • 2 place – 30 credits
  • 3 place – 25 credits
  • 4-10 place – 20 credits

Kaggle Inclass Competition “Medium”

Kaggle Inclass. Here you are proposed to predict popularity (number of claps) of an article on Medium.com. Rules are the same as for “Alice” competition, only the webform for uploading solutions .py files is different.

Kaggle Inclass Competition “DotA 2 winner prediction”

In the 3rd competition you are going to utilize all knowledge and skills acquired during the course session.

Rules:

  • Deadline for submissions: 2019 April 21st
  • Teams up to 4 people are allowed.
  • The results of the competition and the final course rating will be published on 2019 April 26th.

## Credits Grading is as close as possible to “real” Kaggle competitions, each team member gets the specified amount of credit

  • 1 place – 80 credits
  • 2 place – 60 credits
  • 3 place – 50 credits
  • 4-10 place – 40 credits
  • top 10% - 30 credits
  • top 20% - 15 credits

Other bonuses in the Dota 2 competition

  • 5 most upvoted Kernels will be awarded from 10 to 20 credits
  • 5 most active [Discussions] participants will be awarded from 5 to 15 credits, yes, it will be totally subjective

Tutorials

We propose to write and publish a tutorial on any ML/DS-related topic which is not fully covered in our course. Some exemplar topics are given in a template, but you can choose any new one.

Rules:

  • Tutorials must be written in English. As for programming language, only Python is allowed
  • It must be a tutorial (reproducible Jupyter notebook, actually, a Kaggle Kernel, see “how to publish” below), i.e you shall teach some skills, do not just express your ideas on some theoretical concept
  • It is not just a translation of somebody else’s material. You can borrow some stuff, but with fair links/citations
  • The prerequisite for reading your tutorial should be basic ML as taught in mlcourse.ai. Do not write in-depth articles about neural nets, probabilistic programming, Bayesian approach, reinforcement learning etc. - topics like these need a thorough approach. That is, sure, you can write articles like that, but not in the format of mlcourse.ai tutorials. On the other hand, it’s hardly worth making a tutorial too simple (e.g., to pick some library and demonstrate only a couple of methods from it)
  • A typical tutorial shall be 30-60 minutes to read and digest (however, here exceptions are possible)
  • Check out these lists of published tutorials from previous runs of this course: English, Russian . Yes, the second list is in Russian but still Google Translate can kind of give you an insight into the topics that are already covered. For those who already passed the course: definitely, translating somebody else’s (or your own) tutorial into English is not going to work
  • Tutorial submission is due on 2019 April 22nd

How to publish a tutorial

  1. First of all, choose a unique topic and registered it in the Google doc
  2. Create a Kaggle Kernel with your tutorial here (and of course do upvote all kernels that you find useful)
  3. Make sure your Kernel runs normally and produces no errors
  4. Then share the link to your Kernel in the #mlcourse_ai Slack channel with a short description of the tutorial. You also need to add the #tutorial tag, It’s obligatory, otherwise, the tutorial will be ignored.

For discussions, please stick to ODS Slack, channel #mlcourse_ai, pinned thread #tutorial.

Grading tutorials

Grading is solely done by other participants upvoting Kernels on Kaggle in the mlcourse.ai Kaggle Dataset. Best tutorials will get up to 40 credits. The exact grading formula is to be provided later. Voting is finalised on 2019 April 26th.