Bonus Assignment 6. Beating baselines in the competition “How good is your Medium article?”


You can purchase a Bonus Assignments pack with the best non-demo versions of assignments. Select the “Bonus Assignments” tier.

Details of the deal is still in self-paced mode but we offer you Bonus Assignments with solutions for a contribution of $17/month. The idea is that you pay for ~1-5 months while studying the course materials, but a single contribution is still fine and opens your access to the bonus pack.

Note: the first payment is charged at the moment of joining the Tier Patreon, and the next payment is charged on the 1st day of the next month, thus it’s better to purchase the pack in the 1st half of the month. is never supposed to go fully monetized (it’s created in the wonderful open community and will remain open and free) but it’d help to cover some operational costs, and Yury also put in quite some effort into assembling all the best assignments into one pack. Please note that unlike the rest of the course content, Bonus Assignments are copyrighted. Informally, Yury’s fine if you share the pack with 2-3 friends but public sharing of the Bonus Assignments pack is prohibited.

In this assignment, you’ll be challenged to beat a baseline in the competition where the goal is to predict popularity of a Medium article. For this purpose, you’ll be provided with instructions on extracting features from raw JSON files, such as title, author, content, etc. as well as some time-based features.

Here it’s much closer to real-world Data Science, where you spend time fussing with JSONs extracting features and waiting while the model is being trained. At the same time, just like with the “Alice” competition (see Bonus Assignment 4) feature engineering is the key, and that’s fun.