Mathematics for Data Science

Overwhelmed by looking for resources to understand the math behind data science and machine learning? We got you covered.

Motivation

Learning the theoretical background for data science or machine learning can be a daunting experience, as it involves multiple fields of mathematics, and a long list of online resources.

In this piece, my goal is to suggest resources to build the mathematical background necessary to get up and running in data science practical/research work. These suggestions are derived from my own experience in the data science field, and following up with the latest resources suggested by the community.

However, if you are a beginner in machine learning and looking to get a job in industry, I don’t recommend studying all the math before starting to do actual practical work, this bottom up approach is counter-productive and you’ll get discouraged, as you started with the theory (dull?) before the practice (fun!).

My advice is to do it the other way around (top down approach), learn how to code, learn how to use the PyData stack (Pandas, sklearn, Keras, etc..), get your hands dirty building real world projects, use libraries documentations and YouTube/Medium tutorials. THEN, you’ll start to see the bigger picture, noticing your lack of theoretical background, to actually understand how those algorithms work, at that moment, studying math will make much more sense to you!

Here’s an article by the awesome fast.ai team, supporting the top down learning approach

Providing a Good Education in Deep Learning · fast.ai

And another one by Jason Brownlee in his gold mine “Machine Learning Mastery” blog

You’re Doing it Wrong. Why Machine Learning Does Not Have to Be So Hard

Resources

I will divide the resources to 3 sections (Linear Algebra, Calculus, Statistics and probability), the list of resources will be in no particular order, resources are diversified between video tutorials, books, blogs, and online courses.

Linear Algebra

Used in machine learning (& deep learning) to understand how algorithms work under the hood. Basically, it’s all about vector/matrix/tensor operations, no black magic is involved!

  1. Khan Academy Linear Algebra series (beginner friendly).
  2. Coding the Matrix course (and book).
  3. 3Blue1Brown Linear Algebra series.
  4. fast.ai Linear Algebra for coders course, highly related to modern ML workflow.
  5. First course in Coursera Mathematics for Machine Learning specialization.
  6. “Introduction to Applied Linear Algebra — Vectors, Matrices, and Least Squares” book.
  7. MIT Linear Algebra course, highly comprehensive.
  8. Stanford CS229 Linear Algebra review.

Calculus

Used in machine learning (&deep learning) to formulate the functions used to train algorithms to reach their objective, known by loss/cost/objective functions.

  1. Khan Academy Calculus series (beginner friendly).
  2. 3Blue1Brown Calculus series.
  3. Second course in Coursera Mathematics for Machine Learning specialization.
  4. The Matrix Calculus You Need For Deep Learning paper.
  5. MIT Single Variable Calculus.
  6. MIT Multivariable Calculus.

Statistics and Probability

Used in data science to analyze and visualize data, in order to discover (infer) helpful insights.

  1. Khan Academy Statistics and probability series (beginner friendly).
  2. Intro to Descriptive Statistics from Udacity.
  3. Intro to Inferential Statistics from Udacity.
  4. Statistics with R Specialization from Coursera.
  5. Stanford CS229 Probability Theory review.

Bonus materials

  1. Part one of Deep Learning book.
  2. CMU Math Background for ML course.
  3. The Math of Intelligence playlist by Siraj Raval.

So, that was me giving away my carefully curated Math bookmarks folder for the common good! Hope that helps you expand your machine learning knowledge, and fight your fear of discovering what’s happening behind the scenes of your sklearn/keras/pandas import statements.

Your contributions are very welcomed, through reviewing one of the listed resources, or adding new awesome ones.


Mathematics for Data Science was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Leave a Reply

Your email address will not be published. Required fields are marked *