Maths behind Loss (weight update)

Karthik
4 min readOct 4, 2019

I think the toughest part in understanding A.I/machine learning is the intuition and proper understanding the Loss function. And that to, if you are a new entrant to Machine learning, you will go crazy.

Note: I presume that you know the basics of m/c learning concepts :)

I think there are 3 kind of people who understand the loss function:
1) One set of people who accept that loss function (specifically gradients) updates the weights which in turn machine starts learning after each epoch.

2) 2nd set of people are those, who never go beyond 1 epoch for entire lifetime.
i.e People will sit behind and keep thinking how gradient descent helps in training machines.

Actually, this post is for both the set of people who want to understand the Maths behind loss :) It takes some time to understand, but it’s really worth it.

The idea of this post is to grab the best resources(mostly from khan academy and 3Blue1Brown) available in the web and present it in a structured format for beginners [like me :)]

Let’s start

Contents:

Below are the key modules which are required to understand the loss function concepts. For in-depth concepts i have attached the external links at the end of this post.

  1. Vectors
  2. Derivatives
  3. Gradient
  4. Directional derivative

Prerequisites:

a) https://www.mathsisfun.com/

I recommend you guys to go through this web site once, it consists of basic maths which we studied in school. Highly recommended if you are absolute beginner in understanding maths.

Vectors:

  1. Vector Intro (https://www.khanacademy.org/math/linear-algebra/vectors-and-spaces/vectors/v/vector-introduction-linear-algebra)

This chapter of Khan academy is enough to understand the basics of vectors.

2. Essence of linear algebra (https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)

3 blue 1 brown’s explanation is a visual treat to understand the linear algebra concepts.

Calculus:

Alright, next comes the Calculus which actually test’s our limits.

To begin with you can go through the 3Blue1Brown’s essence of calculus first:

https://www.youtube.com/playlist?list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr

It actually gives you an idea of what’s happening behind it, even if you can’t understand it’s perfectly Ok. I recommend you to go through the mentioned khan academy videos and revisit the previous one:

https://www.khanacademy.org/math/calculus-1/cs1-limits-and-continuity

https://www.khanacademy.org/math/calculus-1/cs1-derivatives-definition-and-basic-rules

Multivariable calculus

Now, we are coming to the pre-last module and it is where the logic of weight update exists:

Go through the full list of videos in “Thinking about multivariable functions”

https://www.khanacademy.org/math/multivariable-calculus/thinking-about-multivariable-function

And “Derivatives of multivariable functions”

https://www.khanacademy.org/math/multivariable-calculus/multivariable-derivatives/partial-derivatives/v/partial-derivatives-introduction

At the end of these videos you will understand the key concepts like determinant, derivatives etc..

Directional derivatives

And we are finally come to an end with “Directional derivatives” which is the heart of weight update.

https://www.khanacademy.org/math/multivariable-calculus/multivariable-derivatives/gradient-and-directional-derivatives/v/gradient

If you are still not clear about the Directional derivatives, i recommend you to through the below list of videos:

Gradients and Partial Derivatives : https://www.youtube.com/watch?v=GkB4vW16QHI

Gradient vs. Directional Derivative: https://www.youtube.com/watch?v=NomUbVmmyro

Gradient vs Directional derviative: https://www.youtube.com/watch?v=3xVMVT-2_t4

Directional derivative: https://www.youtube.com/watch?v=TNwHXWApyH4

Other resources:

http://cs229.stanford.edu/section/cs229-linalg.pdf

http://mathonline.wikidot.com/

https://machinelearningmastery.com/start-here/#linear_algebra

https://blog.ycombinator.com/learning-math-for-machine-learning/

https://gwthomas.github.io/docs/math4ml.pdf

Finally if you guys thought i’m wrong please leave a comment, in order to retrain myself.

--

--

Karthik
0 Followers

Web developer to Neural network engineer