The infamous slope/derivative — Intuition

Kruthika Kulkarni
3 min readMay 16, 2021
Photo by John Moeses Bauan on Unsplash

It seems like you have finally decided that you will understand this term today at any cost. You must have seen it multiple times in backward propagation and wondered why we use the derivative in the weight updates. Good job at choosing this topic. It’s always good to get your basics right.

First thing first, remember anything from school about derivatives? Your teacher must have said that the derivative of y is the change in y for a small change in x. Does that ring a bell?

So, what does the above sentence mean?

Let’s take a simple equation

When x = 1, y = 1. Now we need to see what happens when there is a small change in x. So, when x = 1.1 then y = 1.1. This implies that when x shifts by 0.1 (1.1–1), y changes by 0.1 (1.1–1). Therefore the change in y for a small change in x is given by

For the above equation, the derivative is 1. Notice that y = 1*x.

Let us take another example

When x = 1, y= 1, when x = 1.01, y = 1.0201

When x = 3, y= 9, when x = 3.01, y = 9.0601

We see that by the definition of derivative/slope we gave, for the above equation, we can generalize that 2x is the derivative.

We have now got a basic understanding of the slope.

However, why do we need it?

Consider the second equation

The minimum value y takes is 0 for x = 0. So the minimum of the equation is obtained at x = 0

Note, in the below paragraphs, minimum refers to the value of x at which y is minimum.

As you can see in the above table. If the x value is greater than the minimum (0), the slope is positive and if the x value is lesser than the minimum (0), the slope is negative. Also, the farther the x value is from the minimum, the higher is the absolute slope value.

If x > minimum, subtracting a value proportionate to the slope will take us closer to the minimum. As seen above, in this case, slope > 0, therefore the updated value of x will be smaller than before and hence closer to the minimum.

If x <minimum, adding a value proportionate to the slope will take us closer to the minimum. As seen above, in this case, slope < 0, therefore the updated value of x will be greater than before and hence closer to the minimum.

Note that the role of choosing the right alpha is crucial to the above-mentioned process.

I hope this article has given you a basic understanding of slope/derivative and it will help you understand why we use it in weight updates in backward propagation.

--

--

Kruthika Kulkarni

I am a Data Scientist by profession and a Japanese student by passion.