Tuesday 16 October 2018

A intuitive explanation for the chain rule

The chain allows us to take derivative of composition of two functions.  It has the form
$$(f\circ g)'(x)=f'(g(x))g'(x)$$

Intuitively, why should it be a product of the two function's individual derivatives?

We can answer that by looking at Taylor expansions.
If we wiggle \(x\) in \(g(x)\) we get
$$
g(x+\delta)\approx g(x)+g'(x)\delta
$$

That means if we wiggle \(x\) by \(\delta\), f gets wiggled by \(g'(x)\delta\).

A function \f(y)\ would change in a very similar way if we wiggle y:
$$
f(y+\delta') \approx f(y) + f'(y)\delta'
$$

We replace \(\delta'\) by \(g'(x)\delta\) because that is how much the "y" is perturbed if we perturb \(x\) by \(\delta\)

Overall, we would get
$$
f(g(x+\delta)) \approx f(g(x))+f'(g(x))g'(x)\delta
$$