The Chain Rule

From Department of Mathematics at UTSA
Revision as of 22:15, 16 September 2021 by Khanh (talk | contribs) (Created page with "In calculus, the '''chain rule''' is a formula that expresses the derivative of the composition of two differentiable functions {{Mvar|f}} and {{Mvar|g}} in terms of the deriv...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

In calculus, the chain rule is a formula that expresses the derivative of the composition of two differentiable functions f and g in terms of the derivatives f and g. More precisely, if Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h=f\circ g} is the function such that for every x, then the chain rule is, in Lagrange's notation,

or, equivalently,

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle h'=(f\circ g)'=(f'\circ g)\cdot g'.}

The chain rule may also be expressed in Leibniz's notation. If a variable z depends on the variable y, which itself depends on the variable x (that is, y and z are dependent variables), then z depends on x as well, via the intermediate variable y. In this case, the chain rule is expressed as

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{dz}{dx} = \frac{dz}{dy} \cdot \frac{dy}{dx},}

and

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \left.\frac{dz}{dx}\right|_{x} = \left.\frac{dz}{dy}\right|_{y(x)} \cdot \left. \frac{dy}{dx}\right|_{x} ,}

for indicating at which points the derivatives have to be evaluated.

In integration, the counterpart to the chain rule is the substitution rule.

Intuitive explanation

Intuitively, the chain rule states that knowing the instantaneous rate of change of z relative to y and that of y relative to x allows one to calculate the instantaneous rate of change of z relative to x as the product of the two rates of change.

As put by George F. Simmons: "if a car travels twice as fast as a bicycle and the bicycle is four times as fast as a walking man, then the car travels 2 × 4 = 8 times as fast as the man."

The relationship between this example and the chain rule is as follows. Let z, y and x be the (variable) positions of the car, the bicycle, and the walking man, respectively. The rate of change of relative positions of the car and the bicycle is Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\textstyle \frac {dz}{dy}=2.} Similarly, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\textstyle \frac {dy}{dx}=4.} So, the rate of change of the relative positions of the car and the walking man is

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{dz}{dx}=\frac{dz}{dy}\cdot\frac{dy}{dx}=2\cdot 4=8.}

The rate of change of positions is the ratio of the speeds, and the speed is the derivative of the position with respect to the time; that is,

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{dz}{dx}=\frac \frac{dz}{dt}\frac{dx}{dt},}

or, equivalently,

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{dz}{dt}=\frac{dz}{dx}\cdot \frac{dx}{dt},}

which is also an application of the chain rule.

Statement

The simplest form of the chain rule is for real-valued functions of one real variable. It states that if g is a function that is differentiable at a point c (i.e. the derivative g′(c) exists) and f is a function that is differentiable at g(c), then the composite function Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f\circ g} is differentiable at c, and the derivative is

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (f\circ g)'(c) = f'(g(c))\cdot g'(c). }

The rule is sometimes abbreviated as

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (f\circ g)' = (f'\circ g) \cdot g'.}

If y = f(u) and u = g(x), then this abbreviated form is written in Leibniz notation as:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx}.}

The points where the derivatives are evaluated may also be stated explicitly:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \left.\frac{dy}{dx}\right|_{x=c} = \left.\frac{dy}{du}\right|_{u = g(c)} \cdot \left.\frac{du}{dx}\right|_{x=c}.}

Carrying the same reasoning further, given n functions Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_1, \ldots, f_n\!} with the composite function Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_1 \circ ( f_2 \circ \cdots (f_{n-1} \circ f_n) )\!} , if each function Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_i\!} is differentiable at its immediate input, then the composite function is also differentiable by the repeated application of Chain Rule, where the derivative is (in Leibniz's notation):

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{df_1}{dx} = \frac{df_1}{df_2}\frac{df_2}{df_3}\cdots\frac{df_n}{dx}.}

Proofs

First proof

One proof of the chain rule begins with the definition of the derivative:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (f \circ g)'(a) = \lim_{x \to a} \frac{f(g(x)) - f(g(a))}{x - a}.}

Assume for the moment that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g(x)\!} does not equal Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g(a)} for any x near a. Then the previous expression is equal to the product of two factors:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \lim_{x \to a} \frac{f(g(x)) - f(g(a))}{g(x) - g(a)} \cdot \frac{g(x) - g(a)}{x - a}.}

If Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g} oscillates near a, then it might happen that no matter how close one gets to a, there is always an even closer x such that g(x) = g(a). For example, this happens near a = 0 for the continuous function g defined by g(x) = 0 for x = 0 and g(x) = x2 sin(1/x) otherwise. Whenever this happens, the above expression is undefined because it involves division by zero. To work around this, introduce a function Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Q} as follows:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Q(y) = \begin{cases} \displaystyle\frac{f(y) - f(g(a))}{y - g(a)}, & y \neq g(a), \\ f'(g(a)), & y = g(a). \end{cases}}

We will show that the difference quotient for fg is always equal to:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Q(g(x)) \cdot \frac{g(x) - g(a)}{x - a}.}

Whenever g(x) is not equal to g(a), this is clear because the factors of g(x) − g(a) cancel. When g(x) equals g(a), then the difference quotient for fg is zero because f(g(x)) equals f(g(a)), and the above product is zero because it equals f′(g(a)) times zero. So the above product is always equal to the difference quotient, and to show that the derivative of fg at a exists and to determine its value, we need only show that the limit as x goes to a of the above product exists and determine its value.

To do this, recall that the limit of a product exists if the limits of its factors exist. When this happens, the limit of the product of these two factors will equal the product of the limits of the factors. The two factors are Q(g(x)) and (g(x) − g(a)) / (xa). The latter is the difference quotient for g at a, and because g is differentiable at a by assumption, its limit as x tends to a exists and equals g′(a).

As for Q(g(x)), notice that Q is defined wherever f is. Furthermore, f is differentiable at g(a) by assumption, so Q is continuous at g(a), by definition of the derivative. The function g is continuous at a because it is differentiable at a, and therefore Qg is continuous at a. So its limit as x goes to a exists and equals Q(g(a)), which is f′(g(a)).

This shows that the limits of both factors exist and that they equal f′(g(a)) and g′(a), respectively. Therefore, the derivative of fg at a exists and equals f′(g(a))g′(a).

Second proof

Another way of proving the chain rule is to measure the error in the linear approximation determined by the derivative. This proof has the advantage that it generalizes to several variables. It relies on the following equivalent definition of differentiability at a point: A function g is differentiable at a if there exists a real number g′(a) and a function ε(h) that tends to zero as h tends to zero, and furthermore

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g(a + h) - g(a) = g'(a) h + \varepsilon(h) h.}

Here the left-hand side represents the true difference between the value of g at a and at a + h, whereas the right-hand side represents the approximation determined by the derivative plus an error term.

In the situation of the chain rule, such a function ε exists because g is assumed to be differentiable at a. Again by assumption, a similar function also exists for f at g(a). Calling this function η, we have

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f(g(a) + k) - f(g(a)) = f'(g(a)) k + \eta(k) k.}

The above definition imposes no constraints on η(0), even though it is assumed that η(k) tends to zero as k tends to zero. If we set η(0) = 0, then η is continuous at 0.

Proving the theorem requires studying the difference f(g(a + h)) − f(g(a)) as h tends to zero. The first step is to substitute for g(a + h) using the definition of differentiability of g at a:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f(g(a + h)) - f(g(a)) = f(g(a) + g'(a) h + \varepsilon(h) h) - f(g(a)).}

The next step is to use the definition of differentiability of f at g(a). This requires a term of the form f(g(a) + k) for some k. In the above equation, the correct k varies with h. Set kh = g′(a) h + ε(h) h and the right hand side becomes f(g(a) + kh) − f(g(a)). Applying the definition of the derivative gives:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f(g(a) + k_h) - f(g(a)) = f'(g(a)) k_h + \eta(k_h) k_h.}

To study the behavior of this expression as h tends to zero, expand kh. After regrouping the terms, the right-hand side becomes:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f'(g(a)) g'(a)h + [f'(g(a)) \varepsilon(h) + \eta(k_h) g'(a) + \eta(k_h) \varepsilon(h)] h.}

Because ε(h) and η(kh) tend to zero as h tends to zero, the first two bracketed terms tend to zero as h tends to zero. Applying the same theorem on products of limits as in the first proof, the third bracketed term also tends zero. Because the above expression is equal to the difference f(g(a + h)) − f(g(a)), by the definition of the derivative fg is differentiable at a and its derivative is f′(g(a)) g′(a).

The role of Q in the first proof is played by η in this proof. They are related by the equation:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Q(y) = f'(g(a)) + \eta(y - g(a)). }

The need to define Q at g(a) is analogous to the need to define η at zero.

Third proof

Constantin Carathéodory's alternative definition of the differentiability of a function can be used to give an elegant proof of the chain rule.

Under this definition, a function f is differentiable at a point a if and only if there is a function q, continuous at a and such that f(x) − f(a) = q(x)(xa). There is at most one such function, and if f is differentiable at a then f ′(a) = q(a).

Given the assumptions of the chain rule and the fact that differentiable functions and compositions of continuous functions are continuous, we have that there exist functions q, continuous at g(a), and r, continuous at a, and such that,

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f(g(x))-f(g(a))=q(g(x))(g(x)-g(a))}

and

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle g(x)-g(a)=r(x)(x-a).}

Therefore,

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f(g(x))-f(g(a))=q(g(x))r(x)(x-a),}

but the function given by h(x) = q(g(x))r(x) is continuous at a, and we get, for this a

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (f(g(a)))'=q(g(a))r(a)=f'(g(a))g'(a).}

A similar approach works for continuously differentiable (vector-)functions of many variables. This method of factoring also allows a unified approach to stronger forms of differentiability, when the derivative is required to be Lipschitz continuous, Hölder continuous, etc. Differentiation itself can be viewed as the polynomial remainder theorem (the little Étienne Bézout|Bézout theorem, or factor theorem), generalized to an appropriate class of functions.

Proof via infinitesimals

If Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y=f(x)} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x=g(t)} then choosing infinitesimal Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Delta t\not=0} we compute the corresponding Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Delta x=g(t+\Delta t)-g(t)} and then the corresponding Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Delta y=f(x+\Delta x)-f(x)} , so that

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{\Delta y}{\Delta t}=\frac{\Delta y}{\Delta x} \frac{\Delta x}{\Delta t}}

and applying the standard part we obtain

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{d y}{d t}=\frac{d y}{d x} \frac{dx}{dt}}

which is the chain rule.

Resources

References

  1. George F. Simmons, Calculus with Analytic Geometry (1985), p. 93.
  2. Apostol, Tom (1974). Mathematical analysis (2nd ed.). Addison Wesley. Theorem 5.5.
  3. "Chain Rule for Derivative". Math Vault. 2016-06-05. Retrieved 2019-07-28.
  4. Kuhn, Stephen (1991). "The Derivative á la Carathéodory". The American Mathematical Monthly. 98 (1): 40–44. JSTOR 2324035.