Difference between revisions of "Chain Rule"

From Department of Mathematics at UTSA
Jump to navigation Jump to search
(Added links for PowerPoint and worksheet)
 
 
(2 intermediate revisions by one other user not shown)
Line 1: Line 1:
 +
The '''chain rule''' is a method to compute the derivative of the functional composition of two or more functions.
 +
 +
If a function <math>f</math> depends on a variable <math>u</math> , which in turn depends on another variable <math>x</math> , that is <math>f=y\bigl(u(x)\bigr)</math> , then the rate of change of <math>f</math> with respect to <math>x</math> can be computed as the rate of change of <math>y</math> with respect to <math>u</math> multiplied by the rate of change of <math>u</math> with respect to <math>x</math> .
 +
 +
{| WIDTH="75%"
 +
|-
 +
| style="background-color: #FFFFFF; border: solid 1px #D6D6FF; padding: 1em;" valign="top" |
 +
<center>'''Chain Rule'''<br>
 +
If a function <math>f</math> is composed to two differentiable functions <math>y(x)</math> and <math>u(x)</math> , so that <math>f(x)=y\bigl(u(x)\bigr)</math> , then <math>f(x)</math> is differentiable and,
 +
 +
<math>\frac{df}{dx}=\frac{dy}{du}\cdot\frac{du}{dx}</math>
 +
 +
</center>
 +
|}
 +
 +
The method is called the "chain rule" because it can be applied sequentially to as many functions as are nested inside one another. For example, if <math>f</math> is a function of <math>g</math> which is in turn a function of <math>h</math> , which is in turn a function of <math>x</math> , that is
 +
:<math>f\bigl(g(h(x))\bigr)</math>
 +
the derivative of <math>f</math> with respect to <math>x</math> is given by
 +
:<math>\frac{df}{dx}=\frac{df}{dg}\cdot\frac{dg}{dh}\cdot\frac{dh}{dx}</math> and so on.
 +
A useful mnemonic is to think of the differentials as individual entities that can be canceled algebraically, such as
 +
:<math>\frac{df}{dx}=\frac{df}{\cancel{dg}}\cdot\frac{\cancel{dg}}{\cancel{dh}}\cdot\frac{\cancel{dh}}{dx}</math>
 +
However, keep in mind that this trick comes about through a clever choice of notation rather than through actual algebraic cancellation.
 +
 +
The chain rule has broad applications in physics, chemistry, and engineering, as well as being used to study related rates in many disciplines. The chain rule can also be generalized to multiple variables in cases where the nested functions depend on more than one variable.
 +
 +
==Examples==
 +
===Example I===
 +
Suppose that a mountain climber ascends at a rate of <math>0.5\frac{km}{h}</math> . The temperature is lower at higher elevations; suppose the rate by which it decreases is <math>6^\circ C</math> per kilometer. To calculate the decrease in air temperature per hour that the climber experiences, one multiplies <math>\frac{6^\circ C}{km}</math> by <math>0.5\frac{km}{h}</math> , to obtain <math>\frac{3^\circ C}{h}</math> . This calculation is a typical chain rule application.
 +
 +
===Example II===
 +
Consider the function <math>f(x)=(x^2+1)^3</math> . It follows from the chain rule that
 +
 +
{| cellpadding=5
 +
|-
 +
| <math>f(x)=(x^2+1)^3</math> || width="250pt" |Function to differentiate
 +
|-
 +
| <math>u(x)=x^2+1</math> || Define <math>u(x)</math> as inside function
 +
|-
 +
| <math>f(x)=u(x)^3</math> || Express <math>f(x)</math> in terms of <math>u(x)</math>
 +
|-
 +
| <math>\frac{df}{dx}=\frac{df}{du}\cdot\frac{du}{dx}</math> || Express chain rule applicable here
 +
|-
 +
| <math>\frac{df}{dx}=\frac{d}{du}u^3\cdot\frac{d}{dx}(x^2+1)</math> || Substitute in <math>f(u)</math> and <math>u(x)</math>
 +
|-
 +
| <math>\frac{df}{dx}=3u^2\cdot2x</math> || Compute derivatives with power rule
 +
|-
 +
| <math>\frac{df}{dx}=3(x^2+1)^2\cdot 2x</math> || Substitute <math>u(x)</math> back in terms of <math>x</math>
 +
|-
 +
| <math>\frac{df}{dx}=6x(x^2+1)^2</math> || Simplify.
 +
|}
 +
 +
===Example III===
 +
In order to differentiate the trigonometric function
 +
:<math>f(x)=\sin(x^2)</math>
 +
one can write:
 +
 +
{| cellpadding=5
 +
|-
 +
| <math>f(x)=\sin(x^2)</math> || width="250pt" |Function to differentiate
 +
|-
 +
| <math>u(x)=x^2</math> || Define <math>u(x)</math> as inside function
 +
|-
 +
| <math>f(x)=\sin(u)</math> || Express <math>f(x)</math> in terms of <math>u(x)</math>
 +
|-
 +
| <math>\frac{df}{dx}=\frac{df}{du}\cdot\frac{du}{dx}</math> || Express chain rule applicable here
 +
|-
 +
| <math>\frac{df}{dx}=\frac{d}{du}\sin(u)\cdot\frac{d}{dx}(x^2)</math> || Substitute in <math>f(u)</math> and <math>u(x)</math>
 +
|-
 +
| <math>\frac{df}{dx}=\cos(u)\cdot 2x</math> || Evaluate derivatives
 +
|-
 +
| <math>\frac{df}{dx}=\cos(x^2)\cdot 2x</math> || Substitute <math>u</math> in terms of <math>x</math> .
 +
|}
 +
 +
===Example IV: absolute value===
 +
The chain rule can be used to differentiate <math>|x|</math> , the absolute value function:
 +
 +
{| cellpadding=5
 +
|-
 +
| <math>f(x)=|x|</math> || width="250pt" |Function to differentiate
 +
|-
 +
| <math>f(x)=\sqrt{x^2}</math> || Equivalent function
 +
|-
 +
| <math>u(x)=x^2</math> || Define <math>u(x)</math> as inside function
 +
|-
 +
| <math>f(x)=u(x)^\frac12</math> || Express <math>f(x)</math> in terms of <math>u(x)</math>
 +
|-
 +
| <math>\frac{df}{dx}=\frac{df}{du}\cdot\frac{du}{dx}</math> || Express chain rule applicable here
 +
|-
 +
| <math>\frac{df}{dx}=\frac{d}{du}u^\frac12\cdot\frac{d}{dx}(x^2)</math> || Substitute in <math>f(u)</math> and <math>u(x)</math>
 +
|-
 +
| <math>\frac{df}{dx}=\frac{u^{-\frac12}}{2}\cdot 2x</math> || Compute derivatives with power rule
 +
|-
 +
| <math>\frac{df}{dx}=\frac{(x^2)^{-\frac12}}{2}\cdot 2x</math> || Substitute <math>u(x)</math> back in terms of <math>x</math>
 +
|-
 +
| <math>\frac{df}{dx}=\frac{x}{\sqrt{x^2}}</math> || Simplify
 +
|-
 +
| <math>\frac{df}{dx}=\frac{x}{|x|}</math> || Express <math>\sqrt{x^2}</math> as absolute value.
 +
|}
 +
 +
===Example V: three nested functions===
 +
The method is called the "chain rule" because it can be applied sequentially to as many functions as are nested inside one another. For example, if <math>f\bigl(g(h(x))\bigr)=e^{\sin(x^2)}</math> , sequential application of the chain rule yields the derivative as follows (we make use of the fact that <math>\frac{d}{dx}e^x=e^x</math> , which will be proved in a later section):
 +
 +
{| cellpadding=5
 +
|-
 +
| <math>f(x)=e^{\sin(x^2)}=e^g</math> || width="250pt" |Original (outermost) function
 +
|-
 +
| <math>h(x)=x^2</math> || Define <math>h(x)</math> as innermost function
 +
|-
 +
| <math>g(x)=\sin(h)=\sin(x^2)</math> || <math>g(h)=sin(h)</math> as middle function
 +
|-
 +
| <math>\frac{df}{dx}=\frac{df}{dg}\cdot\frac{dg}{dh}\cdot\frac{dh}{dx}</math> || Express chain rule applicable here
 +
|-
 +
| <math>\frac{df}{dg}=e^g=e^{\sin(x^2)}</math> || Differentiate f(g)
 +
|-
 +
| <math>\frac{dg}{dh}=\cos(h)=\cos(x^2)</math> || Differentiate <math>g(h)</math>
 +
|-
 +
| <math>\frac{dh}{dx}=2x</math> || Differentiate <math>h(x)</math>
 +
|-
 +
| <math>\frac{d}{dx}e^{\sin(x^2)}= e^{\sin(x^2)}\cdot\cos(x^2)\cdot 2x</math> || Substitute into chain rule.
 +
|}
 +
 +
==Proof of the chain rule==
 +
Suppose <math>y</math> is a function of <math>u</math> which is a function of <math>x</math> (it is assumed that <math>y</math> is differentiable at <math>u</math> and <math>x</math> , and <math>u</math> is differentiable at <math>x</math> .
 +
To prove the chain rule we use the definition of the derivative.
 +
:<math>\frac{dy}{dx}=\lim_{\Delta x\to0}\frac{\Delta y}{\Delta x}</math>
 +
We now multiply <math>\frac{\Delta y}{\Delta x}</math> by <math>\frac{\Delta u}{\Delta u}</math> and perform some algebraic manipulation.
 +
:<math>\lim_{\Delta x\to0}\frac{\Delta y}{\Delta x}=\lim_{\Delta x\to0}\frac{\Delta y}{\Delta u}\cdot\frac{\Delta u}{\Delta x}=\lim_{\Delta x\to0}\frac{\Delta y}{\Delta u}\cdot\lim_{\Delta x\to0}\frac{\Delta u}{\Delta x}=\lim_{\Delta x\to0}\frac{\Delta y}{\Delta u}\cdot\frac{du}{dx}</math>
 +
 +
Note that as <math>\Delta x</math> approaches <math>0</math> , <math>\Delta u</math> also approaches <math>0</math> . So taking the limit as of a function as <math>\Delta x</math> approaches <math>0</math> is the same as taking its limit as <math>\Delta u</math> approaches <math>0</math> . Thus
 +
:<math>\lim_{\Delta x\to0}\frac{\Delta y}{\Delta u}=\lim_{\Delta u\to0}\frac{\Delta y}{\Delta u}=\frac{dy}{du}</math>
 +
So we have
 +
:<math>\frac{dy}{dx}=\frac{dy}{du}\cdot\frac{du}{dx}</math>
 +
 +
 +
 +
==Resources==
 
* [https://mathresearch.utsa.edu/wikiFiles/MAT1214/The%20Chain%20Rule_/MAT1214-3.6TheChainRulePwPt.pptx  The Chain Rule] PowerPoint file created by Dr. Sara Shirinkam, UTSA.
 
* [https://mathresearch.utsa.edu/wikiFiles/MAT1214/The%20Chain%20Rule_/MAT1214-3.6TheChainRulePwPt.pptx  The Chain Rule] PowerPoint file created by Dr. Sara Shirinkam, UTSA.
  
 
* [https://mathresearch.utsa.edu/wikiFiles/MAT1214/The%20Chain%20Rule_/MAT1214-3.6TheChainRuleWS1.pdf  The Chain Rule Worksheet]
 
* [https://mathresearch.utsa.edu/wikiFiles/MAT1214/The%20Chain%20Rule_/MAT1214-3.6TheChainRuleWS1.pdf  The Chain Rule Worksheet]
 +
 +
* [https://en.wikibooks.org/wiki/Calculus/Chain_Rule Chain Rule], WikiBooks Calculus
 +
 +
==Licensing==
 +
Content obtained and/or adapted from:
 +
* [https://en.wikibooks.org/wiki/Calculus/Chain_Rule Chain Rule, WikiBooks: Calculus] under a CC BY-SA license

Latest revision as of 15:38, 15 January 2022

The chain rule is a method to compute the derivative of the functional composition of two or more functions.

If a function depends on a variable , which in turn depends on another variable , that is , then the rate of change of with respect to can be computed as the rate of change of with respect to multiplied by the rate of change of with respect to .

Chain Rule

If a function is composed to two differentiable functions and , so that , then is differentiable and,

The method is called the "chain rule" because it can be applied sequentially to as many functions as are nested inside one another. For example, if is a function of which is in turn a function of , which is in turn a function of , that is

the derivative of with respect to is given by

and so on.

A useful mnemonic is to think of the differentials as individual entities that can be canceled algebraically, such as

However, keep in mind that this trick comes about through a clever choice of notation rather than through actual algebraic cancellation.

The chain rule has broad applications in physics, chemistry, and engineering, as well as being used to study related rates in many disciplines. The chain rule can also be generalized to multiple variables in cases where the nested functions depend on more than one variable.

Examples

Example I

Suppose that a mountain climber ascends at a rate of . The temperature is lower at higher elevations; suppose the rate by which it decreases is per kilometer. To calculate the decrease in air temperature per hour that the climber experiences, one multiplies by , to obtain . This calculation is a typical chain rule application.

Example II

Consider the function . It follows from the chain rule that

Function to differentiate
Define as inside function
Express in terms of
Express chain rule applicable here
Substitute in and
Compute derivatives with power rule
Substitute back in terms of
Simplify.

Example III

In order to differentiate the trigonometric function

one can write:

Function to differentiate
Define as inside function
Express in terms of
Express chain rule applicable here
Substitute in and
Evaluate derivatives
Substitute in terms of .

Example IV: absolute value

The chain rule can be used to differentiate , the absolute value function:

Function to differentiate
Equivalent function
Define as inside function
Express in terms of
Express chain rule applicable here
Substitute in and
Compute derivatives with power rule
Substitute back in terms of
Simplify
Express as absolute value.

Example V: three nested functions

The method is called the "chain rule" because it can be applied sequentially to as many functions as are nested inside one another. For example, if , sequential application of the chain rule yields the derivative as follows (we make use of the fact that , which will be proved in a later section):

Original (outermost) function
Define as innermost function
as middle function
Express chain rule applicable here
Differentiate f(g)
Differentiate
Differentiate
Substitute into chain rule.

Proof of the chain rule

Suppose is a function of which is a function of (it is assumed that is differentiable at and , and is differentiable at . To prove the chain rule we use the definition of the derivative.

We now multiply by and perform some algebraic manipulation.

Note that as approaches , also approaches . So taking the limit as of a function as approaches is the same as taking its limit as approaches . Thus

So we have


Resources

Licensing

Content obtained and/or adapted from: