Relative Extrema and Convex Functions

Local and global maxima and minima for cos(3πx)/x, 0.1≤ x ≤1.1

In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given range (the local or relative extrema), or on the entire domain (the global or absolute extrema). Pierre de Fermat was one of the first mathematicians to propose a general technique, adequality, for finding the maxima and minima of functions.

As defined in set theory, the maximum and minimum of a set are the greatest and least elements in the set, respectively. Unbounded infinite sets, such as the set of real numbers, have no minimum or maximum.

Definition

A real-valued function f defined on a domain X has a global (or absolute) maximum point at x^∗, if f(x^∗) ≥ f(x) for all x in X. Similarly, the function has a global (or absolute) minimum point at x^∗, if f(x^∗) ≤ f(x) for all x in X. The value of the function at a maximum point is called the maximum value of the function, denoted $\max(f(x))$ , and the value of the function at a minimum point is called the minimum value of the function. Symbolically, this can be written as follows:

x_{0}\in X

is a global maximum point of function

f:X\to \mathbb {R} ,

if

(\forall x\in X)\,f(x_{0})\geq f(x).

The definition of global minimum point also proceeds similarly.

If the domain X is a metric space, then f is said to have a local (or relative) maximum point at the point x^∗, if there exists some ε > 0 such that f(x^∗) ≥ f(x) for all x in X within distance ε of x^∗. Similarly, the function has a local minimum point at x^∗, if f(x^∗) ≤ f(x) for all x in X within distance ε of x^∗. A similar definition can be used when X is a topological space, since the definition just given can be rephrased in terms of neighborhoods. Mathematically, the given definition is written as follows:

Let

(X,d_{X})

be a metric space and function

f:X\to \mathbb {R}

. Then

x_{0}\in X

is a local maximum point of function

f

if

(\exists \varepsilon >0)

such that

(\forall x\in X)\,d_{X}(x,x_{0})<\varepsilon \implies f(x_{0})\geq f(x).

The definition of local minimum point can also proceed similarly.

In both the global and local cases, the concept of a strict extremum can be defined. For example, x^∗ is a strict global maximum point if for all x in X with x ≠ x^∗, we have f(x^∗) > f(x), and x^∗ is a strict local maximum point if there exists some ε > 0 such that, for all x in X within distance ε of x^∗ with x ≠ x^∗, we have f(x^∗) > f(x). Note that a point is a strict global maximum point if and only if it is the unique global maximum point, and similarly for minimum points.

A continuous real-valued function with a compact domain always has a maximum point and a minimum point. An important example is a function whose domain is a closed and bounded interval of real numbers (see the graph above).

Search

Finding global maxima and minima is the goal of mathematical optimization. If a function is continuous on a closed interval, then by the extreme value theorem, global maxima and minima exist. Furthermore, a global maximum (or minimum) either must be a local maximum (or minimum) in the interior of the domain, or must lie on the boundary of the domain. So a method of finding a global maximum (or minimum) is to look at all the local maxima (or minima) in the interior, and also look at the maxima (or minima) of the points on the boundary, and take the largest (or smallest) one.

Likely the most important, yet quite obvious, feature of continuous real-valued functions of a real variable is that they decrease before local minima and increase afterwards, likewise for maxima. (Formally, if f is continuous real-valued function of a real variable x, then x₀ is a local minimum if and only if there exist a < x₀ < b such that f decreases on (a, x₀) and increases on (x₀, b)). A direct consequence of this is the Fermat's theorem, which states that local extrema must occur at critical points (or points where the function is non-differentiable). One can distinguish whether a critical point is a local maximum or local minimum by using the first derivative test, second derivative test, or higher-order derivative test, given sufficient differentiability.

For any function that is defined piecewise, one finds a maximum (or minimum) by finding the maximum (or minimum) of each piece separately, and then seeing which one is largest (or smallest).

Examples

The global maximum of

{\sqrt[{x}]{x}}

occurs at x = e.

Function	Maxima and minima
x²	Unique global minimum at x = 0.
x³	No global minima or maxima. Although the first derivative (3x²) is 0 at x = 0, this is an inflection point. (2nd derivative is 0 at that point.)
${\sqrt[{x}]{x}}$	Unique global maximum at x = e. (See figure at right)
x^−x	Unique global maximum over the positive real numbers at x = 1/e.
x³/3 − x	First derivative x² − 1 and second derivative 2x. Setting the first derivative to 0 and solving for x gives stationary points at −1 and +1. From the sign of the second derivative, we can see that −1 is a local maximum and +1 is a local minimum. This function has no global maximum or minimum.
\|x\|	Global minimum at x = 0 that cannot be found by taking derivatives, because the derivative does not exist at x = 0.
cos(x)	Infinitely many global maxima at 0, ±2 $\pi$ , ±4 $\pi$ , ..., and infinitely many global minima at ± $\pi$ , ±3 $\pi$ , ±5 $\pi$ , ....
2 cos(x) − x	Infinitely many local maxima and minima, but no global maximum or minimum.
cos(3 $\pi$ x)/x with 0.1 ≤ x ≤ 1.1	Global maximum at x = 0.1 (a boundary), a global minimum near x = 0.3, a local maximum near x = 0.6, and a local minimum near x = 1.0. (See figure at top of page.)
x³ + 3x² − 2x + 1 defined over the closed interval (segment) [−4,2]	Local maximum at x = −1− ${\sqrt {15}}$ /3, local minimum at x = −1+ ${\sqrt {15}}$ /3, global maximum at x = 2 and global minimum at x = −4.

For a practical example, assume a situation where someone has $200$ feet of fencing and is trying to maximize the square footage of a rectangular enclosure, where $x$ is the length, $y$ is the width, and $xy$ is the area:

2x+2y=200

2y=200-2x

{\frac {2y}{2}}={\frac {200-2x}{2}}

y=100-x

xy=x(100-x)

The derivative with respect to $x$ is:

{\begin{aligned}{\frac {d}{dx}}xy&={\frac {d}{dx}}x(100-x)\\&={\frac {d}{dx}}\left(100x-x^{2}\right)\\&=100-2x\end{aligned}}

Setting this equal to $0$

0=100-2x

2x=100

x=50

reveals that $x=50$ is our only critical point. Now retrieve the endpoints by determining the interval to which $x$ is restricted. Since width is positive, then $x>0$ , and since $x=100-y$ , that implies that $x<100$ . Plug in critical point $50$ , as well as endpoints $0$ and $100$ , into $xy=x(100-x)$ , and the results are $2500,0,$ and $0$ respectively.

Therefore, the greatest area attainable with a rectangle of $200$ feet of fencing is $50\times 50=2500$ .

Functions of more than one variable

Peano surface, a counterexample to some criteria of local maxima of the 19th century

The global maximum is the point at the top

Counterexample: The red dot shows a local minimum that is not a global minimum

For functions of more than one variable, similar conditions apply. For example, in the (enlargeable) figure on the right, the necessary conditions for a local maximum are similar to those of a function with only one variable. The first partial derivatives as to z (the variable to be maximized) are zero at the maximum (the glowing dot on top in the figure). The second partial derivatives are negative. These are only necessary, not sufficient, conditions for a local maximum, because of the possibility of a saddle point. For use of these conditions to solve for a maximum, the function z must also be differentiable throughout. The second partial derivative test can help classify the point as a relative maximum or relative minimum. In contrast, there are substantial differences between functions of one variable and functions of more than one variable in the identification of global extrema. For example, if a bounded differentiable function f defined on a closed interval in the real line has a single critical point, which is a local minimum, then it is also a global minimum (use the intermediate value theorem and Rolle's theorem to prove this by reductio ad impossibile). In two and more dimensions, this argument fails. This is illustrated by the function

f(x,y)=x^{2}+y^{2}(1-x)^{3},\qquad x,y\in \mathbb {R} ,

whose only critical point is at (0,0), which is a local minimum with f(0,0) = 0. However, it cannot be a global one, because f(2,3) = −5.

Maxima or minima of a functional

If the domain of a function for which an extremum is to be found consists itself of functions (i.e. if an extremum is to be found of a functional), then the extremum is found using the calculus of variations.

In relation to sets

Maxima and minima can also be defined for sets. In general, if an ordered set S has a greatest element m, then m is a maximal element of the set, also denoted as $\max(S)$ . Furthermore, if S is a subset of an ordered set T and m is the greatest element of S with (respect to order induced by T), then m is a least upper bound of S in T. Similar results hold for least element, minimal element and greatest lower bound. The maximum and minimum function for sets are used in databases, and can be computed rapidly, since the maximum (or minimum) of a set can be computed from the maxima of a partition; formally, they are self-decomposable aggregation functions.

In the case of a general partial order, the least element (i.e., one that is smaller than all others) should not be confused with a minimal element (nothing is smaller). Likewise, a greatest element of a partially ordered set (poset) is an upper bound of the set which is contained within the set, whereas a maximal element m of a poset A is an element of A such that if m ≤ b (for any b in A), then m = b. Any least element or greatest element of a poset is unique, but a poset can have several minimal or maximal elements. If a poset has more than one maximal element, then these elements will not be mutually comparable.

In a totally ordered set, or chain, all elements are mutually comparable, so such a set can have at most one minimal element and at most one maximal element. Then, due to mutual comparability, the minimal element will also be the least element, and the maximal element will also be the greatest element. Thus in a totally ordered set, we can simply use the terms minimum and maximum.

If a chain is finite, then it will always have a maximum and a minimum. If a chain is infinite, then it need not have a maximum or a minimum. For example, the set of natural numbers has no maximum, though it has a minimum. If an infinite chain S is bounded, then the closure Cl(S) of the set occasionally has a minimum and a maximum, in which case they are called the greatest lower bound and the least upper bound of the set S, respectively.

Convex and Concave Functions

Definition: A function $f:I\to \mathbb {R}$ is said to be Convex if for every $x,y\in I$ and for every $t\in [0,1]$ we have that

$f(tx+(1-t)y)\leq tf(x)+(1-t)f(y)$ .

A function $f:I\to \mathbb {R}$ is said to be Concave if for every $x,y\in I$ and for every $t\in [0,1]$ we have that

$f(tx+(1-t)y)\geq tf(x)+(1-t)f(y)$ .

We now give equivalent definitions for convex and concave functions.

Theorem 1: Let $f:I\to \mathbb {R}$ .

a) $f$ is convex on $I$ if and only if for all $a,b,c\in I$ with $a<b<c$ we have that $\displaystyle {{\frac {f(b)-f(a)}{b-a}}\leq {\frac {f(c)-f(a)}{c-a}}}$ .

b) $f$ is concave on $I$ if and only if for all $a,b,c\in I$ with $a<b<c$ we have that $\displaystyle {{\frac {f(b)-f(a)}{b-a}}\geq {\frac {f(c)-f(a)}{c-a}}}$ .

We only prove (a) above. The proof of (b) is analogous.

Proof of a): Let > $a,b,c\in I$ be such that $a<b<c$ .

$\Rightarrow$ Suppose that $f$ is convex on $I$ . Then for all $t\in [0,1]$ we have that:

{\begin{aligned}\quad f(tx+(1-t)y)\leq tf(x)+(1-t)f(y)\end{aligned}}

Set $a=x$ , $b=tx+(1-t)y$ , and $c=y$ . Combining the first and third equations with the second equation gives us:

{\begin{aligned}\quad b=ta+(1-t)c\end{aligned}}

Solving for $t$ gives us:

{\begin{aligned}\quad b=ta+c-tc\\\quad b-c=t(a-c)\\\end{aligned}}

Therefore:

{\begin{aligned}\quad t={\frac {b-c}{a-c}}={\frac {c-b}{c-a}}\end{aligned}}

Similarly, we compute $1-t$ to be:

{\begin{aligned}\quad 1-t={\frac {c-a}{c-a}}-{\frac {c-b}{c-a}}={\frac {b-a}{c-a}}\end{aligned}}

From the convexity of $f$ we have $f(tx+(1-t)y)\leq tf(x)+(1-t)f(y)$ , or equivalently:

{\begin{aligned}\quad f(b)\leq {\frac {c-b}{c-a}}f(a)+{\frac {b-a}{c-a}}f(c)\\\end{aligned}}

And hence:

{\begin{aligned}\quad (c-a)f(b)\leq (c-b)f(a)+(b-a)f(c)=(c-a+a-b)f(a)+(b-a)f(c)=(c-a)f(a)-(b-a)f(a)+(b-a)f(c)\end{aligned}}

Therefore:

{\begin{aligned}\quad (c-a)[f(b)-f(a)]\leq (b-a)[f(c)-f(a)]\quad \Leftrightarrow \quad {\frac {f(b)-f(a)}{b-a}}\leq {\frac {f(c)-f(a)}{c-a}}\end{aligned}}

$\Leftarrow$ Obtained by working backwards from above. $\blacksquare$

We state yet another important definition for convex and concave functions.

Theorem 2: Let > $f:I\to \mathbb {R}$ .

a) $f$ is convex on > $I$ if and only if for all > $a,b,c\in I$ with > $a<;b<;c$ we have that > $\displaystyle {{\frac {f(b)-f(a)}{b-a}}\leq {\frac {f(c)-f(b)}{c-b}}}$ .

b) $f$ is concave on > $I$ if and only if for all > $a,b,c\in I$ with > $a<;b<;c$ we have that > $\displaystyle {{\frac {f(b)-f(a)}{b-a}}\geq {\frac {f(c)-f(b)}{c-b}}}$ .

Theorem 2 gives us a nice characterization of convex functions. It tells us that a function $f:I\to \mathbb {R}$ is convex if and only if whenever we take three points $a,b,c\in I$ with $a<;b<;c$ we have that the slope of the line connecting $(a,f(a))$ and $(b,f(b))$ is less than or equal to the sope of the line connecting $(b,f(b))$ and $(c,f(c))$ . In other words, the slope of the line segments connecting consecutive pairs of points on the graph of $f$ is increasing.

We can combine theorems 1 and 2 to get a nice chain of inequalities. That is, $f:I\to \mathbb {R}$ is convex if and only if for all $a,b,c\in I$ with > $a<b<c$ we have that:

{\begin{aligned}{\frac {f(b)-f(a)}{b-a}}\leq {\frac {f(c)-f(a)}{c-a}}\leq {\frac {f(c)-f(b)}{c-b}}\end{aligned}}

Licensing

Content obtained and/or adapted from:

Maxima and minima, Wikipedia under a CC BY-SA license
Convex and Concave Functions, Mathonline under a CC BY-SA license

Relative Extrema and Convex Functions

Contents

Definition

Search

Examples

Functions of more than one variable

Maxima or minima of a functional

In relation to sets

Convex and Concave Functions

Licensing

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools