Illustration of intermediate value theorem. The theorem implies that any randomly chosen $y$
value between $f(a)$ and $f(b)$ will have at least one $x$ in $[a,b]$
with $f(x)=y$.
"""
ImageFile(imgfile, caption)
```
In the early years of calculus, the intermediate value theorem was
intricately connected with the definition of continuity, now it is a
consequence.
The basic proof starts with a set of points in $[a,b]$: $C = \{x
\text{ in } [a,b] \text{ with } f(x) \leq y\}$. The set is not empty
(as $a$ is in $C$) so it *must* have a largest value, call it $c$
(this requires the completeness property of the real numbers). By
continuity of $f$, it can be shown that $\lim_{x \rightarrow c-} f(x)
= f(c) \leq y$ and $\lim_{y \rightarrow c+}f(x) =f(c) \geq y$, which
forces $f(c) = y$.
### Bolzano and the bisection method
Suppose we have a continuous function $f(x)$ on $[a,b]$ with $f(a) <
0$ and $f(b) > 0$. Then as $f(a) < 0 < f(b)$, the intermediate value
theorem guarantees the existence of a $c$ in $[a,b]$ with $f(c) =
0$. This was a special case of the intermediate value theorem proved
by Bolzano first. Such $c$ are called *zeros* of the function $f$.
We use this fact when a building a "sign chart" of a polynomial function.
Between any two consecutive real zeros the polynomial can not
change sign. (Why?) So a "test point" can be used to determine the
sign of the function over an entire interval.
Here, we use the Bolzano theorem to give an algorithm - the *bisection method* - to locate the value $c$ under the assumption $f$ is continous on $[a,b]$ and changes sign between $a$ and $b$.
```julia; hold=true; echo=false; cache=true
## {{{bisection_graph}}}
function bisecting_graph(n)
f(x) = x^2 - 2
a,b = [0,2]
err = 2.0^(1-n)
title = "b - a = $err"
xs = range(a, stop=b, length=100)
plt = plot(f, a, b, legend=false, size=fig_size, title=title)
Illustration of the bisection method to find a zero of a function. At
each step the interval has $f(a)$ and $f(b)$ having opposite signs so
that the intermediate value theorem guaratees a zero.
"""
ImageFile(imgfile, caption)
```
Call $[a,b]$ a *bracketing* interval if $f(a)$ and $f(b)$ have different signs.
We remark that having different signs can be expressed mathematically as $f(a) \cdot f(b) < 0$.
We can narrow down where a zero is in $[a,b]$ by following this recipe:
* Pick a midpoint of the interval, for concreteness $c = (a+b)/2$.
* If $f(c) = 0$ we are done, having found a zero in $[a,b]$.
* Otherwise if must be that either $f(a)\cdot f(c) < 0$ or $f(c) \cdot f(b) < 0$. If $f(a) \cdot f(c) < 0$, then let $b=c$ and repeat the above. Otherwise, let $a=c$ and repeat the above.
At each step the bracketing interval is narrowed -- indeed split in half
as defined -- or a zero is found.
For the real numbers this algorithm never stops unless a zero is
found. A "limiting" process is used to say that if it doesn't stop, it
will converge to some value.
However, using floating point numbers leads to differences from the
real-number situation. In this case, due to the ultimate granularity of the
approximation of floating point values to the real numbers, the
bracketing interval eventually can't be subdivided, that is no $c$ is found over
the floating point numbers with $a < c < b$. So there is a natural
stopping criteria: stop when there is an exact zero, when the
bracketing interval gets too small to subdivide, or when the interval is as small as desired.
We can write a relatively simple program to implement this algorithm:
```julia;
function simple_bisection(f, a, b)
if f(a) == 0 return(a) end
if f(b) == 0 return(b) end
if f(a) * f(b) > 0 error("[a,b] is not a bracketing interval") end
tol = 1e-14 # small number (but should depend on size of a, b)
c = a/2 + b/2
while abs(b-a) > tol
if f(c) == 0 return(c) end
if f(a) * f(c) < 0
a, b = a, c
else
a, b = c, b
end
c = a/2 + b/2
end
c
end
```
This function uses a `while` loop to repeat the process of subdividing
$[a,b]$. A `while` loop will repeat until the condition is no longer
`true`. The above will stop for reasonably sized floating point
values (within $(-100, 100)$, say), but, as written, ignores the fact
that the gap between floating point values depends on their magnitude.
The value $c$ returned *need not* be an exact zero. Let's see:
```julia;
c = simple_bisection(sin, 3, 4)
```
This value of $c$ is a floating-point approximation to $\pi$, but is not *quite* a zero:
```julia;
sin(c)
```
(Even `pi` itself is not a "zero" due to floating point issues.)
### The `find_zero` function.
The `Roots` package has a function `find_zero` that implements the
is a bracket. Its use is similar to `simple_bisection` above. This package is loaded when `CalculusWithJulia` is. We illlustrate the usage of `find_zero`
Notice, the call `find_zero(sin, (3, 4))` again fits the template `action(function, args...)` that we see repeatedly. The `find_zero` function can also be called through `fzero`. The use of `(3, 4)` to specify the interval is not necessary. For example `[3,4]` would work equally as well. (Anything where `extrema` is defined works.)
This function utilizes some facts about floating point values to
guarantee that the answer will be an *exact* zero or a value where there is a sign change between the next bigger floating point or the next smaller, which means the sign at the next and previous floating point values is different:
We see, as before, that $p(c)$ is not quite $0$. But it can be easily checked that `p` is negative at the previous floating point number, while `p` is seen to be positive at the returned value:
The equation $\cos(x) = x$ has just one solution, as can be seen in this plot:
```julia;
𝒇(x) = cos(x)
𝒈(x) = x
plot(𝒇, -pi, pi)
plot!(𝒈)
```
Find it.
We see from the graph that it is clearly between $0$ and $2$, so all we need is a function. (We have two.) The trick is to observe that solving $f(x) = g(x)$ is the same problem as solving for $x$ where $f(x) - g(x) = 0$. So we define the difference and use that:
#### Using parameterized functions (`f(x,p)`) with `find_zero`
Geometry will tell us that ``\cos(x) = x/p`` for *one* ``x`` in ``[0, \pi/2]`` whenever ``p>0``. We could set up finding this value for a given ``p`` by making ``p`` part of the function definition, but as an illustration of passing parameters, we leave `p` as a parameter (in this case, as a second value with default of ``1``):
For each model, we wish to find the value of $x$ after launching where
the height is modeled to be ``0``. That is how far will the arrow travel
before touching the ground?
For the model without wind resistance, we can graph the function
easily enough. Let's guess the distance is no more than ``500`` feet:
```julia;
plot(j, 0, 500)
```
Well, we haven't even seen the peak yet. Better to do a little spade
work first. This is a quadratic function, so we can use `roots` from `SymPy` to find the roots:
```julia;
roots(j(x))
```
We see that $1250$ is the largest root. So we plot over this domain to visualize the flight:
```julia;
plot(j, 0, 1250)
```
As for the model with wind resistance, a quick plot over the same interval, $[0, 1250]$ yields:
```julia;
plot(d, 0, 1250)
```
This graph eventually goes negative and then stops. This is due to the asymptote in model when `(a - gamma^2*x)/a` is zero. To plot the trajectory until it returns to ``0``, we need to identify the value of the zero.
This model is non-linear and we don't have the simplicity of using `roots` to find out the answer, so we solve for when $a-\gamma^2 x$ is $0$:
```julia;
gamma = 1
a = 200 * cos(pi/4)
b = a/gamma^2
```
Note that the function is infinite at `b`:
```julia;
d(b)
```
From the graph, we can see the zero is around `b`. As `y(b)` is `-Inf` we can use the bracket `(b/2,b)`
(The bisection method only needs to know the sign of the function. Other bracketing methods would have issues with an endpoint with an infinite function value. To use them, some value between the zero and `b` would needed.)
Finally, we plot both graphs at once to see that it was a very windy
day indeed.
```julia;
plot(j, 0, 1250, label="no wind")
plot!(d, 0, x1, label="windy day")
```
##### Example: bisection and non-continuity
The Bolzano theorem assumes a continuous function $f$, and when
applicable, yields an algorithm to find a guaranteed zero.
However, the algorithm itself does not know that the function is continuous or
not, only that the function changes sign. As such, it can produce
answers that are not "zeros" when used with discontinuous
functions.
In general a function over floating point values could be considered as a large table of mappings: each of the ``2^{64}`` floating point values gets assigned a value. This is discrete mapping, there is nothing the computer sees related to continuity.
> The concept of continuity, if needed, must be verified by the user of the algorithm.
We have seen this when plotting rational functions or functions with vertical asymptotes. The default algorithms just connect points with lines. The user must manage the discontinuity (by assigning some values `NaN`, say); the algorithms used do not.
In this particular case, the bisection algorithm can still be fruitful
even when the function is not continuous, as the algorithm will yield
information about crossing values of $0$, possibly at
discontinuities. But the user of the algorithm must be aware that the
answers are only guaranteed to be zeros of the function if the
function is continuous and the algorithm did not check for that
assumption.
As an example, let $f(x) = 1/x$. Clearly the interval $[-1,1]$ is a
"bracketing" interval as $f(x)$ changes sign between $a$ and $b$. What
The output of `find_zeros` is a vector of values. To check that each value
is an approximate zero can be done with the "." (broadcast) syntax:
```julia;
f₁.(zs)
```
(For a continuous function this should be the case that the values
returned by `find_zeros` are approximate zeros. Bear in mind that if $f$ is not
continous the algorithm might find jumping points that are not zeros and may not even be in the domain of the function.)
### An alternate interface to `find_zero`
The `find_zero` function in the `Roots` package is an interface to one of several methods. For now we focus on the *bracketing* methods, later we will see others. Bracketing methods, among others, include `Roots.Bisection()`, the basic bisection method though with a different sense of "middle" than ``(a+b)/2`` and used by default above; `Roots.A42()`, which will typically converge much faster than simple bisection; `Roots.Brent()` for the classic method of Brent, and `FalsePosition()` for a family of *regula falsi* methods. These can all be used by specifying the method in a call to `find_zero`.
Alternatively, `Roots` implements the `CommonSolve` interface popularized by its use in the `DifferentialEquations.jl` ecosystem, a wildly successful area for `Julia`. The basic setup is two steps: setup a "problem," solve the problem.
To set up a problem, we call `ZeroProblem` with the function and an initial interval, as in:
```julia
f₅(x) = x^5 - x - 1
prob = ZeroProblem(f₅, (1,2))
```
Then we can "solve" this problem with `solve`. For example:
Though the answers are identical, the methods employed were not. The first call, with an unspecified method, defaults to bisection.
## Extreme value theorem
The Extreme Value Theorem is another consequence of continuity.
To discuss the extreme value theorem, we define an *absolute maximum*.
> The absolute maximum of $f(x)$ over an interval $I$, when it exists, is the value $f(c)$, $c$ in $I$,
> where $f(x) \leq f(c)$ for any $x$ in $I$.
>
> Similarly, an *absolute minimum* of
> $f(x)$ over an interval $I$ can be defined, when it exists, by a value ``f(c)`` where ``c`` is in ``I`` *and*
> ``f(c) \leq f(x)`` for any ``x`` in ``I``.
Related but different is the concept of a relative of *local extrema*:
> A local maxima for ``f`` is a value ``f(c)`` where ``c`` is in **some** *open* interval ``I=(a,b)``, ``I`` in the domain of ``f``, and ``f(c)`` is an absolute maxima for ``f`` over ``I``. Similarly, an local minima for ``f`` is a value ``f(c)`` where ``c`` is in **some** *open* interval ``I=(a,b)``, ``I`` in the domain of ``f``, and ``f(x)`` is an absolute minima for ``f`` over ``I``.
The term *local extrema* is used to describe either a local maximum or local minimum.
The key point, is the extrema are values in the *range* that are realized by some value in the *domain* (possibly more than one.)
This chart of the [Hardrock 100](http://hardrock100.com/) illustrates the two concepts.
```julia; echo=false
###{{{hardrock_profile}}}
imgfile = "figures/hardrock-100.png"
caption = """
Elevation profile of the Hardrock 100 ultramarathon. Treating the elevation profile as a function, the absolute maximum is just about 14,000 feet and the absolute minimum about 7600 feet. These are of interest to the runner for different reasons. Also of interest would be each local maxima and local minima - the peaks and valleys of the graph - and the total elevation climbed - the latter so important/unforgettable its value makes it into the chart's title.
"""
ImageFile(:limits, imgfile, caption)
```
The extreme value theorem discusses an assumption that ensures
absolute maximum and absolute minimum values exist.
> The *extreme value theorem*: If $f(x)$ is continuous over a closed
> interval $[a,b]$ then $f$ has an absolute maximum and an absolute
> minimum over $[a,b]$.
(By continuous over $[a,b]$ we mean continuous on $(a,b)$ and right
continuous at $a$ and left continuous at $b$.)
The assumption that $[a,b]$ includes its endpoints (it is closed) is crucial to make a
guarantee. There are functions which are continuous on open intervals
for which this result is not true. For example, $f(x) = 1/x$ on $(0,1)$. This
function will have no smallest value or largest value, as defined above.
The extreme value theorem is an important theoretical tool for
investigating maxima and minima of functions.
##### Example
The function $f(x) = \sqrt{1-x^2}$ is continuous on the interval
$[-1,1]$ (in the sense above). It then has an absolute maximum, we can
see to be $1$ occurring at an interior point $0$. The absolute minimum
is $0$, it occurs at each endpoint.
##### Example
The function $f(x) = x \cdot e^{-x}$ on the closed interval $[0, 5]$ is continuous. Hence it has an absolute maximum, which a graph shows to be $0.4$. It has an absolute minimum, clearly the value $0$ occurring at the endpoint.
```julia; hold=true;
plot(x -> x * exp(-x), 0, 5)
```
##### Example
The tangent function does not have a *guarantee* of absolute maximum
or minimum over $(-\pi/2, \pi/2),$ as it is not *continuous* at the
endpoints. In fact, it doesn't have either extrema - it has vertical asymptotes at each endpoint of this interval.
##### Example
The function $f(x) = x^{2/3}$ over the interval $[-2,2]$ has cusp at $0$. However, it is continuous on this closed interval, so must have an absolute maximum and absolute minimum. They can be seen from the graph to occur at the endpoints and the cusp at $x=0$, respectively:
```julia;hold=true;
plot(x -> (x^2)^(1/3), -2, 2)
```
(The use of just `x^(2/3)` would fail, can you guess why?)
##### Example
A New York Times [article](https://www.nytimes.com/2016/07/30/world/europe/norway-considers-a-birthday-gift-for-finland-the-peak-of-an-arctic-mountain.html) discusses an idea of Norway moving its border some 490 feet north and 650 feet east in order to have the peak of Mount Halti be the highest point in Finland, as currently it would be on the boundary. Mathematically this hints at a higher dimensional version of the extreme value theorem.
## Continuity and closed and open sets
We comment on two implications of continuity that can be generalized to more general settings.
The two intervals ``(a,b)`` and ``[a,b]`` differ as the latter includes the endpoints. The extreme value theorem shows this distinction can make a big difference in what can be said regarding *images* of such interval.
In particular, if ``f`` is continuous and ``I = [a,b]`` with ``a`` and ``b`` finite (``I`` is *closed* and bounded) then the *image* of ``I`` sometimes denoted ``f(I) = \{y: y=f(x) \text{ for } x \in I\}`` has the property that it will be an interval and will include its endpoints (also closed and bounded).
That ``f(I)`` is an interval is a consequence of the intermediate value theorem. That ``f(I)`` contains its endpoints is the extreme value theorem.
On the real line, sets that are closed and bounded are "compact," a term that generalizes to other settings.
> Continuity implies that the *image* of a compact set is compact.
Now let ``(c,d)`` be an *open* interval in the range of ``f``. An open interval is an open set. On the real line, an open set is one where each point in the set, ``a``, has some ``\delta`` such that if ``|b-a| < \delta`` then ``b`` is also in the set.
> Continuity implies that the *preimage* of an open set is an open set.
The *preimage* of an open set, ``I``, is ``\{a: f(a) \in I\}``. (All ``a`` with an image in ``I``.) Taking some pair ``(a,y)`` with ``y`` in ``I`` and ``a`` in the preimage as ``f(a)=y``.
Let ``\epsilon`` be such that ``|x-y| < \epsilon`` implies ``x`` is in ``I``.
Then as ``f`` is continuous at ``a``, given ``\epsilon`` there is a ``\delta`` such that ``|b-a| <\delta`` implies ``|f(b) - f(a)| < \epsilon`` or ``|f(b)-y| < \epsilon`` which means that ``f(b)`` is in the ``I`` so ``b`` is in the preimage, implying the preimage is an open set.
## Questions
###### Question
There is negative zero in the interval $[-10, 0]$ for the function
Trajectories of potential cannonball fires with air-resistance included. (http://ej.iop.org/images/0143-0807/33/1/149/Full/ejp405251f1_online.jpg)
"""
ImageFile(:limits, figure, caption)
```
In 1638, according to Amir D. [Aczel](http://books.google.com/books?id=kvGt2OlUnQ4C&pg=PA28&lpg=PA28&dq=mersenne+cannon+ball+tests&source=bl&ots=wEUd7e0jFk&sig=LpFuPoUvODzJdaoug4CJsIGZZHw&hl=en&sa=X&ei=KUGcU6OAKJCfyASnioCoBA&ved=0CCEQ6AEwAA#v=onepage&q=mersenne%20cannon%20ball%20tests&f=false),
an experiment was performed in the French Countryside. A monk, Marin
Mersenne, launched a cannonball straight up into the air in an attempt
to help Descartes prove facts about the rotation of the earth. Though
the experiment was not successful, Mersenne later observed that the
time for the cannonball to go up was greater than the time to come
down. ["Vertical Projection in a Resisting Medium: Reflections on Observations of Mersenne".](http://www.maa.org/publications/periodicals/american-mathematical-monthly/american-mathematical-monthly-contents-junejuly-2014)
This isn't the case for simple ballistic motion where the time to go
up is equal to the time to come down. We can "prove" this numerically. For simple ballistic
motion:
```math
f(t) = -\frac{1}{2} \cdot 32 t^2 + v_0t.
```
The time to go up and down are found by
the two zeros of this function. The peak time is related to a zero of
a function given by `f'`, which for now we'll take as a mystery
operation, but later will be known as the derivative. (The notation assumes `CalculusWithJulia` has been loaded.)
Let $v_0= 390$. The three times in question can be found from the zeros of `f` and `f'`. What are they?
([From "On the trajectories of projectiles depicted in early ballistic Woodcuts"](http://www.researchgate.net/publication/230963032_On_the_trajectories_of_projectiles_depicted_in_early_ballistic_woodcuts))
Here $g=32$, again we take $v_0=390$, and $\gamma$ is a drag
coefficient that we will take to be $1$. This is valid when $h(t)
\geq 0$. In `Julia`, rather than hard-code the parameter values, for
added flexibility we can pass them in as keyword arguments:
Part of the proof of the intermediate value theorem rests on knowing what the limit is of $f(x)$ when $f(x) > y$ for all $x$. What can we say about $L$ supposing $L = \lim_{x \rightarrow c+}f(x)$ under this assumption on $f$?
```julia; hold=true; echo=false
choices = [L"It must be that $L > y$ as each $f(x)$ is.",
The zeros of the equation $\cos(x) \cdot \cosh(x) = 1$ are related to vibrations of rods. Using `find_zeros`, what is the largest zero in the interval $[0, 6\pi]$?
A parametric equation is specified by a parameterization $(f(t), g(t)), a \leq t \leq b$. The parameterization will be continuous if and only if each function is continuous.
Suppose $k_x$ and $k_y$ are positive integers and $a, b$ are positive numbers, will the [Lissajous](https://en.wikipedia.org/wiki/Parametric_equation#Lissajous_Curve) curve given by $(a\cos(k_x t), b\sin(k_y t))$ be continuous?
```julia; hold=true; echo=false
yesnoq(true)
```
Here is a sample graph for $a=1, b=2, k_x=3, k_y=4$:
```julia; hold=true;
a,b = 1, 2
k_x, k_y = 3, 4
plot(t -> a * cos(k_x *t), t-> b * sin(k_y * t), 0, 4pi)