use quarto, not Pluto to render pages

This commit is contained in:
jverzani
2022-07-24 16:38:24 -04:00
parent 93c993206a
commit 7b37ca828c
879 changed files with 793311 additions and 2678 deletions

View File

@@ -0,0 +1,17 @@
[deps]
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
EllipsisNotation = "da5c29d0-fa7d-589e-88eb-ea29b0a81949"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
ImplicitPlots = "55ecb840-b828-11e9-1645-43f4a9f9ace7"
IntervalArithmetic = "d1acc4aa-44c8-5952-acd4-ba5d80a2a253"
IntervalConstraintProgramming = "138f1668-1576-5ad7-91b9-7425abbf3153"
LaTeXStrings = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f"
MDBM = "dd61e66b-39ce-57b0-8813-509f78be4b4d"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
QuadGK = "1fd47b50-473d-5c70-9696-f719f8f3bcdc"
Roots = "f2b01f46-fcfa-551c-844a-d8ac1e96c665"
SymPy = "24249f21-da20-56a4-8eb1-6a02cf4ae2e6"
TaylorSeries = "6aa5eb33-94cf-58f4-a9d0-e4b2c4fc25ea"
TermInterface = "8ea1fca8-c5ef-4a55-8b96-4e9afe9c9a3c"
Unitful = "1986cc42-f94f-5a68-af5c-568840ba703d"

View File

@@ -0,0 +1,640 @@
# Curve Sketching
```{julia}
#| echo: false
import Logging
Logging.disable_logging(Logging.Info) # or e.g. Logging.Info
Logging.disable_logging(Logging.Warn)
import SymPy
function Base.show(io::IO, ::MIME"text/html", x::T) where {T <: SymPy.SymbolicObject}
println(io, "<span class=\"math-left-align\" style=\"padding-left: 4px; width:0; float:left;\"> ")
println(io, "\\[")
println(io, sympy.latex(x))
println(io, "\\]")
println(io, "</span>")
end
# hack to work around issue
import Markdown
import CalculusWithJulia
function CalculusWithJulia.WeaveSupport.ImageFile(d::Symbol, f::AbstractString, caption; kwargs...)
nm = joinpath("..", string(d), f)
u = "![$caption]($nm)"
Markdown.parse(u)
end
nothing
```
This section uses the following add-on packages:
```{julia}
using CalculusWithJulia
using Plots
using SymPy
using Roots
using Polynomials # some name clash with SymPy
```
```{julia}
#| echo: false
#| results: "hidden"
using CalculusWithJulia.WeaveSupport
fig_size=(800, 600)
const frontmatter = (
title = "Curve Sketching",
description = "Calculus with Julia: Curve Sketching",
tags = ["CalculusWithJulia", "derivatives", "curve sketching"],
);
nothing
```
---
The figure illustrates a means to *sketch* a sine curve - identify as many of the following values as you can:
* asymptotic behaviour (as $x \rightarrow \pm \infty$),
* periodic behaviour,
* vertical asymptotes,
* the $y$ intercept,
* any $x$ intercept(s),
* local peaks and valleys (relative extrema).
* concavity
With these, a sketch fills in between the points/lines associated with these values.
```{julia}
#| hold: true
#| echo: false
#| cache: true
### {{{ sketch_sin_plot }}}
function sketch_sin_plot_graph(i)
f(x) = 10*sin(pi/2*x) # [0,4]
deltax = 1/10
deltay = 5/10
zs = find_zeros(f, 0-deltax, 4+deltax)
cps = find_zeros(D(f), 0-deltax, 4+deltax)
xs = range(0, stop=4*(i-2)/6, length=50)
if i == 1
## plot zeros
title = "Plot the zeros"
p = scatter(zs, 0*zs, title=title, xlim=(-deltax,4+deltax), ylim=(-10-deltay,10+deltay), legend=false)
elseif i == 2
## plot extrema
title = "Plot the local extrema"
p = scatter(zs, 0*zs, title=title, xlim=(-deltax,4+deltax), ylim=(-10-deltay,10+deltay), legend=false)
scatter!(p, cps, f.(cps))
else
## sketch graph
title = "sketch the graph"
p = scatter(zs, 0*zs, title=title, xlim=(-deltax,4+deltax), ylim=(-10-deltay,10+deltay), legend=false)
scatter!(p, cps, f.(cps))
plot!(p, xs, f.(xs))
end
p
end
caption = L"""
After identifying asymptotic behaviours,
a curve sketch involves identifying the $y$ intercept, if applicable; the $x$ intercepts, if possible; the local extrema; and changes in concavity. From there a sketch fills in between the points. In this example, the periodic function $f(x) = 10\cdot\sin(\pi/2\cdot x)$ is sketched over $[0,4]$.
"""
n = 8
anim = @animate for i=1:n
sketch_sin_plot_graph(i)
end
imgfile = tempname() * ".gif"
gif(anim, imgfile, fps = 1)
ImageFile(imgfile, caption)
```
Though this approach is most useful for hand-sketches, the underlying concepts are important for properly framing graphs made with the computer.
We can easily make a graph of a function over a specified interval. What is not always so easy is to pick an interval that shows off the features of interest. In the section on [rational](../precalc/rational_functions.html) functions there was a discussion about how to draw graphs for rational functions so that horizontal and vertical asymptotes can be seen. These are properties of the "large." In this section, we build on this, but concentrate now on more local properties of a function.
##### Example
Produce a graph of the function $f(x) = x^4 -13x^3 + 56x^2-92x + 48$.
We identify this as a fourth-degree polynomial with postive leading coefficient. Hence it will eventually look $U$-shaped. If we graph over a too-wide interval, that is all we will see. Rather, we do some work to produce a graph that shows the zeros, peaks, and valleys of $f(x)$. To do so, we need to know the extent of the zeros. We can try some theory, but instead we just guess and if that fails, will work harder:
```{julia}
f(x) = x^4 - 13x^3 + 56x^2 -92x + 48
rts = find_zeros(f, -10, 10)
```
As we found $4$ roots, we know by the fundamental theorem of algebra we have them all. This means, our graph need not focus on values much larger than $6$ or much smaller than $1$.
To know where the peaks and valleys are, we look for the critical points:
```{julia}
cps = find_zeros(f', 1, 6)
```
Because we have the $4$ distinct zeros, we must have the peaks and valleys appear in an interleaving manner, so a search over $[1,6]$ finds all three critical points and without checking, they must correspond to relative extrema.
Next we identify the *inflection points* which are among the zeros of the second derivative (when defined):
```{julia}
ips = find_zeros(f'', 1, 6)
```
If there is no sign change for either $f'$ or $f''$ over $[a,b]$ then the sketch of $f$ on this interval must be one of:
* increasing and concave up (if $f' > 0$ and $f'' > 0$)
* increasing and concave down (if $f' > 0$ and $f'' < 0$)
* decreasing and concave up (if $f' < 0$ and $f'' > 0$)
* decreasing and concave down (if $f' < 0$ and $f'' < 0$)
This aids in sketching the graph between the critical points and inflection points.
We finally check that if we were to just use $[0,7]$ as a domain to plot over that the function doesn't get too large to mask the oscillations. This could happen if the $y$ values at the end points are too much larger than the $y$ values at the peaks and valleys, as only so many pixels can be used within a graph. For this we have:
```{julia}
f.([0, cps..., 7])
```
The values at $0$ and at $7$ are a bit large, as compared to the relative extrema, and since we know the graph is eventually $U$-shaped, this offers no insight. So we narrow the range a bit for the graph:
```{julia}
plot(f, 0.5, 6.5)
```
---
This sort of analysis can be automated. The plot "recipe" for polynomials from the `Polynomials` package does similar considerations to choose a viewing window:
```{julia}
xₚ = variable(Polynomial)
plot(f(xₚ)) # f(xₚ) of Polynomial type
```
##### Example
Graph the function
$$
f(x) = \frac{(x-1)\cdot(x-3)^2}{x \cdot (x-2)}.
$$
Not much to do here if you are satisfied with a graph that only gives insight into the asymptotes of this rational function:
```{julia}
𝒇(x) = ( (x-1)*(x-3)^2 ) / (x * (x-2) )
plot(𝒇, -50, 50)
```
We can see the slant asymptote and hints of vertical asymptotes, but, we'd like to see more of the basic features of the graph.
Previously, we have discussed rational functions and their asymptotes. This function has numerator of degree $3$ and denominator of degree $2$, so will have a slant asymptote. As well, the zeros of the denominator, $0$ and $-2$, will lead to vertical asymptotes.
To identify how wide a viewing window should be, for the rational function the asymptotic behaviour is determined after the concavity is done changing and we are past all relative extrema, so we should take an interval that includes all potential inflection points and critical points:
```{julia}
𝒇cps = find_zeros(𝒇', -10, 10)
poss_ips = find_zero(𝒇'', (-10, 10))
extrema(union(𝒇cps, poss_ips))
```
So a range over $[-5,5]$ should display the key features including the slant asymptote.
Previously we used the `rangeclamp` function defined in `CalculusWithJulia` to avoid the distortion that vertical asymptotes can have:
```{julia}
plot(rangeclamp(𝒇), -5, 5)
```
With this graphic, we can now clearly see in the graph the two zeros at $x=1$ and $x=3$, the vertical asymptotes at $x=0$ and $x=2$, and the slant asymptote.
---
Again, this sort of analysis can be systematized. The rational function type in the `Polynomials` package takes a stab at that, but isn't quite so good at capturing the slant asymptote:
```{julia}
xᵣ = variable(RationalFunction)
plot(𝒇(xᵣ)) # f(x) of RationalFunction type
```
##### Example
Consider the function $V(t) = 170 \sin(2\pi\cdot 60 \cdot t)$, a model for the alternating current waveform for an outlet in the United States. Create a graph.
Blindly trying to graph this, we will see immediate issues:
```{julia}
V(t) = 170 * sin(2*pi*60*t)
plot(V, -2pi, 2pi)
```
Ahh, this periodic function is *too* rapidly oscillating to be plotted without care. We recognize this as being of the form $V(t) = a\cdot\sin(c\cdot t)$, so where the sine function has a period of $2\pi$, this will have a period of $2\pi/c$, or $1/60$. So instead of using $(-2\pi, 2\pi)$ as the interval to plot over, we need something much smaller:
```{julia}
plot(V, -1/60, 1/60)
```
##### Example
Plot the function $f(x) = \ln(x/100)/x$.
We guess that this function has a *vertical* asymptote at $x=0+$ and a horizontal asymptote as $x \rightarrow \infty$, we verify through:
```{julia}
@syms x
ex = log(x/100)/x
limit(ex, x=>0, dir="+"), limit(ex, x=>oo)
```
The $\ln(x/100)$ part of $f$ goes $-\infty$ as $x \rightarrow 0+$; yet $f(x)$ is eventually positive as $x \rightarrow 0$. So a graph should
* not show too much of the vertical asymptote
* capture the point where $f(x)$ must cross $0$
* capture the point where $f(x)$ has a relative maximum
* show enough past this maximum to indicate to the reader the eventual horizontal asyptote.
For that, we need to get the $x$ intercepts and the critical points. The $x/100$ means this graph has some scaling to it, so we first look between $0$ and $200$:
```{julia}
find_zeros(ex, 0, 200) # domain is (0, oo)
```
Trying the same for the critical points comes up empty. We know there is one, but it is past $200$. Scanning wider, we see:
```{julia}
find_zeros(diff(ex,x), 0, 500)
```
So maybe graphing over $[50, 300]$ will be a good start:
```{julia}
plot(ex, 50, 300)
```
But it isn't! The function takes its time getting back towards $0$. We know that there must be a change of concavity as $x \rightarrow \infty$, as there is a horizontal asymptote. We looks for the anticipated inflection point to ensure our graph includes that:
```{julia}
find_zeros(diff(ex, x, x), 1, 5000)
```
So a better plot is found by going well beyond that inflection point:
```{julia}
plot(ex, 75, 1500)
```
## Questions
###### Question
Consider this graph
```{julia}
#| hold: true
#| echo: false
f(x) = (x-2)* (x-2.5)*(x-3) / ((x-1)*(x+1))
p = plot(f, -20, -1-.3, legend=false, xlim=(-15, 15), color=:blue)
plot!(p, f, -1 + .2, 1 - .02, color=:blue)
plot!(p, f, 1 + .05, 20, color=:blue)
```
What kind of *asymptotes* does it appear to have?
```{julia}
#| hold: true
#| echo: false
choices = [
L"Just a horizontal asymptote, $y=0$",
L"Just vertical asymptotes at $x=-1$ and $x=1$",
L"Vertical asymptotes at $x=-1$ and $x=1$ and a horizontal asymptote $y=1$",
L"Vertical asymptotes at $x=-1$ and $x=1$ and a slant asymptote"
]
answ = 4
radioq(choices, answ)
```
###### Question
Consider the function $p(x) = x + 2x^3 + 3x^3 + 4x^4 + 5x^5 +6x^6$. Which interval shows more than a $U$-shaped graph that dominates for large $x$ due to the leading term being $6x^6$?
(Find an interval that contains the zeros, critical points, and inflection points.)
```{julia}
#| hold: true
#| echo: false
choices = ["``(-5,5)``, the default bounds of a calculator",
"``(-3.5, 3.5)``, the bounds given by Cauchy for the real roots of ``p``",
"``(-1, 1)``, as many special polynomials have their roots in this interval",
"``(-1.1, .25)``, as this constains all the roots, the critical points, and inflection points and just a bit more"
]
radioq(choices, 4, keep_order=true)
```
###### Question
Let $f(x) = x^3/(9-x^2)$.
What points are *not* in the domain of $f$?
```{julia}
#| echo: false
qchoices = [
"The values of `find_zeros(f, -10, 10)`: `[-3, 0, 3]`",
"The values of `find_zeros(f', -10, 10)`: `[-5.19615, 0, 5.19615]`",
"The values of `find_zeros(f'', -10, 10)`: `[-3, 0, 3]`",
"The zeros of the numerator: `[0]`",
"The zeros of the denominator: `[-3, 3]`",
"The value of `f(0)`: `0`",
"None of these choices"
]
radioq(qchoices, 5, keep_order=true)
```
The $x$-intercepts are:
```{julia}
#| hold: true
#| echo: false
radioq(qchoices, 4, keep_order=true)
```
The $y$-intercept is:
```{julia}
#| hold: true
#| echo: false
radioq(qchoices, 6, keep_order=true)
```
There are *vertical asymptotes* at $x=\dots$?
```{julia}
#| hold: true
#| echo: false
radioq(qchoices, 5)
```
The *slant* asymptote has slope?
```{julia}
#| hold: true
#| echo: false
numericq(1)
```
The function has critical points at
```{julia}
#| hold: true,echo
radioq(qchoices, 2, keep_order=true)
```
The function has relative extrema at
```{julia}
#| hold: true
#| echo: false
radioq(qchoices, 7, keep_order=true)
```
The function has inflection points at
```{julia}
#| hold: true
#| echo: false
radioq(qchoices, 7, keep_order=true)
```
###### Question
A function $f$ has
* zeros of $\{-0.7548\dots, 2.0\}$,
* critical points at $\{-0.17539\dots, 1.0, 1.42539\dots\}$,
* inflection points at $\{0.2712\dots,1.2287\}$.
Is this a possible graph of $f$?
```{julia}
#| hold: true
#| echo: false
f(x) = x^4 - 3x^3 + 2x^2 + x - 2
plot(f, -1, 2.5, legend=false)
```
```{julia}
#| hold: true
#| echo: false
yesnoq("yes")
```
###### Question
Two models for population growth are *exponential* growth: $P(t) = P_0 a^t$ and [logistic growth](https://en.wikipedia.org/wiki/Logistic_function#In_ecology:_modeling_population_growth): $P(t) = K P_0 a^t / (K + P_0(a^t - 1))$. The exponential growth model has growth rate proportional to the current population. The logistic model has growth rate depending on the current population *and* the available resources (which can limit growth).
Letting $K=10$, $P_0=5$, and $a= e^{1/4}$. A plot over $[0,5]$ shows somewhat similar behaviour:
```{julia}
K, P0, a = 50, 5, exp(1/4)
exponential_growth(t) = P0 * a^t
logistic_growth(t) = K * P0 * a^t / (K + P0*(a^t-1))
plot(exponential_growth, 0, 5)
plot!(logistic_growth)
```
Does a plot over $[0,50]$ show qualitatively similar behaviour?
```{julia}
#| hold: true
#| echo: false
yesnoq(true)
```
Exponential growth has $P''(t) = P_0 a^t \log(a)^2 > 0$, so has no inflection point. By plotting over a sufficiently wide interval, can you answer: does the logistic growth model have an inflection point?
```{julia}
#| hold: true
#| echo: false
yesnoq(true)
```
If yes, find it numerically:
```{julia}
#| hold: true
#| echo: false
val = find_zero(D(logistic_growth,2), (0, 20))
numericq(val)
```
The available resources are quantified by $K$. As $K \rightarrow \infty$ what is the limit of the logistic growth model:
```{julia}
#| hold: true
#| echo: false
choices = [
"The exponential growth model",
"The limit does not exist",
"The limit is ``P_0``"]
answ = 1
radioq(choices, answ)
```
##### Question
The plotting algorithm for plotting functions starts with a small initial set of points over the specified interval ($21$) and then refines those sub-intervals where the second derivative is determined to be large.
Why are sub-intervals where the second derivative is large different than those where the second derivative is small?
```{julia}
#| hold: true
#| echo: false
choices = [
"The function will increase (or decrease) rapidly when the second derivative is large, so there needs to be more points to capture the shape",
"The function will have more curvature when the second derivative is large, so there needs to be more points to capture the shape",
"The function will be much larger (in absolute value) when the second derivative is large, so there needs to be more points to capture the shape",
]
answ = 2
radioq(choices, answ)
```
##### Question
Is there a nice algorithm to identify what domain a function should be plotted over to produce an informative graph? [Wilkinson](https://www.cs.uic.edu/~wilkinson/Publications/plotfunc.pdf) has some suggestions. (Wilkinson is well known to the `R` community as the specifier of the grammar of graphics.) It is mentioned that "finding an informative domain for a given function depends on at least three features: periodicity, asymptotics, and monotonicity."
Why would periodicity matter?
```{julia}
#| hold: true
#| echo: false
choices = [
"An informative graph only needs to show one or two periods, as others can be inferred.",
"An informative graph need only show a part of the period, as the rest can be inferred.",
L"An informative graph needs to show several periods, as that will allow proper computation for the $y$ axis range."]
answ = 1
radioq(choices, answ)
```
Why should asymptotics matter?
```{julia}
#| hold: true
#| echo: false
choices = [
L"A vertical asymptote can distory the $y$ range, so it is important to avoid too-large values",
L"A horizontal asymptote must be plotted from $-\infty$ to $\infty$",
"A slant asymptote must be plotted over a very wide domain so that it can be identified."
]
answ = 1
radioq(choices, answ)
```
Monotonicity means increasing or decreasing. This is important for what reason?
```{julia}
#| hold: true
#| echo: false
choices = [
"For monotonic regions, a large slope or very concave function might require more care to plot",
"For monotonic regions, a function is basically a straight line",
"For monotonic regions, the function will have a vertical asymptote, so the region should not be plotted"
]
answ = 1
radioq(choices, answ)
```

File diff suppressed because it is too large Load Diff

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

View File

@@ -0,0 +1,17 @@
## Used to make ring figure. Redo in Julia??
plot.new()
plot.window(xlim=c(0,1), ylim=c(-5, 1.1))
x <- seq(.1, .9, length=9)
y <- c(-4.46262,-4.46866, -4.47268, -4.47469, -4.47468, -4.47267, -4.46864, -4.4626 , -4.45454)
lines(c(0, x[3], 1), c(0, y[3], 1))
points(c(0,1), c(0,1), pch=16, cex=2)
text(c(0,1), c(0,1), c("(0,0)", c("(a,b)")), pos=3)
lines(c(0, x[3], x[3]), c(0, 0, y[3]), cex=2, col="gray")
lines(c(1, x[3], x[3]), c(1, 1, y[3]), cex=2, col="gray")
text(x[3]/2, 0, "x", pos=1)
text(x[3], y[3]/2, "|y|", pos=2)
text(x[3], (1 + y[3])/2, "b-y", pos=4)
text((x[3] + 1)/2, 1, "a-x", pos=1)
text(x[3], y[3], "0", cex=4, col="gold")

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 221 KiB

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,865 @@
# L'Hospital's Rule
```{julia}
#| echo: false
import Logging
Logging.disable_logging(Logging.Info) # or e.g. Logging.Info
Logging.disable_logging(Logging.Warn)
import SymPy
function Base.show(io::IO, ::MIME"text/html", x::T) where {T <: SymPy.SymbolicObject}
println(io, "<span class=\"math-left-align\" style=\"padding-left: 4px; width:0; float:left;\"> ")
println(io, "\\[")
println(io, sympy.latex(x))
println(io, "\\]")
println(io, "</span>")
end
# hack to work around issue
import Markdown
import CalculusWithJulia
function CalculusWithJulia.WeaveSupport.ImageFile(d::Symbol, f::AbstractString, caption; kwargs...)
nm = joinpath("..", string(d), f)
u = "![$caption]($nm)"
Markdown.parse(u)
end
nothing
```
This section uses these add-on packages:
```{julia}
using CalculusWithJulia
using Plots
using SymPy
```
```{julia}
#| echo: false
#| results: "hidden"
using CalculusWithJulia.WeaveSupport
using Roots
fig_size=(800, 600)
const frontmatter = (
title = "L'Hospital's Rule",
description = "Calculus with Julia: L'Hospital's Rule",
tags = ["CalculusWithJulia", "derivatives", "l'hospital's rule"],
);
nothing
```
---
Let's return to limits of the form $\lim_{x \rightarrow c}f(x)/g(x)$ which have an indeterminate form of $0/0$ if both are evaluated at $c$. The typical example being the limit considered by Euler:
$$
\lim_{x\rightarrow 0} \frac{\sin(x)}{x}.
$$
We know this is $1$ using a bound from geometry, but might also guess this is one, as we know from linearization near $0$ that we have $\sin(x) \approx x$ or, more specifically:
$$
\sin(x) = x - \sin(\xi)x^2/2, \quad 0 < \xi < x.
$$
This would yield:
$$
\lim_{x \rightarrow 0} \frac{\sin(x)}{x} = \lim_{x\rightarrow 0} \frac{x -\sin(\xi) x^2/2}{x} = \lim_{x\rightarrow 0} 1 + \sin(\xi) \cdot x/2 = 1.
$$
This is because we know $\sin(\xi) x/2$ has a limit of $0$, when $|\xi| \leq |x|$.
That doesn't look any easier, as we worried about the error term, but if just mentally replaced $\sin(x)$ with $x$ - which it basically is near $0$ - then we can see that the limit should be the same as $x/x$ which we know is $1$ without thinking.
Basically, we found that in terms of limits, if both $f(x)$ and $g(x)$ are $0$ at $c$, that we *might* be able to just take this limit: $(f(c) + f'(c) \cdot(x-c)) / (g(c) + g'(c) \cdot (x-c))$ which is just $f'(c)/g'(c)$.
Wouldn't that be nice? We could find difficult limits just by differentiating the top and the bottom at $c$ (and not use the messy quotient rule).
Well, in fact that is more or less true, a fact that dates back to [L'Hospital](http://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule) - who wrote the first textbook on differential calculus - though this result is likely due to one of the Bernoulli brothers.
> *L'Hospital's rule*: Suppose:
>
> * that $\lim_{x\rightarrow c+} f(c) =0$ and $\lim_{x\rightarrow c+} g(c) =0$,
> * that $f$ and $g$ are differentiable in $(c,b)$, and
> * that $g(x)$ exists and is non-zero for *all* $x$ in $(c,b)$,
>
> then **if** the following limit exists: $\lim_{x\rightarrow c+}f'(x)/g'(x)=L$ it follows that $\lim_{x \rightarrow c+}f(x)/g(x) = L$.
That is *if* the right limit of $f(x)/g(x)$ is indeterminate of the form $0/0$, but the right limit of $f'(x)/g'(x)$ is known, possibly by simple continuity, then the right limit of $f(x)/g(x)$ exists and is equal to that of $f'(x)/g'(x)$.
The rule equally applies to *left limits* and *limits* at $c$. Later it will see there are other generalizations.
To apply this rule to Euler's example, $\sin(x)/x$, we just need to consider that:
$$
L = 1 = \lim_{x \rightarrow 0}\frac{\cos(x)}{1},
$$
So, as well, $\lim_{x \rightarrow 0} \sin(x)/x = 1$.
This is due to $\cos(x)$ being continuous at $0$, so this limit is just $\cos(0)/1$. (More importantly, the tangent line expansion of $\sin(x)$ at $0$ is $\sin(0) + \cos(0)x$, so that $\cos(0)$ is why this answer is as it is, but we don't need to think in terms of $\cos(0)$, but rather the tangent-line expansion, which is $\sin(x) \approx x$, as $\cos(0)$ appears as the coefficient.
:::{.callout-note}
## Note
In [Gruntz](http://www.cybertester.com/data/gruntz.pdf), in a reference attributed to Speiss, we learn that L'Hospital was a French Marquis who was taught in $1692$ the calculus of Leibniz by Johann Bernoulli. They made a contract obliging Bernoulli to leave his mathematical inventions to L'Hospital in exchange for a regular compensation. This result was discovered in $1694$ and appeared in L'Hospital's book of $1696$.
:::
##### Examples
* Consider this limit at $0$: $(a^x - 1)/x$. We have $f(x) =a^x-1$ has $f(0) = 0$, so this limit is indeterminate of the form $0/0$. The derivative of $f(x)$ is $f'(x) = a^x \log(a)$ which has $f'(0) = \log(a)$. The derivative of the bottom is also $1$ at $0$, so we have:
$$
\log(a) = \frac{\log(a)}{1} = \frac{f'(0)}{g'(0)} = \lim_{x \rightarrow 0}\frac{f'(x)}{g'(x)} = \lim_{x \rightarrow 0}\frac{f(x)}{g(x)}
= \lim_{x \rightarrow 0}\frac{a^x - 1}{x}.
$$
:::{.callout-note}
## Note
Why rewrite in the "opposite" direction? Because the theorem's result $L$ is the limit is only true if the related limit involving the derivative exists. We don't do this in the following, but did so here to emphasize the need for the limit of the ratio of the derivatives to exist.
:::
* Consider this limit:
$$
\lim_{x \rightarrow 0} \frac{e^x - e^{-x}}{x}.
$$
It too is of the indeterminate form $0/0$. The derivative of the top is $e^x + e^{-x}$, which is $2$ when $x=0$, so the ratio of $f'(0)/g'(0)$ is seen to be $2$ By continuity, the limit of the ratio of the derivatives is $2$. Then by L'Hospital's rule, the limit above is $2$.
* Sometimes, L'Hospital's rule must be applied twice. Consider this limit:
$$
\lim_{x \rightarrow 0} \frac{\cos(x)}{1 - x^2}
$$
By L'Hospital's rule *if* this following limit exists, the two will be equal:
$$
\lim_{x \rightarrow 0} \frac{-\sin(x)}{-2x}.
$$
But if we didn't guess the answer, we see that this new problem is *also* indeterminate of the form $0/0$. So, repeating the process, this new limit will exist and be equal to the following limit, should it exist:
$$
\lim_{x \rightarrow 0} \frac{-\cos(x)}{-2} = 1/2.
$$
As $L = 1/2$ for this related limit, it must also be the limit of the original problem, by L'Hospital's rule.
* Our "intuitive" limits can bump into issues. Take for example the limit of $(\sin(x)-x)/x^2$ as $x$ goes to $0$. Using $\sin(x) \approx x$ makes this look like $0/x^2$ which is still indeterminate. (Because the difference is higher order than $x$.) Using L'Hospitals, says this limit will exist (and be equal) if the following one does:
$$
\lim_{x \rightarrow 0} \frac{\cos(x) - 1}{2x}.
$$
This particular limit is indeterminate of the form $0/0$, so we again try L'Hospital's rule and consider
$$
\lim_{x \rightarrow 0} \frac{-\sin(x)}{2} = 0
$$
So as this limit exists, working backwards, the original limit in question will also be $0$.
* This example comes from the Wikipedia page. It "proves" a discrete approximation for the second derivative.
Show if $f''(x)$ exists at $c$ and is continuous at $c$, then
$$
f''(c) = \lim_{h \rightarrow 0} \frac{f(c + h) - 2f(c) + f(c-h)}{h^2}.
$$
This will follow from two applications of L'Hospital's rule to the right-hand side. The first says, the limit on the right is equal to this limit, should it exist:
$$
\lim_{h \rightarrow 0} \frac{f'(c+h) - 0 - f'(c-h)}{2h}.
$$
We have to be careful, as we differentiate in the $h$ variable, not the $c$ one, so the chain rule brings out the minus sign. But again, as we still have an indeterminate form $0/0$, this limit will equal the following limit should it exist:
$$
\lim_{h \rightarrow 0} \frac{f''(c+h) - 0 - (-f''(c-h))}{2} =
\lim_{c \rightarrow 0}\frac{f''(c+h) + f''(c-h)}{2} = f''(c).
$$
That last equality follows, as it is assumed that $f''(x)$ exists at $c$ and is continuous, that is, $f''(c \pm h) \rightarrow f''(c)$.
The expression above finds use when second derivatives are numerically approximated. (The middle expression is the basis of the central-finite difference approximation to the derivative.)
* L'Hospital himself was interested in this limit for $a > 0$ ([math overflow](http://mathoverflow.net/questions/51685/how-did-bernoulli-prove-lh%C3%B4pitals-rule))
$$
\lim_{x \rightarrow a} \frac{\sqrt{2a^3\cdot x-x^4} - a\cdot(a^2\cdot x)^{1/3}}{ a - (a\cdot x^3)^{1/4}}.
$$
These derivatives can be done by hand, but to avoid any minor mistakes we utilize `SymPy` taking care to use rational numbers for the fractional powers, so as not to lose precision through floating point roundoff:
```{julia}
@syms a::positive x::positive
f(x) = sqrt(2a^3*x - x^4) - a * (a^2*x)^(1//3)
g(x) = a - (a*x^3)^(1//4)
```
We can see that at $x=a$ we have the indeterminate form $0/0$:
```{julia}
f(a), g(a)
```
What about the derivatives?
```{julia}
fp, gp = diff(f(x),x), diff(g(x),x)
fp(x=>a), gp(x=>a)
```
Their ratio will not be indeterminate, so the limit in question is just the ratio:
```{julia}
fp(x=>a) / gp(x=>a)
```
Of course, we could have just relied on `limit`, which knows about L'Hospital's rule:
```{julia}
limit(f(x)/g(x), x, a)
```
## Idea behind L'Hospital's rule
A first proof of L'Hospital's rule takes advantage of Cauchy's [generalization](http://en.wikipedia.org/wiki/Mean_value_theorem#Cauchy.27s_mean_value_theorem) of the mean value theorem to two functions. Suppose $f(x)$ and $g(x)$ are continuous on $[c,b]$ and differentiable on $(c,b)$. On $(c,x)$, $c < x < b$ there exists a $\xi$ with $f'(\xi) \cdot (f(x) - f(c)) = g'(\xi) \cdot (g(x) - g(c))$. In our formulation, both $f(c)$ and $g(c)$ are zero, so we have, provided we know that $g(x)$ is non zero, that $f(x)/g(x) = f'(\xi)/g'(\xi)$ for some $\xi$, $c < \xi < c + x$. That the right-hand side has a limit as $x \rightarrow c+$ is true by the assumption that the limit of the ratio of the derivatives exists. (The $\xi$ part can be removed by considering it as a composition of a function going to $c$.) Thus the right limit of the ratio $f/g$ is known.
---
```{julia}
#| echo: false
#| cache: true
let
## {{{lhopitals_picture}}}
function lhopitals_picture_graph(n)
g = (x) -> sqrt(1 + x) - 1 - x^2
f = (x) -> x^2
ts = range(-1/2, stop=1/2, length=50)
a, b = 0, 1/2^n * 1/2
m = (f(b)-f(a)) / (g(b)-g(a))
## get bounds
tl = (x) -> g(0) + m * (x - f(0))
lx = max(find_zero(x -> tl(x) - (-0.05), (-1000, 1000)), -0.6)
rx = min(find_zero(x -> tl(x) - (0.25), (-1000, 1000)), 0.2)
xs = [lx, rx]
ys = map(tl, xs)
plt = plot(g, f, -1/2, 1/2, legend=false, size=fig_size, xlim=(-.6, .5), ylim=(-.1, .3))
plot!(plt, xs, ys, color=:orange)
scatter!(plt, [g(a),g(b)], [f(a),f(b)], markersize=5, color=:orange)
plt
end
caption = L"""
Geometric interpretation of ``L=\lim_{x \rightarrow 0} x^2 / (\sqrt{1 +
x} - 1 - x^2)``. At ``0`` this limit is indeterminate of the form
``0/0``. The value for a fixed ``x`` can be seen as the slope of a secant
line of a parametric plot of the two functions, plotted as ``(g,
f)``. In this figure, the limiting "tangent" line has ``0`` slope,
corresponding to the limit ``L``. In general, L'Hospital's rule is
nothing more than a statement about slopes of tangent lines.
"""
n = 6
anim = @animate for i=1:n
lhopitals_picture_graph(i)
end
imgfile = tempname() * ".gif"
gif(anim, imgfile, fps = 1)
plotly()
ImageFile(imgfile, caption)
end
```
## Generalizations
L'Hospital's rule generalizes to other indeterminate forms, in particular the indeterminate form $\infty/\infty$ can be proved at the same time as $0/0$ with a more careful [proof](http://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule#General_proof).
The value $c$ in the limit can also be infinite. Consider this case with $c=\infty$:
$$
\begin{align*}
\lim_{x \rightarrow \infty} \frac{f(x)}{g(x)} &=
\lim_{x \rightarrow 0} \frac{f(1/x)}{g(1/x)}
\end{align*}
$$
L'Hospital's limit applies as $x \rightarrow 0$, so we differentiate to get:
$$
\begin{align*}
\lim_{x \rightarrow 0} \frac{[f(1/x)]'}{[g(1/x)]'}
&= \lim_{x \rightarrow 0} \frac{f'(1/x)\cdot(-1/x^2)}{g'(1/x)\cdot(-1/x^2)}\\
&= \lim_{x \rightarrow 0} \frac{f'(1/x)}{g'(1/x)}\\
&= \lim_{x \rightarrow \infty} \frac{f'(x)}{g'(x)},
\end{align*}
$$
*assuming* the latter limit exists, L'Hospital's rule assures the equality
$$
\lim_{x \rightarrow \infty} \frac{f(x)}{g(x)} =
\lim_{x \rightarrow \infty} \frac{f'(x)}{g'(x)},
$$
##### Examples
For example, consider
$$
\lim_{x \rightarrow \infty} \frac{x}{e^x}.
$$
We see it is of the form $\infty/\infty$. Taking advantage of the fact that L'Hospital's rule applies to limits at $\infty$, we have that this limit will exist and be equal to this one, should it exist:
$$
\lim_{x \rightarrow \infty} \frac{1}{e^x}.
$$
This limit is, of course, $0$, as it is of the form $1/\infty$. It is not hard to build up from here to show that for any integer value of $n>0$ that:
$$
\lim_{x \rightarrow \infty} \frac{x^n}{e^x} = 0.
$$
This is an expression of the fact that exponential functions grow faster than polynomial functions.
Similarly, powers grow faster than logarithms, as this limit shows, which is indeterminate of the form $\infty/\infty$:
$$
\lim_{x \rightarrow \infty} \frac{\log(x)}{x} =
\lim_{x \rightarrow \infty} \frac{1/x}{1} = 0,
$$
the first equality by L'Hospital's rule, as the second limit exists.
## Other indeterminate forms
Indeterminate forms of the type $0 \cdot \infty$, $0^0$, $\infty^\infty$, $\infty - \infty$ can be re-expressed to be in the form $0/0$ or $\infty/\infty$ and then L'Hospital's theorem can be applied.
###### Example: rewriting $0 \cdot \infty$
What is the limit $x \log(x)$ as $x \rightarrow 0+$? The form is $0\cdot \infty$, rewriting, we see this is just:
$$
\lim_{x \rightarrow 0+}\frac{\log(x)}{1/x}.
$$
L'Hospital's rule clearly applies to one-sided limits, as well as two (our proof sketch used one-sided limits), so this limit will equal the following, should it exist:
$$
\lim_{x \rightarrow 0+}\frac{1/x}{-1/x^2} = \lim_{x \rightarrow 0+} -x = 0.
$$
###### Example: rewriting $0^0$
What is the limit $x^x$ as $x \rightarrow 0+$? The expression is of the form $0^0$, which is indeterminate. (Even though floating point math defines the value as $1$.) We can rewrite this by taking a log:
$$
x^x = \exp(\log(x^x)) = \exp(x \log(x)) = \exp(\log(x)/(1/x)).
$$
Be just saw that $\lim_{x \rightarrow 0+}\log(x)/(1/x) = 0$. So by the rules for limits of compositions and the fact that $e^x$ is continuous, we see $\lim_{x \rightarrow 0+} x^x = e^0 = 1$.
##### Example: rewriting $\infty - \infty$
A limit $\lim_{x \rightarrow c} f(x) - g(x)$ of indeterminate form $\infty - \infty$ can be reexpressed to be of the from $0/0$ through the transformation:
$$
\begin{align*}
f(x) - g(x) &= f(x)g(x) \cdot (\frac{1}{g(x)} - \frac{1}{f(x)}) \\
&= \frac{\frac{1}{g(x)} - \frac{1}{f(x)}}{\frac{1}{f(x)g(x)}}.
\end{align*}
$$
Applying this to
$$
L = \lim_{x \rightarrow 1} \big(\frac{x}{x-1} - \frac{1}{\log(x)}\big)
$$
We get that $L$ is equal to the following limit:
$$
\lim_{x \rightarrow 1} \frac{\log(x) - \frac{x-1}{x}}{\frac{x-1}{x} \log(x)}
=
\lim_{x \rightarrow 1} \frac{x\log(x)-(x-1)}{(x-1)\log(x)}
$$
In `SymPy` we have:
```{julia}
𝒇 = x*log(x) - (x-1)
𝒈 = (x-1)*log(x)
𝒇(1), 𝒈(1)
```
L'Hospital's rule applies to the form $0/0$, so we try:
```{julia}
𝒇 = diff(𝒇, x)
𝒈 = diff(𝒈, x)
𝒇(1), 𝒈(1)
```
Again, we get the indeterminate form $0/0$, so we try again with second derivatives:
```{julia}
𝒇 = diff(𝒇, x, x)
𝒈 = diff(𝒈, x, x)
𝒇(1), 𝒈(1)
```
From this we see the limit is $1/2$, as could have been done directly:
```{julia}
limit(𝒇/𝒈, x=>1)
```
## The assumptions are necessary
##### Example: the limit existing is necessary
The following limit is *easily* seen by comparing terms of largest growth:
$$
1 = \lim_{x \rightarrow \infty} \frac{x - \sin(x)}{x}
$$
However, the limit of the ratio of the derivatives *does* not exist:
$$
\lim_{x \rightarrow \infty} \frac{1 - \cos(x)}{1},
$$
as the function just oscillates. This shows that L'Hospital's rule does not apply when the limit of the the ratio of the derivatives does not exist.
##### Example: the assumptions matter
This example comes from the thesis of Gruntz to highlight possible issues when computer systems do simplifications.
Consider:
$$
\lim_{x \rightarrow \infty} \frac{1/2\sin(2x) +x}{\exp(\sin(x))\cdot(\cos(x)\sin(x)+x)}.
$$
If we apply L'Hospital's rule using simplification we have:
```{julia}
u(x) = 1//2*sin(2x) + x
v(x) = exp(sin(x))*(cos(x)*sin(x) + x)
up, vp = diff(u(x),x), diff(v(x),x)
limit(simplify(up/vp), x => oo)
```
However, this answer is incorrect. The reason being subtle. The simplification cancels a term of $\cos(x)$ that appears in the numerator and denominator. Before cancellation, we have `vp` will have infinitely many zero's as $x$ approaches $\infty$ so L'Hospital's won't apply (the limit won't exist, as every $2\pi$ the ratio is undefined so the function is never eventually close to some $L$).
This ratio has no limit, as it oscillates, as confirmed by `SymPy`:
```{julia}
limit(u(x)/v(x), x=> oo)
```
## Questions
###### Question
This function $f(x) = \sin(5x)/x$ is *indeterminate* at $x=0$. What type?
```{julia}
#| echo: false
lh_choices = [
"``0/0``",
"``\\infty/\\infty``",
"``0^0``",
"``\\infty - \\infty``",
"``0 \\cdot \\infty``"
]
nothing
```
```{julia}
#| hold: true
#| echo: false
answ = 1
radioq(lh_choices, answ, keep_order=true)
```
###### Question
This function $f(x) = \sin(x)^{\sin(x)}$ is *indeterminate* at $x=0$. What type?
```{julia}
#| hold: true
#| echo: false
answ =3
radioq(lh_choices, answ, keep_order=true)
```
###### Question
This function $f(x) = (x-2)/(x^2 - 4)$ is *indeterminate* at $x=2$. What type?
```{julia}
#| hold: true
#| echo: false
answ = 1
radioq(lh_choices, answ, keep_order=true)
```
###### Question
This function $f(x) = (g(x+h) - g(x-h)) / (2h)$ ($g$ is continuous) is *indeterminate* at $h=0$. What type?
```{julia}
#| hold: true
#| echo: false
answ = 1
radioq(lh_choices, answ, keep_order=true)
```
###### Question
This function $f(x) = x \log(x)$ is *indeterminate* at $x=0$. What type?
```{julia}
#| hold: true
#| echo: false
answ = 5
radioq(lh_choices, answ, keep_order=true)
```
###### Question
Does L'Hospital's rule apply to this limit:
$$
\lim_{x \rightarrow \pi} \frac{\sin(\pi x)}{\pi x}.
$$
```{julia}
#| hold: true
#| echo: false
choices = [
"Yes. It is of the form ``0/0``",
"No. It is not indeterminate"
]
answ = 2
radioq(choices, answ)
```
###### Question
Use L'Hospital's rule to find the limit
$$
L = \lim_{x \rightarrow 0} \frac{4x - \sin(x)}{x}.
$$
What is $L$?
```{julia}
#| hold: true
#| echo: false
f(x) = (4x - sin(x))/x
L = float(N(limit(f, 0)))
numericq(L)
```
###### Question
Use L'Hospital's rule to find the limit
$$
L = \lim_{x \rightarrow 0} \frac{\sqrt{1+x} - 1}{x}.
$$
What is $L$?
```{julia}
#| hold: true
#| echo: false
f(x) = (sqrt(1+x) - 1)/x
L = float(N(limit(f, 0)))
numericq(L)
```
###### Question
Use L'Hospital's rule *one* or more times to find the limit
$$
L = \lim_{x \rightarrow 0} \frac{x - \sin(x)}{x^3}.
$$
What is $L$?
```{julia}
#| hold: true
#| echo: false
f(x) = (x - sin(x))/x^3
L = float(N(limit(f, 0)))
numericq(L)
```
###### Question
Use L'Hospital's rule *one* or more times to find the limit
$$
L = \lim_{x \rightarrow 0} \frac{1 - x^2/2 - \cos(x)}{x^3}.
$$
What is $L$?
```{julia}
#| hold: true
#| echo: false
f(x) = (1 - x^2/2 - cos(x))/x^3
L = float(N(limit(f, 0)))
numericq(L)
```
###### Question
Use L'Hospital's rule *one* or more times to find the limit
$$
L = \lim_{x \rightarrow \infty} \frac{\log(\log(x))}{\log(x)}.
$$
What is $L$?
```{julia}
#| hold: true
#| echo: false
f(x) = log(log(x))/log(x)
L = N(limit(f(x), x=> oo))
numericq(L)
```
###### Question
By using a common denominator to rewrite this expression, use L'Hospital's rule to find the limit
$$
L = \lim_{x \rightarrow 0} \frac{1}{x} - \frac{1}{\sin(x)}.
$$
What is $L$?
```{julia}
#| hold: true
#| echo: false
f(x) = 1/x - 1/sin(x)
L = float(N(limit(f, 0)))
numericq(L)
```
##### Question
Use L'Hospital's rule to find the limit
$$
L = \lim_{x \rightarrow \infty} \log(x)/x
$$
What is $L$?
```{julia}
#| hold: true
#| echo: false
L = float(N(limit(log(x)/x, x=>oo)))
numericq(L)
```
##### Question
Using L'Hospital's rule, does
$$
\lim_{x \rightarrow 0+} x^{\log(x)}
$$
exist?
Consider $x^{\log(x)} = e^{\log(x)\log(x)}$.
```{julia}
#| hold: true
#| echo: false
yesnoq(false)
```
##### Question
Using L'Hospital's rule, find the limit of
$$
\lim_{x \rightarrow 1} (2-x)^{\tan(\pi/2 \cdot x)}.
$$
(Hint, express as $\exp^{\tan(\pi/2 \cdot x) \cdot \log(2-x)}$ and take the limit of the resulting exponent.)
```{julia}
#| hold: true
#| echo: false
choices = [
"``e^{2/\\pi}``",
"``{2\\pi}``",
"``1``",
"``0``",
"It does not exist"
]
answ = 1
radioq(choices, answ)
```

View File

@@ -0,0 +1,890 @@
# Linearization
```{julia}
#| echo: false
import Logging
Logging.disable_logging(Logging.Info) # or e.g. Logging.Info
Logging.disable_logging(Logging.Warn)
import SymPy
function Base.show(io::IO, ::MIME"text/html", x::T) where {T <: SymPy.SymbolicObject}
println(io, "<span class=\"math-left-align\" style=\"padding-left: 4px; width:0; float:left;\"> ")
println(io, "\\[")
println(io, sympy.latex(x))
println(io, "\\]")
println(io, "</span>")
end
# hack to work around issue
import Markdown
import CalculusWithJulia
function CalculusWithJulia.WeaveSupport.ImageFile(d::Symbol, f::AbstractString, caption; kwargs...)
nm = joinpath("..", string(d), f)
u = "![$caption]($nm)"
Markdown.parse(u)
end
nothing
```
This section uses these add-on packages:
```{julia}
using CalculusWithJulia
using Plots
using SymPy
using TaylorSeries
using DualNumbers
```
```{julia}
#| echo: false
#| results: "hidden"
using CalculusWithJulia.WeaveSupport
const frontmatter = (
title = "Linearization",
description = "Calculus with Julia: Linearization",
tags = ["CalculusWithJulia", "derivatives", "linearization"],
);
nothing
```
---
The derivative of $f(x)$ has the interpretation as the slope of the tangent line. The tangent line is the line that best approximates the function at the point.
Using the point-slope form of a line, we see that the tangent line to the graph of $f(x)$ at $(c,f(c))$ is given by:
$$
y = f(c) + f'(c) \cdot (x - c).
$$
This is written as an equation, though we prefer to work with functions within `Julia`. Here we write such a function as an operator - it takes a function `f` and returns a function representing the tangent line.
```{julia}
#| eval: false
tangent(f, c) = x -> f(c) + f'(c) * (x - c)
```
(Recall, the `->` indicates that an anonymous function is being generated.)
This function along with the `f'` notation for automatic derivatives is defined in the `CalculusWithJulia` package.
We make some graphs with tangent lines:
```{julia}
#| hold: true
f(x) = x^2
plot(f, -3, 3)
plot!(tangent(f, -1))
plot!(tangent(f, 2))
```
The graph shows that near the point, the line and function are close, but this need not be the case away from the point. We can express this informally as
$$
f(x) \approx f(c) + f'(c) \cdot (x-c)
$$
with the understanding this applies for $x$ "close" to $c$.
Usually for the applications herein, instead of $x$ and $c$ the two points are $x+\Delta_x$ and $x$. This gives:
> *Linearization*: $\Delta_y = f(x +\Delta_x) - f(x) \approx f'(x) \Delta_x$, for small $\Delta_x$.
This section gives some implications of this fact and quantifies what "close" can mean.
##### Example
There are several approximations that are well known in physics, due to their widespread usage:
* That $\sin(x) \approx x$ around $x=0$:
```{julia}
#| hold: true
plot(sin, -pi/2, pi/2)
plot!(tangent(sin, 0))
```
Symbolically:
```{julia}
#| hold: true
@syms x
c = 0
f(x) = sin(x)
f(c) + diff(f(x),x)(c) * (x - c)
```
* That $\log(1 + x) \approx x$ around $x=0$:
```{julia}
#| hold: true
f(x) = log(1 + x)
plot(f, -1/2, 1/2)
plot!(tangent(f, 0))
```
Symbolically:
```{julia}
#| hold: true
@syms x
c = 0
f(x) = log(1 + x)
f(c) + diff(f(x),x)(c) * (x - c)
```
(The `log1p` function implements a more accurate version of this function when numeric values are needed.)
* That $1/(1-x) \approx x$ around $x=0$:
```{julia}
#| hold: true
f(x) = 1/(1-x)
plot(f, -1/2, 1/2)
plot!(tangent(f, 0))
```
Symbolically:
```{julia}
#| hold: true
@syms x
c = 0
f(x) = 1 / (1 - x)
f(c) + diff(f(x),x)(c) * (x - c)
```
* That $(1+x)^n \approx 1 + nx$ around $x = 0$. For example, with $n=5$
```{julia}
#| hold: true
n = 5
f(x) = (1+x)^n # f'(0) = n = n(1+x)^(n-1) at x=0
plot(f, -1/2, 1/2)
plot!(tangent(f, 0))
```
Symbolically:
```{julia}
#| hold: true
@syms x, n::real
c = 0
f(x) = (1 + x)^n
f(c) + diff(f(x),x)(x=>c) * (x - c)
```
---
In each of these cases, a more complicated non-linear function is well approximated in a region of interest by a simple linear function.
## Numeric approximations
```{julia}
#| hold: true
#| echo: false
f(x) = sin(x)
a, b = -1/4, pi/2
p = plot(f, a, b, legend=false);
plot!(p, x->x, a, b);
plot!(p, [0,1,1], [0, 0, 1], color=:brown);
plot!(p, [1,1], [0, sin(1)], color=:green, linewidth=4);
annotate!(p, collect(zip([1/2, 1+.075, 1/2-1/8], [.05, sin(1)/2, .75], ["Δx", "Δy", "m=dy/dx"])));
p
```
The plot shows the tangent line with slope $dy/dx$ and the actual change in $y$, $\Delta y$, for some specified $\Delta x$. The small gap above the sine curve is the error were the value of the sine approximated using the drawn tangent line. We can see that approximating the value of $\Delta y = \sin(c+\Delta x) - \sin(c)$ with the often easier to compute $(dy/dx) \cdot \Delta x = f'(c)\Delta x$ - for small enough values of $\Delta x$ - is not going to be too far off provided $\Delta x$ is not too large.
This approximation is known as linearization. It can be used both in theoretical computations and in pratical applications. To see how effective it is, we look at some examples.
##### Example
If $f(x) = \sin(x)$, $c=0$ and $\Delta x= 0.1$ then the values for the actual change in the function values and the value of $\Delta y$ are:
```{julia}
f(x) = sin(x)
c, deltax = 0, 0.1
f(c + deltax) - f(c), f'(c) * deltax
```
The values are pretty close. But what is $0.1$ radians? Lets use degrees. Suppose we have $\Delta x = 10^\circ$:
```{julia}
deltax⁰ = 10*pi/180
actual = f(c + deltax⁰) - f(c)
approx = f'(c) * deltax⁰
actual, approx
```
They agree until the third decimal value. The *percentage error* is just $1/2$ a percent:
```{julia}
(approx - actual) / actual * 100
```
### Relative error or relative change
The relative error is defined by
$$
\big| \frac{\text{actual} - \text{approximate}}{\text{actual}} \big|.
$$
However, typically with linearization, we talk about the *relative change*, not relative error, as the denominator is easier to compute. This is
$$
\frac{f(x + \Delta_x) - f(x)}{f(x)} = \frac{\Delta_y}{f(x)} \approx
\frac{f'(x) \cdot \Delta_x}{f(x)}
$$
The *percentage change* multiplies by $100$.
##### Example
What is the relative change in surface area of a sphere if the radius changes from $r$ to $r + dr$?
We have $S = 4\pi r^2$ so the approximate relative change, $dy/S$ is given, using the derivative $dS/dr = 8\pi r$, by
$$
\frac{8\pi\cdot r\cdot dr}{4\pi r^2} = 2r\cdot dr.
$$
##### Example
We are traveling $60$ miles. At $60$ miles an hour, we will take $60$ minutes (or one hour). How long will it take at $70$ miles an hour? (Assume you can't divide, but, instead, can only multiply!)
Well the answer is $60/70$ hours or $60/70 \cdot 60$ minutes. But we can't divide, so we turn this into a multiplication problem via some algebra:
$$
\frac{60}{70} = \frac{60}{60 + 10} = \frac{1}{1 + 10/60} = \frac{1}{1 + 1/6}.
$$
Okay, so far no calculator was needed. We wrote $70 = 60 + 10$, as we know that $60/60$ is just $1$. This almost gets us there. If we really don't want to divide, we can get an answer by using the tangent line approximation for $1/(1+x)$ around $x=0$. This is $1/(1+x) \approx 1 - x$. (You can check by finding that $f'(0) = -1$.) Thus, our answer is approximately $5/6$ of an hour or 50 minutes.
How much in error are we?
```{julia}
abs(50 - 60/70*60) / (60/70*60) * 100
```
That's about $3$ percent. Not bad considering we could have done all the above in our head while driving without taking our eyes off the road to use the calculator on our phone for a division.
##### Example
A $10$cm by $10$cm by $10$cm cube will contain $1$ liter ($1000$cm$^3$). In manufacturing such a cube, the side lengths are actually $10.1$ cm. What will be the volume in liters? Compute this with a linear approximation to $(10.1)^3$.
Here $f(x) = x^3$ and we are asked to approximate $f(10.1)$. Letting $c=10$, we have:
$$
f(c + \Delta) \approx f(c) + f'(c) \cdot \Delta = 1000 + f'(c) \cdot (0.1)
$$
Computing the derivative can be done easily, we get for our answer:
```{julia}
fp(x) = 3*x^2
c₀, Delta = 10, 0.1
approx₀ = 1000 + fp(c₀) * Delta
```
This is a relative error as a percent of:
```{julia}
actual₀ = 10.1^3
(actual₀ - approx₀)/actual₀ * 100
```
The manufacturer may be interested instead in comparing the volume of the actual object to the $1$ liter target. They might use the approximate value for this comparison, which would yield:
```{julia}
(1000 - approx₀)/approx₀ * 100
```
This is off by about $3$ percent. Not so bad for some applications, devastating for others.
##### Example: Eratosthenes and the circumference of the earth
[Eratosthenes](https://en.wikipedia.org/wiki/Eratosthenes) is said to have been the first person to estimate the radius (or by relation the circumference) of the earth. The basic idea is based on the difference of shadows cast by the sun. Suppose Eratosthenes sized the circumference as $252,000$ *stadia*. Taking $1$`stadia as``160``meters and the actual radius of the earth as``6378.137``kilometers, we can convert to see that Eratosthenes estimated the radius as``6417``.
If Eratosthenes were to have estimated the volume of a spherical earth, what would be his approximate percentage change between his estimate and the actual?
Using $V = 4/3 \pi r^3$ we get $V' = 4\pi r^2$:
```{julia}
rₑ = 6417
rₐ = 6378.137
Δᵣ = rₑ - rₐ
Vₛ(r) = 4/3 * pi * r^3
Δᵥ = Vₛ'(rₑ) * Δᵣ
Δᵥ / Vₛ(rₑ) * 100
```
##### Example: a simple pendulum
A *simple* pendulum is comprised of a massless "bob" on a rigid "rod" of length $l$. The rod swings back and forth making an angle $\theta$ with the perpendicular. At rest $\theta=0$, here we have $\theta$ swinging with $\lvert\theta\rvert \leq \theta_0$ for some $\theta_0$.
According to [Wikipedia](http://tinyurl.com/yz5sz7e) - and many introductory physics book - while swinging, the angle $\theta$ varies with time following this equation:
$$
\theta''(t) + \frac{g}{l} \sin(\theta(t)) = 0.
$$
That is, the second derivative of $\theta$ is proportional to the sine of $\theta$ where the proportionality constant involves $g$ from gravity and the length of the "rod."
This would be much easier if the second derivative were proportional to the angle $\theta$ and not its sine.
[Huygens](http://en.wikipedia.org/wiki/Christiaan_Huygens) used the approximation of $\sin(x) \approx x$, noted above, to say that when the angle is not too big, we have the pendulum's swing obeying $\theta''(t) = -g/l \cdot t$. Without getting too involved in why, we can verify by taking two derivatives that $\theta_0\sin(\sqrt{g/l}\cdot t)$ will be a solution to this modified equation.
With this solution, the motion is periodic with constant amplitude (assuming frictionless behaviour), as the sine function is. More surprisingly, the period is found from $T = 2\pi/(\sqrt{g/l}) = 2\pi \sqrt{l/g}$. It depends on $l$ - longer "rods" take more time to swing back and forth - but does not depend on the how wide the pendulum is swinging between (provided $\theta_0$ is not so big the approximation of $\sin(x) \approx x$ fails). This latter fact may be surprising, though not to Galileo who discovered it.
## Differentials
The Leibniz notation for a derivative is $dy/dx$ indicating the change in $y$ as $x$ changes. It proves convenient to decouple this using *differentials* $dx$ and $dy$. What do these notations mean? They measure change along the tangent line in same way $\Delta_x$ and $\Delta_y$ measure change for the function. The differential $dy$ depends on both $x$ and $dx$, it being defined by $dy=f'(x)dx$. As tangent lines locally represent a function, $dy$ and $dx$ are often associated with an *infinitesimal* difference.
Taking $dx = \Delta_x$, as in the previous graphic, we can compare $dy$ the change along the tangent line given by $dy/dx \cdot dx$ and $\Delta_y$ the change along the function given by $f(x + \Delta_x) - f(x)$. The linear approximation, $f(x + \Delta_x) - f(x)\approx f'(x)dx$, says that
$$
\Delta_y \approx dy; \quad \text{ when } \Delta_x = dx
$$
## The error in approximation
How good is the approximation? Graphically we can see it is pretty good for the graphs we choose, but are there graphs out there for which the approximation is not so good? Of course. However, we can say this (the [Lagrange](http://en.wikipedia.org/wiki/Taylor%27s_theorem) form of a more general Taylor remainder theorem):
> Let $f(x)$ be twice differentiable on $I=(a,b)$, $f$ is continuous on $[a,b]$, and $a < c < b$. Then for any $x$ in $I$, there exists some value $\xi$ between $c$ and $x$ such that $f(x) = f(c) + f'(c)(x-c) + (f''(\xi)/2)\cdot(x-c)^2$.
That is, the error is basically a constant depending on the concavity of $f$ times a quadratic function centered at $c$.
For $\sin(x)$ at $c=0$ we get $\lvert\sin(x) - x\rvert = \lvert-\sin(\xi)\cdot x^2/2\rvert$. Since $\lvert\sin(\xi)\rvert \leq 1$, we must have this bound: $\lvert\sin(x) - x\rvert \leq x^2/2$.
Can we verify? Let's do so graphically:
```{julia}
#| hold: true
h(x) = abs(sin(x) - x)
g(x) = x^2/2
plot(h, -2, 2, label="h")
plot!(g, -2, 2, label="f")
```
The graph shows a tight bound near $0$ and then a bound over this viewing window.
Similarly, for $f(x) = \log(1 + x)$ we have the following at $c=0$:
$$
f'(x) = 1/(1+x), \quad f''(x) = -1/(1+x)^2.
$$
So, as $f(c)=0$ and $f'(c) = 1$, we have
$$
\lvert f(x) - x\rvert \leq \lvert f''(\xi)\rvert \cdot \frac{x^2}{2}
$$
We see that $\lvert f''(x)\rvert$ is decreasing for $x > -1$. So if $-1 < x < c$ we have
$$
\lvert f(x) - x\rvert \leq \lvert f''(x)\rvert \cdot \frac{x^2}{2} = \frac{x^2}{2(1+x)^2}.
$$
And for $c=0 < x$, we have
$$
\lvert f(x) - x\rvert \leq \lvert f''(0)\rvert \cdot \frac{x^2}{2} = x^2/2.
$$
Plotting we verify the bound on $|\log(1+x)-x|$:
```{julia}
#| hold: true
h(x) = abs(log(1+x) - x)
g(x) = x < 0 ? x^2/(2*(1+x)^2) : x^2/2
plot(h, -0.5, 2, label="h")
plot!(g, -0.5, 2, label="g")
```
Again, we see the very close bound near $0$, which widens at the edges of the viewing window.
### Why is the remainder term as it is?
To see formally why the remainder is as it is, we recall the mean value theorem in the extended form of Cauchy. Suppose $c=0$, $x > 0$, and let $h(x) = f(x) - (f(0) + f'(0) x)$ and $g(x) = x^2$. Then we have that there exists a $e$ with $0 < e < x$ such that
$$
\text{error} = h(x) - h(0) = (g(x) - g(0)) \frac{h'(e)}{g'(e)} = x^2 \cdot \frac{1}{2} \cdot \frac{f'(e) - f'(0)}{e} =
x^2 \cdot \frac{1}{2} \cdot f''(\xi).
$$
The value of $\xi$, from the mean value theorem applied to $f'(x)$, satisfies $0 < \xi < e < x$, so is in $[0,x].$
### The big (and small) "oh"
`SymPy` can find the tangent line expression as a special case of its `series` function (which implements [Taylor series](../taylor_series_polynomials.html)). The `series` function needs an expression to approximate; a variable specified, as there may be parameters in the expression; a value $c$ for *where* the expansion is taken, with default $0$; and a number of terms, for this example $2$ for a constant and linear term. (There is also an optional `dir` argument for one-sided expansions.)
Here we see the answer provided for $e^{\sin(x)}$:
```{julia}
@syms x
series(exp(sin(x)), x, 0, 2)
```
The expression $1 + x$ comes from the fact that `exp(sin(0))` is $1$, and the derivative `exp(sin(0)) * cos(0)` is *also* $1$. But what is the $\mathcal{O}(x^2)$?
We know the answer is *precisely* $f''(\xi)/2 \cdot x^2$ for some $\xi$, but were we only concerned about the scale as $x$ goes to zero that when $f''$ is continuous that the error when divided by $x^2$ goes to some finite value ($f''(0)/2$). More generally, if the error divided by $x^2$ is *bounded* as $x$ goes to $0$, then we say the error is "big oh" of $x^2$.
The [big](http://en.wikipedia.org/wiki/Big_O_notation) "oh" notation, $f(x) = \mathcal{O}(g(x))$, says that the ratio $f(x)/g(x)$ is bounded as $x$ goes to $0$ (or some other value $c$, depending on the context). A little "oh" (e.g., $f(x) = \mathcal{o}(g(x))$) would mean that the limit $f(x)/g(x)$ would be $0$, as $x\rightarrow 0$, a much stronger assertion.
Big "oh" and little "oh" give us a sense of how good an approximation is without being bogged down in the details of the exact value. As such they are useful guides in focusing on what is primary and what is secondary. Applying this to our case, we have this rough form of the tangent line approximation valid for functions having a continuous second derivative at $c$:
$$
f(x) = f(c) + f'(c)(x-c) + \mathcal{O}((x-c)^2).
$$
##### Example: the algebra of tangent line approximations
Suppose $f(x)$ and $g(x)$ are represented by their tangent lines about $c$, respectively:
$$
\begin{align*}
f(x) &= f(c) + f'(c)(x-c) + \mathcal{O}((x-c)^2), \\
g(x) &= g(c) + g'(c)(x-c) + \mathcal{O}((x-c)^2).
\end{align*}
$$
Consider the sum, after rearranging we have:
$$
\begin{align*}
f(x) + g(x) &= \left(f(c) + f'(c)(x-c) + \mathcal{O}((x-c)^2)\right) + \left(g(c) + g'(c)(x-c) + \mathcal{O}((x-c)^2)\right)\\
&= \left(f(c) + g(c)\right) + \left(f'(c)+g'(c)\right)(x-c) + \mathcal{O}((x-c)^2).
\end{align*}
$$
The two big "Oh" terms become just one as the sum of a constant times $(x-c)^2$ plus a constant time $(x-c)^2$ is just some other constant times $(x-c)^2$. What we can read off from this is the term multiplying $(x-c)$ is just the derivative of $f(x) + g(x)$ (from the sum rule), so this too is a tangent line approximation.
Is it a coincidence that a basic algebraic operation with tangent lines approximations produces a tangent line approximation? Let's try multiplication:
$$
\begin{align*}
f(x) \cdot g(x) &= [f(c) + f'(c)(x-c) + \mathcal{O}((x-c)^2)] \cdot [g(c) + g'(c)(x-c) + \mathcal{O}((x-c)^2)]\\
&=[(f(c) + f'(c)(x-c)] \cdot [g(c) + g'(c)(x-c)] + (f(c) + f'(c)(x-c) \cdot \mathcal{O}((x-c)^2)) + g(c) + g'(c)(x-c) \cdot \mathcal{O}((x-c)^2)) + [\mathcal{O}((x-c)^2))]^2\\
&= [(f(c) + f'(c)(x-c)] \cdot [g(c) + g'(c)(x-c)] + \mathcal{O}((x-c)^2)\\
&= f(c) \cdot g(c) + [f'(c)\cdot g(c) + f(c)\cdot g'(c)] \cdot (x-c) + [f'(c)\cdot g'(c) \cdot (x-c)^2 + \mathcal{O}((x-c)^2)] \\
&= f(c) \cdot g(c) + [f'(c)\cdot g(c) + f(c)\cdot g'(c)] \cdot (x-c) + \mathcal{O}((x-c)^2)
\end{align*}
$$
The big "oh" notation just sweeps up many things including any products of it *and* the term $f'(c)\cdot g'(c) \cdot (x-c)^2$. Again, we see from the product rule that this is just a tangent line approximation for $f(x) \cdot g(x)$.
The basic mathematical operations involving tangent lines can be computed just using the tangent lines when the desired accuracy is at the tangent line level. This is even true for composition, though there the outer and inner functions may have different "$c$"s.
Knowing this can simplify the task of finding tangent line approximations of compound expressions.
For example, suppose we know that at $c=0$ we have these formula where $a \approx b$ is a shorthand for the more formal $a=b + \mathcal{O}(x^2)$:
$$
\sin(x) \approx x, \quad e^x \approx 1 + x, \quad \text{and}\quad 1/(1+x) \approx 1 - x.
$$
Then we can immediately see these tangent line approximations about $x=0$:
$$
e^x \cdot \sin(x) \approx (1+x) \cdot x = x + x^2 \approx x,
$$
and
$$
\frac{\sin(x)}{e^x} \approx \frac{x}{1 + x} \approx x \cdot(1-x) = x-x^2 \approx x.
$$
Since $\sin(0) = 0$, we can use these to find the tangent line approximation of
$$
e^{\sin(x)} \approx e^x \approx 1 + x.
$$
Note that $\sin(\exp(x))$ is approximately $\sin(1+x)$ but not approximately $1+x$, as the expansion for $\sin$ about $1$ is not simply $x$.
### The TaylorSeries package
The `TaylorSeries` packages will do these calculations in a manner similar to how `SymPy` transforms a function and a symbolic variable into a symbolic expression.
For example, we have
```{julia}
t = Taylor1(Float64, 1)
```
The number type and the order is specified to the constructor. Linearization is order $1$, other orders will be discussed later. This variable can now be composed with mathematical functions and the linearization of the function will be returned:
```{julia}
sin(t), exp(t), 1/(1+t)
```
```{julia}
sin(t)/exp(t), exp(sin(t))
```
##### Example: Automatic differentiation
Automatic differentiation (forward mode) essentially uses this technique. A "dual" is introduced which has terms $a +b\epsilon$ where $\epsilon^2 = 0$. The $\epsilon$ is like $x$ in a linear expansion, so the `a` coefficient encodes the value and the `b` coefficient reflects the derivative at the value. Numbers are treated like a variable, so their "b coefficient" is a `1`. Here then is how `0` is encoded:
```{julia}
Dual(0, 1)
```
Then what is $\(x)$? It should reflect both $(\sin(0), \cos(0))$ the latter being the derivative of $\sin$. We can see this is *almost* what is computed behind the scenes through:
```{julia}
#| hold: true
x = Dual(0, 1)
@code_lowered sin(x)
```
This output of `@code_lowered` can be confusing, but this simple case needn't be. Working from the end we see an assignment to a variable named `%7` of `Dual(%3, %6)`. The value of `%3` is `sin(x)` where `x` is the value `0` above. The value of `%6` is `cos(x)` *times* the value `1` above (the `xp`), which reflects the *chain* rule being used. (The derivative of `sin(u)` is `cos(u)*du`.) So this dual number encodes both the function value at `0` and the derivative of the function at `0`.)
Similarly, we can see what happens to `log(x)` at `1` (encoded by `Dual(1,1)`):
```{julia}
#| hold: true
x = Dual(1, 1)
@code_lowered log(x)
```
We can see the derivative again reflects the chain rule, it being given by `1/x * xp` where `xp` acts like `dx` (from assignments `%5` and `%4`). Comparing the two outputs, we see only the assignment to `%4` differs, it reflecting the derivative of the function.
## Questions
###### Question
What is the right linear approximation for $\sqrt{1 + x}$ near $0$?
```{julia}
#| hold: true
#| echo: false
choices = [
"``1 + 1/2``",
"``1 + x^{1/2}``",
"``1 + (1/2) \\cdot x``",
"``1 - (1/2) \\cdot x``"]
answ = 3
radioq(choices, answ)
```
###### Question
What is the right linear approximation for $(1 + x)^k$ near $0$?
```{julia}
#| hold: true
#| echo: false
choices = [
"``1 + k``",
"``1 + x^k``",
"``1 + k \\cdot x``",
"``1 - k \\cdot x``"]
answ = 3
radioq(choices, answ)
```
###### Question
What is the right linear approximation for $\cos(\sin(x))$ near $0$?
```{julia}
#| hold: true
#| echo: false
choices = [
"``1``",
"``1 + x``",
"``x``",
"``1 - x^2/2``"
]
answ = 1
radioq(choices, answ)
```
###### Question
What is the right linear approximation for $\tan(x)$ near $0$?
```{julia}
#| hold: true
#| echo: false
choices = [
"``1``",
"``x``",
"``1 + x``",
"``1 - x``"
]
answ = 2
radioq(choices, answ)
```
###### Question
What is the right linear approximation of $\sqrt{25 + x}$ near $x=0$?
```{julia}
#| hold: true
#| echo: false
choices = [
"``5 \\cdot (1 + (1/2) \\cdot (x/25))``",
"``1 - (1/2) \\cdot x``",
"``1 + x``",
"``25``"
]
answ = 1
radioq(choices, answ)
```
###### Question
Let $f(x) = \sqrt{x}$. Find the actual error in approximating $f(26)$ by the value of the tangent line at $(25, f(25))$ at $x=26$.
```{julia}
#| hold: true
#| echo: false
tgent(x) = 5 + x/10
answ = tgent(1) - sqrt(26)
numericq(answ)
```
###### Question
An estimate of some quantity was $12.34$ the actual value was $12$. What was the *percentage error*?
```{julia}
#| hold: true
#| echo: false
est = 12.34
act = 12.0
answ = (est -act)/act * 100
numericq(answ)
```
###### Question
Find the percentage error in estimating $\sin(5^\circ)$ by $5 \pi/180$.
```{julia}
#| hold: true
#| echo: false
tl(x) = x
x0 = 5 * pi/180
est = x0
act = sin(x0)
answ = (est -act)/act * 100
numericq(answ)
```
###### Question
The side length of a square is measured roughly to be $2.0$ cm. The actual length $2.2$ cm. What is the difference in area (in absolute values) as *estimated* by a tangent line approximation.
```{julia}
#| hold: true
#| echo: false
tl(x) = 4 + 4x
answ = tl(.2) - 4
numericq(abs(answ))
```
###### Question
The [Birthday problem](https://en.wikipedia.org/wiki/Birthday_problem) computes the probability that in a group of $n$ people, under some assumptions, that no two share a birthday. Without trying to spoil the problem, we focus on the calculus specific part of the problem below:
$$
\begin{align*}
p
&= \frac{365 \cdot 364 \cdot \cdots (365-n+1)}{365^n} \\
&= \frac{365(1 - 0/365) \cdot 365(1 - 1/365) \cdot 365(1-2/365) \cdot \cdots \cdot 365(1-(n-1)/365)}{365^n}\\
&= (1 - \frac{0}{365})\cdot(1 -\frac{1}{365})\cdot \cdots \cdot (1-\frac{n-1}{365}).
\end{align*}
$$
Taking logarithms, we have $\log(p)$ is
$$
\log(1 - \frac{0}{365}) + \log(1 -\frac{1}{365})+ \cdots + \log(1-\frac{n-1}{365}).
$$
Now, use the tangent line approximation for $\log(1 - x)$ and the sum formula for $0 + 1 + 2 + \dots + (n-1)$ to simplify the value of $\log(p)$:
```{julia}
#| hold: true
#| echo: false
choices = ["``-n(n-1)/2/365``",
"``-n(n-1)/2\\cdot 365``",
"``-n^2/(2\\cdot 365)``",
"``-n^2 / 2 \\cdot 365``"]
radioq(choices, 1, keep_order=true)
```
If $n = 10$, what is the approximation for $p$ (not $\log(p)$)?
```{julia}
#| hold: true
#| echo: false
n=10
val = exp(-n*(n-1)/2/365)
numericq(val)
```
If $n=100$, what is the approximation for $p$ (not $\log(p)$?
```{julia}
#| hold: true
#| echo: false
n=100
val = exp(-n*(n-1)/2/365)
numericq(val, 1e-2)
```

View File

@@ -0,0 +1,23 @@
// https://jsxgraph.uni-bayreuth.de/wiki/index.php?title=Mean_Value_Theorem
var board = JXG.JSXGraph.initBoard('jsxgraph', {boundingbox: [-5, 10, 7, -6], axis:true});
board.suspendUpdate();
var p = [];
p[0] = board.create('point', [-1,-2], {size:2});
p[1] = board.create('point', [6,5], {size:2});
p[2] = board.create('point', [-0.5,1], {size:2});
p[3] = board.create('point', [3,3], {size:2});
var f = JXG.Math.Numerics.lagrangePolynomial(p);
var graph = board.create('functiongraph', [f,-10, 10]);
var g = function(x) {
return JXG.Math.Numerics.D(f)(x)-(p[1].Y()-p[0].Y())/(p[1].X()-p[0].X());
};
var r = board.create('glider', [
function() { return JXG.Math.Numerics.root(g,(p[0].X()+p[1].X())*0.5); },
function() { return f(JXG.Math.Numerics.root(g,(p[0].X()+p[1].X())*0.5)); },
graph], {name:' ',size:4,fixed:true});
board.create('tangent', [r], {strokeColor:'#ff0000'});
line = board.create('line',[p[0],p[1]],{strokeColor:'#ff0000',dash:1});
board.unsuspendUpdate();

View File

@@ -0,0 +1,704 @@
# The mean value theorem for differentiable functions.
```{julia}
#| echo: false
import Logging
Logging.disable_logging(Logging.Info) # or e.g. Logging.Info
Logging.disable_logging(Logging.Warn)
import SymPy
function Base.show(io::IO, ::MIME"text/html", x::T) where {T <: SymPy.SymbolicObject}
println(io, "<span class=\"math-left-align\" style=\"padding-left: 4px; width:0; float:left;\"> ")
println(io, "\\[")
println(io, sympy.latex(x))
println(io, "\\]")
println(io, "</span>")
end
# hack to work around issue
import Markdown
import CalculusWithJulia
function CalculusWithJulia.WeaveSupport.ImageFile(d::Symbol, f::AbstractString, caption; kwargs...)
nm = joinpath("..", string(d), f)
u = "![$caption]($nm)"
Markdown.parse(u)
end
nothing
```
This section uses these add-on packages:
```{julia}
using CalculusWithJulia
using Plots
using Roots
```
```{julia}
#| echo: false
#| results: "hidden"
using CalculusWithJulia.WeaveSupport
using Printf
using SymPy
fig_size = (800, 600)
const frontmatter = (
title = "The mean value theorem for differentiable functions.",
description = "Calculus with Julia: The mean value theorem for differentiable functions.",
tags = ["CalculusWithJulia", "derivatives", "the mean value theorem for differentiable functions."],
);
nothing
```
---
A function is *continuous* at $c$ if $f(c+h) - f(c) \rightarrow 0$ as $h$ goes to $0$. We can write that as $f(c+h) - f(x) = \epsilon_h$, with $\epsilon_h$ denoting a function going to $0$ as $h \rightarrow 0$. With this notion, differentiability could be written as $f(c+h) - f(c) - f'(c)h = \epsilon_h \cdot h$. This is clearly a more demanding requirement that mere continuity at $c$.
We defined a function to be *continuous* on an interval $I=(a,b)$ if it was continuous at each point $c$ in $I$. Similarly, we define a function to be *differentiable* on the interval $I$ it it is differentiable at each point $c$ in $I$.
This section looks at properties of differentiable functions. As there is a more stringent definition, perhaps more properties are a consequence of the definition.
## Differentiable is more restrictive than continuous.
Let $f$ be a differentiable function on $I=(a,b)$. We see that $f(c+h) - f(c) = f'(c)h + \epsilon_h\cdot h = h(f'(c) + \epsilon_h)$. The right hand side will clearly go to $0$ as $h\rightarrow 0$, so $f$ will be continuous. In short:
> A differentiable function on $I=(a,b)$ is continuous on $I$.
Is it possible that all continuous functions are differentiable?
The fact that the derivative is related to the tangent line's slope might give an indication that this won't be the case - we just need a function which is continuous but has a point with no tangent line. The usual suspect is $f(x) = \lvert x\rvert$ at $0$.
```{julia}
#| hold: true
f(x) = abs(x)
plot(f, -1,1)
```
We can see formally that the secant line expression will not have a limit when $c=0$ (the left limit is $-1$, the right limit $1$). But more insight is gained by looking a the shape of the graph. At the origin, the graph always is vee-shaped. There is no linear function that approximates this function well. The function is just not smooth enough, as it has a kink.
There are other functions that have kinks. These are often associated with powers. For example, at $x=0$ this function will not have a derivative:
```{julia}
#| hold: true
f(x) = (x^2)^(1/3)
plot(f, -1, 1)
```
Other functions have tangent lines that become vertical. The natural slope would be $\infty$, but this isn't a limiting answer (except in the extended sense we don't apply to the definition of derivatives). A candidate for this case is the cube root function:
```{julia}
plot(cbrt, -1, 1)
```
The derivative at $0$ would need to be $+\infty$ to match the graph. This is implied by the formula for the derivative from the power rule: $f'(x) = 1/3 \cdot x^{-2/3}$, which has a vertical asymptote at $x=0$.
:::{.callout-note}
## Note
The `cbrt` function is used above, instead of `f(x) = x^(1/3)`, as the latter is not defined for negative `x`. Though it can be for the exact power `1/3`, it can't be for an exact power like `1/2`. This means the value of the argument is important in determining the type of the output - and not just the type of the argument. Having type-stable functions is part of the magic to making `Julia` run fast, so `x^c` is not defined for negative `x` and most floating point exponents.
:::
Lest you think that continuous functions always have derivatives except perhaps at exceptional points, this isn't the case. The functions used to [model](http://tinyurl.com/cpdpheb) the stock market are continuous but have no points where they are differentiable.
## Derivatives and maxima.
We have defined an *absolute maximum* of $f(x)$ over an interval to be a value $f(c)$ for a point $c$ in the interval that is as large as any other value in the interval. Just specifying a function and an interval does not guarantee an absolute maximum, but specifying a *continuous* function and a *closed* interval does, by the extreme value theorem.
> *A relative maximum*: We say $f(x)$ has a *relative maximum* at $c$ if there exists *some* interval $I=(a,b)$ with $a < c < b$ for which $f(c)$ is an absolute maximum for $f$ and $I$.
The difference is a bit subtle, for an absolute maximum the interval must also be specified, for a relative maximum there just needs to exist some interval, possibly really small, though it must be bigger than a point.
:::{.callout-note}
## Note
A hiker can appreciate the difference. A relative maximum would be the crest of any hill, but an absolute maximum would be the summit.
:::
What does this have to do with derivatives?
[Fermat](http://science.larouchepac.com/fermat/fermat-maxmin.pdf), perhaps with insight from Kepler, was interested in maxima of polynomial functions. As a warm up, he considered a line segment $AC$ and a point $E$ with the task of choosing $E$ so that $(E-A) \times (C-A)$ being a maximum. We might recognize this as finding the maximum of $f(x) = (x-A)\cdot(C-x)$ for some $A < C$. Geometrically, we know this to be at the midpoint, as the equation is a parabola, but Fermat was interested in an algebraic solution that led to more generality.
He takes $b=AC$ and $a=AE$. Then the product is $a \cdot (b-a) = ab - a^2$. He then perturbs this writing $AE=a+e$, then this new product is $(a+e) \cdot (b - a - e)$. Equating the two, and canceling like terms gives $be = 2ae + e^2$. He cancels the $e$ and basically comments that this must be true for all $e$ even as $e$ goes to $0$, so $b = 2a$ and the value is at the midpoint.
In a more modern approach, this would be the same as looking at this expression:
$$
\frac{f(x+e) - f(x)}{e} = 0.
$$
Working on the left hand side, for non-zero $e$ we can cancel the common $e$ terms, and then let $e$ become $0$. This becomes a problem in solving $f'(x)=0$. Fermat could compute the derivative for any polynomial by taking a limit, a task we would do now by the power rule and the sum and difference of function rules.
This insight holds for other types of functions:
> If $f(c)$ is a relative maximum then either $f'(c) = 0$ or the derivative at $c$ does not exist.
When the derivative exists, this says the tangent line is flat. (If it had a slope, then the the function would increase by moving left or right, as appropriate, a point we pursue later.)
For a continuous function $f(x)$, call a point $c$ in the domain of $f$ where either $f'(c)=0$ or the derivative does not exist a **critical** **point**.
We can combine Bolzano's extreme value theorem with Fermat's insight to get the following:
> A continuous function on $[a,b]$ has an absolute maximum that occurs at a critical point $c$, $a < c < b$, or an endpoint, $a$ or $b$.
A similar statement holds for an absolute minimum. This gives a restricted set of places to look for absolute maximum and minimum values - all the critical points and the endpoints.
It is also the case that all relative extrema occur at a critical point, *however* not all critical points correspond to relative extrema. We will see *derivative tests* that help characterize when that occurs.
```{julia}
#| hold: true
#| echo: false
### {{{lhopital_32}}}
imgfile = "figures/lhopital-32.png"
caption = L"""
Image number ``32`` from L'Hopitals calculus book (the first) showing that
at a relative minimum, the tangent line is parallel to the
$x$-axis. This of course is true when the tangent line is well defined
by Fermat's observation.
"""
ImageFile(:derivatives, imgfile, caption)
```
### Numeric derivatives
The `ForwardDiff` package provides a means to numerically compute derivatives without approximations at a point. In `CalculusWithJulia` this is extended to find derivatives of functions and the `'` notation is overloaded for function objects. Hence these two give nearly identical answers, the difference being only the type of number used:
```{julia}
#| hold: true
f(x) = 3x^3 - 2x
fp(x) = 9x^2 - 2
f'(3), fp(3)
```
##### Example
For the function $f(x) = x^2 \cdot e^{-x}$ find the absolute maximum over the interval $[0, 5]$.
We have that $f(x)$ is continuous on the closed interval of the question, and in fact differentiable on $(0,5)$, so any critical point will be a zero of the derivative. We can check for these with:
```{julia}
f(x) = x^2 * exp(-x)
cps = find_zeros(f', -1, 6) # find_zeros in `Roots`
```
We get $0$ and $2$ are critical points. The endpoints are $0$ and $5$. So the absolute maximum over this interval is either at $0$, $2$, or $5$:
```{julia}
f(0), f(2), f(5)
```
We see that $f(2)$ is then the maximum.
A few things. First, `find_zeros` can miss some roots, in particular endpoints and roots that just touch $0$. We should graph to verify it didn't. Second, it can be easier sometimes to check the values using the "dot" notation. If `f`, `a`,`b` are the function and the interval, then this would typically follow this pattern:
```{julia}
a, b = 0, 5
critical_pts = find_zeros(f', a, b)
f.(critical_pts), f(a), f(b)
```
For this problem, we have the left endpoint repeated, but in general this won't be a point where the derivative is zero.
As an aside, the output above is not a single container. To achieve that, the values can be combined before the broadcasting:
```{julia}
f.(vcat(a, critical_pts, b))
```
##### Example
For the function $g(x) = e^x\cdot(x^3 - x)$ find the absolute maximum over the interval $[0, 2]$.
We follow the same pattern. Since $f(x)$ is continuous on the closed interval and differentiable on the open interval we know that the absolute maximum must occur at an endpoint ($0$ or $2$) or a critical point where $f'(c)=0$. To solve for these, we have again:
```{julia}
g(x) = exp(x) * (x^3 - x)
gcps = find_zeros(g', 0, 2)
```
And checking values gives:
```{julia}
g.(vcat(0, gcps, 2))
```
Here the maximum occurs at an endpoint. The critical point $c=0.67\dots$ does not produce a maximum value. Rather $f(0.67\dots)$ is an absolute minimum.
:::{.callout-note}
## Note
:::
**Absolute minimum** We haven't discussed the parallel problem of absolute minima over a closed interval. By considering the function $h(x) = - f(x)$, we see that the any thing true for an absolute maximum should hold in a related manner for an absolute minimum, in particular an absolute minimum on a closed interval will only occur at a critical point or an end point.
## Rolle's theorem
Let $f(x)$ be differentiable on $(a,b)$ and continuous on $[a,b]$. Then the absolute maximum occurs at an endpoint or where the derivative is $0$ (as the derivative is always defined). This gives rise to:
> *[Rolle's](http://en.wikipedia.org/wiki/Rolle%27s_theorem) theorem*: For $f$ differentiable on $(a,b)$ and continuous on $[a,b]$, if $f(a)=f(b)$, then there exists some $c$ in $(a,b)$ with $f'(c) = 0$.
This modest observation opens the door to many relationships between a function and its derivative, as it ties the two together in one statement.
To see why Rolle's theorem is true, we assume that $f(a)=0$, otherwise consider $g(x)=f(x)-f(a)$. By the extreme value theorem, there must be an absolute maximum and minimum. If $f(x)$ is ever positive, then the absolute maximum occurs in $(a,b)$ - not at an endpoint - so at a critical point where the derivative is $0$. Similarly if $f(x)$ is ever negative. Finally, if $f(x)$ is just $0$, then take any $c$ in $(a,b)$.
The statement in Rolle's theorem speaks to existence. It doesn't give a recipe to find $c$. It just guarantees that there is *one* or *more* values in the interval $(a,b)$ where the derivative is $0$ if we assume differentiability on $(a,b)$ and continuity on $[a,b]$.
##### Example
Let $j(x) = e^x \cdot x \cdot (x-1)$. We know $j(0)=0$ and $j(1)=0$, so on $[0,1]$. Rolle's theorem guarantees that we can find *at* *least* one answer (unless numeric issues arise):
```{julia}
j(x) = exp(x) * x * (x-1)
find_zeros(j', 0, 1)
```
This graph illustrates the lone value for $c$ for this problem
```{julia}
#| echo: false
x0 = find_zero(j', (0, 1))
plot([j, x->j(x0) + 0*(x-x0)], 0, 1)
```
## The mean value theorem
We are driving south and in one hour cover 70 miles. If the speed limit is 65 miles per hour, were we ever speeding? We'll we averaged more than the speed limit so we know the answer is yes, but why? Speeding would mean our instantaneous speed was more than the speed limit, yet we only know for sure our *average* speed was more than the speed limit. The mean value tells us that if some conditions are met, then at some point (possibly more than one) we must have that our instantaneous speed is equal to our average speed.
The mean value theorem is a direct generalization of Rolle's theorem.
> *Mean value theorem*: Let $f(x)$ be differentiable on $(a,b)$ and continuous on $[a,b]$. Then there exists a value $c$ in $(a,b)$ where $f'(c) = (f(b) - f(a)) / (b - a)$.
This says for any secant line between $a < b$ there will be a parallel tangent line at some $c$ with $a < c < b$ (all provided $f$ is differentiable on $(a,b)$ and continuous on $[a,b]$).
This graph illustrates the theorem. The orange line is the secant line. A parallel line tangent to the graph is guaranteed by the mean value theorem. In this figure, there are two such lines, rendered using red.
```{julia}
#| hold: true
#| echo: false
f(x) = x^3 - x
a, b = -2, 1.75
m = (f(b) - f(a)) / (b-a)
cps = find_zeros(x -> f'(x) - m, a, b)
p = plot(f, a-1, b+1, linewidth=3, legend=false)
plot!(x -> f(a) + m*(x-a), a-1, b+1, linewidth=3, color=:orange)
scatter!([a,b], [f(a), f(b)])
for cp in cps
plot!(x -> f(cp) + f'(cp)*(x-cp), a-1, b+1, color=:red)
end
p
```
Like Rolle's theorem this is a guarantee that something exists, not a recipe to find it. In fact, the mean value theorem is just Rolle's theorem applied to:
$$
g(x) = f(x) - (f(a) + (f(b) - f(a)) / (b-a) \cdot (x-a))
$$
That is the function $f(x)$, minus the secant line between $(a,f(a))$ and $(b, f(b))$.
```{julia}
#| hold: true
#| echo: false
# Need to bring jsxgraph into PLUTO
#caption = """
#Illustration of the mean value theorem from
#[jsxgraph](https://jsxgraph.uni-bayreuth.de/).
#The polynomial function interpolates the points ``A``,``B``,``C``, and ``D``.
#Adjusting these creates different functions. Regardless of the
#function -- which as a polynomial will always be continuous and
#differentiable -- the slope of the secant line between ``A`` and ``B`` is alway#s matched by **some** tangent line between the points ``A`` and ``B``.
#"""
#JSXGraph(:derivatives, "mean-value.js", caption)
nothing
```
An interactive example can be found at [jsxgraph](http://jsxgraph.uni-bayreuth.de/wiki/index.php?title=Mean_Value_Theorem).
##### Example
The mean value theorem is an extremely useful tool to relate properties of a function with properties of its derivative, as, like Rolle's theorem, it includes both $f$ and $f'$ in its statement.
For example, suppose we have a function $f(x)$ and we know that the derivative is **always** $0$. What can we say about the function?
Well, constant functions have derivatives that are constantly $0$. But do others? We will see the answer is no: If a function has a zero derivative in $(a,b)$ it must be a constant. We can readily see that if $f$ is a polynomial function this is the case, as we can differentiate a polynomial function and this will be zero only if **all** its coefficients are $0$, which would mean there is no non-constant leading term in the polynomial. But polynomials are not representative of all functions, and so a proof requires a bit more effort.
Suppose it is known that $f'(x)=0$ on some interval $I$ and we take any $a < b$ in $I$. Since $f'(x)$ always exists, $f(x)$ is always differentiable, and hence always continuous. So on $[a,b]$ the conditions of the mean value theorem apply. That is, there is a $c$ in $(a,b)$ with $(f(b) - f(a)) / (b-a) = f'(c) = 0$. But this would imply $f(b) - f(a)=0$. That is $f(x)$ is a constant, as for any $a$ and $b$, we see $f(a)=f(b)$.
### The Cauchy mean value theorem
[Cauchy](http://en.wikipedia.org/wiki/Mean_value_theorem#Cauchy.27s_mean_value_theorem) offered an extension to the mean value theorem above. Suppose both $f$ and $g$ satisfy the conditions of the mean value theorem on $[a,b]$ with $g(b)-g(a) \neq 0$, then there exists at least one $c$ with $a < c < b$ such that
$$
f'(c) = g'(c) \cdot \frac{f(b) - f(a)}{g(b) - g(a)}.
$$
The proof follows by considering $h(x) = f(x) - r\cdot g(x)$, with $r$ chosen so that $h(a)=h(b)$. Then Rolle's theorem applies so that there is a $c$ with $h'(c)=0$, so $f'(c) = r g'(c)$, but $r$ can be seen to be $(f(b)-f(a))/(g(b)-g(a))$, which proves the theorem.
Letting $g(x) = x$ demonstrates that the mean value theorem is a special case.
##### Example
Suppose $f(x)$ and $g(x)$ satisfy the Cauchy mean value theorem on $[0,x]$, $g'(x)$ is non-zero on $(0,x)$, and $f(0)=g(0)=0$. Then we have:
$$
\frac{f(x) - f(0)}{g(x) - g(0)} = \frac{f(x)}{g(x)} = \frac{f'(c)}{g'(c)},
$$
For some $c$ in $[0,x]$. If $\lim_{x \rightarrow 0} f'(x)/g'(x) = L$, then the right hand side will have a limit of $L$, and hence the left hand side will too. That is, when the limit exists, we have under these conditions that $\lim_{x\rightarrow 0}f(x)/g(x) = \lim_{x\rightarrow 0}f'(x)/g'(x)$.
This could be used to prove the limit of $\sin(x)/x$ as $x$ goes to $0$ just by showing the limit of $\cos(x)/1$ is $1$, as is known by continuity.
### Visualizing the Cauchy mean value theorem
The Cauchy mean value theorem can be visualized in terms of a tangent line and a *parallel* secant line in a similar manner as the mean value theorem as long as a *parametric* graph is used. A parametric graph plots the points $(g(t), f(t))$ for some range of $t$. That is, it graphs *both* functions at the same time. The following illustrates the construction of such a graph:
```{julia}
#| hold: true
#| echo: false
#| cache: true
### {{{parametric_fns}}}
function parametric_fns_graph(n)
f = (x) -> sin(x)
g = (x) -> x
ns = (1:10)/10
ts = range(-pi/2, stop=-pi/2 + ns[n] * pi, length=100)
plt = plot(f, g, -pi/2, -pi/2 + ns[n] * pi, legend=false, size=fig_size,
xlim=(-1.1,1.1), ylim=(-pi/2-.1, pi/2+.1))
scatter!(plt, [f(ts[end])], [g(ts[end])], color=:orange, markersize=5)
val = @sprintf("% 0.2f", ts[end])
annotate!(plt, [(0, 1, "t = $val")])
end
caption = L"""
Illustration of parametric graph of $(g(t), f(t))$ for $-\pi/2 \leq t
\leq \pi/2$ with $g(x) = \sin(x)$ and $f(x) = x$. Each point on the
graph is from some value $t$ in the interval. We can see that the
graph goes through $(0,0)$ as that is when $t=0$. As well, it must go
through $(1, \pi/2)$ as that is when $t=\pi/2$
"""
n = 10
anim = @animate for i=1:n
parametric_fns_graph(i)
end
imgfile = tempname() * ".gif"
gif(anim, imgfile, fps = 1)
ImageFile(imgfile, caption)
```
With $g(x) = \sin(x)$ and $f(x) = x$, we can take $I=[a,b] = [0, \pi/2]$. In the figure below, the *secant line* is drawn in red which connects $(g(a), f(a))$ with the point $(g(b), f(b))$, and hence has slope $\Delta f/\Delta g$. The parallel lines drawn show the *tangent* lines with slope $f'(c)/g'(c)$. Two exist for this problem, the mean value theorem guarantees at least one will.
```{julia}
#| hold: true
#| echo: false
g(x) = sin(x)
f(x) = x
ts = range(-pi/2, stop=pi/2, length=50)
a,b = 0, pi/2
m = (f(b) - f(a))/(g(b) - g(a))
cps = find_zeros(x -> f'(x)/g'(x) - m, -pi/2, pi/2)
c = cps[1]
Delta = (0 + m * (c - 0)) - (g(c))
p = plot(g, f, -pi/2, pi/2, linewidth=3, legend=false)
plot!(x -> f(a) + m * (x - g(a)), -1, 1, linewidth=3, color=:red)
scatter!([g(a),g(b)], [f(a), f(b)])
for c in cps
plot!(x -> f(c) + m * (x - g(c)), -1, 1, color=:orange)
end
p
```
## Questions
###### Question
Rolle's theorem is a guarantee of a value, but does not provide a recipe to find it. For the function $1 - x^2$ over the interval $[-5,5]$, find a value $c$ that satisfies the result.
```{julia}
#| hold: true
#| echo: false
c = 0
numericq(c)
```
###### Question
The extreme value theorem is a guarantee of a value, but does not provide a recipe to find it. For the function $f(x) = \sin(x)$ on $I=[0, \pi]$ find a value $c$ satisfying the theorem for an absolute maximum.
```{julia}
#| hold: true
#| echo: false
c = pi/2
numericq(c)
```
###### Question
The extreme value theorem is a guarantee of a value, but does not provide a recipe to find it. For the function $f(x) = \sin(x)$ on $I=[\pi, 3\pi/2]$ find a value $c$ satisfying the theorem for an absolute maximum.
```{julia}
#| hold: true
#| echo: false
c = pi
numericq(c)
```
###### Question
The mean value theorem is a guarantee of a value, but does not provide a recipe to find it. For $f(x) = x^2$ on $[0,2]$ find a value of $c$ satisfying the theorem.
```{julia}
#| hold: true
#| echo: false
c = 1
numericq(c)
```
###### Question
The Cauchy mean value theorem is a guarantee of a value, but does not provide a recipe to find it. For $f(x) = x^3$ and $g(x) = x^2$ find a value $c$ in the interval $[1, 2]$
```{julia}
#| hold: true
#| echo: false
c,x = symbols("c, x", real=true)
val = solve(3c^2 / (2c) - (2^3 - 1^3) / (2^2 - 1^2), c)[1]
numericq(float(val))
```
###### Question
Will the function $f(x) = x + 1/x$ satisfy the conditions of the mean value theorem over $[-1/2, 1/2]$?
```{julia}
#| hold: true
#| echo: false
radioq(["Yes", "No"], 2)
```
###### Question
Just as it is a fact that $f'(x) = 0$ (for all $x$ in $I$) implies $f(x)$ is a constant, so too is it a fact that if $f'(x) = g'(x)$ that $f(x) - g(x)$ is a constant. What function would you consider, if you wanted to prove this with the mean value theorem?
```{julia}
#| hold: true
#| echo: false
choices = [
"``h(x) = f(x) - (f(b) - f(a)) / (b - a)``",
"``h(x) = f(x) - (f(b) - f(a)) / (b - a) \\cdot g(x)``",
"``h(x) = f(x) - g(x)``",
"``h(x) = f'(x) - g'(x)``"
]
answ = 3
radioq(choices, answ)
```
###### Question
Suppose $f''(x) > 0$ on $I$. Why is it impossible that $f'(x) = 0$ at more than one value in $I$?
```{julia}
#| hold: true
#| echo: false
choices = [
L"It isn't. The function $f(x) = x^2$ has two zeros and $f''(x) = 2 > 0$",
"By the Rolle's theorem, there is at least one, and perhaps more",
L"By the mean value theorem, we must have $f'(b) - f'(a) > 0$ when ever $b > a$. This means $f'(x)$ is increasing and can't double back to have more than one zero."
]
answ = 3
radioq(choices, answ)
```
###### Question
Let $f(x) = 1/x$. For $0 < a < b$, find $c$ so that $f'(c) = (f(b) - f(a)) / (b-a)$.
```{julia}
#| hold: true
#| echo: false
choices = [
"``c = (a+b)/2``",
"``c = \\sqrt{ab}``",
"``c = 1 / (1/a + 1/b)``",
"``c = a + (\\sqrt{5} - 1)/2 \\cdot (b-a)``"
]
answ = 2
radioq(choices, answ)
```
###### Question
Let $f(x) = x^2$. For $0 < a < b$, find $c$ so that $f'(c) = (f(b) - f(a)) / (b-a)$.
```{julia}
#| hold: true
#| echo: false
choices = [
"``c = (a+b)/2``",
"``c = \\sqrt{ab}``",
"``c = 1 / (1/a + 1/b)``",
"``c = a + (\\sqrt{5} - 1)/2 \\cdot (b-a)``"
]
answ = 1
radioq(choices, answ)
```
###### Question
In an example, we used the fact that if $0 < c < x$, for some $c$ given by the mean value theorem and $f(x)$ goes to $0$ as $x$ goes to zero then $f(c)$ will also go to zero. Suppose we say that $c=g(x)$ for some function $c$.
Why is it known that $g(x)$ goes to $0$ as $x$ goes to zero (from the right)?
```{julia}
#| hold: true
#| echo: false
choices = [L"The squeeze theorem applies, as $0 < g(x) < x$.",
L"As $f(x)$ goes to zero by Rolle's theorem it must be that $g(x)$ goes to $0$.",
L"This follows by the extreme value theorem, as there must be some $c$ in $[0,x]$."]
answ = 1
radioq(choices, answ)
```
Since $g(x)$ goes to zero, why is it true that if $f(x)$ goes to $L$ as $x$ goes to zero that $f(g(x))$ must also have a limit $L$?
```{julia}
#| hold: true
#| echo: false
choices = ["It isn't true. The limit must be 0",
L"The squeeze theorem applies, as $0 < g(x) < x$",
"This follows from the limit rules for composition of functions"]
answ = 3
radioq(choices, answ)
```

View File

@@ -0,0 +1,643 @@
# Derivative-free alternatives to Newton's method
```{julia}
#| echo: false
import Logging
Logging.disable_logging(Logging.Info) # or e.g. Logging.Info
Logging.disable_logging(Logging.Warn)
import SymPy
function Base.show(io::IO, ::MIME"text/html", x::T) where {T <: SymPy.SymbolicObject}
println(io, "<span class=\"math-left-align\" style=\"padding-left: 4px; width:0; float:left;\"> ")
println(io, "\\[")
println(io, sympy.latex(x))
println(io, "\\]")
println(io, "</span>")
end
# hack to work around issue
import Markdown
import CalculusWithJulia
function CalculusWithJulia.WeaveSupport.ImageFile(d::Symbol, f::AbstractString, caption; kwargs...)
nm = joinpath("..", string(d), f)
u = "![$caption]($nm)"
Markdown.parse(u)
end
nothing
```
This section uses these add-on packages:
```{julia}
using CalculusWithJulia
using Plots
using ImplicitEquations
using Roots
using SymPy
```
```{julia}
#| echo: false
#| results: "hidden"
using CalculusWithJulia.WeaveSupport
const frontmatter = (
title = "Derivative-free alternatives to Newton's method",
description = "Calculus with Julia: Derivative-free alternatives to Newton's method",
tags = ["CalculusWithJulia", "derivatives", "derivative-free alternatives to newton's method"],
);
nothing
```
---
Newton's method is not the only algorithm of its kind for identifying zeros of a function. In this section we discuss some alternatives.
## The `find_zero(f, x0)` function
The function `find_zero` from the `Roots` packages provides several different algorithms for finding a zero of a function, including some a derivative-free algorithms for finding zeros when started with an initial guess. The default method is similar to Newton's method in that only a good initial guess is needed. However, the algorithm, while possibly slower in terms of function evaluations and steps, is engineered to be a bit more robust to the choice of initial estimate than Newton's method. (If it finds a bracket, it will use a bisection algorithm which is guaranteed to converge, but can be slower to do so.) Here we see how to call the function:
```{julia}
f(x) = cos(x) - x
x₀ = 1
find_zero(f, x₀)
```
Compare to this related call which uses the bisection method:
```{julia}
find_zero(f, (0, 1)) ## [0,1] must be a bracketing interval
```
For this example both give the same answer, but the bisection method is a bit less convenient as a bracketing interval must be pre-specified.
## The secant method
The default `find_zero` method above uses a secant-like method unless a bracketing method is found. The secant method is historic, dating back over $3000$ years. Here we discuss the secant method in a more general framework.
One way to view Newton's method is through the inverse of $f$ (assuming it exists): if $f(\alpha) = 0$ then $\alpha = f^{-1}(0)$.
If $f$ has a simple zero at $\alpha$ and is locally invertible (that is some $f^{-1}$ exists) then the update step for Newton's method can be identified with:
* fitting a polynomial to the local inverse function of $f$ going through through the point $(f(x_0),x_0)$,
* and matching the slope of $f$ at the same point.
That is, we can write $g(y) = h_0 + h_1 (y-f(x_0))$. Then $g(f(x_0)) = x_0 = h_0$, so $h_0 = x_0$. From $g'(f(x_0)) = 1/f'(x_0)$, we get $h_1 = 1/f'(x_0)$. That is, $g(y) = x_0 + (y-f(x_0))/f'(x_0)$. At $y=0,$ we get the update step $x_1 = g(0) = x_0 - f(x_0)/f'(x_0)$.
A similar viewpoint can be used to create derivative-free methods.
For example, the [secant method](https://en.wikipedia.org/wiki/Secant_method) can be seen as the result of fitting a degree-$1$ polynomial approximation for $f^{-1}$ through two points $(f(x_0),x_0)$ and $(f(x_1), x_1)$.
Again, expressing this approximation as $g(y) = h_0 + h_1(y-f(x_1))$ leads to $g(f(x_1)) = x_1 = h_0$. Substituting $f(x_0)$ gives $g(f(x_0)) = x_0 = x_1 + h_1(f(x_0)-f(x_1))$. Solving for $h_1$ leads to $h_1=(x_1-x_0)/(f(x_1)-f(x_0))$. Then $x_2 = g(0) = x_1 + (x_1-x_0)/(f(x_1)-f(x_0)) \cdot f(x_1)$. This is the first step of the secant method:
$$
x_{n+1} = x_n - f(x_n) \frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}.
$$
That is, where the next step of Newton's method comes from the intersection of the tangent line at $x_n$ with the $x$-axis, the next step of the secant method comes from the intersection of the secant line defined by $x_n$ and $x_{n-1}$ with the $x$ axis. That is, the secant method simply replaces $f'(x_n)$ with the slope of the secant line between $x_n$ and $x_{n-1}$.
We code the update step as `λ2`:
```{julia}
λ2(f0,f1,x0,x1) = x1 - f1 * (x1-x0) / (f1-f0)
```
Then we can run a few steps to identify the zero of sine starting at $3$ and $4$
```{julia}
#| hold: true
#| term: true
x0,x1 = 4,3
f0,f1 = sin.((x0,x1))
@show x1,f1
x0,x1 = x1, λ2(f0,f1,x0,x1)
f0,f1 = f1, sin(x1)
@show x1,f1
x0,x1 = x1, λ2(f0,f1,x0,x1)
f0,f1 = f1, sin(x1)
@show x1,f1
x0,x1 = x1, λ2(f0,f1,x0,x1)
f0,f1 = f1, sin(x1)
@show x1,f1
x0,x1 = x1, λ2(f0,f1,x0,x1)
f0,f1 = f1, sin(x1)
x1,f1
```
Like Newton's method, the secant method coverges quickly for this problem (though its rate is less than the quadratic rate of Newton's method).
This method is included in `Roots` as `Secant()` (or `Order1()`):
```{julia}
find_zero(sin, (4,3), Secant())
```
Though the derivative is related to the slope of the secant line, that is in the limit. The convergence of the secant method is not as fast as Newton's method, though at each step of the secant method, only one new function evaluation is needed, so it can be more efficient for functions that are expensive to compute or differentiate.
Let $\epsilon_{n+1} = x_{n+1}-\alpha$, where $\alpha$ is assumed to be the *simple* zero of $f(x)$ that the secant method converges to. A [calculation](https://math.okstate.edu/people/binegar/4513-F98/4513-l08.pdf) shows that
$$
\begin{align*}
\epsilon_{n+1} &\approx \frac{x_n-x_{n-1}}{f(x_n)-f(x_{n-1})} \frac{(1/2)f''(\alpha)(e_n-e_{n-1})}{x_n-x_{n-1}} \epsilon_n \epsilon_{n-1}\\
& \approx \frac{f''(\alpha)}{2f'(\alpha)} \epsilon_n \epsilon_{n-1}\\
&= C \epsilon_n \epsilon_{n-1}.
\end{align*}
$$
The constant `C` is similar to that for Newton's method, and reveals potential troubles for the secant method similar to those of Newton's method: a poor initial guess (the initial error is too big), the second derivative is too large, the first derivative too flat near the answer.
Assuming the error term has the form $\epsilon_{n+1} = A|\epsilon_n|^\phi$ and substituting into the above leads to the equation
$$
\frac{A^{1-1/\phi}}{C} = |\epsilon_n|^{1 - \phi +1/\phi}.
$$
The left side being a constant suggests $\phi$ solves: $1 - \phi + 1/\phi = 0$ or $\phi^2 -\phi - 1 = 0$. The solution is the golden ratio, $(1 + \sqrt{5})/2 \approx 1.618\dots$.
### Steffensen's method
Steffensen's method is a secant-like method that converges with $|\epsilon_{n+1}| \approx C |\epsilon_n|^2$. The secant is taken between the points $(x_n,f(x_n))$ and $(x_n + f(x_n), f(x_n + f(x_n))$. Like Newton's method this requires $2$ function evaluations per step. Steffensen's is implemented through `Roots.Steffensen()`. Steffensen's method is more sensitive to the initial guess than other methods, so in practice must be used with care, though it is a starting point for many higher-order derivative-free methods.
## Inverse quadratic interpolation
Inverse quadratic interpolation fits a quadratic polynomial through three points, not just two like the Secant method. The third being $(f(x_2), x_2)$.
For example, here is the inverse quadratic function, $g(y)$, going through three points marked with red dots. The blue dot is found from $(g(0), 0)$.
```{julia}
#| hold: true
#| echo: false
a,b,c = 1,2,3
fa,fb,fc = -1,1/4,1
g(y) = (y-fb)*(y-fa)/(fc-fb)/(fc-fa)*c + (y-fc)*(y-fa)/(fb-fc)/(fb-fa)*b + (y-fc)*(y-fb)/(fa-fc)/(fa-fb)*a
ys = range(-2,2, length=100)
xs = g.(ys)
plot(xs, ys, legend=false)
scatter!([a,b,c],[fa,fb,fc], color=:red, markersize=5)
scatter!([g(0)],[0], color=:blue, markersize=5)
plot!(zero, color=:blue)
```
Here we use `SymPy` to identify the degree-$2$ polynomial as a function of $y$, then evaluate it at $y=0$ to find the next step:
```{julia}
@syms y hs[0:2] xs[0:2] fs[0:2]
H(y) = sum(hᵢ*(y - fs[end])^i for (hᵢ,i) ∈ zip(hs, 0:2))
eqs = [H(fᵢ) ~ xᵢ for (xᵢ, fᵢ) ∈ zip(xs, fs)]
ϕ = solve(eqs, hs)
hy = subs(H(y), ϕ)
```
The value of `hy` at $y=0$ yields the next guess based on the past three, and is given by:
```{julia}
q⁻¹ = hy(y => 0)
```
Though the above can be simplified quite a bit when computed by hand, here we simply make this a function with `lambdify` which we will use below.
```{julia}
λ3 = lambdify(q⁻¹) # fs, then xs
```
(`SymPy`'s `lambdify` function, by default, picks the order of its argument lexicographically, in this case they will be the `f` values then the `x` values.)
An inverse quadratic step is utilized by Brent's method, as possible, to yield a rapidly convergent bracketing algorithm implemented as a default zero finder in many software languages. `Julia`'s `Roots` package implements the method in `Roots.Brent()`. An inverse cubic interpolation is utilized by [Alefeld, Potra, and Shi](https://dl.acm.org/doi/10.1145/210089.210111) which gives an asymptotically even more rapidly convergent algorithm then Brent's (implemented in `Roots.AlefeldPotraShi()` and also `Roots.A42()`). This is used as a finishing step in many cases by the default hybrid `Order0()` method of `find_zero`.
In a bracketing algorithm, the next step should reduce the size of the bracket, so the next iterate should be inside the current bracket. However, quadratic convergence does not guarantee this to happen. As such, sometimes a subsitute method must be chosen.
[Chandrapatla's](https://www.google.com/books/edition/Computational_Physics/cC-8BAAAQBAJ?hl=en&gbpv=1&pg=PA95&printsec=frontcover) method, is a bracketing method utilizing an inverse quadratic step as the centerpiece. The key insight is the test to choose between this inverse quadratic step and a bisection step. This is done in the following based on values of $\xi$ and $\Phi$ defined within:
```{julia}
function chandrapatla(f, u, v, λ; verbose=false)
a,b = promote(float(u), float(v))
fa,fb = f(a),f(b)
@assert fa * fb < 0
if abs(fa) < abs(fb)
a,b,fa,fb = b,a,fb,fa
end
c, fc = a, fa
maxsteps = 100
for ns in 1:maxsteps
Δ = abs(b-a)
m, fm = (abs(fa) < abs(fb)) ? (a, fa) : (b, fb)
ϵ = eps(m)
if Δ ≤ 2ϵ
return m
end
@show m,fm
iszero(fm) && return m
ξ = (a-b)/(c-b)
Φ = (fa-fb)/(fc-fb)
if Φ^2 < ξ < 1 - (1-Φ)^2
xt = λ(fa,fc,fb, a,c,b) # inverse quadratic
else
xt = a + (b-a)/2
end
ft = f(xt)
isnan(ft) && break
if sign(fa) == sign(ft)
c,fc = a,fa
a,fa = xt,ft
else
c,b,a = b,a,xt
fc,fb,fa = fb,fa,ft
end
verbose && @show ns, a, fa
end
error("no convergence: [a,b] = $(sort([a,b]))")
end
```
Like bisection, this method ensures that $a$ and $b$ is a bracket, but it moves $a$ to the newest estimate, so does not maintain that $a < b$ throughout.
We can see it in action on the sine function. Here we pass in $\lambda$, but in a real implementation (as in `Roots.Chandrapatla()`) we would have programmed the algorithm to compute the inverse quadratic value.
```{julia}
#| term: true
chandrapatla(sin, 3, 4, λ3, verbose=true)
```
The condition `Φ^2 < ξ < 1 - (1-Φ)^2` can be visualized. Assume `a,b=0,1`, `fa,fb=-1/2,1`, Then `c < a < b`, and `fc` has the same sign as `fa`, but what values of `fc` will satisfy the inequality?
```{julia}
ξ(c,fc) = (a-b)/(c-b)
Φ(c,fc) = (fa-fb)/(fc-fb)
Φl(c,fc) = Φ(c,fc)^2
Φr(c,fc) = 1 - (1-Φ(c,fc))^2
a,b = 0, 1
fa,fb = -1/2, 1
region = Lt(Φl, ξ) & Lt(ξ,Φr)
plot(region, xlims=(-2,a), ylims=(-3,0))
```
When `(c,fc)` is in the shaded area, the inverse quadratic step is chosen. We can see that `fc < fa` is needed.
For these values, this area is within the area where a implicit quadratic step will result in a value between `a` and `b`:
```{julia}
l(c,fc) = λ3(fa,fb,fc,a,b,c)
region₃ = ImplicitEquations.Lt(l,b) & ImplicitEquations.Gt(l,a)
plot(region₃, xlims=(-2,0), ylims=(-3,0))
```
There are values in the parameter space where this does not occur.
## Tolerances
The `chandrapatla` algorithm typically waits until `abs(b-a) <= 2eps(m)` (where $m$ is either $b$ or $a$ depending on the size of $f(a)$ and $f(b)$) is satisfied. Informally this means the algorithm stops when the two bracketing values are no more than a small amount apart. What is a "small amount?"
To understand, we start with the fact that floating point numbers are an approximation to real numbers.
Floating point numbers effectively represent a number in scientific notation in terms of
* a sign (plus or minus) ,
* a *mantissa* (a number in $[1,2)$, in binary ), and
* an exponent (to represent a power of $2$).
The mantissa is of the form `1.xxxxx...xxx` where there are $m$ different `x`s each possibly a `0` or `1`. The `i`th `x` indicates if the term `1/2^i` should be included in the value. The mantissa is the sum of `1` plus the indicated values of `1/2^i` for `i` in `1` to `m`. So the last `x` represents if `1/2^m` should be included in the sum. As such, the mantissa represents a discrete set of values, separated by `1/2^m`, as that is the smallest difference possible.
For example if `m=2` then the possible value for the mantissa are `11 => 1 + 1/2 + 1/4 = 7/4`, `10 => 1 + 1/2 = 6/4`, `01 => 1 + 1/4 = 5/4`. and `00 => 1 = 4/4`, values separated by `1/4 = 1/2^m`.
For $64$-bit floating point numbers `m=52`, so the values in the mantissa differ by `1/2^52 = 2.220446049250313e-16`. This is the value of `eps()`.
However, this "gap" between numbers is for values when the exponent is `0`. That is the numbers in `[1,2)`. For values in `[2,4)` the gap is twice, between `[1/2,1)` the gap is half. That is the gap depends on the size of the number. The gap between `x` and its next largest floating point number is given by `eps(x)` and that always satisfies `eps(x) <= eps() * abs(x)`.
One way to think about this is the difference between `x` and the next largest floating point values is *basically* `x*(1+eps()) - x` or `x*eps()`.
For the specific example, `abs(b-a) <= 2eps(m)` means that the gap between `a` and `b` is essentially 2 floating point values from the $x$ value with the smallest $f(x)$ value.
For bracketing methods that is about as good as you can get. However, once floating values are understood, the absolute best you can get for a bracketing interval would be
* along the way, a value `f(c)` is found which is *exactly* `0.0`
* the endpoints of the bracketing interval are *adjacent* floating point values, meaning the interval can not be bisected and `f` changes sign between the two values.
There can be problems when the stopping criteria is `abs(b-a) <= 2eps(m))` and the answer is `0.0` that require engineering around. For example, the algorithm above for the function `f(x) = -40*x*exp(-x)` does not converge when started with `[-9,1]`, even though `0.0` is an obvious zero.
```{julia}
#| hold: true
fu(x) = -40*x*exp(-x)
chandrapatla(fu, -9, 1, λ3)
```
Here the issue is `abs(b-a)` is tiny (of the order `1e-119`) but `eps(m)` is even smaller.
For non-bracketing methods, like Newton's method or the secant method, different criteria are useful. There may not be a bracketing interval for `f` (for example `f(x) = (x-1)^2`) so the second criteria above might need to be restated in terms of the last two iterates, $x_n$ and $x_{n-1}$. Calling this difference $\Delta = |x_n - x_{n-1}|$, we might stop if $\Delta$ is small enough. As there are scenarios where this can happen, but the function is not at a zero, a check on the size of $f$ is needed.
However, there may be no floating point value where $f$ is exactly `0.0` so checking the size of `f(x_n)` requires some agreement.
First if `f(x_n)` is `0.0` then it makes sense to call `x_n` an *exact zero* of $f$, even though this may hold even if `x_n`, a floating point value, is not mathematically an *exact* zero of $f$. (Consider `f(x) = x^2 - 2x + 1`. Mathematically, this is identical to `g(x) = (x-1)^2`, but `f(1 + eps())` is zero, while `g(1+eps())` is `4.930380657631324e-32`.
However, there may never be a value with `f(x_n)` exactly `0.0`. (The value of `sin(pi)` is not zero, for example, as `pi` is an approximation to $\pi$, as well the `sin` of values adjacent to `float(pi)` do not produce `0.0` exactly.)
Suppose `x_n` is the closest floating number to $\alpha$, the zero. Then the relative rounding error, $($ `x_n` $- \alpha)/\alpha$, will be a value $(1 + \delta)$ with $\delta$ less than `eps()`.
How far then can `f(x_n)` be from $0 = f(\alpha)$?
$$
f(x_n) = f(x_n - \alpha + \alpha) = f(\alpha + \alpha \cdot \delta) = f(\alpha \cdot (1 + \delta)),
$$
Assuming $f$ has a derivative, the linear approximation gives:
$$
f(x_n) \approx f(\alpha) + f'(\alpha) \cdot (\alpha\delta) = f'(\alpha) \cdot \alpha \delta
$$
So we should consider `f(x_n)` an *approximate zero* when it is on the scale of $f'(\alpha) \cdot \alpha \delta$.
That $\alpha$ factor means we consider a *relative* tolerance for `f`. Also important when `x_n` is close to `0`, is the need for an *absolute* tolerance, one not dependent on the size of `x`. So a good condition to check if `f(x_n)` is small is
`abs(f(x_n)) <= abs(x_n) * rtol + atol`, or `abs(f(x_n)) <= max(abs(x_n) * rtol, atol)`
where the relative tolerance, `rtol`, would absorb an estimate for $f'(\alpha)$.
Now, in Newton's method the update step is $f(x_n)/f'(x_n)$. Naturally when $f(x_n)$ is close to $0$, the update step is small and $\Delta$ will be close to $0$. *However*, should $f'(x_n)$ be large, then $\Delta$ can also be small and the algorithm will possibly stop, as $x_{n+1} \approx x_n$ but not necessarily $x_{n+1} \approx \alpha$. So termination on $\Delta$ alone can be off. Checking if $f(x_{n+1})$ is an approximate zero is also useful to include in a stopping criteria.
One thing to keep in mind is that the right-hand side of the rule `abs(f(x_n)) <= abs(x_n) * rtol + atol`, as a function of `x_n`, goes to `Inf` as `x_n` increases. So if `f` has `0` as an asymptote (like `e^(-x)`) for large enough `x_n`, the rule will be `true` and `x_n` could be counted as an approximate zero, despite it not being one.
So a modified criteria for convergence might look like:
* stop if $\Delta$ is small and `f` is an approximate zero with some tolerances
* stop if `f` is an approximate zero with some tolerances, but be mindful that this rule can identify mathematically erroneous answers.
It is not uncommon to assign `rtol` to have a value like `sqrt(eps())` to account for accumulated floating point errors and the factor of $f'(\alpha)$, though in the `Roots` package it is set smaller by default.
## Questions
###### Question
Let `f(x) = tanh(x)` (the hyperbolic tangent) and `fp(x) = sech(x)^2`, its derivative.
Does *Newton's* method (using `Roots.Newton()`) converge starting at `1.0`?
```{julia}
#| hold: true
#| echo: false
yesnoq("yes")
```
Does *Newton's* method (using `Roots.Newton()`) converge starting at `1.3`?
```{julia}
#| hold: true
#| echo: false
yesnoq("no")
```
Does the secant method (using `Roots.Secant()`) converge starting at `1.3`? (a second starting value will automatically be chosen, if not directly passed in.)
```{julia}
#| hold: true
#| echo: false
yesnoq("yes")
```
###### Question
For the function `f(x) = x^5 - x - 1` both Newton's method and the secant method will converge to the one root when started from `1.0`. Using `verbose=true` as an argument to `find_zero`, (e.g., `find_zero(f, x0, Roots.Secant(), verbose=true)`) how many *more* steps does the secant method need to converge?
```{julia}
#| hold: true
#| echo: false
numericq(2)
```
Do the two methods converge to the exact same value?
```{julia}
#| hold: true
#| echo: false
yesnoq("yes")
```
###### Question
Let `f(x) = exp(x) - x^4` and `x0=8.0`. How many steps (iterations) does it take for the secant method to converge using the default tolerances?
```{julia}
#| hold: true
#| echo: false
numericq(10, 1)
```
###### Question
Let `f(x) = exp(x) - x^4` and a starting bracket be `x0 = [8.9]`. Then calling `find_zero(f,x0, verbose=true)` will show that 49 steps are needed for exact bisection to converge. What about with the `Roots.Brent()` algorithm, which uses inverse quadratic steps when it can?
It takes how many steps?
```{julia}
#| hold: true
#| echo: false
numericq(36, 1)
```
The `Roots.A42()` method uses inverse cubic interpolation, as possible, how many steps does this method take to converge?
```{julia}
#| hold: true
#| echo: false
numericq(3, 1)
```
The large difference is due to how the tolerances are set within `Roots`. The `Brent method gets pretty close in a few steps, but takes a much longer time to get close enough for the default tolerances
###### Question
Consider this crazy function defined by:
```{julia}
#| eval: false
f(x) = cos(100*x)-4*erf(30*x-10)
```
(The `erf` function is the (error function](https://en.wikipedia.org/wiki/Error_function) and is in the `SpecialFunctions` package loaded with `CalculusWithJulia`.)
Make a plot over the interval $[-3,3]$ to see why it is called "crazy".
Does `find_zero` find a zero to this function starting from $0$?
```{julia}
#| hold: true
#| echo: false
yesnoq("yes")
```
If so, what is the value?
```{julia}
#| hold: true
#| echo: false
f(x) = cos(100*x)-4*erf(30*x-10)
val = find_zero(f, 0)
numericq(val)
```
If not, what is the reason?
```{julia}
#| hold: true
#| echo: false
choices = [
"The zero is a simple zero",
"The zero is not a simple zero",
"The function oscillates too much to rely on the tangent line approximation far from the zero",
"We can find an answer"
]
answ = 4
radioq(choices, answ, keep_order=true)
```
Does `find_zero` find a zero to this function starting from $1$?
```{julia}
#| hold: true
#| echo: false
yesnoq(false)
```
If so, what is the value?
```{julia}
#| hold: true
#| echo: false
numericq(-999.999)
```
If not, what is the reason?
```{julia}
#| hold: true
#| echo: false
choices = [
"The zero is a simple zero",
"The zero is not a simple zero",
"The function oscillates too much to rely on the tangent line approximations far from the zero",
"We can find an answer"
]
answ = 3
radioq(choices, answ, keep_order=true)
```

View File

@@ -0,0 +1,72 @@
// newton's method
const b = JXG.JSXGraph.initBoard('jsxgraph', {
boundingbox: [-3,5,3,-5], axis:true
});
var f = function(x) {return x*x*x*x*x - x - 1};
var fp = function(x) { return 4*x*x*x*x - 1};
var x0 = 0.85;
var nm = function(x) { return x - f(x)/fp(x);};
var l = b.create('point', [-1.5,0], {name:'', size:0});
var r = b.create('point', [1.5,0], {name:'', size:0});
var xaxis = b.create('line', [l,r])
var P0 = b.create('glider', [x0,0,xaxis], {name:'x0'});
var P0a = b.create('point', [function() {return P0.X();},
function() {return f(P0.X());}], {name:''});
var P1 = b.create('point', [function() {return nm(P0.X());},
0], {name:''});
var P1a = b.create('point', [function() {return P1.X();},
function() {return f(P1.X());}], {name:''});
var P2 = b.create('point', [function() {return nm(P1.X());},
0], {name:''});
var P2a = b.create('point', [function() {return P2.X();},
function() {return f(P2.X());}], {name:''});
var P3 = b.create('point', [function() {return nm(P2.X());},
0], {name:''});
var P3a = b.create('point', [function() {return P3.X();},
function() {return f(P3.X());}], {name:''});
var P4 = b.create('point', [function() {return nm(P3.X());},
0], {name:''});
var P4a = b.create('point', [function() {return P4.X();},
function() {return f(P4.X());}], {name:''});
var P5 = b.create('point', [function() {return nm(P4.X());},
0], {name:'x5', strokeColor:'black'});
P0a.setAttribute({fixed:true});
P1.setAttribute({fixed:true});
P1a.setAttribute({fixed:true});
P2.setAttribute({fixed:true});
P2a.setAttribute({fixed:true});
P3.setAttribute({fixed:true});
P3a.setAttribute({fixed:true});
P4.setAttribute({fixed:true});
P4a.setAttribute({fixed:true});
P5.setAttribute({fixed:true});
var sc = '#000000';
b.create('segment', [P0,P0a], {strokeColor:sc, strokeWidth:1});
b.create('segment', [P0a, P1], {strokeColor:sc, strokeWidth:1});
b.create('segment', [P1,P1a], {strokeColor:sc, strokeWidth:1});
b.create('segment', [P1a, P2], {strokeColor:sc, strokeWidth:1});
b.create('segment', [P2,P2a], {strokeColor:sc, strokeWidth:1});
b.create('segment', [P2a, P3], {strokeColor:sc, strokeWidth:1});
b.create('segment', [P3,P3a], {strokeColor:sc, strokeWidth:1});
b.create('segment', [P3a, P4], {strokeColor:sc, strokeWidth:1});
b.create('segment', [P4,P4a], {strokeColor:sc, strokeWidth:1});
b.create('segment', [P4a, P5], {strokeColor:sc, strokeWidth:1});
b.create('functiongraph', [f, -1.5, 1.5])

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,412 @@
# Numeric derivatives
```{julia}
#| echo: false
import Logging
Logging.disable_logging(Logging.Info) # or e.g. Logging.Info
Logging.disable_logging(Logging.Warn)
import SymPy
function Base.show(io::IO, ::MIME"text/html", x::T) where {T <: SymPy.SymbolicObject}
println(io, "<span class=\"math-left-align\" style=\"padding-left: 4px; width:0; float:left;\"> ")
println(io, "\\[")
println(io, sympy.latex(x))
println(io, "\\]")
println(io, "</span>")
end
# hack to work around issue
import Markdown
import CalculusWithJulia
function CalculusWithJulia.WeaveSupport.ImageFile(d::Symbol, f::AbstractString, caption; kwargs...)
nm = joinpath("..", string(d), f)
u = "![$caption]($nm)"
Markdown.parse(u)
end
nothing
```
This section uses these add-on packages:
```{julia}
using CalculusWithJulia
using Plots
using ForwardDiff
using SymPy
using Roots
```
```{julia}
#| echo: false
#| results: "hidden"
using CalculusWithJulia.WeaveSupport
const frontmatter = (
title = "Numeric derivatives",
description = "Calculus with Julia: Numeric derivatives",
tags = ["CalculusWithJulia", "derivatives", "numeric derivatives"],
);
nothing
```
---
`SymPy` returns symbolic derivatives. Up to choices of simplification, these answers match those that would be derived by hand. This is useful when comparing with known answers and for seeing the structure of the answer. However, there are times we just want to work with the answer numerically. For that we have other options within `Julia`. We discuss approximate derivatives and automatic derivatives. The latter will find wide usage in these notes.
### Approximate derivatives
By approximating the limit of the secant line with a value for a small, but positive, $h$, we get an approximation to the derivative. That is
$$
f'(x) \approx \frac{f(x+h) - f(x)}{h}.
$$
This is the forward-difference approximation. The central difference approximation looks both ways:
$$
f'(x) \approx \frac{f(x+h) - f(x-h)}{2h}.
$$
Though in general they are different, they are both approximations. The central difference is usually more accurate for the same size $h$. However, both are susceptible to round-off errors. The numerator is a subtraction of like-size numbers - a perfect opportunity to lose precision.
As such there is a balancing act:
* if $h$ is too small the round-off errors are problematic,
* if $h$ is too big, the approximation to the limit is not good.
For the forward difference $h$ values around $10^{-8}$ are typically good, for the central difference, values around $10^{-6}$ are typically good.
##### Example
Let's verify that the forward difference isn't too far off.
```{julia}
f(x) = exp(-x^2/2)
c = 1
h = 1e-8
fapprox = (f(c+h) - f(c)) / h
```
We can compare to the actual with:
```{julia}
@syms x
df = diff(f(x), x)
factual = N(df(c))
abs(factual - fapprox)
```
The error is about $1$ part in $100$ million.
The central difference is better here:
```{julia}
#| hold: true
h = 1e-6
cdapprox = (f(c+h) - f(c-h)) / (2h)
abs(factual - cdapprox)
```
---
The [FiniteDifferences](https://github.com/JuliaDiff/FiniteDifferences.jl) and [FiniteDiff](https://github.com/JuliaDiff/FiniteDiff.jl) packages provide performant interfaces for differentiation based on finite differences.
### Automatic derivatives
There are some other ways to compute derivatives numerically that give much more accuracy at the expense of slightly increased computing time. Automatic differentiation is the general name for a few different approaches. These approaches promise less complexity - in some cases - than symbolic derivatives and more accuracy than approximate derivatives; the accuracy is on the order of machine precision.
The `ForwardDiff` package provides one of [several](https://juliadiff.org/) ways for `Julia` to compute automatic derivatives. `ForwardDiff` is well suited for functions encountered in these notes, which depend on at most a few variables and output no more than a few values at once.
The `ForwardDiff` package was loaded in this section; in general its features are available when the `CalculusWithJulia` package is loaded, as that package provides a more convenient interface. The `derivative` function is not exported by `FiniteDiff`, so its usage requires qualification. To illustrate, to find the derivative of $f(x)$ at a *point* we have this syntax:
```{julia}
ForwardDiff.derivative(f, c) # derivative is qualified by a module name
```
The `CalculusWithJulia` package defines an operator `D` which goes from finding a derivative at a point with `ForwardDiff.derivative` to defining a function which evaluates the derivative at each point. It is defined along the lines of `D(f) = x -> ForwardDiff.derivative(f,x)` in parallel to how the derivative operation for a function is defined mathematically from the definition for its value at a point.
Here we see the error in estimating $f'(1)$:
```{julia}
fauto = D(f)(c) # D(f) is a function, D(f)(c) is the function called on c
abs(factual - fauto)
```
In this case, it is exact.
The `D` operator is defined for most all functions in `Julia`, though, like the `diff` operator in `SymPy` there are some for which it won't work.
##### Example
For $f(x) = \sqrt{1 + \sin(\cos(x))}$ compare the difference between the forward derivative with $h=1e-8$ and that computed by `D` at $x=\pi/4$.
The forward derivative is found with:
```{julia}
𝒇(x) = sqrt(1 + sin(cos(x)))
𝒄, 𝒉 = pi/4, 1e-8
fwd = (𝒇(𝒄+𝒉) - 𝒇(𝒄))/𝒉
```
That given by `D` is:
```{julia}
ds_value = D(𝒇)(𝒄)
ds_value, fwd, ds_value - fwd
```
Finally, `SymPy` gives an exact value we use to compare:
```{julia}
𝒇𝒑 = diff(𝒇(x), x)
```
```{julia}
actual = N(𝒇𝒑(PI/4))
actual - ds_value, actual - fwd
```
#### Convenient notation
`Julia` allows the possibility of extending functions to different types. Out of the box, the `'` notation is not employed for functions, but is used for matrices. It is used in postfix position, as with `A'`. We can define it to do the same thing as `D` for functions and then, we can evaluate derivatives with the familiar `f'(x)`. This is done in `CalculusWithJulia` along the lines of `Base.adjoint(f::Function) = D(f)`.
Then, we have, for example:
```{julia}
#| hold: true
f(x) = sin(x)
f'(pi), f''(pi)
```
##### Example
Suppose our task is to find a zero of the second derivative of $k(x) = e^{-x^2/2}$ in $[0, 10]$, a known bracket. The `D` function takes a second argument to indicate the order of the derivative (e.g., `D(f,2)`), but we use the more familiar notation:
```{julia}
#| hold: true
k(x) = exp(-x^2/2)
find_zero(k'', 0..10)
```
We pass in the function object, `k''`, and not the evaluated function.
## Recap on derivatives in Julia
A quick summary for finding derivatives in `Julia`, as there are $3$ different manners:
* Symbolic derivatives are found using `diff` from `SymPy`
* Automatic derivatives are found using the notation `f'` using `ForwardDiff.derivative`
* approximate derivatives at a point, `c`, for a given `h` are found with `(f(c+h)-f(c))/h`.
For example, here all three are computed and compared:
```{julia}
#| hold: true
f(x) = exp(-x)*sin(x)
c = pi
h = 1e-8
fp = diff(f(x),x)
fp, fp(c), f'(c), (f(c+h) - f(c))/h
```
:::{.callout-note}
## Note
The use of `'` to find derivatives provided by `CalculusWithJulia` is convenient, and used extensively in these notes, but it needs to be noted that it does **not conform** with the generic meaning of `'` within `Julia`'s wider package ecosystem and may cause issue with linear algebra operations; the symbol is meant for the adjoint of a matrix.
:::
## Questions
##### Question
Find the derivative using a forward difference approximation of $f(x) = x^x$ at the point $x=2$ using `h=0.1`:
```{julia}
#| hold: true
#| echo: false
f(x) = x^x
c, h = 2, 0.1
val = (f(c+h) - f(c))/h
numericq(val)
```
Using `D` or `f'` find the value using automatic differentiation
```{julia}
#| hold: true
#| echo: false
f(x) = x^x
c = 2
val = f'(c)
numericq(val)
```
##### Question
Mathematically, as the value of `h` in the forward difference gets smaller the forward difference approximation gets better. On the computer, this is thwarted by floating point representation issues (in particular the error in subtracting two like-sized numbers in forming $f(x+h)-f(x)$.)
For `1e-16` what is the error (in absolute value) in finding the forward difference approximation for the derivative of $\sin(x)$ at $x=0$?
```{julia}
#| hold: true
#| echo: false
f(x) = sin(x)
h = 1e-16
c = 0
approx = (f(c+h)-f(c))/h
val = abs(cos(0) - approx)
numericq(val)
```
Repeat for $x=\pi/4$:
```{julia}
#| hold: true
#| echo: false
f(x) = sin(x)
h = 1e-16
c = pi/4
approx = (f(c+h)-f(c))/h
val = abs(cos(0) - approx)
numericq(val)
```
###### Question
Let $f(x) = x^x$. Using `D`, find $f'(3)$.
```{julia}
#| hold: true
#| echo: false
f(x) = x^x
val = D(f)(3)
numericq(val)
```
###### Question
Let $f(x) = \lvert 1 - \sqrt{1 + x}\rvert$. Using `D`, find $f'(3)$.
```{julia}
#| hold: true
#| echo: false
f(x) = abs(1 - sqrt(1 + x))
val = D(f)(3)
numericq(val)
```
###### Question
Let $f(x) = e^{\sin(x)}$. Using `D`, find $f'(3)$.
```{julia}
#| hold: true
#| echo: false
f(x) = exp(sin(x))
val = D(f)(3)
numericq(val)
```
###### Question
For `Julia`'s `airyai` function find a numeric derivative using the forward difference. For $c=3$ and $h=10^{-8}$ find the forward difference approximation to $f'(3)$ for the `airyai` function.
```{julia}
#| hold: true
#| echo: false
h = 1e-8
c = 3
val = (airyai(c+h) - airyai(c))/h
numericq(val)
```
###### Question
Find the rate of change with respect to time of the function $f(t)= 64 - 16t^2$ at $t=1$.
```{julia}
#| hold: true
#| echo: false
fp_(t) = -16*2*t
c = 1
numericq(fp_(c))
```
###### Question
Find the rate of change with respect to height, $h$, of $f(h) = 32h^3 - 62 h + 12$ at $h=2$.
```{julia}
#| hold: true
#| echo: false
fp_(h) = 3*32h^2 - 62
c = 2
numericq(fp_(2))
```

View File

@@ -0,0 +1,36 @@
// inscribe trapezoid
var R = 5;
var Delta = 0.5
const b = JXG.JSXGraph.initBoard('jsxgraph', {
boundingbox: [-R-Delta,R+Delta,R+Delta,-1], axis:true
});
var xax = b.create("segment", [[0,0],[R,0]]);
var P4 = b.create("glider", [R/2,0, xax], {name: "P_4=(r,0)"});
var CL = b.create('point', [function() {return -P4.X()},0], {name:''});
var CR = b.create('point', [function() {return P4.X()},0], {name:''});
var C = b.create('semicircle', [CL,CR]);
var Crestricted = b.create("functiongraph",
[function(x) {
r = P4.X();
y = Math.sqrt(r*r - x*x);
return y;
}, 0, function() {return P4.X()}]);
var P3 = b.create("glider", [
P4.X()/2,
Math.sqrt(P4.X()*P4.X()*(1 - 1/4)),
Crestricted], {name:"P_3=(x,y)"});
var P1 = b.create('point', [function() {return -Math.abs(P4.X());},
function() {return P4.Y();}], {name:'P_1'});
var P2 = b.create('point', [function() {return -P3.X();},
function() {return P3.Y();}], {name:'P_2'});
var poly = b.create('polygon',[P1, P2, P3, P4], { borders:{strokeColor:'black'} });
b.create('text',[-1.5,.25, function(){ return 'Area='+ poly.Area().toFixed(1); }]);

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,44 @@
using WeavePynb
using CwJWeaveTpl
fnames = [
"derivatives", ## more questions
"numeric_derivatives",
"mean_value_theorem",
"optimization",
"curve_sketching",
"linearization",
"newtons_method",
"lhopitals_rule", ## Okay - -but could beef up questions..
"implicit_differentiation", ## add more questions?
"related_rates",
"taylor_series_polynomials"
]
process_file(nm; cache=:off) = CwJWeaveTpl.mmd(nm * ".jmd", cache=cache)
function process_files(;cache=:user)
for f in fnames
@show f
process_file(f, cache=cache)
end
end
"""
## TODO derivatives
tangent lines intersect at avearge for a parabola
Should we have derivative results: inverse functions, logarithmic differentiation...
"""

View File

@@ -0,0 +1,843 @@
# Related rates
```{julia}
#| echo: false
import Logging
Logging.disable_logging(Logging.Info) # or e.g. Logging.Info
Logging.disable_logging(Logging.Warn)
import SymPy
function Base.show(io::IO, ::MIME"text/html", x::T) where {T <: SymPy.SymbolicObject}
println(io, "<span class=\"math-left-align\" style=\"padding-left: 4px; width:0; float:left;\"> ")
println(io, "\\[")
println(io, sympy.latex(x))
println(io, "\\]")
println(io, "</span>")
end
# hack to work around issue
import Markdown
import CalculusWithJulia
function CalculusWithJulia.WeaveSupport.ImageFile(d::Symbol, f::AbstractString, caption; kwargs...)
nm = joinpath("..", string(d), f)
u = "![$caption]($nm)"
Markdown.parse(u)
end
nothing
```
This section uses these add-on packaages:
```{julia}
using CalculusWithJulia
using Plots
using Roots
using SymPy
```
```{julia}
#| echo: false
#| results: "hidden"
using CalculusWithJulia.WeaveSupport
fig_size=(800, 600)
const frontmatter = (
title = "Related rates",
description = "Calculus with Julia: Related rates",
tags = ["CalculusWithJulia", "derivatives", "related rates"],
);
nothing
```
---
Related rates problems involve two (or more) unknown quantities that are related through an equation. As the two variables depend on each other, also so do their rates - change with respect to some variable which is often time, though exactly how remains to be discovered. Hence the name "related rates."
#### Examples
The following is a typical "book" problem:
> A screen saver displays the outline of a $3$ cm by $2$ cm rectangle and then expands the rectangle in such a way that the $2$ cm side is expanding at the rate of $4$ cm/sec and the proportions of the rectangle never change. How fast is the area of the rectangle increasing when its dimensions are $12$ cm by $8$ cm? [Source.](http://oregonstate.edu/instruct/mth251/cq/Stage9/Practice/ratesProblems.html)
```{julia}
#| hold: true
#| echo: false
#| cache: true
### {{{growing_rects}}}
## Secant line approaches tangent line...
function growing_rects_graph(n)
w = (t) -> 2 + 4t
h = (t) -> 3/2 * w(t)
t = n - 1
w_2 = w(t)/2
h_2 = h(t)/2
w_n = w(5)/2
h_n = h(5)/2
plt = plot(w_2 * [-1, -1, 1, 1, -1], h_2 * [-1, 1, 1, -1, -1], xlim=(-17,17), ylim=(-17,17),
legend=false, size=fig_size)
annotate!(plt, [(-1.5, 1, "Area = $(round(Int, 4*w_2*h_2))")])
plt
end
caption = L"""
As $t$ increases, the size of the rectangle grows. The ratio of width to height is fixed. If we know the rate of change in time for the width ($dw/dt$) and the height ($dh/dt$) can we tell the rate of change of *area* with respect to time ($dA/dt$)?
"""
n=6
anim = @animate for i=1:n
growing_rects_graph(i)
end
imgfile = tempname() * ".gif"
gif(anim, imgfile, fps = 1)
ImageFile(imgfile, caption)
```
Here we know $A = w \cdot h$ and we know some things about how $w$ and $h$ are related *and* about the rate of how both $w$ and $h$ grow in time $t$. That means that we could express this growth in terms of some functions $w(t)$ and $h(t)$, then we can figure out that the area - as a function of $t$ - will be expressed as:
$$
A(t) = w(t) \cdot h(t).
$$
We would get by the product rule that the *rate of change* of area with respect to time, $A'(t)$ is just:
$$
A'(t) = w'(t) h(t) + w(t) h'(t).
$$
As an aside, it is fairly conventional to suppress the $(t)$ part of the notation $A=wh$ and to use the Leibniz notation for derivatives:
$$
\frac{dA}{dt} = \frac{dw}{dt} h + w \frac{dh}{dt}.
$$
This relationship is true for all $t$, but the problem discusses a certain value of $t$ - when $w(t)=8$ and $h(t) = 12$. At this same value of $t$, we have $w'(t) = 4$ and so $h'(t) = 6$. Substituting these 4 values into the 4 unknowns in the formula for $A'(t)$ gives:
$$
A'(t) = 4 \cdot 12 + 8 \cdot 6 = 96.
$$
Summarizing, from the relationship between $A$, $w$ and $t$, there is a relationship between their rates of growth with respect to $t$, a time variable. Using this and known values, we can compute. In this case, $A'$ at the specific $t$.
We could also have done this differently. We would recognize the following:
* The area of a rectangle is just:
```{julia}
A(w,h) = w * h
```
* The width - expanding at a rate of $4t$ from a starting value of $2$ - must satisfy:
```{julia}
w(t) = 2 + 4*t
```
* The height is a constant proportion of the width:
```{julia}
h(t) = 3/2 * w(t)
```
This means again that area depends on $t$ through this formula:
```{julia}
A(t) = A(w(t), h(t))
```
This is why the rates of change are related: as $w$ and $h$ change in time, the functional relationship with $A$ means $A$ also changes in time.
Now to answer the question, when the width is 8, we must have that $t$ is:
```{julia}
tstar = find_zero(x -> w(x) - 8, [0, 4]) # or solve by hand to get 3/2
```
The question is to find the rate the area is increasing at the given time $t$, which is $A'(t)$ or $dA/dt$. We get this by performing the differentiation, then substituting in the value.
Here we do so with the aid of `Julia`, though this problem could readily be done "by hand."
We have expressed $A$ as a function of $t$ by composition, so can differentiate that:
```{julia}
A'(tstar)
```
---
Now what? Why is $96$ of any interest? It is if the value at a specific time is needed. But in general, a better question might be to understand if there is some pattern to the numbers in the figure, these being $6, 54, 150, 294, 486, 726$. Their differences are the *average* rate of change:
```{julia}
xs = [6, 54, 150, 294, 486, 726]
ds = diff(xs)
```
Those seem to be increasing by a fixed amount each time, which we can see by one more application of `diff`:
```{julia}
diff(ds)
```
How can this relationship be summarized? Well, let's go back to what we know, though this time using symbolic math:
```{julia}
@syms t
diff(A(t), t)
```
This should be clear: the rate of change, $dA/dt$, is increasing linearly, hence the second derivative, $dA^2/dt^2$ would be constant, just as we saw for the average rate of change.
So, for this problem, a constant rate of change in width and height leads to a linear rate of change in area, put otherwise, linear growth in both width and height leads to quadratic growth in area.
##### Example
A ladder, with length $l$, is leaning against a wall. We parameterize this problem so that the top of the ladder is at $(0,h)$ and the bottom at $(b, 0)$. Then $l^2 = h^2 + b^2$ is a constant.
If the ladder starts to slip away at the base, but remains in contact with the wall, express the rate of change of $h$ with respect to $t$ in terms of $db/dt$.
We have from implicitly differentiating in $t$ the equation $l^2 = h^2 + b^2$, noting that $l$ is a constant, that:
$$
0 = 2h \frac{dh}{dt} + 2b \frac{db}{dt}.
$$
Solving, yields:
$$
\frac{dh}{dt} = -\frac{b}{h} \cdot \frac{db}{dt}.
$$
* If when $l = 12$ it is known that $db/dt = 2$ when $b=4$, find $dh/dt$.
We just need to find $h$ for this value of $b$, as the other two quantities in the last equation are known.
But $h = \sqrt{l^2 - b^2}$, so the answer is:
```{julia}
length, bottom, dbdt = 12, 4, 2
height = sqrt(length^2 - bottom^2)
-bottom/height * dbdt
```
* What happens to the rate as $b$ goes to $l$?
As $b$ goes to $l$, $h$ goes to $0$, so $b/h$ blows up. Unless $db/dt$ goes to $0$, the expression will become $-\infty$.
:::{.callout-note}
## Note
Often, this problem is presented with $db/dt$ having a constant rate. In this case, the ladder problem defies physics, as $dh/dt$ eventually is faster than the speed of light as $h \rightarrow 0+$. In practice, were $db/dt$ kept at a constant, the ladder would necessarily come away from the wall. The trajectory would follow that of a tractrix were there no gravity to account for.
:::
##### Example
```{julia}
#| hold: true
#| echo: false
caption = "A man and woman walk towards the light."
imgfile = "figures/long-shadow-noir.png"
ImageFile(:derivatives, imgfile, caption)
```
Shadows are a staple of film noir. In the photo, suppose a man and a woman walk towards a street light. As they approach the light the length of their shadow changes.
Suppose, we focus on the $5$ foot tall woman. Her shadow comes from a streetlight $15$ feet high. She is walking at $3$ feet per second towards the light. What is the rate of change of her shadow?
The setup for this problem involves drawing a right triangle with height $12$ and base given by the distance $x$ from the light the woman is *plus* the length $l$ of the shadow. There is a similar triangle formed by the woman's height with length $l$. Equating the ratios of the sided gives:
$$
\frac{5}{l} = \frac{12}{x + l}
$$
As we need to take derivatives, we work with the reciprocal relationship:
$$
\frac{l}{5} = \frac{x + l}{12}
$$
Differentiating in $t$ gives:
$$
\frac{l'}{5} = \frac{x' + l'}{12}
$$
Or
$$
l' \cdot (\frac{1}{5} - \frac{1}{12}) = \frac{x'}{12}
$$
Solving for $l'$ gives an answer in terms of $x'$ the rate the woman is walking. In this description $x$ is getting shorter, so $x'$ would be $-3$ feet per second and the shadow length would be decreasing at a rate proportional to the walking speed.
##### Example
```{julia}
#| hold: true
#| echo: false
p = plot(; axis=nothing, border=:none, legend=false, aspect_ratio=:equal)
scatter!(p, [0],[50], color=:yellow, markersize=50)
plot!(p, [0, 50], [0,0], linestyle=:dash)
plot!(p, [0,50], [50,0], linestyle=:dot)
plot!(p, [25,25],[25,0], linewidth=5, color=:black)
plot!(p, [25,50], [0,0], linewidth=2, color=:black)
```
The sun is setting at the rate of $1/20$ radian/min, and appears to be dropping perpendicular to the horizon, as depicted in the figure. How fast is the shadow of a $25$ meter wall lengthening at the moment when the shadow is $25$ meters long?
Let the shadow length be labeled $x$, as it appears on the $x$ axis above. Then we have by right-angle trigonometry:
$$
\tan(\theta) = \frac{25}{x}
$$
of $x\tan(\theta) = 25$.
As $t$ evolves, we know $d\theta/dt$ but what is $dx/dt$? Using implicit differentiation yields:
$$
\frac{dx}{dt} \cdot \tan(\theta) + x \cdot (\sec^2(\theta)\cdot \frac{d\theta}{dt}) = 0
$$
Substituting known values and identifying $\theta=\pi/4$ when the shadow length, $x$, is $25$ gives:
$$
\frac{dx}{dt} \cdot \tan(\pi/4) + 25 \cdot((4/2) \cdot \frac{-1}{20} = 0
$$
This can be solved for the unknown: $dx/dt = 50/20$.
##### Example
A batter hits a ball toward third base at $75$ ft/sec and runs toward first base at a rate of $24$ ft/sec. At what rate does the distance between the ball and the batter change when $2$ seconds have passed?
We will answer this with `SymPy`. First we create some symbols for the movement of the ball towardsthird base, `b(t)`, the runner toward first base, `r(t)`, and the two velocities. We use symbolic functions for the movements, as we will be differentiating them in time:
```{julia}
@syms b() r() v_b v_r
d = sqrt(b(t)^2 + r(t)^2)
```
The distance formula applies to give $d$. As the ball and runner are moving in a perpendicular direction, the formula is easy to apply.
We can differentiate `d` in terms of `t` and in process we also find the derivatives of `b` and `r`:
```{julia}
db, dr = diff(b(t),t), diff(r(t),t) # b(t), r(t) -- symbolic functions
dd = diff(d,t) # d -- not d(t) -- an expression
```
The slight difference in the commands is due to `b` and `r` being symbolic functions, whereas `d` is a symbolic expression. Now we begin substituting. First, from the problem `db` is just the velocity in the ball's direction, or `v_b`. Similarly for `v_r`:
```{julia}
ddt = subs(dd, db => v_b, dr => v_r)
```
Now, we can substitute in for `b(t)`, as it is `v_b*t`, etc.:
```{julia}
ddt₁ = subs(ddt, b(t) => v_b * t, r(t) => v_r * t)
```
This finds the rate of change of time for any `t` with symbolic values of the velocities. (And shows how the answer doesn't actually depend on $t$.) The problem's answer comes from a last substitution:
```{julia}
ddt₁(t => 2, v_b => 75, v_r => 24)
```
Were this done by "hand," it would be better to work with distance squared to avoid the expansion of complexity from the square root. That is, using implicit differentiation:
$$
\begin{align*}
d^2 &= b^2 + r^2\\
2d\cdot d' &= 2b\cdot b' + 2r\cdot r'\\
d' &= (b\cdot b' + r \cdot r')/d\\
d' &= (tb'\cdot b' + tr' \cdot r')/d\\
d' &= \left((b')^2 + (r')^2\right) \cdot \frac{t}{d}.
\end{align*}
$$
##### Example
```{julia}
#| hold: true
#| echo: false
#| cache: true
###{{{baseball_been_berry_good}}}
## Secant line approaches tangent line...
function baseball_been_berry_good_graph(n)
v0 = 15
x = (t) -> 50t
y = (t) -> v0*t - 5 * t^2
ns = range(.25, stop=3, length=8)
t = ns[n]
ts = range(0, stop=t, length=50)
xs = map(x, ts)
ys = map(y, ts)
degrees = atand(y(t)/(100-x(t)))
degrees = degrees < 0 ? 180 + degrees : degrees
plt = plot(xs, ys, legend=false, size=fig_size, xlim=(0,150), ylim=(0,15))
plot!(plt, [x(t), 100], [y(t), 0.0], color=:orange)
annotate!(plt, [(55, 4,"θ = $(round(Int, degrees)) degrees"),
(x(t), y(t), "($(round(Int, x(t))), $(round(Int, y(t))))")])
end
caption = L"""
The flight of the ball as being tracked by a stationary outfielder. This ball will go over the head of the player. What can the player tell from the quantity $d\theta/dt$?
"""
n = 8
anim = @animate for i=1:n
baseball_been_berry_good_graph(i)
end
imgfile = tempname() * ".gif"
gif(anim, imgfile, fps = 1)
ImageFile(imgfile, caption)
```
A baseball player stands $100$ meters from home base. A batter hits the ball directly at the player so that the distance from home plate is $x(t)$ and the height is $y(t)$.
The player tracks the flight of the ball in terms of the angle $\theta$ made between the ball and the player. This will satisfy:
$$
\tan(\theta) = \frac{y(t)}{100 - x(t)}.
$$
What is the rate of change of $\theta$ with respect to $t$ in terms of that of $x$ and $y$?
We have by the chain rule and quotient rule:
$$
\sec^2(\theta) \theta'(t) = \frac{y'(t) \cdot (100 - x(t)) - y(t) \cdot (-x'(t))}{(100 - x(t))^2}.
$$
If we have $x(t) = 50t$ and $y(t)=v_{0y} t - 5 t^2$ when is the rate of change of the angle happening most quickly?
The formula for $\theta'(t)$ is
$$
\theta'(t) = \cos^2(\theta) \cdot \frac{y'(t) \cdot (100 - x(t)) - y(t) \cdot (-x'(t))}{(100 - x(t))^2}.
$$
This question requires us to differentiate *again* in $t$. Since we have fairly explicit function for $x$ and $y$, we will use `SymPy` to do this.
```{julia}
@syms theta()
v0 = 5
x(t) = 50t
y(t) = v0*t - 5 * t^2
eqn = tan(theta(t)) - y(t) / (100 - x(t))
```
```{julia}
thetap = diff(theta(t),t)
dtheta = solve(diff(eqn, t), thetap)[1]
```
We could proceed directly by evaluating:
```{julia}
d2theta = diff(dtheta, t)(thetap => dtheta)
```
That is not so tractable, however.
It helps to simplify $\cos^2(\theta(t))$ using basic right-triangle trigonometry. Recall, $\theta$ comes from a right triangle with height $y(t)$ and length $(100 - x(t))$. The cosine of this angle will be $100 - x(t)$ divided by the length of the hypotenuse. So we can substitute:
```{julia}
dtheta₁ = dtheta(cos(theta(t))^2 => (100 -x(t))^2/(y(t)^2 + (100-x(t))^2))
```
Plotting reveals some interesting things. For $v_{0y} < 10$ we have graphs that look like:
```{julia}
plot(dtheta₁, 0, v0/5)
```
The ball will drop in front of the player, and the change in $d\theta/dt$ is monotonic.
But let's rerun the code with $v_{0y} > 10$:
```{julia}
#| hold: true
v0 = 15
x(t) = 50t
y(t) = v0*t - 5 * t^2
eqn = tan(theta(t)) - y(t) / (100 - x(t))
thetap = diff(theta(t),t)
dtheta = solve(diff(eqn, t), thetap)[1]
dtheta₁ = subs(dtheta, cos(theta(t))^2, (100 - x(t))^2/(y(t)^2 + (100 - x(t))^2))
plot(dtheta₁, 0, v0/5)
```
In the second case we have a different shape. The graph is not monotonic, and before the peak there is an inflection point. Without thinking too hard, we can see that the greatest change in the angle is when it is just above the head ($t=2$ has $x(t)=100$).
That these two graphs differ so, means that the player may be able to read if the ball is going to go over his or her head by paying attention to the how the ball is being tracked.
##### Example
Hipster pour-over coffee is made with a conical coffee filter. The cone is actually a [frustum](http://en.wikipedia.org/wiki/Frustum) of a cone with small diameter, say $r_0$, chopped off. We will parameterize our cone by a value $h \geq 0$ on the $y$ axis and an angle $\theta$ formed by a side and the $y$ axis. Then the coffee filter is the part of the cone between some $h_0$ (related $r_0=h_0 \tan(\theta)$) and $h$.
The volume of a cone of height $h$ is $V(h) = \pi/3 h \cdot R^2$. From the geometry, $R = h\tan(\theta)$. The volume of the filter then is:
$$
V = V(h) - V(h_0).
$$
What is $dV/dh$ in terms of $dR/dh$?
Differentiating implicitly gives:
$$
\frac{dV}{dh} = \frac{\pi}{3} ( R(h)^2 + h \cdot 2 R \frac{dR}{dh}).
$$
We see that it depends on $R$ and the change in $R$ with respect to $h$. However, we visualize $h$ - the height - so it is better to re-express. Clearly, $dR/dh = \tan\theta$ and using $R(h) = h \tan(\theta)$ we get:
$$
\frac{dV}{dh} = \pi h^2 \tan^2(\theta).
$$
The rate of change goes down as $h$ gets smaller ($h \geq h_0$) and gets bigger for bigger $\theta$.
How do the quantities vary in time?
For an incompressible fluid, by balancing the volume leaving with how it leaves we will have $dh/dt$ is the ratio of the cross-sectional area at bottom over that at the height of the fluid $(\pi \cdot (h_0\tan(\theta))^2) / (\pi \cdot ((h\tan\theta))^2)$ times the outward velocity of the fluid.
That is $dh/dt = (h_0/h)^2 \cdot v$. Which makes sense - larger openings ($h_0$) mean more fluid lost per unit time so the height change follows, higher levels ($h$) means the change in height is slower, as the cross-sections have more volume.
By [Torricelli's](http://en.wikipedia.org/wiki/Torricelli's_law) law, the out velocity follows the law $v = \sqrt{2g(h-h_0)}$. This gives:
$$
\frac{dh}{dt} = \frac{h_0^2}{h^2} \cdot v = \frac{h_0^2}{h^2} \sqrt{2g(h-h_0)}.
$$
If $h >> h_0$, then $\sqrt{h-h_0} = \sqrt{h}\sqrt(1 - h_0/h) \approx \sqrt{h}(1 - (1/2)(h_0/h)) \approx \sqrt{h}$. So the rate of change of height in time is like $1/h^{3/2}$.
Now, by the chain rule, we have then the rate of change of volume with respect to time, $dV/dt$, is:
$$
\begin{align*}
\frac{dV}{dt} &=
\frac{dV}{dh} \cdot \frac{dh}{dt}\\
&= \pi h^2 \tan^2(\theta) \cdot \frac{h_0^2}{h^2} \sqrt{2g(h-h_0)} \\
&= \pi \sqrt{2g} \cdot (r_0)^2 \cdot \sqrt{h-h_0} \\
&\approx \pi \sqrt{2g} \cdot r_0^2 \cdot \sqrt{h}.
\end{align*}
$$
This rate depends on the square of the size of the opening ($r_0^2$) and the square root of the height ($h$), but not the angle of the cone.
## Questions
###### Question
Supply and demand. Suppose demand for product $XYZ$ is $d(x)$ and supply is $s(x)$. The excess demand is $d(x) - s(x)$. Suppose this is positive. How does this influence price? Guess the "law" of economics that applies:
```{julia}
#| hold: true
#| echo: false
choices = [
"The rate of change of price will be ``0``",
"The rate of change of price will increase",
"The rate of change of price will be positive and will depend on the rate of change of excess demand."
]
answ = 3
radioq(choices, answ, keep_order=true)
```
(Theoretically, when demand exceeds supply, prices increase.)
###### Question
Which makes more sense from an economic viewpoint?
```{julia}
#| hold: true
#| echo: false
choices = [
"If the rate of change of unemployment is negative, the rate of change of wages will be negative.",
"If the rate of change of unemployment is negative, the rate of change of wages will be positive."
]
answ = 2
radioq(choices, answ, keep_order=true)
```
(Colloquially, "the rate of change of unemployment is negative" means the unemployment rate is going down, so there are fewer workers available to fill new jobs.)
###### Question
In chemistry there is a fundamental relationship between pressure ($P$), temperature ($T)$ and volume ($V$) given by $PV=cT$ where $c$ is a constant. Which of the following would be true with respect to time?
```{julia}
#| hold: true
#| echo: false
choices = [
L"The rate of change of pressure is always increasing by $c$",
"If volume is constant, the rate of change of pressure is proportional to the temperature",
"If volume is constant, the rate of change of pressure is proportional to the rate of change of temperature",
"If pressure is held constant, the rate of change of pressure is proportional to the rate of change of temperature"]
answ = 3
radioq(choices, answ, keep_order=true)
```
###### Question
A pebble is thrown into a lake causing ripples to form expanding circles. Suppose one of the circles expands at a rate of $1$ foot per second and the radius of the circle is $10$ feet, what is the rate of change of the area enclosed by the circle?
```{julia}
#| hold: true
#| echo: false
# a = pi*r^2
# da/dt = pi * 2r * drdt
r = 10; drdt = 1
val = pi * 2r * drdt
numericq(val, units=L"feet$^2$/second")
```
###### Question
A pizza maker tosses some dough in the air. The dough is formed in a circle with radius $10$. As it rotates, its area increases at a rate of $1$ inch$^2$ per second. What is the rate of change of the radius?
```{julia}
#| hold: true
#| echo: false
# a = pi*r^2
# da/dt = pi * 2r * drdt
r = 10; dadt = 1
val = dadt /( pi * 2r)
numericq(val, units="inches/second")
```
###### Question
An FBI agent with a powerful spyglass is located in a boat anchored 400 meters offshore. A gangster under surveillance is driving along the shore. Assume the shoreline is straight and that the gangster is 1 km from the point on the shore nearest to the boat. If the spyglasses must rotate at a rate of $\pi/4$ radians per minute to track the gangster, how fast is the gangster moving? (In kilometers per minute.) [Source.](http://oregonstate.edu/instruct/mth251/cq/Stage9/Practice/ratesProblems.html)
```{julia}
#| hold: true
#| echo: false
## tan(theta) = x/y
## sec^2(theta) dtheta/dt = 1/y dx/dt (y is constant)
## dxdt = y sec^2(theta) dtheta/dt
dthetadt = pi/4
y0 = .4; x0 = 1.0
theta = atan(x0/y0)
val = y0 * sec(theta)^2 * dthetadt
numericq(val, units="kilometers/minute")
```
###### Question
A flood lamp is installed on the ground 200 feet from a vertical wall. A six foot tall man is walking towards the wall at the rate of 4 feet per second. How fast is the tip of his shadow moving down the wall when he is 50 feet from the wall? [Source.](http://oregonstate.edu/instruct/mth251/cq/Stage9/Practice/ratesProblems.html) (As the question is written the answer should be positive.)
```{julia}
#| hold: true
#| echo: false
## y/200 = 6/x
## dydt = 200 * 6 * -1/x^2 dxdt
x0 = 200 - 50
dxdt = 4
val = 200 * 6 * (1/x0^2) * dxdt
numericq(val, units="feet/second")
```
###### Question
Consider the hyperbola $y = 1/x$ and think of it as a slide. A particle slides along the hyperbola so that its x-coordinate is increasing at a rate of $f(x)$ units/sec. If its $y$-coordinate is decreasing at a constant rate of $1$ unit/sec, what is $f(x)$? [Source.](http://oregonstate.edu/instruct/mth251/cq/Stage9/Practice/ratesProblems.html)
```{julia}
#| hold: true
#| echo: false
choices = [
"``f(x) = 1/x``",
"``f(x) = x^0``",
"``f(x) = x``",
"``f(x) = x^2``"
]
answ = 4
radioq(choices, answ, keep_order=true)
```
###### Question
A balloon is in the shape of a sphere, fortunately, as this gives a known formula, $V=4/3 \pi r^3$, for the volume. If the balloon is being filled with a rate of change of volume per unit time is $2$ and the radius is $3$, what is rate of change of radius per unit time?
```{julia}
#| hold: true
#| echo: false
r, dVdt = 3, 2
drdt = dVdt / (4 * pi * r^2)
numericq(drdt, units="units per unit time")
```
###### Question
Consider the curve $f(x) = x^2 - \log(x)$. For a given $x$, the tangent line intersects the $y$ axis. Where?
```{julia}
#| hold: true
#| echo: false
choices = [
"``y = 1 - x^2 - \\log(x)``",
"``y = 1 - x^2``",
"``y = 1 - \\log(x)``",
"``y = x(2x - 1/x)``"
]
answ = 1
radioq(choices, answ)
```
If $dx/dt = -1$, what is $dy/dt$?
```{julia}
#| hold: true
#| echo: false
choices = [
"``dy/dt = 2x + 1/x``",
"``dy/dt = 1 - x^2 - \\log(x)``",
"``dy/dt = -2x - 1/x``",
"``dy/dt = 1``"
]
answ=1
radioq(choices, answ)
```

View File

@@ -0,0 +1,253 @@
# Symbolic derivatives
```{julia}
#| echo: false
import Logging
Logging.disable_logging(Logging.Info) # or e.g. Logging.Info
Logging.disable_logging(Logging.Warn)
import SymPy
function Base.show(io::IO, ::MIME"text/html", x::T) where {T <: SymPy.SymbolicObject}
println(io, "<span class=\"math-left-align\" style=\"padding-left: 4px; width:0; float:left;\"> ")
println(io, "\\[")
println(io, sympy.latex(x))
println(io, "\\]")
println(io, "</span>")
end
# hack to work around issue
import Markdown
import CalculusWithJulia
function CalculusWithJulia.WeaveSupport.ImageFile(d::Symbol, f::AbstractString, caption; kwargs...)
nm = joinpath("..", string(d), f)
u = "![$caption]($nm)"
Markdown.parse(u)
end
nothing
```
This section uses this add-on package:
```{julia}
using TermInterface
```
```{julia}
#| echo: false
const frontmatter = (
title = "Symbolic derivatives",
description = "Calculus with Julia: Symbolic derivatives",
tags = ["CalculusWithJulia", "derivatives", "symbolic derivatives"],
);
```
---
The ability to breakdown an expression into operations and their arguments is necessary when trying to apply the differentiation rules. Such rules are applied from the outside in. Identifying the proper "outside" function is usually most of the battle when finding derivatives.
In the following example, we provide a sketch of a framework to differentiate expressions by a chosen symbol to illustrate how the outer function drives the task of differentiation.
The `Symbolics` package provides native symbolic manipulation abilities for `Julia`, similar to `SymPy`, though without the dependence on `Python`. The `TermInterface` package, used by `Symbolics`, provides a generic interface for expression manipulation for this package that *also* is implemented for `Julia`'s expressions and symbols.
An expression is an unevaluated portion of code that for our purposes below contains other expressions, symbols, and numeric literals. They are held in the `Expr` type. A symbol, such as `:x`, is distinct from a string (e.g. `"x"`) and is useful to the programmer to distinguish between the contents a variable points to from the name of the variable. Symbols are fundamental to metaprogramming in `Julia`. An expression is a specification of some set of statements to execute. A numeric literal is just a number.
The three main functions from `TermInterface` we leverage are `istree`, `operation`, and `arguments`. The `operation` function returns the "outside" function of an expression. For example:
```{julia}
operation(:(sin(x)))
```
We see the `sin` function, referred to by a symbol (`:sin`). The `:(...)` above *quotes* the argument, and does not evaluate it, hence `x` need not be defined above. (The `:` notation is used to create both symbols and expressions.)
The arguments are the terms that the outside function is called on. For our purposes there may be $1$ (*unary*), $2$ (*binary*), or more than $2$ (*nary*) arguments. (We ignore zero-argument functions.) For example:
```{julia}
arguments(:(-x)), arguments(:(pi^2)), arguments(:(1 + x + x^2))
```
(The last one may be surprising, but all three arguments are passed to the `+` function.)
Here we define a function to decide the *arity* of an expression based on the number of arguments it is called with:
```{julia}
function arity(ex)
n = length(arguments(ex))
n == 1 ? Val(:unary) :
n == 2 ? Val(:binary) : Val(:nary)
end
```
Differentiation must distinguish between expressions, variables, and numbers. Mathematically expressions have an "outer" function, whereas variables and numbers can be directly differentiated. The `istree` function in `TermInterface` returns `true` when passed an expression, and `false` when passed a symbol or numeric literal. The latter two may be distinguished by `isa(..., Symbol)`.
Here we create a function, `D`, that when it encounters an expression it *dispatches* to a specific method of `D` based on the outer operation and arity, otherwise if it encounters a symbol or a numeric literal it does the differentiation:
```{julia}
function D(ex, var=:x)
if istree(ex)
op, args = operation(ex), arguments(ex)
D(Val(op), arity(ex), args, var)
elseif isa(ex, Symbol) && ex == :x
1
else
0
end
end
```
Now to develop methods for `D` for different "outside" functions and arities.
Addition can be unary (`:(+x)` is a valid quoting, even if it might simplify to the symbol `:x` when evaluated), *binary*, or *nary*. Here we implement the *sum rule*:
```{julia}
D(::Val{:+}, ::Val{:unary}, args, var) = D(first(args), var)
function D(::Val{:+}, ::Val{:binary}, args, var)
a, b = D.(args, var)
:($a + $b)
end
function D(::Val{:+}, ::Val{:nary}, args, var)
as = D.(args, var)
:(+($as...))
end
```
The `args` are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply `D` to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The "primes" are unicode notation achieved through `\prime[tab]` and not operations.) The *nary* case does something similar, only uses splatting to produce the sum.
Subtraction must also be implemented in a similar manner, but not for the *nary* case:
```{julia}
function D(::Val{:-}, ::Val{:unary}, args, var)
a = D(first(args), var)
:(-$a)
end
function D(::Val{:-}, ::Val{:binary}, args, var)
a, b = D.(args, var)
:($a - $b)
end
```
The *product rule* is similar to addition, in that $3$ cases are considered:
```{julia}
D(op::Val{:*}, ::Val{:unary}, args, var) = D(first(args), var)
function D(::Val{:*}, ::Val{:binary}, args, var)
a, b = args
a, b = D.(args, var)
:($a * $b + $a * $b)
end
function D(op::Val{:*}, ::Val{:nary}, args, var)
a, bs... = args
b = :(*($(bs...)))
a = D(a, var)
b = D(b, var)
:($a * $b + $a * $b)
end
```
The *nary* case above just peels off the first factor and then uses the binary product rule.
Division is only a binary operation, so here we have the *quotient rule*:
```{julia}
function D(::Val{:/}, ::Val{:binary}, args, var)
u,v = args
u, v = D(u, var), D(v, var)
:( ($u*$v - $u*$v)/$v^2 )
end
```
Powers are handled a bit differently. The power rule would require checking if the exponent does not contain the variable of differentiation, exponential derivatives would require checking the base does not contain the variable of differentation. Trying to implement both would be tedious, so we use the fact that $x = \exp(\log(x))$ (for `x` in the domain of `log`, more care is necessary if `x` is negative) to differentiate:
```{julia}
function D(::Val{:^}, ::Val{:binary}, args, var)
a, b = args
D(:(exp($b*log($a))), var) # a > 0 assumed here
end
```
That leaves the task of defining a rule to differentiate both `exp` and `log`. We do so with *unary* definitions. In the following we also implement `sin` and `cos` rules:
```{julia}
function D(::Val{:exp}, ::Val{:unary}, args, var)
a = first(args)
a = D(a, var)
:(exp($a) * $a)
end
function D(::Val{:log}, ::Val{:unary}, args, var)
a = first(args)
a = D(a, var)
:(1/$a * $a)
end
function D(::Val{:sin}, ::Val{:unary}, args, var)
a = first(args)
a = D(a, var)
:(cos($a) * $a)
end
function D(::Val{:cos}, ::Val{:unary}, args, var)
a = first(args)
a = D(a, var)
:(-sin($a) * $a)
end
```
The pattern is similar for each. The `$a` factor is needed due to the *chain rule*. The above illustrates the simple pattern necessary to add a derivative rule for a function. More could be, but for this example the above will suffice, as now the system is ready to be put to work.
```{julia}
ex₁ = :(x + 2/x)
D(ex₁, :x)
```
The output does not simplify, so some work is needed to identify `1 - 2/x^2` as the answer.
```{julia}
ex₂ = :( (x + sin(x))/sin(x))
D(ex₂, :x)
```
Again, simplification is not performed.
Finally, we have a second derivative taken below:
```{julia}
ex₃ = :(sin(x) - x - x^3/6)
D(D(ex₃, :x), :x)
```
The length of the expression should lead to further appreciation for simplification steps taken when doing such a computation by hand.

File diff suppressed because it is too large Load Diff