It was remarked that these relationships hold: $D(S(f))(k) = f(k)$ and
$S(D(f))(k) = f(k) - f(0)$. These being a consequence of the inverse
relationship between addition and subtraction. These two
relationships are examples of a more general pair of relationships
known as the
[Fundamental theorem of calculus](http://en.wikipedia.org/wiki/Fundamental_theorem_of_calculus) or FTC.
We will see that with suitable rewriting, the derivative of a function is related to a certain limit of `D(f)` and the definite integral of a function is related to a certain limit of `S(f)`. The addition and subtraction rules encapsulated in the relations of $D(S(f))(k) = f(k)$ and $S(D(f))(k) = f(k) - f(0)$ then generalize to these calculus counterparts.
The FTC details the interconnectivity between the operations of integration and
differentiation.
For example:
> What is the definite integral of the derivative?
That is, what is $A = \int_a^b f'(x) dx$? (Assume $f'$ is continuous.)
To investigate, we begin with the right Riemann sum using $h = (b-a)/n$:
```math
A \approx S_n = \sum_{i=1}^n f'(a + ih) \cdot h.
```
But the mean value theorem says that for small $h$ we have $f'(x) \approx (f(x) - f(x-h))/h$. Using this approximation with $x=a+ih$ gives:
This follows immediately as if $F(x)$ and $G(x)$ are antiderivatives of $f(x)$ and $g(x)$, then $[F(x) + G(x)]' = f(x) + g(x)$, so the right hand side will have a derivative of $f(x) + g(x)$.
In fact, this more general form where $c$ and $d$ are constants covers both cases:
```math
\int (cf(x) + dg(x)) dx = c \int f(x) dx + d \int g(x) dx.
```
This statement is nothing more than the derivative formula
$[cf(x) + dg(x)]' = cf'(x) + dg'(x)$. The product rule gives rise to a
technique called *integration by parts* and the chain rule gives rise
to a technique of *integration by substitution*, but we defer those
discussions to other sections.
##### Examples
- The antiderivative of the polynomial $p(x) = a_n x^n + \cdots a_1 x + a_0$ follows from the linearity of the integral and the general power rule:
- More generally, a [Laurent](https://en.wikipedia.org/wiki/Laurent_polynomial) polynomial allows for terms with negative powers. These too can be handled by the above. For example
The relationship that $[\int_a^x f(u) du]' = f(x)$ is a bit harder to appreciate, as it doesn't help answer many ready made questions. Here we give some examples of its use.
First, the expression defining an antiderivative, or indefinite integral, is given in term of a definite integral:
```math
F(x) = \int_a^x f(u) du.
```
The value of $a$ does not matter, as long as the integral is defined.
```julia; hold=true; echo=false; cache=true
##{{{ftc_graph}}}
function make_ftc_graph(n)
a, b = 2, 3
ts = range(0, stop=b, length=50)
xs = range(a, stop=b, length=8)
g(x) = x
G(x) = x^2/2
xn,xn1 = xs[n:(n+1)]
xbar = (xn+xn1)/2
rxs = collect(range(xn, stop=xn1, length=2))
rys = map(g, rxs)
p = plot(g, 0, b, legend=false, size=fig_size, xlim=(0,3.25), ylim=(0,5))
Illustration showing $F(x) = \int_a^x f(u) du$ is a function that
accumulates area. The value of $A$ is the area over $[x_{n-1}, x_n]$
and also the difference $F(x_n) - F(x_{n-1})$.
"""
n = 7
anim = @animate for i=1:n
make_ftc_graph(i)
end
imgfile = tempname() * ".gif"
gif(anim, imgfile, fps = 1)
ImageFile(imgfile, caption)
```
The picture for this, for non-negative $f$, is of accumulating area as
$x$ increases. It can be used to give insight into some formulas:
For any function, we know that $F(b) - F(c) + F(c) - F(a) = F(b) - F(a)$. For this specific function, this translates into this property of the integral:
That $F(x) = G(x) + \int_{a_0}^{a_1} f(u) du$. The additional part may
look complicated, but the point is that as far as $x$ is involved, it
is a constant. Hence both $F$ and $G$ are antiderivatives if either
one is.
##### Example
From the familiar formula rate $\times$ time $=$ distance, we "know,"
for example, that a car traveling 60 miles an hour for one hour will
have traveled 60 miles. This allows us to translate statements about
the speed (or more generally velocity) into statements about position
at a given time. If the speed is not constant, we don't have such an
easy conversion.
Suppose our velocity at time $t$ is $v(t)$, and always positive. We
want to find the position at time $t$, $x(t)$. Let's assume $x(0) =
0$. Let $h$ be some small time step, say $h=(t - 0)/n$ for some large
$n>0$. Then we can *approximate* $v(t)$ between
$[ih, (i+1)h)$ by $v(ih)$. This is a constant so the change in position over the time interval $[ih, (i+1)h)$ would simply be $v(ih) \cdot h$, and ignoring the accumulated errors, the approximate position at time $t$ would be found by adding this pieces together: $x(t) \approx v(0h)\cdot h + v(1h)\cdot h + v(2h) \cdot h + \cdots + v(nh)h$. But we recognize this (as did [Beeckman](http://www.math.harvard.edu/~knill/teaching/math1a_2011/exhibits/bressoud/)
in 1618) as nothing more than an approximation for the Riemann sum of
$v$ over the interval $[0, t]$. That is, we expect:
```math
x(t) = \int_0^t v(u) du.
```
Hopefully this makes sense: our position is the result of accumulating
our change in position over small units of time. The old
one-foot-in-front-of-another approach to walking out the door.
The above was simplified by the assumption that $x(0) = 0$. What if $x(0) = x_0$ for some non-zero value. Then the above is not exactly correct, as $\int_0^0 v(u) du = 0$. So instead, we might write this more concretely as:
```math
x(t) = x_0 + \int_0^t v(u) du.
```
There is a similar relationship between velocity and acceleration, but let's think about it formally. If we know that the acceleration is the rate of change of velocity, then we have $a(t) = v'(t)$. By the FTC, then
```math
\int_0^t a(u) du = \int_0^t v'(t) = v(t) - v(0).
```
Rewriting gives a similar statement as before:
```math
v(t) = v_0 + \int_0^t a(u) du.
```
##### Example
In probability theory, for a positive, continuous random variable, the
probability that the random value is less than $a$ is given by $P(X
\leq a) = F(a) = \int_{0}^a f(x) dx$. (Positive means the integral
starts at $0$, whereas in general it could be $-\infty$, a minor complication that
we haven't yet discussed.)
For example, the exponential distribution with rate $1$ has $f(x) = e^{-x}$. Compute $F(x)$.
This is just $F(x) = \int_0^x e^{-u} du = -e^{-u}\big|_0^x = 1 - e^{-x}$.
The "uniform" distribution on $[a,b]$ has
```math
F(x) =
\begin{cases}
0 & x < a\\
\frac{x-a}{b-a} & a \leq x \leq b\\
1 & x > b
\end{cases}
```
Find $f(x)$. There are some subtleties here. If we assume that $F(x) = \int_0^x f(u) du$ then we know if $f(x)$ is continuous that $F'(x) = f(x)$. Differentiating we get
```math
f(x) = \begin{cases}
0 & x < a\\
\frac{1}{b-a} & a < x < b\\
0 & x > b
\end{cases}
```
However, the function $f$ is *not* continuous on $[a,b]$ and $F'(x)$ is not
differentiable on $(a,b)$. It is true that $f$ is integrable, and
where $F$ is differentiable $F'=f$. So $f$ is determined except
possibly at the points $x=a$ and $x=b$.
##### Example
The error function is defined by $\text{erf}(x) = 2/\sqrt{\pi}\int_0^x e^{-u^2}
du$. It is implemented in `Julia` through `erf`. Suppose, we were to
ask where it takes on it's maximum value, what would we find?
The answer will either be at a critical point, at $0$ or as $x$ goes to $\infty$. We can differentiate to find critical points:
```math
[\text{erf}(x)] = \frac{2}{\pi}e^{-x^2}.
```
Oh, this is never $0$, so there are no critical points. The maximum occurs at $0$ or as $x$ goes to $\infty$. Clearly at $0$, we have $\text{erf}(0)=0$, so the answer will be as $x$ goes to $\infty$.
In retrospect, this is a silly question. As $f(x) > 0$ for all $x$, we
*must* have that $F(x)$ is strictly increasing, so never gets to a
local maximum.
##### Example
The [Dawson](http://en.wikipedia.org/wiki/Dawson_function) function is
```math
F(x) = e^{-x^2} \int_0^x e^{t^2} dt
```
Characterize any local maxima or minima.
For this we need to consider the product rule. The fundamental theorem of calculus will help with the right-hand side. We have:
The first value being positive says there is a relative minimum at $-0.924139$, at $0.924139$ there is a relative maximum.
##### Example
Returning to probability, suppose there are ``n`` positive random numbers ``X_1``, ``X_2``, ..., ``X_n``. A natural question might be to ask what formulas describes the largest of these values, assuming each is identical in some way. A description that is helpful is to define ``F(a) = P(X \leq a)`` for some random number ``X``. That is the probability that ``X`` is less than or equal to ``a`` is ``F(a)``. For many situations, there is a *density* function, ``f``, for which ``F(a) = \int_0^a f(x) dx``.
Under assumptions that the ``X`` are identical and independent, the largest value, ``M``, may b
characterized by ``P(M \leq a) = \left[F(a)\right]^n``. Using ``f`` and ``F`` describe the derivative of this expression.
This problem is constructed to take advantage of the FTC, and we have:
```math
\begin{align*}
\left[P(M \leq a)\right]'
&= \left[F(a)^n\right]'\\
&= n \cdot F(a)^{n-1} \left[F(a)\right]'\\
&= n F(a)^{n-1}f(a)
\end{align*}
```
##### Example
Suppose again probabilities of a random number between ``0`` and ``1``, say, are given by a positive, continuous function ``f(x)``on ``(0,1)`` by ``F(a) = P(X \leq a) = \int_0^a f(x) dx``. The median value of the random number is a value of ``a`` for which ``P(X \leq a) = 1/2``. Such an ``a`` makes ``X`` a coin toss -- betting if ``X`` is less than ``a`` is like betting on heads to come up. More generally the ``q``th quantile of ``X`` is a number ``a`` with ``P(X \leq a) = q``. The definition is fine, but for a given ``f`` and ``q`` can we find ``a``?
Abstractly, we are solving ``F(a) = q`` or ``F(a)-q = 0`` for ``a``. That is, this is a zero-finding question. We have discussed different options for this problem: bisection, a range of derivative free methods, and Newton's method. As evaluating ``F`` involves an integral, which may involve many evaluations of ``f``, a method which converges quickly is preferred. For that, Newton's method is a good idea, it having quadratic convergence in this case, as ``a`` is a simple zero given that ``F`` is increasing under the assumptions above.
Newton's method involves the update step `x = x - f(x)/f'(x)`. For this "``f``" is ``h(x) = \int_0^x f(u) du - q``. The derivative is easy, the FTC just applies: ``h'(x) = f(x)``; no need for automatic differentiation, which may not even apply to this setup.
To do a concrete example, we take the [Beta](https://en.wikipedia.org/wiki/Beta_distribution)(``\alpha, \beta``) distribution (``\alpha, \beta > 0``) which has density, ``f``, over ``[0,1]`` given by
The Wikipedia link above gives an approximate answer for the median of ``(\alpha-1/3)/(\alpha+\beta-2/3)`` when ``\alpha,\beta > 1``. Let's see how correct this is when ``\alpha=5`` and ``\beta=6``. The `gamma` function used below implements ``\Gamma``. It is in the `SpecialFunctions` package, which is loaded with the `CalculusWithJulia` package.
The trick using a closure relies on an internal way of accessing elements in a closure. The same trick could be implemented many different ways which aren't reliant on undocumented internals, this approach was just a tad more convenient. It shouldn't be copied for work intended for distribution, as the internals may change without notice or deprecation.
Suppose $f(x) \geq 0$ and $F(x) = \int_0^x f(u) du$. $F(x)$ is continuous and so has a maximum value on the interval $[0,1]$ taken at some $c$ in $[0,1]$. It is
Let $F(x) = \int_0^x f(u) du$, where $f(x)$ is given by the graph below. Identify the $x$ values of all *relative maxima* of $F(x)$. Explain why you know these are the values.
Suppose $f(x)$ is monotonically decreasing with $f(0)=1$, $f(1/2) = 0$ and $f(1) = -1$. Let $F(x) = \int_0^x f(u) du$. $F(x)$ is continuous and so has a maximum value on the interval $[0,1]$ taken at some $c$ in $[0,1]$. It is
Barrow presented a version of the fundamental theorem of calculus in a
1670 volume edited by Newton, Barrow's student
(cf. [Wagner](http://www.maa.org/sites/default/files/0746834234133.di020795.02p0640b.pdf)). His version can be stated as follows (cf. [Jardine](http://www.maa.org/publications/ebooks/mathematical-time-capsules)):
Consider the following figure where $f$ is a strictly increasing
function with $f(0) = 0$. and $x > 0$. The function $A(x) = \int_0^x
f(u) du$ is also plotted. The point $Q$ is $f(x)$, and the point $P$
is $A(x)$. The point $T$ is chosen to so that the length between $T$
and $x$ times the length between $Q$ and $x$ equals the length from
According to [Bressoud](http://www.math.harvard.edu/~knill/teaching/math1a_2011/exhibits/bressoud/) "Newton observes that the rate of change of an accumulated quantity is the rate at which that quantity is accumulating". Which part of the FTC does this refer to:
Finding the value of a definite integral through the fundamental theorem of calculus relies on the algebraic identification of an antiderivative. This is difficult to do by hand and by computer, and is complicated by the fact that not every [elementary ](https://en.wikipedia.org/wiki/Elementary_function)function has an elementary antiderivative.
`SymPy`'s documentation on integration indicates that several different means to integrate a function are used internally. As it is of interest here, it is copied with just minor edits below (from an older version of SymPy):