diff --git a/CwJ/ODEs/Project.toml b/CwJ/ODEs/Project.toml deleted file mode 100644 index d455e5c..0000000 --- a/CwJ/ODEs/Project.toml +++ /dev/null @@ -1,10 +0,0 @@ -[deps] -DiffEqBase = "2b5f629d-d688-5b77-993f-72d75c75574e" -DifferentialEquations = "0c46a032-eb83-5123-abaf-570d42b7fbaa" -ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" -MonteCarloMeasurements = "0987c9cc-fe09-11e8-30f0-b96dd679fdca" -NLsolve = "2774e3e8-f4cf-5e23-947b-6d7e65073b56" -Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" -QuadGK = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" -Roots = "f2b01f46-fcfa-551c-844a-d8ac1e96c665" -SymPy = "24249f21-da20-56a4-8eb1-6a02cf4ae2e6" diff --git a/CwJ/ODEs/cache/euler.cache b/CwJ/ODEs/cache/euler.cache deleted file mode 100644 index 45f90f7..0000000 Binary files a/CwJ/ODEs/cache/euler.cache and /dev/null differ diff --git a/CwJ/ODEs/cache/odes.cache b/CwJ/ODEs/cache/odes.cache deleted file mode 100644 index 43037df..0000000 Binary files a/CwJ/ODEs/cache/odes.cache and /dev/null differ diff --git a/CwJ/ODEs/differential_equations.jmd b/CwJ/ODEs/differential_equations.jmd deleted file mode 100644 index 38987c7..0000000 --- a/CwJ/ODEs/differential_equations.jmd +++ /dev/null @@ -1,376 +0,0 @@ -# The `DifferentialEquations` suite - -This section uses these add-on packages: - -```julia -using OrdinaryDiffEq -using Plots -using ModelingToolkit -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "The `DifferentialEquations` suite", - description = "Calculus with Julia: The `DifferentialEquations` suite", - tags = ["CalculusWithJulia", "odes", "the `differentialequations` suite"], -); -fig_size = (800, 600) -nothing -``` - ----- - -The -[`DifferentialEquations`](https://github.com/SciML/DifferentialEquations.jl) -suite of packages contains solvers for a wide range of various -differential equations. This section just briefly touches touch on -ordinary differential equations (ODEs), and so relies only on -`OrdinaryDiffEq` part of the suite. For more detail on this type and -many others covered by the suite of packages, there are many other -resources, including the -[documentation](https://diffeq.sciml.ai/stable/) and accompanying -[tutorials](https://github.com/SciML/SciMLTutorials.jl). - -## SIR Model - -We follow along with an introduction to the SIR model for the spread of disease by [Smith and Moore](https://www.maa.org/press/periodicals/loci/joma/the-sir-model-for-spread-of-disease-introduction). This model received a workout due to the COVID-19 pandemic. - -The basic model breaks a population into three cohorts: The **susceptible** individuals, the **infected** individuals, and the **recovered** individuals. These add to the population size, ``N``, which is fixed, but the cohort sizes vary in time. We name these cohort sizes ``S(t)``, ``I(t)``, and ``R(t)`` and define ``s(t)=S(t)/N``, ``i(t) = I(t)/N`` and ``r(t) = R(t)/N`` to be the respective proportions. - -The following *assumptions* are made about these cohorts by Smith and Moore: - -> No one is added to the susceptible group, since we are ignoring births and immigration. The only way an individual leaves the susceptible group is by becoming infected. - - -This implies the rate of change in time of ``S(t)`` depends on the current number of susceptibles, and the amount of interaction with the infected cohorts. The model *assumes* each infected person has ``b`` contacts per day that are sufficient to spread the disease. Not all contacts will be with susceptible people, but if people are assumed to mix within the cohorts, then there will be on average ``b \cdot S(t)/N`` contacts with susceptible people per infected person. As each infected person is modeled identically, the time rate of change of ``S(t)`` is: - -```math -\frac{dS}{dt} = - b \cdot \frac{S(t)}{N} \cdot I(t) = -b \cdot s(t) \cdot I(t) -``` - -It is negative, as no one is added, only taken off. After dividing by -``N``, this can also be expressed as ``s'(t) = -b s(t) i(t)``. - -> assume that a fixed fraction ``k`` of the infected group will recover during any given day. - -This means the change in time of the recovered depends on ``k`` and the number infected, giving rise to the equation - -```math -\frac{dR}{dt} = k \cdot I(t) -``` - -which can also be expressed in proportions as ``r'(t) = k \cdot i(t)``. - -Finally, from ``S(t) + I(T) + R(t) = N`` we have ``S'(T) + I'(t) + R'(t) = 0`` or ``s'(t) + i'(t) + r'(t) = 0``. - - -Combining, it is possible to express the rate of change of the infected population through: - -```math -\frac{di}{dt} = b \cdot s(t) \cdot i(t) - k \cdot i(t) -``` - -The author's apply this model to flu statistics from Hong Kong where: - -```math -\begin{align*} -S(0) &= 7,900,000\\ -I(0) &= 10\\ -R(0) &= 0\\ -\end{align*} -``` - -In `Julia` we define these, `N` to model the total population, and `u0` to be the proportions. - -```julia -S0, I0, R0 = 7_900_000, 10, 0 -N = S0 + I0 + R0 -u0 = [S0, I0, R0]/N # initial proportions -``` - -An *estimated* set of values for ``k`` and ``b`` are ``k=1/3``, coming from the average period of infectiousness being estimated at three days and ``b=1/2``, which seems low in normal times, but not for an infected person who may be feeling quite ill and staying at home. (The model for COVID would certainly have a larger ``b`` value). - -Okay, the mathematical modeling is done; now we try to solve for the unknown functions using `DifferentialEquations`. - -To warm up, if ``b=0`` then ``i'(t) = -k \cdot i(t)`` describes the infected. (There is no circulation of people in this case.) The solution would be achieved through: - -```julia; hold=true -k = 1/3 - -f(u,p,t) = -k * u # solving u′(t) = - k u(t) -time_span = (0.0, 20.0) - -prob = ODEProblem(f, I0/N, time_span) -sol = solve(prob, Tsit5(), reltol=1e-8, abstol=1e-8) - -plot(sol) -``` - -The `sol` object is a set of numbers with a convenient `plot` method. As may have been expected, this graph shows exponential decay. - - -A few comments are in order. The problem we want to solve is - -```math -\frac{di}{dt} = -k \cdot i(t) = F(i(t), k, t) -``` - -where ``F`` depends on the current value (``i``), a parameter (``k``), and the time (``t``). We did not utilize ``p`` above for the parameter, as it was easy not to, but could have, and will in the following. The time variable ``t`` does not appear by itself in our equation, so only `f(u, p, t) = -k * u` was used, `u` the generic name for a solution which in this case is ``i``. - -The problem we set up needs an initial value (the ``u0``) and a time span to solve over. Here we want time to model real time, so use floating point values. - -The plot shows steady decay, as there is no mixing of infected with others. - -Adding in the interaction requires a bit more work. We now have what is known as a *system* of equations: - -```math -\begin{align*} -\frac{ds}{dt} &= -b \cdot s(t) \cdot i(t)\\ -\frac{di}{dt} &= b \cdot s(t) \cdot i(t) - k \cdot i(t)\\ -\frac{dr}{dt} &= k \cdot i(t)\\ -\end{align*} -``` - -Systems of equations can be solved in a similar manner as a single ordinary differential equation, though adjustments are made to accommodate the multiple functions. - -We use a style that updates values in place, and note that `u` now holds ``3`` different functions at once: - -```julia -function sir!(du, u, p, t) - k, b = p - s, i, r = u[1], u[2], u[3] - - ds = -b * s * i - di = b * s * i - k * i - dr = k * i - - du[1], du[2], du[3] = ds, di, dr -end -``` - -The notation `du` is suggestive of both the derivative and a small increment. The mathematical formulation follows the derivative, the numeric solution uses a time step and increments the solution over this time step. The `Tsit5()` solver, used here, adaptively chooses a time step, `dt`; were the `Euler` method used, this time step would need to be explicit. - -!!! note "Mutation not re-binding" - The `sir!` function has the trailing `!` indicating -- by convention -- it *mutates* its first value, `du`. In this case, through an assignment, as in `du[1]=ds`. This could use some explanation. The *binding* `du` refers to the *container* holding the ``3`` values, whereas `du[1]` refers to the first value in that container. So `du[1]=ds` changes the first value, but not the *binding* of `du` to the container. That is, `du` mutates. This would be quite different were the call `du = [ds,di,dr]` which would create a new *binding* to a new container and not mutate the values in the original container. - -With the update function defined, the problem is setup and a solution found with in the same manner: - -```julia; -p = (k=1/3, b=1/2) # parameters -time_span = (0.0, 150.0) # time span to solve over, 5 months - -prob = ODEProblem(sir!, u0, time_span, p) -sol = solve(prob, Tsit5()) - -plot(sol) -plot!(x -> 0.5, linewidth=2) # mark 50% line -``` - -The lower graph shows the number of infected at each day over the five-month period displayed. The peak is around 6-7% of the population at any one time. However, over time the recovered part of the population reaches over 50%, meaning more than half the population is modeled as getting sick. - - -Now we change the parameter ``b`` and observe the difference. We passed in a value `p` holding our two parameters, so we just need to change that and run the model again: - -```julia; hold=true -p = (k=1/2, b=2) # change b from 1/2 to 2 -- more daily contact -prob = ODEProblem(sir!, u0, time_span, p) -sol = solve(prob, Tsit5()) - -plot(sol) -``` - -The graphs are somewhat similar, but the steady state is reached much more quickly and nearly everyone became infected. - - -What about if ``k`` were bigger? - -```julia; hold=true -p = (k=2/3, b=1/2) -prob = ODEProblem(sir!, u0, time_span, p) -sol = solve(prob, Tsit5()) - -plot(sol) -``` - - -The graphs show that under these conditions the infections never take off; we have ``i' = (b\cdot s-k)i = k\cdot((b/k) s - 1) i`` which is always negative, since `(b/k)s < 1`, so infections will only decay. - - -The solution object is indexed by time, then has the `s`, `i`, `r` estimates. We use this structure below to return the estimated proportion of recovered individuals at the end of the time span. - -```julia -function recovered(k,b) - prob = ODEProblem(sir!, u0, time_span, (k,b)); - sol = solve(prob, Tsit5()); - s,i,r = last(sol) - r -end -``` - -This function makes it easy to see the impact of changing the parameters. For example, fixing ``k=1/3`` we have: - -```julia -f(b) = recovered(1/3, b) -plot(f, 0, 2) -``` - -This very clearly shows the sharp dependence on the value of ``b``; below some level, the proportion of people who are ever infected (the recovered cohort) remains near ``0``; above that level it can climb quickly towards ``1``. - -The function `recovered` is of two variables returning a single value. In subsequent sections we will see a few ``3``-dimensional plots that are common for such functions, here we skip ahead and show how to visualize multiple function plots at once using "`z`" values in a graph. - -```julia; hold=true -k, ks = 0.1, 0.2:0.1:0.9 # first `k` and then the rest -bs = range(0, 2, length=100) -zs = recovered.(k, bs) # find values for fixed k, each of bs -p = plot(bs, k*one.(bs), zs, legend=false) # k*one.(ks) is [k,k,...,k] -for k in ks - plot!(p, bs, k*one.(bs), recovered.(k, bs)) -end -p -``` - -The 3-dimensional graph with `plotly` can have its viewing angle -adjusted with the mouse. When looking down on the ``x-y`` plane, which -code `b` and `k`, we can see the rapid growth along a line related to -``b/k``. - - -Smith and Moore point out that ``k`` is roughly the reciprocal of the number of days an individual is sick enough to infect others. This can be estimated during a breakout. However, they go on to note that there is no direct way to observe ``b``, but there is an indirect way. - -The ratio ``c = b/k`` is the number of close contacts per day times the number of days infected which is the number of close contacts per infected individual. - -This can be estimated from the curves once steady state has been reached (at the end of the pandemic). - -```math -\frac{di}{ds} = \frac{di/dt}{ds/dt} = \frac{b \cdot s(t) \cdot i(t) - k \cdot i(t)}{-b \cdot s(t) \cdot i(t)} = -1 + \frac{1}{c \cdot s} -``` - -This equation does not depend on ``t``; ``s`` is the dependent variable. It could be solved numerically, but in this case affords an algebraic solution: ``i = -s + (1/c) \log(s) + q``, where ``q`` is some constant. The quantity ``q = i + s - (1/c) \log(s)`` does not depend on time, so is the same at time ``t=0`` as it is as ``t \rightarrow \infty``. At ``t=0`` we have ``s(0) \approx 1`` and ``i(0) \approx 0``, whereas ``t \rightarrow \infty``, ``i(t) \rightarrow 0`` and ``s(t)`` goes to the steady state value, which can be estimated. Solving with ``t=0``, we see ``q=0 + 1 - (1/c)\log(1) = 1``. In the limit them ``1 = 0 + s_{\infty} - (1/c)\log(s_\infty)`` or ``c = \log(s_\infty)/(1-s_\infty)``. - - -## Trajectory with drag - -We now solve numerically the problem of a trajectory with a drag force from air resistance. - -The general model is: - -```math -\begin{align*} -x''(t) &= - W(t,x(t), x'(t), y(t), y'(t)) \cdot x'(t)\\ -y''(t) &= -g - W(t,x(t), x'(t), y(t), y'(t)) \cdot y'(t)\\ -\end{align*} -``` - -with initial conditions: ``x(0) = y(0) = 0`` and ``x'(0) = v_0 \cos(\theta), y'(0) = v_0 \sin(\theta)``. - -This into an ODE by a standard trick. Here we define our function for updating a step. As can be seen the vector `u` contains both ``\langle x,y \rangle`` -and ``\langle x',y' \rangle`` - -```julia -function xy!(du, u, p, t) - g, γ = p.g, p.k - x, y = u[1], u[2] - x′, y′ = u[3], u[4] # unicode \prime[tab] - - W = γ - - du[1] = x′ - du[2] = y′ - du[3] = 0 - W * x′ - du[4] = -g - W * y′ -end -``` - -This function ``W`` is just a constant above, but can be easily modified as desired. - -!!! note "A second-order ODE is a coupled first-order ODE" - The "standard" trick is to take a second order ODE like ``u''(t)=u`` and turn this into two coupled ODEs by using a new name: ``v=u'(t)`` and then ``v'(t) = u(t)``. In this application, there are ``4`` equations, as we have *both* ``x''`` and ``y''`` being so converted. The first and second components of ``du`` are new variables, the third and fourth show the original equation. - -The initial conditions are specified through: - -```julia -θ = pi/4 -v₀ = 200 -xy₀ = [0.0, 0.0] -vxy₀ = v₀ * [cos(θ), sin(θ)] -INITIAL = vcat(xy₀, vxy₀) -``` - -The time span can be computed using an *upper* bound of no drag, for which the classic physics formulas give (when ``y_0=0``) ``(0, 2v_{y0}/g)`` - -```julia -g = 9.8 -TSPAN = (0, 2*vxy₀[2] / g) -``` - -This allows us to define an `ODEProblem`: - -```julia -trajectory_problem = ODEProblem(xy!, INITIAL, TSPAN) -``` - -When ``\gamma = 0`` there should be no drag and we expect to see a parabola: - -```julia; hold=true -ps = (g=9.8, k=0) -SOL = solve(trajectory_problem, Tsit5(); p = ps) - -plot(t -> SOL(t)[1], t -> SOL(t)[2], TSPAN...; legend=false) -``` - -The plot is a parametric plot of the ``x`` and ``y`` parts of the solution over the time span. We can see the expected parabolic shape. - -On a *windy* day, the value of ``k`` would be positive. Repeating the above with ``k=1/4`` gives: - -```julia; hold=true -ps = (g=9.8, k=1/4) -SOL = solve(trajectory_problem, Tsit5(); p = ps) - -plot(t -> SOL(t)[1], t -> SOL(t)[2], TSPAN...; legend=false) -``` - -We see that the ``y`` values have gone negative. The `DifferentialEquations` package can adjust for that with a *callback* which terminates the problem once ``y`` has gone negative. This can be implemented as follows: - - -```julia; hold=true -condition(u,t,integrator) = u[2] # called when `u[2]` is negative -affect!(integrator) = terminate!(integrator) # stop the process -cb = ContinuousCallback(condition, affect!) - -ps = (g=9.8, k = 1/4) -SOL = solve(trajectory_problem, Tsit5(); p = ps, callback=cb) - -plot(t -> SOL(t)[1], t -> SOL(t)[2], TSPAN...; legend=false) -``` - - -Finally, we note that the `ModelingToolkit` package provides symbolic-numeric computing. This allows the equations to be set up symbolically, as in `SymPy` before being passed off to `DifferentialEquations` to solve numerically. The above example with no wind resistance could be translated into the following: - -```julia; hold=true -@parameters t γ g -@variables x(t) y(t) -D = Differential(t) - -eqs = [D(D(x)) ~ -γ * D(x), - D(D(y)) ~ -g - γ * D(y)] - -@named sys = ODESystem(eqs) -sys = ode_order_lowering(sys) # turn 2nd order into 1st - -u0 = [D(x) => vxy₀[1], - D(y) => vxy₀[2], - x => 0.0, - y => 0.0] - -p = [γ => 0.0, - g => 9.8] - -prob = ODEProblem(sys, u0, TSPAN, p, jac=true) -sol = solve(prob,Tsit5()) - -plot(t -> sol(t)[3], t -> sol(t)[4], TSPAN..., legend=false) -``` - -The toolkit will automatically generate fast functions and can perform transformations (such as is done by `ode_order_lowering`) before passing along to the numeric solves. diff --git a/CwJ/ODEs/euler.jmd b/CwJ/ODEs/euler.jmd deleted file mode 100644 index d126ff6..0000000 --- a/CwJ/ODEs/euler.jmd +++ /dev/null @@ -1,834 +0,0 @@ -# Euler's method - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using Roots -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - - -const frontmatter = ( - title = "Euler's method", - description = "Calculus with Julia: Euler's method", - tags = ["CalculusWithJulia", "odes", "euler's method"], -); -fig_size = (800, 600) -nothing -``` - ----- - -The following section takes up the task of numerically approximating solutions to differential equations. `Julia` has a huge set of state-of-the-art tools for this task starting with the [DifferentialEquations](https://github.com/SciML/DifferentialEquations.jl) package. We don't use that package in this section, focusing on simpler methods and implementations for pedagogical purposes, but any further exploration should utilize the tools provided therein. A brief introduction to the package follows in an upcoming [section](./differential_equations.html). - - ----- - - - -Consider the differential equation: - -```math -y'(x) = y(x) \cdot x, \quad y(1)=1, -``` - -which can be solved with `SymPy`: - -```julia; -@syms x, y, u() -D = Differential(x) -x0, y0 = 1, 1 -F(y,x) = y*x - -dsolve(D(u)(x) - F(u(x), x)) -``` - -With the given initial condition, the solution becomes: - -```julia; -out = dsolve(D(u)(x) - F(u(x),x), u(x), ics=Dict(u(x0) => y0)) -``` - - -Plotting this solution over the slope field - -```julia; -p = plot(legend=false) -vectorfieldplot!((x,y) -> [1, F(x,y)], xlims=(0, 2.5), ylims=(0, 10)) -plot!(rhs(out), linewidth=5) -``` - - -we see that the vectors that are drawn seem to be tangent to the graph -of the solution. This is no coincidence, the tangent lines to integral -curves are in the direction of the slope field. - - -What if the graph of the solution were not there, could we use this -fact to *approximately* reconstruct the solution? - -That is, if we stitched together pieces of the slope field, would we -get a curve that was close to the actual answer? - -```julia; hold=true; echo=false; cache=true -## {{{euler_graph}}} -function make_euler_graph(n) - x, y = symbols("x, y") - F(y,x) = y*x - x0, y0 = 1, 1 - - h = (2-1)/5 - xs = zeros(n+1) - ys = zeros(n+1) - xs[1] = x0 # index is off by 1 - ys[1] = y0 - for i in 1:n - xs[i + 1] = xs[i] + h - ys[i + 1] = ys[i] + h * F(ys[i], xs[i]) - end - - p = plot(legend=false) - vectorfieldplot!((x,y) -> [1, F(y,x)], xlims=(1,2), ylims=(0,6)) - - ## Add Euler soln - plot!(p, xs, ys, linewidth=5) - scatter!(p, xs, ys) - - ## add function - out = dsolve(u'(x) - F(u(x), x), u(x), ics=(u, x0, y0)) - plot!(p, rhs(out), x0, xs[end], linewidth=5) - - p -end - - - - -n = 5 -anim = @animate for i=1:n - make_euler_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -caption = """ -Illustration of a function stitching together slope field lines to -approximate the answer to an initial-value problem. The other function drawn is the actual solution. -""" - -ImageFile(imgfile, caption) -``` - -The illustration suggests the answer is yes, let's see. The solution -is drawn over $x$ values $1$ to $2$. Let's try piecing together $5$ -pieces between $1$ and $2$ and see what we have. - -The slope-field vectors are *scaled* versions of the vector `[1, F(y,x)]`. The `1` -is the part in the direction of the $x$ axis, so here we would like -that to be $0.2$ (which is $(2-1)/5$. So our vectors would be `0.2 * -[1, F(y,x)]`. To allow for generality, we use `h` in place of the -specific value $0.2$. - -Then our first pieces would be the line connecting $(x_0,y_0)$ to - -```math -\langle x_0, y_0 \rangle + h \cdot \langle 1, F(y_0, x_0) \rangle. -``` - -The above uses vector notation to add the piece scaled by $h$ to the -starting point. Rather than continue with that notation, we will use -subscripts. Let $x_1$, $y_1$ be the postion of the tip of the -vector. Then we have: - -```math -x_1 = x_0 + h, \quad y_1 = y_0 + h F(y_0, x_0). -``` - -With this notation, it is easy to see what comes next: - -```math -x_2 = x_1 + h, \quad y_2 = y_1 + h F(y_1, x_1). -``` - -We just shifted the indices forward by $1$. But graphically what is -this? It takes the tip of the first part of our "stitched" together -solution, finds the slope filed there (`[1, F(y,x)]`) and then uses -this direction to stitch together one more piece. - -Clearly, we can repeat. The $n$th piece will end at: - -```math -x_{n+1} = x_n + h, \quad y_{n+1} = y_n + h F(y_n, x_n). -``` - -For our example, we can do some numerics. We want $h=0.2$ and $5$ -pieces, so values of $y$ at $x_0=1, x_1=1.2, x_2=1.4, x_3=1.6, -x_4=1.8,$ and $x_5=2$. - -Below we do this in a loop. We have to be a bit careful, as in `Julia` -the vector of zeros we create to store our answers begins indexing at -$1$, and not $0$. - -```julia; -n=5 -h = (2-1)/n -xs = zeros(n+1) -ys = zeros(n+1) -xs[1] = x0 # index is off by 1 -ys[1] = y0 -for i in 1:n - xs[i + 1] = xs[i] + h - ys[i + 1] = ys[i] + h * F(ys[i], xs[i]) -end -``` - -So how did we do? Let's look graphically: - -```julia; -plot(exp(-1/2)*exp(x^2/2), x0, 2) -plot!(xs, ys) -``` - -Not bad. We wouldn't expect this to be exact - due to the concavity -of the solution, each step is an underestimate. However, we see it is -an okay approximation and would likely be better with a smaller $h$. A -topic we pursue in just a bit. - -Rather than type in the above command each time, we wrap it all up in -a function. The inputs are $n$, $a=x_0$, $b=x_n$, $y_0$, and, most -importantly, $F$. The output is massaged into a function through a -call to `linterp`, rather than two vectors. The `linterp` function we define below just -finds a function that linearly interpolates between the points and is -`NaN` outside of the range of the $x$ values: - -```julia; -function linterp(xs, ys) - function(x) - ((x < xs[1]) || (x > xs[end])) && return NaN - for i in 1:(length(xs) - 1) - if xs[i] <= x < xs[i+1] - l = (x-xs[i]) / (xs[i+1] - xs[i]) - return (1-l) * ys[i] + l * ys[i+1] - end - end - ys[end] - end -end -``` - -With that, here is our function to find an approximate solution to $y'=F(y,x)$ with initial condition: - -```julia; -function euler(F, x0, xn, y0, n) - h = (xn - x0)/n - xs = zeros(n+1) - ys = zeros(n+1) - xs[1] = x0 - ys[1] = y0 - for i in 1:n - xs[i + 1] = xs[i] + h - ys[i + 1] = ys[i] + h * F(ys[i], xs[i]) - end - linterp(xs, ys) -end -``` - -With `euler`, it becomes easy to explore different values. - -For example, we thought the solution would look better with a smaller $h$ (or larger $n$). Instead of $n=5$, let's try $n=50$: - -```julia; -u₁₂ = euler(F, 1, 2, 1, 50) -plot(exp(-1/2)*exp(x^2/2), x0, 2) -plot!(u₁₂, x0, 2) -``` - -It is more work for the computer, but not for us, and clearly a much better approximation to the actual answer is found. - - -## The Euler method - - -```julia; hold=true; echo=false -imgfile ="figures/euler.png" -caption = """ -Figure from first publication of Euler's method. From [Gander and Wanner](http://www.unige.ch/~gander/Preprints/Ritz.pdf). -""" - -ImageFile(:ODEs, imgfile, caption) -``` - - -The name of our function reflects the [mathematician](https://en.wikipedia.org/wiki/Leonhard_Euler) associated with the iteration: - -```math -x_{n+1} = x_n + h, \quad y_{n+1} = y_n + h \cdot F(y_n, x_n), -``` - -to approximate a solution to the first-order, ordinary differential -equation with initial values: $y'(x) = F(y,x)$. - - -[The Euler method](https://en.wikipedia.org/wiki/Euler_method) uses -linearization. Each "step" is just an approximation of the function -value $y(x_{n+1})$ with the value from the tangent line tangent to the -point $(x_n, y_n)$. - - -Each step introduces an error. The error in one step is known as the -*local truncation error* and can be shown to be about equal to $1/2 -\cdot h^2 \cdot f''(x_{n})$ assuming $y$ has ``3`` or more derivatives. - -The total error, or more commonly, *global truncation error*, is the -error between the actual answer and the approximate answer at the end -of the process. It reflects an accumulation of these local errors. This -error is *bounded* by a constant times $h$. Since it gets smaller as -$h$ gets smaller in direct proportion, the Euler method is called -*first order*. - -Other, somewhat more complicated, methods have global truncation errors that -involve higher powers of $h$ - that is for the same size $h$, the -error is smaller. In analogy is the fact that Riemann sums have -error that depends on $h$, whereas other methods of approximating the -integral have smaller errors. For example, Simpson's rule had error -related to $h^4$. So, the Euler method may not be employed if there -is concern about total resources (time, computer, ...), it is -important for theoretical purposes in a manner similar to the role of the Riemann -integral. - -In the examples, we will see that for many problems the simple Euler -method is satisfactory, but not always so. The task of numerically -solving differential equations is not a one-size-fits-all one. In the -following, a few different modifications are presented to the basic -Euler method, but this just scratches the surface of the topic. - -#### Examples - -##### Example - - -Consider the initial value problem $y'(x) = x + y(x)$ with initial -condition $y(0)=1$. This problem can be solved exactly. Here we -approximate over $[0,2]$ using Euler's method. - -```julia; -𝑭(y,x) = x + y -𝒙0, 𝒙n, 𝒚0 = 0, 2, 1 -𝒇 = euler(𝑭, 𝒙0, 𝒙n, 𝒚0, 25) -𝒇(𝒙n) -``` - -We graphically compare our approximate answer with the exact one: - -```julia; -plot(𝒇, 𝒙0, 𝒙n) -𝒐ut = dsolve(D(u)(x) - 𝑭(u(x),x), u(x), ics = Dict(u(𝒙0) => 𝒚0)) -plot(rhs(𝒐ut), 𝒙0, 𝒙n) -plot!(𝒇, 𝒙0, 𝒙n) -``` - -From the graph it appears our value for `f(xn)` will underestimate the -actual value of the solution slightly. - -##### Example - -The equation $y'(x) = \sin(x \cdot y)$ is not separable, so need not have an -easy solution. The default method will fail. Looking at the available methods with `sympy.classify_ode(𝐞qn, u(x))` shows a power series method which -can return a power series *approximation* (a Taylor polynomial). Let's -look at comparing an approximate answer given by the Euler method to -that one returned by `SymPy`. - -First, the `SymPy` solution: - -```julia; -𝐅(y,x) = sin(x*y) -𝐞qn = D(u)(x) - 𝐅(u(x), x) -𝐨ut = dsolve(𝐞qn, hint="1st_power_series") -``` - - - -If we assume $y(0) = 1$, we can continue: - -```julia; -𝐨ut1 = dsolve(𝐞qn, u(x), ics=Dict(u(0) => 1), hint="1st_power_series") -``` - -The approximate value given by the Euler method is - -```julia; -𝐱0, 𝐱n, 𝐲0 = 0, 2, 1 - -plot(legend=false) -vectorfieldplot!((x,y) -> [1, 𝐅(y,x)], xlims=(𝐱0, 𝐱n), ylims=(0,5)) -plot!(rhs(𝐨ut1).removeO(), linewidth=5) - -𝐮 = euler(𝐅, 𝐱0, 𝐱n, 𝐲0, 10) -plot!(𝐮, linewidth=5) -``` - -We see that the answer found from using a polynomial series matches that of Euler's method for a bit, but as time evolves, the approximate solution given by Euler's method more closely tracks the slope field. - -##### Example - - -The -[Brachistochrone problem](http://www.unige.ch/~gander/Preprints/Ritz.pdf) -was posed by Johann Bernoulli in 1696. It asked for the curve between -two points for which an object will fall faster along that curve than -any other. For an example, a bead sliding on a wire will take a certain amount of time to get from point $A$ to point $B$, the time depending on the shape of the wire. Which shape will take the least amount of time? - - -```julia; hold=true; echo=false -imgfile = "figures/bead-game.jpg" -caption = """ - -A child's bead game. What shape wire will produce the shortest time for a bed to slide from a top to the bottom? - -""" -ImageFile(:ODEs, imgfile, caption) -``` - -Restrict our attention to the $x$-$y$ plane, and consider a path, -between the point $(0,A)$ and $(B,0)$. Let $y(x)$ be the distance from -$A$, so $y(0)=0$ and at the end $y$ will be $A$. - - -[Galileo](http://www-history.mcs.st-and.ac.uk/HistTopics/Brachistochrone.html) -knew the straight line was not the curve, but incorrectly thought the -answer was a part of a circle. - -```julia; hold=true; echo=false -imgfile = "figures/galileo.gif" -caption = """ -As early as 1638, Galileo showed that an object falling along `AC` and then `CB` will fall faster than one traveling along `AB`, where `C` is on the arc of a circle. -From the [History of Math Archive](http://www-history.mcs.st-and.ac.uk/HistTopics/Brachistochrone.html). -""" -ImageFile(:ODEs, imgfile, caption) -``` - - -This simulation also suggests that a curved path is better than the shorter straight one: - -```julia; hold=true; echo=false; cache=true -##{{{brach_graph}}} - -function brach(f, x0, vx0, y0, vy0, dt, n) - m = 1 - g = 9.8 - - axs = Float64[0] - ays = Float64[-g] - vxs = Float64[vx0] - vys = Float64[vy0] - xs = Float64[x0] - ys = Float64[y0] - - for i in 1:n - x = xs[end] - vx = vxs[end] - - ax = -f'(x) * (f''(x) * vx^2 + g) / (1 + f'(x)^2) - ay = f''(x) * vx^2 + f'(x) * ax - - push!(axs, ax) - push!(ays, ay) - - push!(vxs, vx + ax * dt) - push!(vys, vys[end] + ay * dt) - push!(xs, x + vxs[end] * dt)# + (1/2) * ax * dt^2) - push!(ys, ys[end] + vys[end] * dt)# + (1/2) * ay * dt^2) - end - - [xs ys vxs vys axs ays] - -end - - -fs = [x -> 1 - x, - x -> (x-1)^2, - x -> 1 - sqrt(1 - (x-1)^2), - x -> - (x-1)*(x+1), - x -> 3*(x-1)*(x-1/3) - ] - - -MS = [brach(f, 1/100, 0, 1, 0, 1/100, 100) for f in fs] - - -function make_brach_graph(n) - - p = plot(xlim=(0,1), ylim=(-1/3, 1), legend=false) - for (i,f) in enumerate(fs) - plot!(f, 0, 1) - U = MS[i] - x = min(1.0, U[n,1]) - scatter!(p, [x], [f(x)]) - end - p - -end - - - -n = 4 -anim = @animate for i=[1,5,10,15,20,25,30,35,40,45,50,55,60] - make_brach_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -caption = """ -The race is on. An illustration of beads falling along a path, as can be seen, some paths are faster than others. The fastest path would follow a cycloid. See [Bensky and Moelter](https://pdfs.semanticscholar.org/66c1/4d8da6f2f5f2b93faf4deb77aafc7febb43a.pdf) for details on simulating a bead on a wire. -""" - -ImageFile(imgfile, caption) -``` - - - - -Now, the natural question is which path is best? The solution can be -[reduced](http://mathworld.wolfram.com/BrachistochroneProblem.html) to -solving this equation for a positive $c$: - -```math -1 + (y'(x))^2 = \frac{c}{y}, \quad c > 0. -``` - -Reexpressing, this becomes: - -```math -\frac{dy}{dx} = \sqrt{\frac{C-y}{y}}. -``` - -This is a separable equation and can be solved, but even `SymPy` has -trouble with this integral. However, the result has been known to be a piece of a cycloid since the insightful -Jacob Bernoulli used an analogy from light bending to approach the problem. The answer is best described parametrically -through: - -```math -x(u) = c\cdot u - \frac{c}{2}\sin(2u), \quad y(u) = \frac{c}{2}( 1- \cos(2u)), \quad 0 \leq u \leq U. -``` - -The values of $U$ and $c$ must satisfy $(x(U), y(U)) = (B, A)$. - - -Rather than pursue this, we will solve it numerically for a fixed -value of $C$ over a fixed interval to see the shape. - - -The equation can be written in terms of $y'=F(y,x)$, where - -```math -F(y,x) = \sqrt{\frac{c-y}{y}}. -``` - -But as $y_0 = 0$, we immediately would have a problem with the first step, as there would be division by $0$. - -This says that for the optimal solution, the bead picks up speed by first sliding straight down before heading off towards $B$. That's great for the physics, but runs roughshod over our Euler method, as the first step has an infinity. - -For this, we can try the *backwards Euler* method which uses the slope at $(x_{n+1}, y_{n+1})$, rather than $(x_n, y_n)$. The update step becomes: - -```math -y_{n+1} = y_n + h \cdot F(y_{n+1}, x_{n+1}). -``` - -Seems innocuous, but the value we are trying to find, $y_{n+1}$, is -now on both sides of the equation, so is only *implicitly* defined. In -this code, we use the `find_zero` function from the `Roots` package. The -caveat is, this function needs a good initial guess, and the one we -use below need not be widely applicable. - - -```julia; -function back_euler(F, x0, xn, y0, n) - h = (xn - x0)/n - xs = zeros(n+1) - ys = zeros(n+1) - xs[1] = x0 - ys[1] = y0 - for i in 1:n - xs[i + 1] = xs[i] + h - ## solve y[i+1] = y[i] + h * F(y[i+1], x[i+1]) - ys[i + 1] = find_zero(y -> ys[i] + h * F(y, xs[i + 1]) - y, ys[i]+h) - end - linterp(xs, ys) -end -``` - -We then have with $C=1$ over the interval $[0,1.2]$ the following: - -```julia; -𝐹(y, x; C=1) = sqrt(C/y - 1) -𝑥0, 𝑥n, 𝑦0 = 0, 1.2, 0 -cyc = back_euler(𝐹, 𝑥0, 𝑥n, 𝑦0, 50) -plot(x -> 1 - cyc(x), 𝑥0, 𝑥n) -``` - -Remember, $y$ is the displacement from the top, so it is -non-negative. Above we flipped the graph to make it look more like -expectation. In general, the trajectory may actually dip below the -ending point and come back up. The above won't see this, for as -written $dy/dx \geq 0$, which need not be the case, as the defining -equation is in terms of $(dy/dx)^2$, so the derivative could have any -sign. - - - -##### Example: stiff equations - -The Euler method is *convergent*, in that as $h$ goes to $0$, the -approximate solution will converge to the actual answer. However, this -does not say that for a fixed size $h$, the approximate value will be -good. For example, consider the differential equation $y'(x) = --5y$. This has solution $y(x)=y_0 e^{-5x}$. However, if we try the -Euler method to get an answer over $[0,2]$ with $h=0.5$ we don't see -this: - -```julia; -ℱ(y,x) = -5y -𝓍0, 𝓍n, 𝓎0 = 0, 2, 1 -𝓊 = euler(ℱ, 𝓍0, 𝓍n, 𝓎0, 4) # n =4 => h = 2/4 -vectorfieldplot((x,y) -> [1, ℱ(y,x)], xlims=(0, 2), ylims=(-5, 5)) -plot!(x -> y0 * exp(-5x), 0, 2, linewidth=5) -plot!(𝓊, 0, 2, linewidth=5) -``` - -What we see is that the value of $h$ is too big to capture the decay -scale of the solution. A smaller $h$, can do much better: - -```julia; -𝓊₁ = euler(ℱ, 𝓍0, 𝓍n, 𝓎0, 50) # n=50 => h = 2/50 -plot(x -> y0 * exp(-5x), 0, 2) -plot!(𝓊₁, 0, 2) -``` - -This is an example of a -[stiff equation](https://en.wikipedia.org/wiki/Stiff_equation). Such -equations cause explicit methods like the Euler one problems, as small -$h$s are needed to good results. - -The implicit, backward Euler method does not have this issue, as we can see here: - -```julia; -𝓊₂ = back_euler(ℱ, 𝓍0, 𝓍n, 𝓎0, 4) # n =4 => h = 2/4 -vectorfieldplot((x,y) -> [1, ℱ(y,x)], xlims=(0, 2), ylims=(-1, 1)) -plot!(x -> y0 * exp(-5x), 0, 2, linewidth=5) -plot!(𝓊₂, 0, 2, linewidth=5) -``` - - -##### Example: The pendulum - - -The differential equation describing the simple pendulum is - -```math -\theta''(t) = - \frac{g}{l}\sin(\theta(t)). -``` - -The typical approach to solving for $\theta(t)$ is to use the small-angle approximation that $\sin(x) \approx x$, and then the differential equation simplifies to: -$\theta''(t) = -g/l \cdot \theta(t)$, which is easily solved. - -Here we try to get an answer numerically. However, the problem, as stated, is not a first order equation due to the $\theta''(t)$ term. If we let $u(t) = \theta(t)$ and $v(t) = \theta'(t)$, then we get *two* coupled first order equations: - -```math -v'(t) = -g/l \cdot \sin(u(t)), \quad u'(t) = v(t). -``` - -We can try the Euler method here. A simple approach might be this iteration scheme: - -```math -\begin{align*} -x_{n+1} &= x_n + h,\\ -u_{n+1} &= u_n + h v_n,\\ -v_{n+1} &= v_n - h \cdot g/l \cdot \sin(u_n). -\end{align*} -``` - -Here we need *two* initial conditions: one for the initial value -$u(t_0)$ and the initial value of $u'(t_0)$. We have seen if we start at an angle $a$ and release the bob from rest, so $u'(0)=0$ we get a sinusoidal answer to the linearized model. What happens here? We let $a=1$, $L=5$ and $g=9.8$: - -We write a function to solve this starting from $(x_0, y_0)$ and ending at $x_n$: - -```julia; -function euler2(x0, xn, y0, yp0, n; g=9.8, l = 5) - xs, us, vs = zeros(n+1), zeros(n+1), zeros(n+1) - xs[1], us[1], vs[1] = x0, y0, yp0 - h = (xn - x0)/n - for i = 1:n - xs[i+1] = xs[i] + h - us[i+1] = us[i] + h * vs[i] - vs[i+1] = vs[i] + h * (-g / l) * sin(us[i]) - end - linterp(xs, us) -end -``` - -Let's take $a = \pi/4$ as the initial angle, then the approximate -solution should be $\pi/4\cos(\sqrt{g/l}x)$ with period $T = -2\pi\sqrt{l/g}$. We try first to plot then over 4 periods: - -```julia; -𝗅, 𝗀 = 5, 9.8 -𝖳 = 2pi * sqrt(𝗅/𝗀) -𝗑0, 𝗑n, 𝗒0, 𝗒p0 = 0, 4𝖳, pi/4, 0 -plot(euler2(𝗑0, 𝗑n, 𝗒0, 𝗒p0, 20), 0, 4𝖳) -``` - -Something looks terribly amiss. The issue is the step size, $h$, is -too large to capture the oscillations. There are basically only $5$ -steps to capture a full up and down motion. Instead, we try to get $20$ steps per period -so $n$ must be not $20$, but $4 \cdot 20 \cdot T \approx 360$. To this -graph, we add the approximate one: - -```julia; -plot(euler2(𝗑0, 𝗑n, 𝗒0, 𝗒p0, 360), 0, 4𝖳) -plot!(x -> pi/4*cos(sqrt(𝗀/𝗅)*x), 0, 4𝖳) -``` - -Even now, we still see that something seems amiss, though the issue is -not as dramatic as before. The oscillatory nature of the pendulum is -seen, but in the Euler solution, the amplitude grows, which would -necessarily mean energy is being put into the system. A familiar -instance of a pendulum would be a child on a swing. Without pumping -the legs - putting energy in the system - the height of the swing's -arc will not grow. Though we now have oscillatory motion, this growth -indicates the solution is still not quite right. The issue is likely -due to each step mildly overcorrecting and resulting in an overall -growth. One of the questions pursues this a bit further. - -## Questions - -##### Question - -Use Euler's method with $n=5$ to approximate $u(1)$ where - -```math -u'(x) = x - u(x), \quad u(0) = 1 -``` - -```julia; hold=true; echo=false -F(y,x) = x - y -x0, xn, y0 = 0, 1, 1 -val = euler(F, x0, xn, y0, 5)(1) -numericq(val) -``` - -##### Question - -Consider the equation - -```math -y' = x \cdot \sin(y), \quad y(0) = 1. -``` - -Use Euler's method with $n=50$ to find the value of $y(5)$. - -```julia; hold=true; echo=false -F(y, x) = x * sin(y) -x0, xn, y0 = 0, 5, 1 -n = 50 -u = euler(F, x0, xn, y0, n) -numericq(u(xn)) -``` - - -##### Question - -Consider the ordinary differential equation - -```math -\frac{dy}{dx} = 1 - 2\frac{y}{x}, \quad y(1) = 0. -``` - -Use Euler's method to solve for $y(2)$ when $n=50$. - -```julia; hold=true; echo=false -F(y, x) = 1 - 2y/x -x0, xn, y0 = 1, 2, 0 -n = 50 -u = euler(F, x0, xn, y0, n) -numericq(u(xn)) -``` - -##### Question - - -Consider the ordinary differential equation - -```math -\frac{dy}{dx} = \frac{y \cdot \log(y)}{x}, \quad y(2) = e. -``` - -Use Euler's method to solve for $y(3)$ when $n=25$. - -```julia; hold=true; echo=false -F(y, x) = y*log(y)/x -x0, xn, y0 = 2, 3, exp(1) -n = 25 -u = euler(F, x0, xn, y0, n) -numericq(u(xn)) -``` - - -##### Question - -Consider the first-order non-linear ODE - -```math -y' = y \cdot (1-2x), \quad y(0) = 1. -``` - -Use Euler's method with $n=50$ to approximate the solution $y$ over $[0,2]$. - -What is the value at $x=1/2$? - -```julia; hold=true; echo=false -F(y, x) = y * (1-2x) -x0, xn, y0 = 0, 2, 1 -n = 50 -u = euler(F, x0, xn, y0, n) -numericq(u(1/2)) -``` - -What is the value at $x=3/2$? - -```julia; hold=true; echo=false -F(y, x) = y * (1-2x) -x0, xn, y0 = 0, 2, 1 -n = 50 -u = euler(F, x0, xn, y0, n) -numericq(u(3/2)) -``` - -##### Question: The pendulum revisited. - -The issue with the pendulum's solution growing in amplitude can be -addressed using a modification to the Euler method attributed to -[Cromer](http://astro.physics.ncsu.edu/urca/course_files/Lesson14/index.html). The -fix is to replace the term `sin(us[i])` in the line `vs[i+1] = vs[i] + h * (-g / l) * -sin(us[i])` of the `euler2` function with `sin(us[i+1])`, which uses the updated angular -velocity in the ``2``nd step in place of the value before the step. - -Modify the `euler2` function to implement the Euler-Cromer method. What do you see? - -```julia; hold=true; echo=false -choices = [ -"The same as before - the amplitude grows", -"The solution is identical to that of the approximation found by linearization of the sine term", -"The solution has a constant amplitude, but its period is slightly *shorter* than that of the approximate solution found by linearization", -"The solution has a constant amplitude, but its period is slightly *longer* than that of the approximate solution found by linearization"] -answ = 4 -radioq(choices, answ, keep_order=true) -``` diff --git a/CwJ/ODEs/figures/bead-game.jpg b/CwJ/ODEs/figures/bead-game.jpg deleted file mode 100644 index 72e2389..0000000 Binary files a/CwJ/ODEs/figures/bead-game.jpg and /dev/null differ diff --git a/CwJ/ODEs/figures/euler.png b/CwJ/ODEs/figures/euler.png deleted file mode 100644 index 78bf257..0000000 Binary files a/CwJ/ODEs/figures/euler.png and /dev/null differ diff --git a/CwJ/ODEs/figures/galileo.gif b/CwJ/ODEs/figures/galileo.gif deleted file mode 100644 index 07d18b2..0000000 Binary files a/CwJ/ODEs/figures/galileo.gif and /dev/null differ diff --git a/CwJ/ODEs/figures/verrazano-narrows-bridge-anniversary-historic-photos-2.jpeg b/CwJ/ODEs/figures/verrazano-narrows-bridge-anniversary-historic-photos-2.jpeg deleted file mode 100644 index 82c5258..0000000 Binary files a/CwJ/ODEs/figures/verrazano-narrows-bridge-anniversary-historic-photos-2.jpeg and /dev/null differ diff --git a/CwJ/ODEs/odes.jmd b/CwJ/ODEs/odes.jmd deleted file mode 100644 index 5ec99e0..0000000 --- a/CwJ/ODEs/odes.jmd +++ /dev/null @@ -1,917 +0,0 @@ -# ODEs - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "ODEs", - description = "Calculus with Julia: ODEs", - tags = ["CalculusWithJulia", "odes", "odes"], -); -nothing -``` - ----- - -Some relationships are easiest to describe in terms of rates or derivatives. For example: - -* Knowing the speed of a car and how long it has been driving can - summarize the car's location. - -* One of Newton's famous laws, $F=ma$, describes the force on an - object of mass $m$ in terms of the acceleration. The acceleration - is the derivative of velocity, which in turn is the derivative of - position. So if we know the rates of change of $v(t)$ or $x(t)$, we - can differentiate to find $F$. - -* Newton's law of [cooling](http://tinyurl.com/z4lmetp). This - describes the temperature change in an object due to a difference in - temperature with the object's surroundings. The formula being, - $T'(t) = -r \left(T(t) - T_a \right)$, where $T(t)$ is temperature at time $t$ - and $T_a$ the ambient temperature. - -* [Hooke's law](http://tinyurl.com/kbz7r8l) relates force on an object - to the position on the object, through $F = k x$. This is - appropriate for many systems involving springs. Combined with - Newton's law $F=ma$, this leads to an equation that $x$ must - satisfy: $m x''(t) = k x(t)$. - -## Motion with constant acceleration - -Let's consider the case of constant acceleration. This describes how nearby objects fall to earth, as the force due to gravity is assumed to be a constant, so the acceleration is the constant force divided by the constant mass. - -With constant acceleration, what is the velocity? - -As mentioned, we have $dv/dt = a$ for any velocity function $v(t)$, but in this case, the right hand side is assumed to be constant. How does this restrict the possible functions, $v(t)$, that the velocity can be? - -Here we can integrate to find that any answer must look like the following for some constant of integration: - -```math -v(t) = \int \frac{dv}{dt} dt = \int a dt = at + C. -``` - -If we are given the velocity at a fixed time, say $v(t_0) = v_0$, then we can use the definite integral to get: - -```math -v(t) - v(t_0) = \int_{t_0}^t a dt = at - a t_0. -``` - -Solving, gives: - -```math -v(t) = v_0 + a (t - t_0). -``` - -This expresses the velocity at time $t$ in terms of the initial velocity, the constant acceleration and the time duration. - -A natural question might be, is this the *only* possible answer? There are a few useful ways to think about this. - -First, suppose there were another, say $u(t)$. Then define $w(t)$ to be the difference: $w(t) = v(t) - u(t)$. We would have that $w'(t) = v'(t) - u'(t) = a - a = 0$. But from the mean value theorem, a function whose derivative is *continuously* $0$, will necessarily be a constant. So at most, $v$ and $u$ will differ by a constant, but if both are equal at $t_0$, they will be equal for all $t$. - -Second, since the derivative of any solution is a continuous function, it is true by the fundamental theorem of calculus that it *must* satisfy the form for the antiderivative. The initial condition makes the answer unique, as the indeterminate $C$ can take only one value. - -Summarizing, we have - -> If ``v(t)`` satisfies the equation: ``v'(t) = a``, ``v(t_0) = v_0,`` -> then the unique solution will be ``v(t) = v_0 + a (t - t_0)``. - - -Next, what about position? Here we know that the time derivative of position yields the velocity, so we should have that the unknown position function satisfies this equation and initial condition: - -```math -x'(t) = v(t) = v_0 + a (t - t_0), \quad x(t_0) = x_0. -``` - -Again, we can integrate to get an answer for any value $t$: - -```math -\begin{align*} -x(t) - x(t_0) &= \int_{t_0}^t \frac{dv}{dt} dt \\ -&= (v_0t + \frac{1}{2}a t^2 - at_0 t) |_{t_0}^t \\ -&= (v_0 - at_0)(t - t_0) + \frac{1}{2} a (t^2 - t_0^2). -\end{align*} -``` - -There are three constants: the initial value for the independent variable, $t_0$, and the two initial values for the velocity and position, $v_0, x_0$. Assuming $t_0 = 0$, we can simplify the above to get a formula familiar from introductory physics: - -```math -x(t) = x_0 + v_0 t + \frac{1}{2} at^2. -``` - -Again, the mean value theorem can show that with the initial value specified this is the only possible solution. - -## First-order initial-value problems - -The two problems just looked at can be summarized by the following. We are looking for solutions to an equation of the form (taking $y$ and $x$ as the variables, in place of $x$ and $t$): - -```math -y'(x) = f(x), \quad y(x_0) = y_0. -``` - -This is called an *ordinary differential equation* (ODE), as it is an equation involving the ordinary derivative of an unknown function, $y$. - -This is called a first-order, ordinary differential equation, as there is only the first derivative involved. - -This is called an initial-value problem, as the value at the initial point $x_0$ is specified as part of the problem. - -#### Examples - -Let's look at a few more examples, and then generalize. - -##### Example: Newton's law of cooling - -Consider the ordinary differential equation given by Newton's law of cooling: - -```math -T'(t) = -r (T(t) - T_a), \quad T(0) = T_0 -``` - -This equation is also first order, as it involves just the first derivative, but notice that on the right hand side is the function $T$, not the variable being differentiated against, $t$. - -As we have a difference on the right hand side, we rename the variable through $U(t) = T(t) - T_a$. Then, as $U'(t) = T'(t)$, we have the equation: - -```math -U'(t) = -r U(t), \quad U(0) = U_0. -``` - - -This shows that the rate of change of $U$ depends on $U$. Large postive values indicate a negative rate of change - a push back towards the origin, and large negative values of $U$ indicate a positive rate of change - again, a push back towards the origin. We shouldn't be surprised to either see a steady decay towards the origin, or oscillations about the origin. - -What will we find? This equation is different from the previous two -equations, as the function $U$ appears on both sides. However, we can -rearrange to get: - -```math -\frac{dU}{dt}\frac{1}{U(t)} = -r. -``` - - -This suggests integrating both sides, as before. Here we do the "$u$"-substitution $u = U(t)$, so $du = U'(t) dt$: - -```math --rt + C = \int \frac{dU}{dt}\frac{1}{U(t)} dt = -\int \frac{1}{u}du = \log(u). -``` - -Solving gives: $u = U(t) = e^C e^{-rt}$. Using the initial condition forces $e^C = U(t_0) = T(0) - T_a$ and so our solution in terms of $T(t)$ is: - - -```math -T(t) - T_a = (T_0 - T_a) e^{-rt}. -``` - -In words, the initial difference in temperature of the object and the environment exponentially decays to $0$. - -That is, as $t > 0$ goes to $\infty$, the right hand will go to $0$ for $r > 0$, so $T(t) \rightarrow T_a$ - the temperature of the object will reach the ambient temperature. The rate of this is largest when the difference between $T(t)$ and $T_a$ is largest, so when objects are cooling the statement "hotter things cool faster" is appropriate. - - -A graph of the solution for $T_0=200$ and $T_a=72$ and $r=1/2$ is made -as follows. We've added a few line segments from the defining formula, -and see that they are indeed tangent to the solution found for the differential equation. - -```julia; echo=false -let - T0, Ta, r = 200, 72, 1/2 - f(u, t) = -r*(u - Ta) - v(t) = Ta + (T0 - Ta) * exp(-r*t) - p = plot(v, 0, 6, linewidth=4, legend=false) - [plot!(p, x -> v(a) + f(v(a), a) * (x-a), 0, 6) for a in 1:2:5] - p -end -``` - - - - -The above is implicitly assuming that there could be no other -solution, than the one we found. Is that really the case? We will see -that there is a theorem that can answer this, but in this case, the -trick of taking the difference of two equations satisfying the -equation leads to the equation $W'(t) = r W(t), \text{ and } W(0) = -0$. This equation has a general solution of $W(t) = Ce^{rt}$ and the -initial condition forces $C=0$, so $W(t) = 0$, as before. Hence, the -initial-value problem for Newton's law of cooling has a unique -solution. - - - -In general, the equation could be written as (again using $y$ and $x$ as the variables): - -```math -y'(x) = g(y), \quad y(x_0) = y_0 -``` - - -This is called an *autonomous*, first-order ODE, as the right-hand side does not depend on $x$ (except through ``y(x)``). - -Let $F(y) = \int_{y_0}^y du/g(u)$, then a solution to the above is $F(y) = x - x_0$, assuming $1/g(u)$ is integrable. - - -##### Example: Toricelli's law - -[Toricelli's Law](http://tinyurl.com/hxvf3qp) describes the speed a jet of water will leave a vessel through an opening below the surface of the water. The formula is $v=\sqrt{2gh}$, where $h$ is the height of the water above the hole and $g$ the gravitational constant. This arises from equating the kinetic energy gained, $1/2 mv^2$ and potential energy lost, $mgh$, for the exiting water. - -An application of Torricelli's law is to describe the volume of water in a tank over time, $V(t)$. Imagine a cylinder of cross sectional area $A$ with a hole of cross sectional diameter $a$ at the bottom, Then $V(t) = A h(t)$, with $h$ giving the height. The change in volume over $\Delta t$ units of time must be given by the value $a v(t) \Delta t$, or - -```math -V(t+\Delta t) - V(t) = -a v(t) \Delta t = -a\sqrt{2gh(t)}\Delta t -``` - -This suggests the following formula, written in terms of $h(t)$ should apply: - -```math -A\frac{dh}{dt} = -a \sqrt{2gh(t)}. -``` - -Rearranging, this gives an equation - -```math -\frac{dh}{dt} \frac{1}{\sqrt{h(t)}} = -\frac{a}{A}\sqrt{2g}. -``` - -Integrating both sides yields: - -```math -2\sqrt{h(t)} = -\frac{a}{A}\sqrt{2g} t + C. -``` - -If $h(0) = h_0 = V(0)/A$, we can solve for $C = 2\sqrt{h_0}$, or - -```math -\sqrt{h(t)} = \sqrt{h_0} -\frac{1}{2}\frac{a}{A}\sqrt{2g} t. -``` - - -Setting $h(t)=0$ and solving for $t$ shows that the time to drain the tank would be $(2A)/(a\sqrt{2g})\sqrt{h_0}$. - - -##### Example - -Consider now the equation - -```math -y'(x) = y(x)^2, \quad y(x_0) = y_0. -``` - -This is called a *non-linear* ordinary differential equation, as the $y$ variable on the right hand side presents itself in a non-linear form (it is squared). These equations may have solutions that are not defined for all times. - -This particular problem can be solved as before by moving the $y^2$ to the left hand side and integrating to yield: - -```math -y(x) = - \frac{1}{C + x}, -``` - -and with the initial condition: - -```math -y(x) = \frac{y_0}{1 - y_0(x - x_0)}. -``` - -This answer can demonstrate *blow-up*. That is, in a finite range for $x$ values, the $y$ value can go to infinity. For example, if the initial conditions are $x_0=0$ and $y_0 = 1$, then $y(x) = 1/(1-x)$ is only defined for $x \geq x_0$ on $[0,1)$, as at $x=1$ there is a vertical asymptote. - - -## Separable equations - -We've seen equations of the form $y'(x) = f(x)$ and $y'(x) = g(y)$ both solved by integrating. The same tricks will work for equations of the form $y'(x) = f(x) \cdot g(y)$. Such equations are called *separable*. - -Basically, we equate up to constants - -```math -\int \frac{dy}{g(y)} = \int f(x) dx. -``` - -For example, suppose we have the equation - -```math -\frac{dy}{dx} = x \cdot y(x), \quad y(x_0) = y_0. -``` - -Then we can find a solution, $y(x)$ through: - -```math -\int \frac{dy}{y} = \int x dx, -``` - -or - -```math -\log(y) = \frac{x^2}{2} + C -``` - -Which yields: - -```math -y(x) = e^C e^{\frac{1}{2}x^2}. -``` - -Substituting in $x_0$ yields a value for $C$ in terms of the initial information $y_0$ and $x_0$. - - -## Symbolic solutions - -Differential equations are classified according to their type. Different types have different methods for solution, when a solution exists. - -The first-order initial value equations we have seen can be described generally by - -```math -\begin{align*} -y'(x) &= F(y,x),\\ -y(x_0) &= x_0. -\end{align*} -``` - -Special cases include: - -* *linear* if the function $F$ is linear in $y$; -* *autonomous* if $F(y,x) = G(y)$ (a function of $y$ alone); -* *separable* if $F(y,x) = G(y)H(x)$. - -As seen, separable equations are approached by moving the "$y$" terms to one side, the "$x$" terms to the other and integrating. This also applies to autonomous equations then. There are other families of equation types that have exact solutions, and techniques for solution, summarized at this [Wikipedia page](http://tinyurl.com/zywzz4q). - -Rather than go over these various families, we demonstrate that `SymPy` can solve many of these equations symbolically. - - -The `solve` function in `SymPy` solves equations for unknown -*variables*. As a differential equation involves an unknown *function* -there is a different function, `dsolve`. The basic idea is to describe -the differential equation using a symbolic function and then call -`dsolve` to solve the expression. - -Symbolic functions are defined by the `@syms` macro (also see `?symbols`) using parentheses to distinguish a function from a variable: - -```julia; -@syms x u() # a symbolic variable and a symbolic function -``` - - -We will solve the following, known as the *logistic equation*: - -```math -u'(x) = a u(1-u), \quad a > 0 -``` - -Before beginning, we look at the form of the equation. When $u=0$ or -$u=1$ the rate of change is $0$, so we expect the function might be -bounded within that range. If not, when $u$ gets bigger than $1$, then -the slope is negative and when $u$ gets less than $0$, the slope is -positive, so there will at least be a drift back to the range -$[0,1]$. Let's see exactly what happens. We define a parameter, -restricting `a` to be positive: - - - -```julia; -@syms a::positive -``` - - -To specify a derivative of `u` in our equation we can use `diff(u(x),x)` but here, for visual simplicity, use the `Differential` operator, as follows: - -```julia; -D = Differential(x) -eqn = D(u)(x) ~ a * u(x) * (1 - u(x)) # use l \Equal[tab] r, Eq(l,r), or just l - r -``` - -In the above, we evaluate the symbolic function at the variable `x` -through the use of `u(x)` in the expression. The equation above uses `~` to combine the left- and right-hand sides as an equation in `SymPy`. (A unicode equals is also available for this task). This is a shortcut for `Eq(l,r)`, but even just using `l - r` would suffice, as the default assumption for an equation is that it is set to `0`. - -The `Differential` operation is borrowed from the `ModelingToolkit` package, which will be introduced later. - - -To finish, we call `dsolve` to find a solution (if possible): - -```julia; -out = dsolve(eqn) -``` - -This answer - to a first-order equation - has one free constant, -`C_1`, which can be solved for from an initial condition. We can see -that when $a > 0$, as $x$ goes to positive infinity the solution goes -to $1$, and when $x$ goes to negative infinity, the solution goes to $0$ -and otherwise is trapped in between, as expected. - -The limits are confirmed by investigating the limits of the right-hand: - -```julia; -limit(rhs(out), x => oo), limit(rhs(out), x => -oo) -``` - -We can confirm that the solution is always increasing, hence trapped within ``[0,1]`` by observing that the derivative is positive when `C₁` is positive: - -```juila; -diff(rhs(out),x) -``` - - - -Suppose that $u(0) = 1/2$. Can we solve for $C_1$ symbolically? We can use `solve`, but first we will need to get the symbol for `C_1`: - -```julia; -eq = rhs(out) # just the right hand side -C1 = first(setdiff(free_symbols(eq), (x,a))) # fish out constant, it is not x or a -c1 = solve(eq(x=>0) - 1//2, C1) -``` - -And we plug in with: - -```julia; -eq(C1 => c1[1]) -``` - -That's a lot of work. The `dsolve` function in `SymPy` allows initial conditions to be specified for some equations. In this case, ours is $x_0=0$ and $y_0=1/2$. The extra arguments passed in through a dictionary to the `ics` argument: - -```julia; -x0, y0 = 0, Sym(1//2) -dsolve(eqn, u(x), ics=Dict(u(x0) => y0)) -``` - -(The one subtlety is the need to write the rational value as a symbolic expression, as otherwise it will get converted to a floating point value prior to being passed along.) - -##### Example: Hooke's law - - -In the first example, we solved for position, $x(t)$, from an assumption of constant acceleration in two steps. The equation relating the two is a second-order equation: $x''(t) = a$, so two constants are generated. That a second-order equation could be reduced to two first-order equations is not happy circumstance, as it can always be done. Rather than show the technique though, we demonstrate that `SymPy` can also handle some second-order ODEs. - -Hooke's law relates the force on an object to its position via $F=ma = -kx$, or $x''(t) = -(k/m)x(t)$. - -Suppose $k > 0$. Then we can solve, similar to the above, with: - -```julia; -@syms k::positive m::positive -D2 = D ∘ D # takes second derivative through composition -eqnh = D2(u)(x) ~ -(k/m)*u(x) -dsolve(eqnh) -``` - -Here we find two constants, as anticipated, for we would guess that -two integrations are needed in the solution. - -Suppose the spring were started by pulling it down to a bottom and -releasing. The initial position at time $0$ would be $a$, say, and -initial velocity $0$. Here we get the solution specifying initial -conditions on the function and its derivative (expressed through -`u'`): - -```julia; -dsolve(eqnh, u(x), ics = Dict(u(0) => -a, D(u)(0) => 0)) -``` - - -We get that the motion will follow -$u(x) = -a \cos(\sqrt{k/m}x)$. This is simple oscillatory behavior. As the spring stretches, the force gets large enough to pull it back, and as it compresses the force gets large enough to push it back. The amplitude of this oscillation is $a$ and the period $2\pi/\sqrt{k/m}$. Larger $k$ values mean shorter periods; larger $m$ values mean longer periods. - - -##### Example: the pendulum - -The simple gravity [pendulum](http://tinyurl.com/h8ys6ts) is an idealization of a physical pendulum that models a "bob" with mass $m$ swinging on a massless rod of length $l$ in a frictionless world governed only by the gravitational constant $g$. The motion can be described by this differential equation for the angle, $\theta$, made from the vertical: - -```math -\theta''(t) + \frac{g}{l}\sin(\theta(t)) = 0 -``` - -Can this second-order equation be solved by `SymPy`? - -```julia; -@syms g::positive l::positive theta()=>"θ" -eqnp = D2(theta)(x) + g/l*sin(theta(x)) -``` - -Trying to do so, can cause `SymPy` to hang or simply give up and repeat its input; no easy answer is forthcoming for this equation. - -In general, for the first-order initial value problem characterized by -$y'(x) = F(y,x)$, there are conditions -([Peano](http://tinyurl.com/h663wba) and -[Picard-Lindelof](http://tinyurl.com/3rbde5e)) that can guarantee the -existence (and uniqueness) of equation locally, but there may not be -an accompanying method to actually find it. This particular problem -has a solution, but it can not be written in terms of elementary -functions. - -However, as [Huygens](https://en.wikipedia.org/wiki/Christiaan_Huygens) first noted, if the angles involved are small, then we approximate the solution through the linearization $\sin(\theta(t)) \approx \theta(t)$. The resulting equation for an approximate answer is just that of Hooke: - - -```math -\theta''(t) + \frac{g}{l}\theta(t) = 0 -``` - -Here, the solution is in terms of sines and cosines, with period given by $T = 2\pi/\sqrt{k} = 2\pi\cdot\sqrt{l/g}$. The answer does not depend on the mass, $m$, of the bob nor the amplitude of the motion, provided the small-angle approximation is valid. - -If we pull the bob back an angle $a$ and release it then the initial conditions are $\theta(0) = a$ and $\theta'(a) = 0$. This gives the solution: - -```julia; -eqnp₁ = D2(u)(x) + g/l * u(x) -dsolve(eqnp₁, u(x), ics=Dict(u(0) => a, D(u)(0) => 0)) -``` - - -##### Example: hanging cables - -A chain hangs between two supports a distance $L$ apart. What shape -will it take if there are no forces outside of gravity acting on it? -What about if the force is uniform along length of the chain, like a -suspension bridge? How will the shape differ then? - -Let $y(x)$ describe the chain at position $x$, with $0 \leq x \leq L$, -say. We consider first the case of the chain with no force save -gravity. Let $w(x)$ be the density of the chain at $x$, taken below to be a constant. - -The chain is in equilibrium, so tension, $T(x)$, in the chain will be -in the direction of the derivative. Let $V$ be the vertical component -and $H$ the horizontal component. With only gravity acting on the -chain, the value of $H$ will be a constant. The value of $V$ will vary -with position. - -At a point $x$, there is $s(x)$ amount of chain with weight $w \cdot s(x)$. The tension is in the direction of the tangent line, so: - -```math -\tan(\theta) = y'(x) = \frac{w s(x)}{H}. -``` - -In terms of an increment of chain, we have: - -```math -\frac{w ds}{H} = d(y'(x)). -``` - -That is, the ratio of the vertical and horizontal tensions in the increment are in balance with the differential of the derivative. - - -But $ds = \sqrt{dx^2 + dy^2} = \sqrt{dx^2 + y'(x)^2 dx^2} = \sqrt{1 + y'(x)^2}dx$, so we can simplify to: - - -```math -\frac{w}{H}\sqrt{1 + y'(x)^2}dx =y''(x)dx. -``` - -This yields the second-order equation: - -```math -y''(x) = \frac{w}{H} \sqrt{1 + y'(x)^2}. -``` - -We enter this into `Julia`: - -```julia; -@syms w::positive H::positive y() -eqnc = D2(y)(x) ~ (w/H) * sqrt(1 + y'(x)^2) -``` - -Unfortunately, `SymPy` needs a bit of help with this problem, by breaking the problem into -steps. - -For the first step we solve for the derivative. Let $u = y'$, -then we have $u'(x) = (w/H)\sqrt{1 + u(x)^2}$: - -```julia; -eqnc₁ = subs(eqnc, D(y)(x) => u(x)) -``` - -and can solve via: - -```julia; -outc = dsolve(eqnc₁) -``` - -So $y'(x) = u(x) = \sinh(C_1 + w \cdot x/H)$. This can be solved by direct -integration as there is no $y(x)$ term on the right hand -side. - -```julia; -D(y)(x) ~ rhs(outc) -``` - -We see a simple linear transformation involving the hyperbolic sine. To avoid, `SymPy` struggling with the above equation, and knowing the hyperbolic sine is the derivative of the hyperbolic cosine, we anticipate an answer and verify it: - -```julia; -yc = (H/w)*cosh(C1 + w*x/H) -diff(yc, x) == rhs(outc) # == not \Equal[tab] -``` - -The shape is a hyperbolic cosine, known as the catenary. - - -```julia; echo=false -imgfile = "figures/verrazano-narrows-bridge-anniversary-historic-photos-2.jpeg" -caption = """ -The cables of an unloaded suspension bridge have a different shape than a loaded suspension bridge. As seen, the cables in this [figure](https://www.brownstoner.com/brooklyn-life/verrazano-narrows-bridge-anniversary-historic-photos/) would be modeled by a catenary. -""" -ImageFile(:ODEs, imgfile, caption) -``` - - ----- - -If the chain has a uniform load -- like a suspension bridge with a deck -- sufficient to make the weight of the chain negligible, then how does the above change? Then the vertical tension comes from $Udx$ and not $w ds$, so the equation becomes instead: - -```math -\frac{Udx}{H} = d(y'(x)). -``` - -This $y''(x) = U/H$, a constant. So it's answer will be a parabola. - - - -##### Example: projectile motion in a medium - - -The first example describes projectile motion without air resistance. If we use $(x(t), y(t))$ to describe position at time $t$, the functions satisfy: - -```math -x''(t) = 0, \quad y''(t) = -g. -``` - -That is, the $x$ position - where no forces act - has $0$ acceleration, and the $y$ position - where the force of gravity acts - has constant acceleration, $-g$, where $g=9.8m/s^2$ is the gravitational constant. These equations can be solved to give: - -```math -x(t) = x_0 + v_0 \cos(\alpha) t, \quad y(t) = y_0 + v_0\sin(\alpha)t - \frac{1}{2}g \cdot t^2. -``` - - -Furthermore, we can solve for $t$ from $x(t)$, to get an equation describing $y(x)$. Here are all the steps: - -```julia; hold=true -@syms x0::real y0::real v0::real alpha::real g::real -@syms t x u() -a1 = dsolve(D2(u)(x) ~ 0, u(x), ics=Dict(u(0) => x0, D(u)(0) => v0 * cos(alpha))) -a2 = dsolve(D2(u)(x) ~ -g, u(x), ics=Dict(u(0) => y0, D(u)(0) => v0 * sin(alpha))) -ts = solve(t - rhs(a1), x)[1] -y = simplify(rhs(a2)(t => ts)) -sympy.Poly(y, x).coeffs() -``` - -Though `y` is messy, it can be seen that the answer is a quadratic polynomial in $x$ yielding the familiar -parabolic motion for a trajectory. The output shows the coefficients. - - -In a resistive medium, there are drag forces at play. If this force is -proportional to the velocity, say, with proportion $\gamma$, then the -equations become: - -```math -\begin{align*} -x''(t) &= -\gamma x'(t), & \quad y''(t) &= -\gamma y'(t) -g, \\ -x(0) &= x_0, &\quad y(0) &= y_0,\\ -x'(0) &= v_0\cos(\alpha),&\quad y'(0) &= v_0 \sin(\alpha). -\end{align*} -``` - -We now attempt to solve these. - -```julia -@syms alpha::real, γ::postive, t::positive, v() -@syms x_0::real y_0::real v_0::real -Dₜ = Differential(t) -eq₁ = Dₜ(Dₜ(u))(t) ~ - γ * Dₜ(u)(t) -eq₂ = Dₜ(Dₜ(v))(t) ~ -g - γ * Dₜ(v)(t) - -a₁ = dsolve(eq₁, ics=Dict(u(0) => x_0, Dₜ(u)(0) => v_0 * cos(alpha))) -a₂ = dsolve(eq₂, ics=Dict(v(0) => y_0, Dₜ(v)(0) => v_0 * sin(alpha))) - -ts = solve(x - rhs(a₁), t)[1] -yᵣ = rhs(a₂)(t => ts) -``` - - -This gives $y$ as a function of $x$. - -There are a lot of symbols. Lets simplify by using constants $x_0=y_0=0$: - -```julia; -yᵣ₁ = yᵣ(x_0 => 0, y_0 => 0) -``` - - -What is the trajectory? We see -that the `log` function part will have issues when -$-\gamma x + v_0 \cos(\alpha) = 0$. - -If we fix some parameters, we can plot. - -```julia; -v₀, γ₀, α = 200, 1/2, pi/4 -soln = yᵣ₁(v_0=>v₀, γ=>γ₀, alpha=>α, g=>9.8) -plot(soln, 0, v₀ * cos(α) / γ₀ - 1/10, legend=false) -``` - -We can see that the resistance makes the path quite non-symmetric. - -## Visualizing a first-order initial value problem - -The solution, $y(x)$, is known through its derivative. A useful tool to visualize the solution to a first-order differential equation is the [slope field](http://tinyurl.com/jspzfok) (or direction field) plot, which at different values of $(x,y)$, plots a vector with slope given through $y'(x)$.The `vectorfieldplot` of the `CalculusWithJulia` package can be used to produce these. - - - -For example, in a previous example we found a solution to $y'(x) = x\cdot y(x)$, coded as - -```julia -F(y, x) = y*x -``` - -Suppose $x_0=1$ and $y_0=1$. Then a direction field plot is drawn through: - -```julia; hold=true -@syms x y -x0, y0 = 1, 1 - -plot(legend=false) -vectorfieldplot!((x,y) -> [1, F(y,x)], xlims=(x0, 2), ylims=(y0-5, y0+5)) - -f(x) = y0*exp(-x0^2/2) * exp(x^2/2) -plot!(f, linewidth=5) -``` - -In general, if the first-order equation is written as $y'(x) = F(y,x)$, then we plot a "function" that takes $(x,y)$ and returns an $x$ value of $1$ and a $y$ value of $F(y,x)$, so the slope is $F(y,x)$. - -!!! note - The order of variables in $F(y,x)$ is conventional with the equation $y'(x) = F(y(x),x)$. - - -The plots are also useful for illustrating solutions for different initial conditions: - - -```julia; hold=true -p = plot(legend=false) -x0, y0 = 1, 1 - -vectorfieldplot!((x,y) -> [1,F(y,x)], xlims=(x0, 2), ylims=(y0-5, y0+5)) -for y0 in -4:4 - f(x) = y0*exp(-x0^2/2) * exp(x^2/2) - plot!(f, x0, 2, linewidth=5) -end -p -``` - -Such solutions are called [integral -curves](https://en.wikipedia.org/wiki/Integral_curve). -These graphs illustrate the fact that the slope field is tangent to the graph of any -integral curve. - - - -## Questions - -##### Question - -Using `SymPy` to solve the differential equation - -```math -u' = \frac{1-x}{u} -``` - -gives - -```julia; hold=true -@syms x u() -dsolve(D(u)(x) - (1-x)/u(x)) -``` - -The two answers track positive and negative solutions. For the initial condition, $u(-1)=1$, we have the second one is appropriate: $u(x) = \sqrt{C_1 - x^2 + 2x}$. At $-1$ this gives: $1 = \sqrt{C_1-3}$, so $C_1 = 4$. - -This value is good for what values of $x$? - -```julia; hold=true; echo=false -choices = [ -"``[-1, \\infty)``", -"``[-1, 4]``", -"``[-1, 0]``", -"``[1-\\sqrt{5}, 1 + \\sqrt{5}]``"] -answ = 4 -radioq(choices, answ) -``` - - -##### Question - -Suppose $y(x)$ satisfies - -```math -y'(x) = y(x)^2, \quad y(1) = 1. -``` - -What is $y(3/2)$? - -```julia; hold=true; echo=false -@syms x u() -out = dsolve(D(u)(x) - u(x)^2, u(x), ics=Dict(u(1) => 1)) -val = N(rhs(out(3/2))) -numericq(val) -``` - -##### Question - -Solve the initial value problem - -```math -y' = 1 + x^2 + y(x)^2 + x^2 y(x)^2, \quad y(0) = 1. -``` - -Use your answer to find $y(1)$. - -```julia; hold=true; echo=false -eqn = D(u)(x) - (1 + x^2 + u(x)^2 + x^2 * u(x)^2) -out = dsolve(eqn, u(x), ics=Dict(u(0) => 1)) -val = N(rhs(out)(1).evalf()) -numericq(val) -``` - -##### Question - -A population is modeled by $y(x)$. The rate of population growth is generally proportional to the population ($k y(x)$), but as the population gets large, the rate is curtailed $(1 - y(x)/M)$. - -Solve the initial value problem - -```math -y'(x) = k\cdot y(x) \cdot (1 - \frac{y(x)}{M}), -``` - -when $k=1$, $M=100$, and $y(0) = 20$. Find the value of $y(5)$. - -```julia; hold=true;echo=false -k, M = 1, 100 -eqn = D(u)(x) - k * u(x) * (1 - u(x)/M) -out = dsolve(eqn, u(x), ics=Dict(u(0) => 20)) -val = N(rhs(out)(5)) -numericq(val) -``` - - -##### Question - -Solve the initial value problem - -```math -y'(t) = \sin(t) - \frac{y(t)}{t}, \quad y(\pi) = 1 -``` - -Find the value of the solution at $t=2\pi$. - -```julia; hold=true; echo=false -eqn = D(u)(x) - (sin(x) - u(x)/x) -out = dsolve(eqn, u(x), ics=Dict(u(PI) => 1)) -val = N(rhs(out(2PI))) -numericq(val) -``` - - -##### Question - -Suppose $u(x)$ satisfies: - -```math -\frac{du}{dx} = e^{-x} \cdot u(x), \quad u(0) = 1. -``` - -Find $u(5)$ using `SymPy`. - -```julia; hold=true; echo=false -eqn = D(u)(x) - exp(-x)*u(x) -out = dsolve(eqn, u(x), ics=Dict(u(0) => 1)) -val = N(rhs(out)(5)) -numericq(val) -``` - -##### Question - -The differential equation with boundary values - -```math -\frac{r^2 \frac{dc}{dr}}{dr} = 0, \quad c(1)=2, c(10)=1, -``` - -can be solved with `SymPy`. What is the value of $c(5)$? - -```julia; hold=true; echo=false -@syms x u() -eqn = diff(x^2*D(u)(x), x) -out = dsolve(eqn, u(x), ics=Dict(u(1)=>2, u(10) => 1)) |> rhs -out(5) # 10/9 -choices = ["``10/9``", "``3/2``", "``9/10``", "``8/9``"] -answ = 1 -radioq(choices, answ) -``` - - -##### Question - -The example with projectile motion in a medium has a parameter -$\gamma$ modeling the effect of air resistance. If `y` is the -answer - as would be the case if the example were copy-and-pasted -in - what can be said about `limit(y, gamma=>0)`? - -```julia; hold=true; echo=false -choices = [ -"The limit is a quadratic polynomial in `x`, mirroring the first part of that example.", -"The limit does not exist, but the limit to `oo` gives a quadratic polynomial in `x`, mirroring the first part of that example.", -"The limit does not exist -- there is a singularity -- as seen by setting `gamma=0`." -] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/ODEs/process.jl b/CwJ/ODEs/process.jl deleted file mode 100644 index 067dff1..0000000 --- a/CwJ/ODEs/process.jl +++ /dev/null @@ -1,27 +0,0 @@ -using WeavePynb -using Mustache - -mmd(fname) = mmd_to_html(fname, BRAND_HREF="../toc.html", BRAND_NAME="Calculus with Julia") -## uncomment to generate just .md files -mmd(fname) = mmd_to_md(fname, BRAND_HREF="../toc.html", BRAND_NAME="Calculus with Julia") - -fnames = [ - "odes", - "euler" - ] - - - -function process_file(nm, twice=false) - include("$nm.jl") - mmd_to_md("$nm.mmd") - markdownToHTML("$nm.md") - twice && markdownToHTML("$nm.md") -end - -process_files(twice=false) = [process_file(nm, twice) for nm in fnames] - -""" -## TODO ODEs - -""" diff --git a/CwJ/ODEs/solve.jmd b/CwJ/ODEs/solve.jmd deleted file mode 100644 index 8b63729..0000000 --- a/CwJ/ODEs/solve.jmd +++ /dev/null @@ -1,248 +0,0 @@ -# The problem-algorithm-solve interface - -This section uses these add-on packages: - -```julia -using Plots -using MonteCarloMeasurements -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "The problem-algorithm-solve interface", - description = "Calculus with Julia: The problem-algorithm-solve interface", - tags = ["CalculusWithJulia", "odes", "the problem-algorithm-solve interface"], -); -fig_size = (800, 600) -nothing -``` - ----- - - -The [DifferentialEquations.jl](https://github.com/SciML) package is an entry point to a suite of `Julia` packages for numerically solving differential equations in `Julia` and other languages. A common interface is implemented that flexibly adjusts to the many different problems and algorithms covered by this suite of packages. In this section, we review a very informative [post](https://discourse.julialang.org/t/function-depending-on-the-global-variable-inside-module/64322/10) by discourse user `@genkuroki` which very nicely demonstrates the usefulness of the problem-algorithm-solve approach used with `DifferentialEquations.jl`. We slightly modify the presentation below for our needs, but suggest a perusal of the original post. - -##### Example: FreeFall - -The motion of an object under a uniform gravitational field is of interest. - -The parameters that govern the equation of motions are the gravitational constant, `g`; the initial height, `y0`; and the initial velocity, `v0`. The time span for which a solution is sought is `tspan`. - -A problem consists of these parameters. Typical `Julia` usage would be to create a structure to hold the parameters, which may be done as follows: - -```julia -struct Problem{G, Y0, V0, TS} - g::G - y0::Y0 - v0::V0 - tspan::TS -end - -Problem(;g=9.80665, y0=0.0, v0=30.0, tspan=(0.0,8.0)) = Problem(g, y0, v0, tspan) -``` - -The above creates a type, `Problem`, *and* a default constructor with default values. (The original uses a more sophisticated setup that allows the two things above to be combined.) - -Just calling `Problem()` will create a problem suitable for the earth, passing different values for `g` would be possible for other planets. - - -To solve differential equations there are many different possible algorithms. Here is the construction of two types to indicate two algorithms: - -```julia -struct EulerMethod{T} - dt::T -end -EulerMethod(; dt=0.1) = EulerMethod(dt) - -struct ExactFormula{T} - dt::T -end -ExactFormula(; dt=0.1) = ExactFormula(dt) -``` - -The above just specifies a type for dispatch --- the directions indicating what code to use to solve the problem. As seen, each specifies a size for a time step with default of `0.1`. - -A type for solutions is useful for different `show` methods or other methods. One can be created through: - - -```julia -struct Solution{Y, V, T, P<:Problem, A} - y::Y - v::V - t::T - prob::P - alg::A -end -``` - -The different algorithms then can be implemented as part of a generic `solve` function. Following the post we have: - -```julia -solve(prob::Problem) = solve(prob, default_algorithm(prob)) -default_algorithm(prob::Problem) = EulerMethod() - -function solve(prob::Problem, alg::ExactFormula) - g, y0, v0, tspan = prob.g, prob.y0, prob.v0, prob.tspan - dt = alg.dt - t0, t1 = tspan - t = range(t0, t1 + dt/2; step = dt) - - y(t) = y0 + v0*(t - t0) - g*(t - t0)^2/2 - v(t) = v0 - g*(t - t0) - - Solution(y.(t), v.(t), t, prob, alg) -end - -function solve(prob::Problem, alg::EulerMethod) - g, y0, v0, tspan = prob.g, prob.y0, prob.v0, prob.tspan - dt = alg.dt - t0, t1 = tspan - t = range(t0, t1 + dt/2; step = dt) - - n = length(t) - y = Vector{typeof(y0)}(undef, n) - v = Vector{typeof(v0)}(undef, n) - y[1] = y0 - v[1] = v0 - - for i in 1:n-1 - v[i+1] = v[i] - g*dt # F*h step of Euler - y[i+1] = y[i] + v[i]*dt # F*h step of Euler - end - - Solution(y, v, t, prob, alg) -end -``` - -The post has a more elegant means to unpack the parameters from the structures, but for each of the above, the parameters are unpacked, and then the corresponding algorithm employed. As of version `v1.7` of `Julia`, the syntax `(;g,y0,v0,tspan) = prob` could also be employed. - - -The exact formulas, ` y(t) = y0 + v0*(t - t0) - g*(t - t0)^2/2` and `v(t) = v0 - g*(t - t0)`, follow from well-known physics formulas. Each answer is wrapped in a `Solution` type so that the answers found can be easily extracted in a uniform manner. - -For example, plots of each can be obtained through: - -```julia -earth = Problem() -sol_euler = solve(earth) -sol_exact = solve(earth, ExactFormula()) - -plot(sol_euler.t, sol_euler.y; - label="Euler's method (dt = $(sol_euler.alg.dt))", ls=:auto) -plot!(sol_exact.t, sol_exact.y; label="exact solution", ls=:auto) -title!("On the Earth"; xlabel="t", legend=:bottomleft) -``` - -Following the post, since the time step `dt = 0.1` is not small enough, the error of the Euler method is rather large. Next we change the algorithm parameter, `dt`, to be smaller: - -```julia -earth₂ = Problem() -sol_euler₂ = solve(earth₂, EulerMethod(dt = 0.01)) -sol_exact₂ = solve(earth₂, ExactFormula()) - -plot(sol_euler₂.t, sol_euler₂.y; - label="Euler's method (dt = $(sol_euler₂.alg.dt))", ls=:auto) -plot!(sol_exact₂.t, sol_exact₂.y; label="exact solution", ls=:auto) -title!("On the Earth"; xlabel="t", legend=:bottomleft) -``` - -It is worth noting that only the first line is modified, and only the method requires modification. - -Were the moon to be considered, the gravitational constant would need adjustment. This parameter is part of the problem, not the solution algorithm. - -Such adjustments are made by passing different values to the `Problem` -constructor: - -```julia -moon = Problem(g = 1.62, tspan = (0.0, 40.0)) -sol_eulerₘ = solve(moon) -sol_exactₘ = solve(moon, ExactFormula(dt = sol_euler.alg.dt)) - -plot(sol_eulerₘ.t, sol_eulerₘ.y; - label="Euler's method (dt = $(sol_eulerₘ.alg.dt))", ls=:auto) -plot!(sol_exactₘ.t, sol_exactₘ.y; label="exact solution", ls=:auto) -title!("On the Moon"; xlabel="t", legend=:bottomleft) -``` - -The code above also adjusts the time span in addition to the -graviational constant. The algorithm for exact formula is set to use -the `dt` value used in the `euler` formula, for easier -comparison. Otherwise, outside of the labels, the patterns are the -same. Only those things that need changing are changed, the rest comes -from defaults. - -The above shows the benefits of using a common interface. Next, the post illustrates how *other* authors could extend this code, simply by adding a *new* `solve` method. For example, - -```julia -struct Symplectic2ndOrder{T} - dt::T -end -Symplectic2ndOrder(;dt=0.1) = Symplectic2ndOrder(dt) - -function solve(prob::Problem, alg::Symplectic2ndOrder) - g, y0, v0, tspan = prob.g, prob.y0, prob.v0, prob.tspan - dt = alg.dt - t0, t1 = tspan - t = range(t0, t1 + dt/2; step = dt) - - n = length(t) - y = Vector{typeof(y0)}(undef, n) - v = Vector{typeof(v0)}(undef, n) - y[1] = y0 - v[1] = v0 - - for i in 1:n-1 - ytmp = y[i] + v[i]*dt/2 - v[i+1] = v[i] - g*dt - y[i+1] = ytmp + v[i+1]*dt/2 - end - - Solution(y, v, t, prob, alg) -end -``` - -Had the two prior methods been in a package, the other user could still extend the interface, as above, with just a slight standard modification. - - -The same approach works for this new type: - -```julia - -earth₃ = Problem() -sol_sympl₃ = solve(earth₃, Symplectic2ndOrder(dt = 2.0)) -sol_exact₃ = solve(earth₃, ExactFormula()) - -plot(sol_sympl₃.t, sol_sympl₃.y; label="2nd order symplectic (dt = $(sol_sympl₃.alg.dt))", ls=:auto) -plot!(sol_exact₃.t, sol_exact₃.y; label="exact solution", ls=:auto) -title!("On the Earth"; xlabel="t", legend=:bottomleft) -``` - -Finally, the author of the post shows how the interface can compose with other packages in the `Julia` package ecosystem. This example uses the external package `MonteCarloMeasurements` which plots the behavior of the system for perturbations of the initial value: - - -```julia - -earth₄ = Problem(y0 = 0.0 ± 0.0, v0 = 30.0 ± 1.0) -sol_euler₄ = solve(earth₄) -sol_sympl₄ = solve(earth₄, Symplectic2ndOrder(dt = 2.0)) -sol_exact₄ = solve(earth₄, ExactFormula()) - -ylim = (-100, 60) -P = plot(sol_euler₄.t, sol_euler₄.y; - label="Euler's method (dt = $(sol_euler₄.alg.dt))", ls=:auto) -title!("On the Earth"; xlabel="t", legend=:bottomleft, ylim) - -Q = plot(sol_sympl₄.t, sol_sympl₄.y; - label="2nd order symplectic (dt = $(sol_sympl₄.alg.dt))", ls=:auto) -title!("On the Earth"; xlabel="t", legend=:bottomleft, ylim) - -R = plot(sol_exact₄.t, sol_exact₄.y; label="exact solution", ls=:auto) -title!("On the Earth"; xlabel="t", legend=:bottomleft, ylim) - -plot(P, Q, R; size=(720, 600)) -``` - -The only change was in the problem, `Problem(y0 = 0.0 ± 0.0, v0 = 30.0 ± 1.0)`, where a different number type is used which accounts for uncertainty. The rest follows the same pattern. - -This example, shows the flexibility of the problem-algorithm-solver pattern while maintaining a consistent pattern for execution. diff --git a/CwJ/Project.toml b/CwJ/Project.toml deleted file mode 100644 index 81648c0..0000000 --- a/CwJ/Project.toml +++ /dev/null @@ -1 +0,0 @@ -[deps] diff --git a/CwJ/TODO/AD.md b/CwJ/TODO/AD.md deleted file mode 100644 index 8d6c12e..0000000 --- a/CwJ/TODO/AD.md +++ /dev/null @@ -1,3 +0,0 @@ -Good paper recommended here (https://discourse.julialang.org/t/learning-automatic-differentiation/56158/3) - -https://www.jmlr.org/papers/volume18/17-468/17-468.pdf diff --git a/CwJ/TODO/arrows.md b/CwJ/TODO/arrows.md deleted file mode 100644 index 1f9214a..0000000 --- a/CwJ/TODO/arrows.md +++ /dev/null @@ -1,61 +0,0 @@ -This is really just - -plot!([0,cos(θ)],[0,sin(θ)], arrow=true) - - -https://stackoverflow.com/questions/58219191/drawing-an-arrow-with-specified-direction-on-a-point-in-scatter-plot-in-julia - -https://github.com/m3g/CKP/blob/master/disciplina/codes/velocities.jl - - - - -using Plots -using LaTeXStrings - -function arch(θ₁,θ₂;radius=1.,Δθ=1.) - θ₁ = π*θ₁/180 - θ₂ = π*θ₂/180 - Δθ = π*Δθ/180 - l = round(Int,(θ₂-θ₁)/Δθ) - x = zeros(l) - y = zeros(l) - for i in 1:l - θ = θ₁ + i*Δθ - x[i] = radius*cos(θ) - y[i] = radius*sin(θ) - end - return x, y -end - -plot() - -x, y = arch(0,360) -plot(x,y,seriestype=:shape,label="",alpha=0.5) - -x, y = arch(0,360,radius=0.95) -plot!(x,y,seriestype=:shape,label="",fillcolor=:white) - -x, y = arch(0,360,radius=0.7) -plot!(x,y,seriestype=:shape,label="",alpha=0.5,fillcolor=:red) - -x, y = arch(0,360,radius=0.65) -plot!(x,y,seriestype=:shape,label="",fillcolor=:white) - -plot!([0,0],[0,1.1],arrow=true,color=:black,linewidth=2,label="") -plot!([0,1.1],[0,0],arrow=true,color=:black,linewidth=2,label="") - -x, y = arch(15,16,radius=0.65) -plot!([0,x[1]],[0,y[1]],arrow=true,color=:black,linewidth=1,label="") - -x, y = arch(35,36,radius=0.95) -plot!([0,x[1]],[0,y[1]],arrow=true,color=:black,linewidth=1,label="") - -plot!(draw_arrow=true) -plot!(showaxis=:no,ticks=nothing,xlim=[-0.1,1.1],ylim=[-0.1,1.1],) -plot!(xlabel="x",ylabel="y",size=(400,400)) - -annotate!(0.58,-0.07,text(L"\Delta v_1",10)) -annotate!(0.88,-0.07,text(L"\Delta v_2",10)) - -savefig("./velocities.pdf") diff --git a/CwJ/TODO/earth.jl b/CwJ/TODO/earth.jl deleted file mode 100644 index 252f273..0000000 --- a/CwJ/TODO/earth.jl +++ /dev/null @@ -1,30 +0,0 @@ -# Calculate the temperature of the earth using the simplest model -# @jake -# https://discourse.julialang.org/t/seven-lines-of-julia-examples-sought/50416/121 - -using Unitful, Plots -p_sun = 386e24u"W" # power output of the sun -radius_a = 6378u"km" # semi-major axis of the earth -radius_b = 6357u"km" # semi-minor axis of the earth -orbit_a = 149.6e6u"km" # distance from the sun to earth -orbit_e = 0.017 # eccentricity of r = a(1-e^2)/(1+ecos(θ)) & time ≈ 365.25 * θ / 360 where θ is in degrees -a = 0.75 # absorptivity of the sun's radiation -e = 0.6 # emmissivity of the earth (very dependent on cloud cover) -σ = 5.6703e-8u"W*m^-2*K^-4" # Stefan-Boltzman constant -temp_sky = 3u"K" # sky temperature - - - -t = (0:0.25:365.25)u"d" # day of year in 1/4 day increments -θ = 2*π/365.25u"d" .* t # approximate angle around the sun -r = orbit_a * (1-orbit_e^2) ./ (1 .+ orbit_e .* cos.(θ)) # distance from sun to earth -area_projected = π * radius_a * radius_b # area of earth facing the sun -ec = sqrt(1-radius_b^2/radius_a^2) # eccentricity of earth - -area_surface = 2*π*radius_a^2*(1 + radius_b^2/(ec*radius_b^2)*atanh(ec)) # surface area of the earth - -q_in = p_sun * a * area_projected ./ (4 * π .* r.^2) # total heat impacting the earth - -temp_earth = (q_in ./ (e*σ*area_surface) .+ temp_sky^4).^0.25 # temperature of the earth - -plot(t*u"d^-1", temp_earth*u"K^-1" .- 273.15, label = false, title = "Temperature of Earth", xlabel = "Day", ylabel = "Temperature [C]") diff --git a/CwJ/TODO/ladder-questions.md b/CwJ/TODO/ladder-questions.md deleted file mode 100644 index da1b3af..0000000 --- a/CwJ/TODO/ladder-questions.md +++ /dev/null @@ -1,115 +0,0 @@ -###### Question (Ladder [questions](http://www.mathematische-basteleien.de/ladder.htm)) - - -A ``7``meter ladder leans against wall with the base ``1.5``meters from wall at its base. At which height does the ladder touch the wall? - -```julia; hold=true; echo=false -l = 7 -adj = 1.5 -opp = sqrt(l^2 - adj^2) -numericq(opp, 1e-3) -``` - - ----- - -A ``7``meter ladder leans against the wall. Between the ladder and the wall is a ``1``m cube box. The ladder touches the wall, the box and the ground. There are two such positions, what is the height of the ladder of the more upright position? - -You might find this code of help: - -```julia; eval=false -@syms x y -l, b = 7, 1 -eq = (b+x)^2 + (b+y)^2 -eq = subs(eq, x=> b*(b/y)) # x/b = b/y -solve(eq ~ l^2, y) -``` - -What is the value `b+y` in the above? - -```julia; echo=false -radioq(("The height of the ladder", - "The height of the box plus ladder", - "The distance from the base of the ladder to the box," - "The distance from the base of the ladder to the base of the wall" - ),1) -``` - - -What is the height of the ladder - -```julia; hold=true; echo=false -numericq(6.90162289514212, 1e-3) -``` - - ----- - -A ladder of length ``c`` is to moved through a 2-dimensional hallway of width ``b`` which has a right angled bend. If ``4b=c``, when will the ladder get stuck? - -Consider this picture - -```julia; hold=true; echo=false -p = plot(; axis=nothing, legend=false, aspect_ratio=:equal) -x,y=1,2 -b = sqrt(x*y) -plot!(p, [0,0,b+x], [b+y,0,0], linestyle=:dot) -plot!(p, [0,b+x],[b,b], color=:black, linestyle=:dash) -plot!(p, [b,b],[0,b+y], color=:black, linestyle=:dash) -plot!(p, [b+x,0], [0, b+y], color=:black) -``` - - -Suppose ``b=5``, then with ``b+x`` and ``b+y`` being the lengths on the walls where it is stuck *and* by similar triangles ``b/x = y/b`` we can solve for ``x``. (In the case take the largest positive value. The answer would be the angle ``\theta`` with ``\tan(\theta) = (b+y)/(b+x)``. - -```julia; hold=true; echo=false -b = 5 -l = 4*b -@syms x y -eq = (b+x)^2 + (b+y)^2 -eq =subs(eq, y=> b^2/x) -x₀ = N(maximum(filter(>(0), solve(eq ~ l^2, x)))) -y₀ = b^2/x₀ -θ₀ = Float64(atan((b+y₀)/(b+x₀))) -numericq(θ₀, 1e-2) -``` - - ------ - -Two ladders of length ``a`` and ``b`` criss-cross between two walls of width ``x``. They meet at a height of ``c``. - -```julia; hold=true; echo=false -p = plot(; legend=false, axis=nothing, aspect_ratio=:equal) -ya,yb,x = 2,3,1 -plot!(p, [0,x],[ya,0], color=:black) -plot!(p, [0,x],[0, yb], color=:black) -plot!(p, [0,0], [0,yb], color=:blue, linewidth=5) -plot!(p, [x,x], [0,yb], color=:blue, linewidth=5) -plot!(p, [0,x], [0,0], color=:blue, linewidth=5) -xc = ya/(ya+yb) -c = yb*xc -plot!(p, [xc,xc],[0,c]) -p -``` - -Suppose ``c=1``, ``b=3``, and ``a=5``. Find ``x``. - -Introduce ``x = z + y``, and ``h`` and ``k`` the heights of the ladders along the left wall and the right wall. - -The ``z/c = x/k`` and ``y/c = x/h`` by similar triangles. As ``z + y`` is ``x`` we can solve to get - -```math -x = z + y = \frac{xc}{k} + \frac{xc}{h} - = \frac{xc}{\sqrt{b^2 - x^2}} + \frac{xc}{\sqrt{a^2 - x^2}} -``` - -With ``a,b,c`` as given, this can be solved with - -```julia; hold=true; echo=false -a,b,c = 5, 3, 1 -f(x) = x*c/sqrt(b^2 - x^2) + x*c/sqrt(a^2 - x^2) - x -find_zero(f, (0, b)) -``` - -The answer is ``2.69\dots``. diff --git a/CwJ/TODO/partialcircle.jl b/CwJ/TODO/partialcircle.jl deleted file mode 100644 index 30a1186..0000000 --- a/CwJ/TODO/partialcircle.jl +++ /dev/null @@ -1,4 +0,0 @@ -plot([0,1,1,0], [0,0,1,0], aspect_ratio=:equal, legend=false) - plot!(Plots.partialcircle(0, pi/4,100, 0.25), arrow=true) - Δ = 0.05 - plot!([1-Δ, 1-Δ, 1], [0,Δ,Δ]) diff --git a/CwJ/TODO/ti-30-image.png b/CwJ/TODO/ti-30-image.png deleted file mode 100644 index 050f0a3..0000000 Binary files a/CwJ/TODO/ti-30-image.png and /dev/null differ diff --git a/CwJ/alternatives/README b/CwJ/alternatives/README deleted file mode 100644 index b40bdc4..0000000 --- a/CwJ/alternatives/README +++ /dev/null @@ -1,11 +0,0 @@ -# Alternatives - -There are many ways to do related things in `Julia`. This directory holds alternatives to the some choices made within these notes: - -## Symbolics - -* needs writing - -## Makie - -* needs updating diff --git a/CwJ/alternatives/SciML.jmd b/CwJ/alternatives/SciML.jmd deleted file mode 100644 index e497340..0000000 --- a/CwJ/alternatives/SciML.jmd +++ /dev/null @@ -1,625 +0,0 @@ -# The SciML suite of packages - - -The `Julia` ecosystem advances rapidly. For much of it, the driving force is the [SciML](https://github.com/SciML) organization (Scientific Machine Learning). - -In this section we describe some packages provided by this organization that could be used as alternatives to the ones utilized in these notes. Members of this organization created many packages for solving different types of differential equations, and have branched out from there. Many newer efforts of this organization have been to write uniform interfaces to other packages in the ecosystem, some of which are discussed below. We don't discuss the promise of SCIML: "Performance is considered a priority, and performance issues are considered bugs," as we don't pursue features like in-place modification, sparsity, etc. Interested readers should consult the relevant packages documentation. - -The basic structure to use these packages is the "problem-algorithm-solve" interface described in [The problem-algorithm-solve interface](../ODEs/solve.html). We also discussed this interface a bit in [ODEs](../ODEs/differential_equations.html). - -!!! note - These packages are in a process of rapid development and change to them is expected. These notes were written using the following versions: - -```julia -pkgs = ["Symbolics", "NonlinearSolve", "Optimization", "Integrals"] -import Pkg; Pkg.status(pkgs) -``` - -## Symbolic math (`Symbolics`) - -The `Symbolics`, `SymbolicUtils`, and `ModelingToolkit` packages are provided by this organization. These can be viewed as an alternative to `SymPy`, which is used throughout this set of notes. See the section on [Symbolics](./symbolics.html) for some additional details, the package [documentation](https://symbolics.juliasymbolics.org/stable/), or the documentation for [SymbolicsUtils](https://github.com/JuliaSymbolics/SymbolicUtils.jl). - - -## Solving equations - -Solving one or more equations (simultaneously) is different in the linear case (where solutions are readily found -- though performance can distinguish approaches -- and the nonlinear case -- where for most situations, numeric approaches are required. - -### `LinearSolve` - -The `LinearSolve` package aims to generalize the solving of linear equations. For many cases these are simply represented as matrix equations of the form `Ax=b`, from which `Julia` (borrowing from MATLAB) offers the interface `A \ b` to yield `x`. There are scenarios that don't naturally fit this structure and perhaps problems where different tolerances need to be specified, and the `LinearSolve` package aims to provide a common interface to handle these scenarios. As this set of notes doesn't bump into such, this package is not described here. In the symbolic case, the `Symbolics.solve_for` function was described in [Symbolics](./symbolics.html). - -### `NonlinearSolve` -The `NonlinearSolve` package can be seen as an alternative to the use of the `Roots` package in this set of notes. The package presents itself as "Fast implementations of root finding algorithms in Julia that satisfy the SciML common interface." - -The package is loaded through the following command: - -```julia -using NonlinearSolve -``` - -Unlike `Roots`, the package handles problems beyond the univariate case, as such the simplest problems have a little extra setup required. - -For example, suppose we want to use this package to solve for zeros of ``f(x) = x^5 - x - 1``. We could do so a few different ways. - -First, we need to define a `Julia` function representing `f`. We do so with: - -```julia -f(u, p) = @. (u^5 - u - 1) -``` - -The function definition expects a container for the "`x`" variables and allows the passing of a container to hold parameters. We could have used the dotted operations for the power and each subtraction to allow vectorization of these basic math operations, as `u` is a container of values. The `@.` macro makes adding the "dots" quite easy, as illustrated above. It converts "every function call or operator in expr into a `dot call`." - -A problem is set up with this function and an initial guess. The `@SVector` specification for the guess is for performance purposes and is provided by the `StaticArrays` package. - -```julia -using StaticArrays -u0 = @SVector[1.0] -prob = NonlinearProblem(f, u0) -``` - -The problem is solved by calling `solve` with an appropriate method specified. Here we use Newton's method. The derivative of `f` is computed automatically. - -```julia -soln = solve(prob, NewtonRaphson()) -``` - -The basic interface for retrieving the solution from the solution object is to use indexing: - -```julia -soln[] -``` - ----- - -!!! note - This interface is more performant than `Roots`, though it isn't an apples to oranges comparison as different stopping criteria are used by the two. In order to be so, we need to help out the call to `NonlinearProblem` to indicate the problem is non-mutating by adding a "`false`", as follows: - -```julia -using BenchmarkTools -@btime solve(NonlinearProblem{false}(f, @SVector[1.0]), NewtonRaphson()) -``` - -As compared to: - -```julia -import Roots -import ForwardDiff -g(x) = x^5 - x - 1 -gp(x) = ForwardDiff.derivative(g, x) -@btime solve(Roots.ZeroProblem((g, gp), 1.0), Roots.Newton()) -``` ----- - -This problem can also be solved using a bracketing method. The package has both `Bisection` and `Falsi` as possible methods. To use a bracketing method, the initial bracket must be specified. - -```julia -u0 = (1.0, 2.0) -prob = NonlinearProblem(f, u0) -``` - -And - -```julia -solve(prob, Bisection()) -``` - ----- - -Incorporating parameters is readily done. For example to solve ``f(x) = \cos(x) - x/p`` for different values of ``p`` we might have: - - -```julia -f(x, p) = @. cos(x) - x/p -u0 = (0, pi/2) -p = 2 -prob = NonlinearProblem(f, u0, p) -solve(prob, Bisection()) -``` - -!!! note - The *insignificant* difference in stopping criteria used by `NonlinearSolve` and `Roots` is illustrated in this example, where the value returned by `NonlinearSolve` differs by one floating point value: - -```julia -an = solve(NonlinearProblem{false}(f, u0, p), Bisection()) -ar = solve(Roots.ZeroProblem(f, u0), Roots.Bisection(); p=p) -nextfloat(an[]) == ar, f(an[], p), f(ar, p) -``` - - ----- - -We can solve for several parameters at once, by using an equal number of initial positions as follows: - -```julia -ps = [1, 2, 3, 4] -u0 = @SVector[1, 1, 1, 1] -prob = NonlinearProblem(f, u0, ps) -solve(prob, NewtonRaphson()) -``` - - -### Higher dimensions - - -We solve now for a point on the surface of the following `peaks` function where the gradient is ``0``. (The gradient here will be a vector-valued function from ``R^2 `` to ``R^2.``) First we define the function: - -```julia -function _peaks(x, y) - p = 3 * (1 - x)^2 * exp(-x^2 - (y + 1)^2) - p -= 10 * (x / 5 - x^3 - y^5) * exp(-x^2 - y^2) - p -= 1/3 * exp(-(x + 1)^2 - y^2) - p -end -peaks(u) = _peaks(u[1], u[2]) # pass container, take first two components -``` - -The gradient can be computed different ways within `Julia`, but here we use the fact that the `ForwardDiff` package is loaded by `NonlinearSolve`. Once the function is defined, the pattern is similar to above. We provide a starting point, create a problem, then solve: - -```julia -∇peaks(x, p=nothing) = NonlinearSolve.ForwardDiff.gradient(peaks, x) -u0 = @SVector[1.0, 1.0] -prob = NonlinearProblem(∇peaks, u0) -u = solve(prob, NewtonRaphson()) -``` - -We can see that this identified value is a "zero" through: - -```julia; error=true -∇peaks(u.u) -``` - -### Using Modeling toolkit to model the non-linear problem - -Nonlinear problems can also be approached symbolically using the `ModelingToolkit` package. There is one additional step necessary. - -As an example, we look to solve numerically for the zeros of ``x^5-x-\alpha`` for a parameter ``\alpha``. We can describe this equation as follows: - -```julia -using ModelingToolkit - -@variables x -@parameters α - -eq = x^5 - x - α ~ 0 -``` - -The extra step is to specify a "`NonlinearSystem`." It is a system, as in practice one or more equations can be considered. The `NonlinearSystem`constructor handles the details where the equation, the variable, and the parameter are specified. Below this is done using vectors with just one element: - -```julia -ns = NonlinearSystem([eq], [x], [α], name=:ns) -``` - -The `name` argument is special. The name of the object (`ns`) is assigned through `=`, but the system must also know this same name. However, the name on the left is not known when the name on the right is needed, so it is up to the user to keep them synchronized. The `@named` macro handles this behind the scenes by simply rewriting the syntax of the assignment: - -```julia -@named ns = NonlinearSystem([eq], [x], [α]) -``` - -With the system defined, we can pass this to `NonlinearProblem`, as was done with a function. The parameter is specified here, and in this case is `α => 1.0`. The initial guess is `[1.0]`: - -```julia -prob = NonlinearProblem(ns, [1.0], [α => 1.0]) -``` - -The problem is solved as before: - -```julia -solve(prob, NewtonRaphson()) -``` - -## Optimization (`Optimization.jl`) - -We describe briefly the `Optimization` package which provides a common interface to *numerous* optimization packages in the `Julia` ecosystem. We discuss only the interface for `Optim.jl` defined in `OptimizationOptimJL`. - -We begin with a simple example from first semester calculus: - -> Among all rectangles of fixed perimeter, find the one with the *maximum* area. - -If the perimeter is taken to be ``25``, the mathematical setup has a -constraint (``P=25=2x+2y``) and an objective (``A=xy``) to -maximize. In this case, the function to *maximize* is ``A(x) = x \cdot -(25-2x)/2``. This is easily done different ways, such as finding the -one critical point and identifying this as the point of maximum. - -To do this last step using `Optimization` we would have. - -```julia -height(x) = @. (25 - 2x)/2 -A(x, p=nothing) = @.(- x * height(x)) -``` - -The minus sign is needed here as optimization routines find *minimums*, not maximums. - -To use `Optimization` we must load the package **and** the underlying backend glue code we intend to use: - -```julia -using Optimization -using OptimizationOptimJL -``` - - -Next, we define an optimization function with information on how its derivatives will be taken. The following uses `ForwardDiff`, which is a good choice in the typical calculus setting, where there are a small number of inputs (just ``1`` here.) - -```julia -F = OptimizationFunction(A, Optimization.AutoForwardDiff()) -x0 = [4.0] -prob = OptimizationProblem(F, x0) -``` - -The problem is solved through the common interface with a specified method, in this case `Newton`: - -```julia -soln = solve(prob, Newton()) -``` - -!!! note - We use `Newton` not `NewtonRaphson` as above. Both methods are similar, but they come from different uses -- for latter for solving non-linear equation(s), the former for solving optimization problems. - -The solution is an object containing the identified answer and more. To get the value, use index notation: - -```julia -soln[] -``` - -The corresponding ``y`` value and area are found by: - -```julia -xstar = soln[] -height(xstar), A(xstar) -``` - -The `minimum` property also holds the identified minimum: - -```julia -soln.minimum # compare with A(soln[], nothing) -``` - -The package is a wrapper around other packages. The output of the underlying package is presented in the `original` property: - -```julia -soln.original -``` - ----- - - -This problem can also be approached symbolically, using `ModelingToolkit`. - -For example, we set up the problem with: - -```julia -using ModelingToolkit -@parameters P -@variables x -y = (P - 2x)/2 -Area = - x*y -``` - -The above should be self explanatory. To put into a form to pass to `solve` we define a "system" by specifying our objective function, the variables, and the parameters. - -```julia -@named sys = OptimizationSystem(Area, [x], [P]) -``` - -(This step is different, as before an `OptimizationFunction` was defined; we use `@named`, as above, to ensure the system has the same name as the identifier, `sys`.) - - -This system is passed to `OptimizationProblem` along with a specification of the initial condition (``x=4``) and the perimeter (``P=25``). A vector of pairs is used below: - -```julia -prob = OptimizationProblem(sys, [x => 4.0], [P => 25.0]; grad=true, hess=true) -``` - -The keywords `grad=true` and `hess=true` instruct for automatic derivatives to be taken as needed. These are needed in the choice of method, `Newton`, below. - -Solving this problem then follows the same pattern as before, again with `Newton` we have: - -```julia -solve(prob, Newton()) -``` - -(A derivative-free method like `NelderMead()` could be used and then the `grad` and `hess` keywords above would be unnecessary, though not harmful.) - ----- - -The related calculus problem: - -> Among all rectangles with a fixed area, find the one with *minimum* perimeter - - -could be similarly approached: - -```julia -@parameters Area -@variables x -y = Area/x # from A = xy -P = 2x + 2y -@named sys = OptimizationSystem(P, [x], [Area]) - -u0 = [x => 4.0] -p = [Area => 25.0] - -prob = OptimizationProblem(sys, u0, p; grad=true, hess=true) -soln = solve(prob, LBFGS()) -``` - -We used an initial guess of ``x=4`` above. The `LBFGS` method is -a computationally efficient modification of the -Broyden-Fletcher-Goldfarb-Shanno algorithm ... It is a quasi-Newton -method that updates an approximation to the Hessian using past -approximations as well as the gradient." On this problem it performs similarly to `Newton`, though in general may be preferable for higher-dimensional problems. - -### Two dimensional - -Scalar functions of two input variables can have their minimum value identified in the same manner using `Optimization.jl`. - -For example, consider the function - -```math -f(x,y) = (x + 2y - 7)^2 + (2x + y - 5)^2 -``` - -We wish to minimize this function. - - -We begin by defining a function in `Julia`: - -```julia -function f(u, p) - x, y = u - (x + 2y - 7)^2 + (2x + y - 5)^2 -end -``` - -We turn this into an optimization function by specifying how derivatives will be taken, as we will the `LBFGS` algorithm below: - -```julia -ff = OptimizationFunction(f, Optimization.AutoForwardDiff()) -``` - -We will begin our search at the origin. We have to be mindful to use floating point numbers here: - -```julia -u0 = [0.0, 0.0] # or zeros(2) -``` - -```julia -prob = OptimizationProblem(ff, u0) -``` - -Finally, we solve the values: - -```julia -solve(prob, LBFGS()) -``` - -The value of ``(1, 3)`` agrees with the contour graph, as it is a point in the interior of the contour for the smallest values displayed. - -```julia -using Plots - -xs = range(0, 2, length=100) -ys = range(2, 4, length=100) -contour(xs, ys, (x,y) -> f((x,y), nothing)) -``` - - -We could also use a *derivative-free* method, and skip a step: - -```julia -prob = OptimizationProblem(f, u0) # skip making an OptimizationFunction -solve(prob, NelderMead()) -``` - -## Integration (`Integrals.jl`) - -The `Integrals` package provides a common interface to different numeric integration packages in the `Julia` ecosystem. For example, `QuadGK` and `HCubature`. The value of this interface, over those two packages, is its non-differentiated access to other packages, which for some uses may be more performant. - -The package follows the same `problem-algorithm-solve` interface, as already seen. - -The interface is designed for ``1``-and-higher dimensional integrals. - -The package is loaded with - -```julia -using Integrals -``` - - -For a simple definite integral, such as ``\int_0^\pi \sin(x)dx``, we have: - -```julia -f(x, p) = sin(x) -prob = IntegralProblem(f, 0.0, pi) -soln = solve(prob, QuadGKJL()) -``` - -To get access to the answer, we can use indexing notation: - -```julia -soln[] -``` - -Comparing to just using `QuadGK`, the same definite integral would be -estimated with: - -```julia -using QuadGK -quadgk(sin, 0, pi) -``` - -The estimated upper bound on the error from `QuadGK`, is available through the `resid` property on the `Integrals` output: - -```julia -soln.resid -``` - - -The `Integrals` solution is a bit more verbose, but it is more flexible. For example, the `HCubature` package provides a similar means to compute ``n``- dimensional integrals. For this problem, the modifications would be: - -```julia -f(x, p) = sin.(x) -prob = IntegralProblem(f, [0.0], [pi]) -soln = solve(prob, HCubatureJL()) -``` - -```julia -soln[] -``` - -The estimated maximum error is also given by `resid`: - -```julia -soln.resid -``` - ----- - -As well, suppose we wanted to parameterize our function and then differentiate. - -Consider ``d/dp \int_0^\pi \sin(px) dx``. We can do this integral directly to get - -```math -\begin{align*} -\frac{d}{dp} \int_0^\pi \sin(px)dx -&= \frac{d}{dp}\left( \frac{-1}{p} \cos(px)\Big\rvert_0^\pi\right)\\ -&= \frac{d}{dp}\left( -\frac{\cos(p\cdot\pi)-1}{p}\right)\\ -&= \frac{\cos(p\cdot \pi) - 1)}{p^2} + \frac{\pi\cdot\sin(p\cdot\pi)}{p} -\end{align*} -``` - -Using `Integrals` with `QuadGK` we have: - -```julia -f(x, p) = sin(p*x) -function ∫sinpx(p) - prob = IntegralProblem(f, 0.0, pi, p) - solve(prob, QuadGKJL()) -end -``` - -We can compute values at both ``p=1`` and ``p=2``: - -```julia -∫sinpx(1), ∫sinpx(2) -``` - -To find the derivative in ``p`` , we have: - -```julia -ForwardDiff.derivative(∫sinpx, 1), ForwardDiff.derivative(∫sinpx, 2) -``` - - -(In `QuadGK`, the following can be differentiated `∫sinpx(p) = quadgk(x -> sin(p*x), 0, pi)[1]` as well. `Integrals` gives a consistent interface. - - -### Higher dimension integrals - -The power of a common interface is the ability to swap backends and the uniformity for different dimensions. Here we discuss integrals of scalar-valued and vector-valued functions. - -#### ``f: R^n \rightarrow R`` - -The area under a surface generated by ``z=f(x,y)`` over a rectangular region ``[a,b]\times[c,d]`` can be readily computed. The two coding implementations require ``f`` to be expressed as a function of a vector--*and* a parameter--and the interval to be expressed using two vectors, one for the left endpoints (`[a,c]`) and on for the right endpoints (`[b,d]`). - -For example, the area under the function ``f(x,y) = 1 + x^2 + 2y^2`` over ``[-1/2, 1/2] \times [-1,1]`` is computed by: - -```julia -f(x, y) = 1 + x^2 + 2y^2 # match math -fxp(x, p) = f(x[1], x[2]) # prepare for IntegralProblem -ls = [-1/2, -1] # left endpoints -rs = [1/2, 1] # right endpoints -prob = IntegralProblem(fxp, ls, rs) -soln = solve(prob, HCubatureJL()) -``` - -Of course, we could have directly defined the function (`fxp`) using indexing of the `x` variable. - ----- - -For non-rectangular domains a change of variable is required. - -For example, an integral to assist in finding the volume of a sphere might be - -```math -V = 2 \iint_R \sqrt{\rho^2 - x^2 - y^2} dx dy -``` - -where ``R`` is the disc of radius ``\rho`` in the ``x-y`` plane. - -The usual approach is to change to polar-coordinates and write this integral as - -```math -V = \int_0^{2\pi}\int_0^\rho \sqrt{\rho^2 - r^2} r dr d\theta -``` - -the latter being an integral over a rectangular domain. - -To compute this transformed integral, we might have: - -```julia -function vol_sphere(ρ) - f(rθ, p) = sqrt(ρ^2 - rθ[1]^2) * rθ[1] - ls = [0,0] - rs = [ρ, 2pi] - prob = IntegralProblem(f, ls, rs) - solve(prob, HCubatureJL()) -end - -vol_sphere(2) -``` - -If it is possible to express the region to integrate as ``G(R)`` where ``R`` is a rectangular region, then the change of variables formula, - -```math -\iint_{G(R)} f(x) dA = \iint_R (f\circ G)(u) |det(J_G(u)| dU -``` - -turns the integral into the non-rectangular domain ``G(R)`` into one over the rectangular domain ``R``. The key is to *identify* ``G`` and to compute the Jacobian. The latter is simply accomplished with `ForwardDiff.jacobian`. - -For an example, we find the moment of inertia about the axis of the unit square tilted counter-clockwise an angle ``0 \leq \alpha \leq \pi/2``. - -The counter clockwise rotation of a unit square by angle ``\alpha`` is described by - -```math -G(u, v) = \langle \cos(\alpha)\cdot u - \sin(\alpha)\cdot v, \sin(\alpha)\cdot u, +\cos(\alpha)\cdot v \rangle -``` - -So we have ``\iint_{G(R)} x^2 dA`` is computed by the following with ``\alpha=\pi/4``: - -```julia -import LinearAlgebra: det - - -𝑓(uv) = uv[1]^2 - -function G(uv) - - α = pi/4 # could be made a parameter - - u,v = uv - [cos(α)*u - sin(α)*v, sin(α)*u + cos(α)*v] -end - -f(u, p) = (𝑓∘G)(u) * det(ForwardDiff.jacobian(G, u)) - -prob = IntegralProblem(f, [0,0], [1,1]) -solve(prob, HCubatureJL()) -``` - -#### ``f: R^n \rightarrow R^m`` - -The `Integrals` package provides an interface for vector-valued functions. By default, the number of dimensions in the output is assumed to be ``1``, but the `nout` argument can adjust that. - - -Let ``f`` be vector valued with components ``f_1, f_2, \dots, f_m``, then the output below is the vector with components ``\iint_R f_1 dV, \iint_R f_2 dV, \dots, \iint_R f_m dV``. - - - -For a trivial example, we have: - -```julia -f(x, p) = [x[1], x[2]^2] -prob = IntegralProblem(f, [0,0],[3,4], nout=2) -solve(prob, HCubatureJL()) -``` diff --git a/CwJ/alternatives/interval_arithmetic.__jmd__ b/CwJ/alternatives/interval_arithmetic.__jmd__ deleted file mode 100644 index 78fd860..0000000 --- a/CwJ/alternatives/interval_arithmetic.__jmd__ +++ /dev/null @@ -1,107 +0,0 @@ -# Using interval arithemetic - -Highlighted here is the use of interval arithmetic for calculus problems. - -Unlike floating point math, where floating point values are an *approximation* to real numbers, interval arithmetic uses *interval* which are **guaranteed** to contain the given value. We use the `IntervalArithmetic` package and friends to work below, but note there is nothing magic about the concept. - -## Basic XXX - - - -## Using `IntervalRootFinding` to identify zeros of a function - -The `IntervalRootFinding` package provides a more *rigorous* alternative to `find_zeros`. This packages leverages the interval arithmetic features of `IntervalArithmetic`. -The `IntervalRootFinding` package provides a function `roots`, with usage similar to `find_zeros`. Intervals are specified with the notation `a..b`. In the following, we *qualify* `roots` to not conflict with the `roots` function from `SymPy`, which has already been loaded: - -```julia -import IntervalArithmetic -import IntervalRootFinding -``` - -```julia -u(x) = sin(x) - 0.1*x^2 + 1 -𝑱 = IntervalArithmetic.Interval(-10, 10) # cumbersome -10..10; needed here: .. means something in CalculusWithJulia -rts = IntervalRootFinding.roots(u, 𝑱) -``` - -The "zeros" are returned with an enclosing interval and a flag, which for the above indicates a unique zero in the interval. - -The intervals with a unique answer can be filtered and refined with a construct like the following: - -```julia -[find_zero(u, (IntervalArithmetic.interval(I).lo, IntervalArithmetic.interval(I).hi)) for I in rts if I.status == :unique] -``` - -The midpoint of the returned interval can be found by composing the `mid` function with the `interval` function of the package: - -```julia -[(IntervalArithmetic.mid ∘ IntervalArithmetic.interval)(I) for I in rts if I.status == :unique] -``` - - - -For some problems, `find_zeros` is more direct, as with this one: - - -```julia -find_zeros(u, (-10, 10)) -``` - -Which can be useful if there is some prior understanding of the zeros expected to be found. -However, `IntervalRootFinding` is more efficient computationally and *offers a guarantee* on the values found. - - - -For functions where roots are not "unique" a different output may appear: - -```julia; hold=true; -f(x) = x*(x-1)^2 -rts = IntervalRootFinding.roots(f, 𝑱) -``` - -The interval labeled `:unknown` contains a `0`, but it can't be proved by `roots`. - - -Interval arithmetic finds **rigorous** **bounds** on the range of `f` values over a closed interval `a..b` (the range is `f(a..b)`). "Rigorous" means the bounds are truthful and account for possible floating point issues. "Bounds" means the answer lies within, but the bound need not be the answer. - -This allows one -- for some functions -- to answer affirmatively questions like: - -* Is the function *always* positive on `a..b`? Negative? This can be done by checking if `0` is in the bound given by `f(a..b)`. If it isn't then one of the two characterizations is true. - -* Is the function *strictly increasing* on `a..b`? Strictly decreasing? These questions can be answered using the (upcoming) [derivative](../derivatives/derivatives.html). If the derivative is positive on `a..b` then `f` is strictly increasing, if negative on `a..b` then `f` is strictly decreasing. Finding the derivative can be done within the `IntervalArithmetic` framework using [automatic differentiation](../derivatives/numeric_derivatives.html), a blackbox operation denoted `f'` below. - -Combined, for some functions and some intervals these two questions can be answered affirmatively: - -* the interval does not contain a zero (`0 !in f(a..b)`) -* over the interval, the function crosses the `x` axis *once* (`f(a..a)` and `f(b..b)` are one positive and one negative *and* `f` is strictly monotone, or `0 !in f'(a..b)`) - -This allows the following (simplified) bisection-like algorithm to be used: - -* consider an interval `a..b` -* if the function is *always* positive or negative, it can be discarded as no zero can be in the interval -* if the function crosses the `x` axis *once* over this interval **then** there is a "unique" zero in the interval and the interval can be marked so and set aside -* if neither of the above *and* `a..b` is not too small already, then *sub-divide* the interval and repeat the above with *both* smaller intervals -* if `a..b` is too small, stop and mark it as "unknown" - -When terminated there will be intervals with unique zeros flagged and smaller intervals with an unknown status. - -Compared to the *bisection* algorithm -- which only knows for some intervals if that interval has one or more crossings -- this algorithm gives a more rigorous means to get all the zeros in `a..b`. - - - - - - -For a last example of the value of this package, this function, which appeared in our discussion on limits, is *positive* for **every** floating point number, but has two zeros snuck in at values within the floating point neighbors of $15/11$ - -```julia -t(x) = x^2 + 1 +log(abs( 11*x-15 ))/99 -``` - -The `find_zeros` function will fail on identifying any potential zeros of this function. Even the basic call of `roots` will fail, due to it relying on some smoothness of `f`. However, explicitly asking for `Bisection` shows the *potential* for one or more zeros near $15/11$: - -```julia -IntervalRootFinding.roots(t, 𝑱, IntervalRootFinding.Bisection) -``` - -(The basic algorithm above can be sped up using a variant of [Newton's](../derivatives/newton_method.html) method, this variant assumes some "smoothness" in the function `f`, whereas this `f` is not continuous at the point ``x=15/11``.) diff --git a/CwJ/alternatives/makie_plotting.jmd b/CwJ/alternatives/makie_plotting.jmd deleted file mode 100644 index 97b6605..0000000 --- a/CwJ/alternatives/makie_plotting.jmd +++ /dev/null @@ -1,1046 +0,0 @@ -# Calculus plots with Makie - - -The [Makie.jl webpage](https://github.com/JuliaPlots/Makie.jl) says - -> From the Japanese word Maki-e, which is a technique to sprinkle lacquer with gold and silver powder. Data is basically the gold and silver of our age, so let's spread it out beautifully on the screen! - -`Makie` itself is a metapackage for a rich ecosystem. We show how to -use the interface provided by the `GLMakie` backend to produce the -familiar graphics of calculus. - - -!!! note "Examples and tutorials" - `Makie` is a sophisticated plotting package, and capable of an enormous range of plots (cf. [examples](https://makie.juliaplots.org/stable/examples/plotting_functions/).) `Makie` also has numerous [tutorials](https://makie.juliaplots.org/stable/tutorials/) to learn from. These are far more extensive that what is described herein, as this section focuses just on the graphics from calculus. - -## Figures - -Makie draws graphics onto a canvas termed a "scene" in the Makie documentation. -A scene is an implementation detail, the basic (non-mutating) plotting commands described below return a `FigureAxisPlot` object, a compound object that combines a figure, an axes, and a plot object. The `show` method for these objects display the figure. - -For `Makie` there are the `GLMakie`, `WGLMakie`, and `CairoMakie` backends for different types of canvases. In the following, we have used `GLMakie`. `WGLMakie` is useful for incorporating `Makie` plots into web-based technologies. - -We begin by loading the main package and the `norm` function from the standard `LinearAlgebra` package: - -```julia -using GLMakie -import LinearAlgebra: norm - -``` - -The `Makie` developers have workarounds for the delayed time to first plot, but without utilizing these the time to load the package is lengthy. - - -## Points (`scatter`) - -The task of plotting the points, say ``(1,2)``, ``(2,3)``, ``(3,2)`` can be done different ways. Most plotting packages, and `Makie` is no exception, allow the following: form vectors of the ``x`` and ``y`` values then plot those with `scatter`: - -```julia -xs = [1,2,3] -ys = [2,3,2] -scatter(xs, ys) -``` - -The `scatter` function creates and returns an object, which when displayed shows the plot. - - -### `Point2`, `Point3` - -When learning about points on the Cartesian plane, a "`t`"-chart is often produced: - -``` -x | y ------ -1 | 2 -2 | 3 -3 | 2 -``` - -The `scatter` usage above used the columns. The rows are associated with the points, and these too can be used to produce the same graphic. -Rather than make vectors of ``x`` and ``y`` (and optionally ``z``) coordinates, it is more idiomatic to create a vector of "points." `Makie` utilizes a `Point` type to store a 2 or 3 dimensional point. The `Point2` and `Point3` constructors will be utilized. - -`Makie` uses a GPU, when present, to accelerate the graphic rendering. GPUs employ 32-bit numbers. Julia uses an `f0` to indicate 32-bit floating points. Hence the alternate types `Point2f0` to store 2D points as 32-bit numbers and `Points3f0` to store 3D points as 32-bit numbers are seen in the documentation for Makie. - - -We can plot a vector of points in as direct manner as vectors of their coordinates: - -```julia -pts = [Point2(1,2), Point2(2,3), Point2(3,2)] -scatter(pts) -``` - -A typical usage is to generate points from some vector-valued -function. Say we have a parameterized function `r` taking ``R`` into -``R^2`` defined by: - -```julia -r(t) = [sin(t), cos(t)] -``` - - -Then broadcasting values gives a vector of vectors, each identified with a point: - -```julia -ts = [1,2,3] -r.(ts) -``` - -We can broadcast `Point2` over this to create a vector of `Point` objects: - -```julia -pts = Point2.(r.(ts)) -``` - -These then can be plotted directly: - -```julia -scatter(pts) -``` - - -The ploting of points in three dimesions is essentially the same, save the use of `Point3` instead of `Point2`. - -```julia -r(t) = [sin(t), cos(t), t] -ts = range(0, 4pi, length=100) -pts = Point3.(r.(ts)) -scatter(pts; markersize=5) -``` - - ----- - -To plot points generated in terms of vectors of coordinates, the -component vectors must be created. The "`t`"-table shows how, simply -loop over each column and add the corresponding ``x`` or ``y`` (or ``z``) -value. This utility function does exactly that, returning the vectors -in a tuple. - -```julia -unzip(vs) = Tuple([vs[j][i] for j in eachindex(vs)] for i in eachindex(vs[1])) -``` - -!!! note - In the `CalculusWithJulia` package, `unzip` is implemented using `SplitApplyCombine.invert`. - -We might have then: - -```julia -scatter(unzip(r.(ts))...; markersize=5) -``` - -where splatting is used to specify the `xs`, `ys`, and `zs` to `scatter`. - -(Compare to `scatter(Point3.(r.(ts)))` or `scatter(Point3∘r).(ts))`.) - -### Attributes - -A point is drawn with a "marker" with a certain size and color. These attributes can be adjusted, as in the following: - -```julia -scatter(xs, ys; - marker=[:x,:cross, :circle], markersize=25, - color=:blue) -``` - -Marker attributes include - -* `marker` a symbol, shape. -* `marker_offset` offset coordinates -* `markersize` size (radius pixels) of marker - -A single value will be repeated. A vector of values of a matching size will specify the attribute on a per point basis. - -## Curves - -The curves of calculus are lines. The `lines` command of `Makie` will render a curve by connecting a series of points with straight-line segments. By taking a sufficient number of points the connect-the-dot figure can appear curved. - - -### Plots of univariate functions - -The basic plot of univariate calculus is the graph of a function ``f`` over an interval ``[a,b]``. This is implemented using a familiar strategy: produce a series of representative values between ``a`` and ``b``; produce the corresponding ``f(x)`` values; plot these as points and connect the points with straight lines. - - -To create regular values between `a` and `b` typically the `range` function or the range operator (`a:h:b`) are employed. The the related `LinRange` function is also an option. - - -For example: - -```julia -f(x) = sin(x) -a, b = 0, 2pi -xs = range(a, b, length=250) -lines(xs, f.(xs)) -``` - -`Makie` also will read the interval notation of `IntervalSets` and select its own set of intermediate points: - -```julia -lines(a..b, f) -``` - - -As with `scatter`, `lines` returns an object that produces a graphic when displayed. - -As with `scatter`, `lines` can can also be drawn using a vector of points: - -```julia -pts = [Point2(x, f(x)) for x ∈ xs] -lines(pts) -``` - -(Though the advantage isn't clear here, this will be useful when the points are generated in different manners.) - -When a `y` value is `NaN` or infinite, the connecting lines are not drawn: - -```julia -xs = 1:5 -ys = [1,2,NaN, 4, 5] -lines(xs, ys) -``` - -As with other plotting packages, this is useful to represent discontinuous functions, such as what occurs at a vertical asymptote or a step function. - - -#### Adding to a figure (`lines!`, `scatter!`, ...) - -To *add* or *modify* a scene can be done using a mutating version of a plotting primitive, such as `lines!` or `scatter!`. The names follow `Julia`'s convention of using an `!` to indicate that a function modifies an argument, in this case the underlying figure. - -Here is one way to show two plots at once: - -```julia -xs = range(0, 2pi, length=100) -lines(xs, sin.(xs)) -lines!(xs, cos.(xs)) -current_figure() -``` - -!!! note "Current figure" - The `current_figure` call is needed to have the figure display, as the returned value of `lines!` is not a figure object. (Figure objects display when shown as the output of a cell.) - - -We will see soon how to modify the line attributes so that the curves can be distinguished. - -The following shows the construction details in the graphic: - -```julia -xs = range(0, 2pi, length=10) -lines(xs, sin.(xs)) -scatter!(xs, sin.(xs); - markersize=10) -current_figure() -``` - - -As an example, this shows how to add the tangent line to a graph. The slope of the tangent line being computed by `ForwardDiff.derivative`. - -```julia -import ForwardDiff -f(x) = x^x -a, b= 0, 2 -c = 0.5 -xs = range(a, b, length=200) - -tl(x) = f(c) + ForwardDiff.derivative(f, c) * (x-c) - -lines(xs, f.(xs)) -lines!(xs, tl.(xs), color=:blue) -current_figure() -``` - -This example, modified from a [discourse](https://discourse.julialang.org/t/how-to-plot-step-functions-x-correctly-in-julia/84087/5) post by user `@rafael.guerra`, shows how to plot a step function (`floor`) using `NaN`s to create line breaks. The marker colors set for `scatter!` use `:white` to match the background color. - -```julia -x = -5:5 -δ = 5eps() # for rounding purposes; our interval is [i,i+1) ≈ [i, i+1-δ] -xx = Float64[] -for i ∈ x[1:end-1] - append!(xx, (i, i+1 - δ, NaN)) -end -yy = floor.(xx) - -lines(xx, yy) -scatter!(xx, yy, color=repeat([:black, :white, :white], length(xx)÷3)) - -current_figure() -``` - - - -### Text (`annotations`) - -Text can be placed at a point, as a marker is. To place text, the desired text and a position need to be specified along with any adjustments to the default attributes. - -For example: - -```julia -xs = 1:5 -pts = Point2.(xs, xs) -scatter(pts) -annotations!("Point " .* string.(xs), pts; - textsize = 50 .- 2*xs, - rotation = 2pi ./ xs) - -current_figure() -``` - -The graphic shows that `textsize` adjusts the displayed size and `rotation` adjusts the orientation. (The graphic also shows a need to manually override the limits of the `y` axis, as the `Point 5` is chopped off; the `ylims!` function to do so will be shown later.) - -Attributes for `text`, among many others, include: - -* `align` Specify the text alignment through `(:pos, :pos)`, where `:pos` can be `:left`, `:center`, or `:right`. -* `rotation` to indicate how the text is to be rotated -* `textsize` the font point size for the text -* `font` to indicate the desired font - - - -#### Line attributes - -In a previous example, we added the argument `color=:blue` to the `lines!` call. This was to set an attribute for the line being drawn. Lines have other attributes that allow different ones to be distinguished, as above where colors indicate the different graphs. - -Other attributes can be seen from the help page for `lines`, and include: - -* `color` set with a symbol, as above, or a string -* `label` a label for the line to display in a legend -* `linestyle` available styles are set by a symbol, one of `:dash`, `:dot`, `:dashdot`, or `:dashdotdot`. -* `linewidth` width of line -* `transparency` the `alpha` value, a number between ``0`` and ``1``, smaller numbers for more transparent. - - -#### Simple legends - -A simple legend displaying labels given to each curve can be produced by `axislegend`. For example: - -```julia -xs = 0..pi -lines(xs, x -> sin(x^2), label="sin(x^2)") -lines!(xs, x -> sin(x)^2, label = "sin(x)^2") -axislegend() - -current_figure() -``` - -Later, we will see how to control the placement of a legend within a figure. - -#### Titles, axis labels, axis ticks - -The basic plots we have seen are of type `FigureAxisPlot`. The "axis" part controls attributes of the plot such as titles, labels, tick positions, etc. These values can be set in different manners. On construction we can pass values to a named argument `axis` using a named tuple. - -For example: - -```julia -xs = 0..2pi -lines(xs, sin; - axis=(title="Plot of sin(x)", xlabel="x", ylabel="sin(x)") - ) -``` - -To access the `axis` element of a plot **after** the plot is constructed, values can be assigned to the `axis` property of the `FigureAxisPlot` object. For example: - -```julia -xs = 0..2pi -p = lines(xs, sin; - axis=(title="Plot of sin(x)", xlabel="x", ylabel="sin(x)") - ) -p.axis.xticks = MultiplesTicks(5, pi, "π") # label 5 times using `pi` - -current_figure() -``` - -The ticks are most easily set as a collection of values. Above, the `MultiplesTicks` function was used to label with multiples of ``\pi``. - -Later we will discuss how `Makie` allows for subsequent modification of several parts of the plot (not just the ticks) including the data. - -#### Figure resolution, ``x`` and ``y`` limits - -As just mentioned, the basic plots we have seen are of type `FigureAxisPlot`. The "figure" part can be used to adjust the background color or the resolution. As with attributes for the axis, these too can be passed to a simple constructor: - -```julia -lines(xs, sin; - axis=(title="Plot of sin(x)", xlabel="x", ylabel="sin(x)"), - figure=(;resolution=(300, 300)) - ) -``` - -The `;` in the tuple passed to `figure` is one way to create a *named* tuple with a single element. Alternatively, `(resolution=(300,300), )` -- with a trailing comma -- could have been used. - - - -To set the limits of the graph there are shorthand functions `xlims!`, `ylims!`, and `zlims!`. This might prove useful if vertical asymptotes are encountered, as in this example: - -```julia -f(x) = 1/x -a,b = -1, 1 -xs = range(-1, 1, length=200) -lines(xs, f.(xs)) -ylims!(-10, 10) - -current_figure() -``` - -This still leaves the artifact due to the vertical asymptote at ``0`` having different values from the left and the right. - - -### Plots of parametric functions - -A space curve is a plot of a function ``f:R^2 \rightarrow R`` or ``f:R^3 \rightarrow R``. - -To construct a curve from a set of points, we have a similar pattern in both ``2`` and ``3`` dimensions: - -```julia -r(t) = [sin(2t), cos(3t)] -ts = range(0, 2pi, length=200) -pts = Point2.(r.(ts)) # or (Point2∘r).(ts) -lines(pts) -``` - -Or - -```julia -r(t) = [sin(2t), cos(3t), t] -ts = range(0, 2pi, length=200) -pts = Point3.(r.(ts)) -lines(pts) -``` - - -Alternatively, vectors of the ``x``, ``y``, and ``z`` components can be produced and then plotted using the pattern `lines(xs, ys)` or `lines(xs, ys, zs)`. For example, using `unzip`, as above, we might have done the prior example with: - -```julia -xs, ys, zs = unzip(r.(ts)) -lines(xs, ys, zs) -``` - - -#### Aspect ratio - -A simple plot of a parametrically defined circle will show an ellipse, as the aspect ratio of the ``x`` and ``y`` axis is not ``1``. To enforce this, we can pass a value of `aspect=1` to the underlying "Axis" object. For example: - -```julia -ts = range(0, 2pi, length=100) -xs, ys = sin.(ts), cos.(ts) -lines(xs, ys; axis=(; aspect = 1)) -``` - -#### Tangent vectors (`arrows`) - -A tangent vector along a curve can be drawn quite easily using the `arrows` function. There are different interfaces for `arrows`, but we show the one which uses a vector of positions and a vector of "vectors". For the latter, we utilize the `derivative` function from `ForwardDiff`: - -```julia -r(t) = [sin(t), cos(t)] # vector, not tuple -ts = range(0, 4pi, length=200) -lines(Point2.(r.(ts))) - -nts = 0:pi/4:2pi -us = r.(nts) -dus = ForwardDiff.derivative.(r, nts) - -arrows!(Point2.(us), Point2.(dus)) - -current_figure() -``` - - -In 3 dimensions the differences are minor: - -```julia -r(t) = [sin(t), cos(t), t] # vector, not tuple -ts = range(0, 4pi, length=200) -lines(Point3.(r.(ts))) - -nts = 0:pi/2:(4pi-pi/2) -us = r.(nts) -dus = ForwardDiff.derivative.(r, nts) - -arrows!(Point3.(us), Point3.(dus)) - -current_figure() -``` - - -#### Arrow attributes - -Attributes for `arrows` include - -* `arrowsize` to adjust the size -* `lengthscale` to scale the size -* `arrowcolor` to set the color -* `arrowhead` to adjust the head -* `arrowtail` to adjust the tail - - - -## Surfaces - -Plots of surfaces in ``3`` dimensions are useful to help understand the behavior of multivariate functions. - -#### Surfaces defined through ``z=f(x,y)`` - -The "`peaks`" function defined below has a few prominent peaks: - -```julia -function peaks(x, y) - p = 3*(1-x)^2*exp(-x^2 - (y+1)^2) - p -= 10(x/5-x^3-y^5)*exp(-x^2-y^2) - p -= 1/3*exp(-(x+1)^2-y^2) - p -end -``` - -Here we see how `peaks` can be visualized over the region ``[-5,5]\times[-5,5]``: - -```julia -xs = ys = range(-5, 5, length=25) -surface(xs, ys, peaks) -``` - -The calling pattern `surface(xs, ys, f)` implies a rectangular grid over the ``x``-``y`` plane defined by `xs` and `ys` with ``z`` values given by ``f(x,y)``. - - -Alternatively a "matrix" of ``z`` values can be specified. For a function `f`, this is conveniently generated by the pattern `f.(xs, ys')`, the `'` being important to get a matrix of all ``x``-``y`` pairs through `Julia`'s broadcasting syntax. - -```julia -zs = peaks.(xs, ys') -surface(xs, ys, zs) -``` - - -To see how this graph is constructed, the points ``(x,y,f(x,y))`` are plotted over the grid and displayed. - -Here we downsample to illustrate: - -```julia -xs = ys = range(-5, 5, length=5) -pts = [Point3(x, y, peaks(x,y)) for x in xs for y in ys] -scatter(pts, markersize=25) -``` - - -These points are then connected. The `wireframe` function illustrates -just the frame: - -```julia -wireframe(xs, ys, peaks.(xs, ys'); linewidth=5) -``` - -The `surface` call triangulates the frame and fills in the shading: - -```julia -surface!(xs, ys, peaks.(xs, ys')) -current_figure() -``` - - - - -#### Parametrically defined surfaces - -A surface may be parametrically defined through a function ``r(u,v) = (x(u,v), y(u,v), z(u,v))``. For example, the surface generated by ``z=f(x,y)`` is of the form with ``r(u,v) = (u,v,f(u,v))``. - -The `surface` function and the `wireframe` function can be used to display such surfaces. In previous usages, the `x` and `y` values were vectors from which a 2-dimensional grid is formed. For parametric surfaces, a grid for the `x` and `y` values must be generated. This function will do so: - -```julia -function parametric_grid(us, vs, r) - n,m = length(us), length(vs) - xs, ys, zs = zeros(n,m), zeros(n,m), zeros(n,m) - for (i, uᵢ) in pairs(us) - for (j, vⱼ) in pairs(vs) - x,y,z = r(uᵢ, vⱼ) - xs[i,j] = x - ys[i,j] = y - zs[i,j] = z - end - end - (xs, ys, zs) -end -``` - -With the data suitably massaged, we can directly plot either a `surface` or `wireframe` plot. - ----- - -As an aside, The above can be done more campactly with nested list comprehensions: - -``` -xs, ys, zs = [[pt[i] for pt in r.(us, vs')] for i in 1:3] -``` - -Or using the `unzip` function directly after broadcasting: - -``` -xs, ys, zs = unzip(r.(us, vs')) -``` - ----- - -For example, a sphere can be parameterized by ``r(u,v) = (\sin(u)\cos(v), \sin(u)\sin(v), \cos(u))`` and visualized through: - -```julia -r(u,v) = [sin(u)*cos(v), sin(u)*sin(v), cos(u)] -us = range(0, pi, length=25) -vs = range(0, pi/2, length=25) -xs, ys, zs = parametric_grid(us, vs, r) - -surface(xs, ys, zs) -wireframe!(xs, ys, zs) -current_figure() -``` - -A surface of revolution for ``g(u)`` revolved about the ``z`` axis can be visualized through: - -```julia -g(u) = u^2 * exp(-u) -r(u,v) = (g(u)*sin(v), g(u)*cos(v), u) -us = range(0, 3, length=10) -vs = range(0, 2pi, length=10) -xs, ys, zs = parametric_grid(us, vs, r) - -surface(xs, ys, zs) -wireframe!(xs, ys, zs) -current_figure() -``` - - -A torus with big radius ``2`` and inner radius ``1/2`` can be visualized as follows - -```julia -r1, r2 = 2, 1/2 -r(u,v) = ((r1 + r2*cos(v))*cos(u), (r1 + r2*cos(v))*sin(u), r2*sin(v)) -us = vs = range(0, 2pi, length=25) -xs, ys, zs = parametric_grid(us, vs, r) - -surface(xs, ys, zs) -wireframe!(xs, ys, zs) -current_figure() -``` - - -A Möbius strip can be produced with: - -```julia -ws = range(-1/4, 1/4, length=8) -thetas = range(0, 2pi, length=30) -r(w, θ) = ((1+w*cos(θ/2))*cos(θ), (1+w*cos(θ/2))*sin(θ), w*sin(θ/2)) -xs, ys, zs = parametric_grid(ws, thetas, r) - -surface(xs, ys, zs) -wireframe!(xs, ys, zs) -current_figure() -``` - -## Contour plots (`contour`, `contourf`, `heatmap`) - -For a function ``z = f(x,y)`` an alternative to a surface plot, is a contour plot. That is, for different values of ``c`` the level curves ``f(x,y)=c`` are drawn. - -For a function ``f(x,y)``, the syntax for generating a contour plot follows that for `surface`. - -For example, using the `peaks` function, previously defined, we have a contour plot over the region ``[-5,5]\times[-5,5]`` is generated through: - -```julia -xs = ys = range(-5, 5, length=100) -contour(xs, ys, peaks) -``` - -The default of ``5`` levels can be adjusted using the `levels` keyword: - -```julia -contour(xs, ys, peaks; levels = 20) -``` - -The `levels` argument can also specify precisely what levels are to be drawn. - -The contour graph makes identification of peaks and valleys easy as the limits of patterns of nested contour lines. - - -A *filled* contour plot is produced by `contourf`: - -```julia -contourf(xs, ys, peaks) -``` - -A related, but alternative visualization, using color to represent magnitude is a heatmap, produced by the `heatmap` function. The calling syntax is similar to `contour` and `surface`: - - -```julia -heatmap(xs, ys, peaks) -``` - -This graph shows peaks and valleys through "hotspots" on the graph. - - -The `MakieGallery` package includes an example of a surface plot with both a wireframe and 2D contour graph added. It is replicated here using the `peaks` function scaled by ``5``. - -The function and domain to plot are described by: - -```julia -xs = ys = range(-5, 5, length=51) -zs = peaks.(xs, ys') / 5; -``` - -The `zs` were generated, as `wireframe` does not provide the interface for passing a function. - -The `surface` and `wireframe` are produced as follows. Here we manually create the figure and axis object so that we can set the viewing angle through the `elevation` argument to the axis object: - -```julia -fig = Figure() -ax3 = Axis3(fig[1,1]; - elevation=pi/9, azimuth=pi/16) -surface!(ax3, xs, ys, zs) -wireframe!(ax3, xs, ys, zs; - overdraw = true, transparency = true, - color = (:black, 0.1)) -current_figure() -``` - -To add the contour, a simple call via `contour!(scene, xs, ys, zs)` will place the contour at the ``z=0`` level which will make it hard to read. Rather, placing at the "bottom" of the figure is desirable. To identify that the minimum value, is identified (and rounded) and the argument `transformation = (:xy, zmin)` is passed to `contour!`: - -```julia -ezs = extrema(zs) -zmin, zmax = floor(first(ezs)), ceil(last(ezs)) -contour!(ax3, xs, ys, zs; - levels = 15, linewidth = 2, - transformation = (:xy, zmin)) -zlims!(zmin, zmax) -current_figure() -``` - -The `transformation` plot attribute sets the "plane" (one of `:xy`, `:yz`, or `:xz`) at a location, in this example `zmin`. - - -The manual construction of a figure and an axis object will be further discussed later. - - -### Three dimensional contour plots - -The `contour` function can also plot ``3``-dimensional contour plots. Concentric spheres, contours of ``x^2 + y^2 + z^2 = c`` for ``c > 0`` are presented by the following: - -```julia -f(x,y,z) = x^2 + y^2 + z^2 -xs = ys = zs = range(-3, 3, length=100) - -contour(xs, ys, zs, f) -``` - - -### Implicitly defined curves and surfaces - -Suppose ``f`` is a scalar-valued function. If `f` takes two variables for its input, then the equation ``f(x,y) = 0`` implicitly defines ``y`` as a function of ``x``; ``y`` can be visualized *locally* with a curve. If ``f`` takes three variables for its input, then the equation ``f(x,y,z)=0`` implicitly defines ``z`` as a function of ``x`` and ``y``; ``z`` can be visualized *locally* with a surface. - -#### Implicitly defined curves - -The graph of an equation is the collection of all ``(x,y)`` values -satisfying the equation. This is more general than the graph of a -function, which can be viewed as the graph of the equation -``y=f(x)``. An equation in ``x``-``y`` can be graphed if the set of -solutions to a related equation ``f(x,y)=0`` can be identified, as one -can move all terms to one side of an equation and define ``f`` as the -rule of the side with the terms. The implicit function theorem ensures that under some conditions, *locally* near a point ``(x, y)``, the value ``y`` can be represented as a function of ``x``. So, the graph of the equation ``f(x,y)=0`` can be produced by stitching together these local function representations. - -The contour graph can produce these graphs by setting the `levels` argument to `[0]`. - -```julia -f(x,y) = x^3 + x^2 + x + 1 - x*y # solve x^3 + x^2 + x + 1 = x*y -xs = range(-5, 5, length=100) -ys = range(-10, 10, length=100) - -contour(xs, ys, f.(xs, ys'); levels=[0]) -``` - -The `implicitPlots.jl` function uses the `Contour` package along with a `Plots` recipe to plot such graphs. Here we see how to use `Makie` in a similar manner: - -```julia; eval=false -import Contour - -function implicit_plot(xs, ys, f; kwargs...) - fig = Figure() - ax = Axis(fig[1,1]) - implicit_plot!(ax, xs, ys, f; kwargs...) - fig -end - -function implicit_plot!(ax, xs, ys, f; kwargs...) - z = [f(x, y) for x in xs, y in ys] - cs = Contour.contour(collect(xs), collect(ys), z, 0.0) - ls = Contour.lines(cs) - - isempty(ls) && error("empty") - - for l ∈ ls - us, vs = Contour.coordinates(l) - lines!(ax, us, vs; kwargs...) - end - -end -``` - - - - -#### Implicitly defined surfaces, ``F(x,y,z)=0`` - -To plot the equation ``F(x,y,z)=0``, for ``F`` a scalar-valued -function, again the implicit function theorem says that, under -conditions, near any solution ``(x,y,z)``, ``z`` can be represented as -a function of ``x`` and ``y``, so the graph will look likes surfaces -stitched together. The `Implicit3DPlotting` package takes an approach like -`ImplicitPlots` to represent these surfaces. It replaces the `Contour` -package computation with a ``3``-dimensional alternative provided -through the `Meshing` and `GeometryBasics` packages. - - -The `Implicit3DPlotting` package needs some maintenance, so we borrow the main functionality and wrap it into a function: - -```julia -import Meshing -import GeometryBasics - -function make_mesh(xlims, ylims, zlims, f, - M = Meshing.MarchingCubes(); # or Meshing.MarchingTetrahedra() - samples=(35, 35, 35), - ) - - lims = extrema.((xlims, ylims, zlims)) - Δ = xs -> last(xs) - first(xs) - xs = Vec(first.(lims)) - Δxs = Vec(Δ.(lims)) - - GeometryBasics.Mesh(f, Rect(xs, Δxs), M; samples = samples) -end -``` - -The `make_mesh` function creates a mesh that can be visualized with the `wireframe` or `mesh` plotting functions. - - -This example, plotting an implicitly defined sphere, comes from the -documentation of `Implicit3DPlotting`. The `f` in `make_mesh` is a -scalar-valued function of a vector: - -```julia -f(x) = sum(x.^2) - 1 -xs = ys = zs = (-5, 5) -m = make_mesh(xs, ys, zs, f) -wireframe(m) -``` - -Here we visualize an intersection of a sphere with another figure: - -```julia -r₂(x) = sum(x.^2) - 5/4 # a sphere -r₄(x) = sum(x.^4) - 1 -xs = ys = zs = -2:2 -m2,m4 = make_mesh(xs, ys, zs, r₂), make_mesh(xs, ys, zs, r₄) - -wireframe(m4, color=:yellow) -wireframe!(m2, color=:red) -current_figure() -``` - - -This example comes from [Wikipedia](https://en.wikipedia.org/wiki/Implicit_surface) showing an implicit surface of genus ``2``: - -```julia -f(x,y,z) = 2y*(y^2 -3x^2)*(1-z^2) + (x^2 +y^2)^2 - (9z^2-1)*(1-z^2) -zs = ys = xs = range(-5/2, 5/2, length=100) -m = make_mesh(xs, ys, zs, x -> f(x...)) -wireframe(m) -``` - - -(This figure does not render well through `contour(xs, ys, zs, f, levels=[0])`, as the hole is not shown.) - - -For one last example from Wikipedia, we have the Cassini oval which "can be defined as the point set for which the *product* of the distances to ``n`` given points is constant." That is: - -```julia -function cassini(λ, ps = ((1,0,0), (-1, 0, 0))) - n = length(ps) - x -> prod(norm(x .- p) for p ∈ ps) - λ^n -end -xs = ys = zs = range(-2, 2, length=100) -m = make_mesh(xs, ys, zs, cassini(1.05)) -wireframe(m) -``` - -## Vector fields. Visualizations of ``f:R^2 \rightarrow R^2`` - -The vector field ``f(x,y) = \langle y, -x \rangle`` can be visualized as a set of vectors, ``f(x,y)``, positioned at a grid. These arrows can be visualized with the `arrows` function. The `arrows` function is passed a vector of points for the anchors and a vector of points representing the vectors. - -We can generate these on a regular grid through: - -```julia -f(x, y) = [y, -x] -xs = ys = -5:5 -pts = vec(Point2.(xs, ys')) -dus = vec(Point2.(f.(xs, ys'))); -first(pts), first(dus) # show an example -``` - -Broadcasting over `(xs, ys')` ensures each pair of possible values is encountered. The `vec` call reshapes an array into a vector. - -Calling `arrows` on the prepared data produces the graphic: - -```julia -arrows(pts, dus) -``` - -The grid seems rotated at first glance; but is also confusing. This is due to the length of the vectors as the ``(x,y)`` values get farther from the origin. Plotting the *normalized* values (each will have length ``1``) can be done easily using `norm` (which is found in the standard `LinearAlgebra` library): - -```julia -dvs = dus ./ norm.(dus) -arrows(pts, dvs) -``` - -The rotational pattern becomes much clearer now. - -The `streamplot` function also illustrates this phenomenon. This implements an "algorithm [that] puts an arrow somewhere and extends the streamline in both directions from there. Then, it chooses a new position (from the remaining ones), repeating the the exercise until the streamline gets blocked, from which on a new starting point, the process repeats." - -The `streamplot` function expects a `Point` not a pair of values, so we adjust `f` slightly and call the function using the pattern `streamplot(g, xs, ys)`: - -```julia -f(x, y) = [y, -x] -g(xs) = Point2(f(xs...)) - -streamplot(g, -5..5, -5..5) -``` - -(We used interval notation to set the viewing range, a range could also be used.) - -!!! note - The calling pattern of `streamplot` is different than other functions, such as `surface`, in that the function comes first. - - -## Layoutables and Observables - -### Layoutables - -`Makie` makes it really easy to piece together figures from individual plots. To illustrate, we create a graphic consisting of a plot of a function, its derivative, and its second derivative. In our graphic, we also leave space for a label. - -!!! note - The Layout [Tutorial](https://makie.juliaplots.org/stable/tutorials/layout-tutorial/) has *much* more detail on this subject. - -The basic plotting commands, like `lines`, return a `FigureAxisPlot` object. For laying out our own graphic, we manage the figure and axes manually. The commands below create a figure, then assign axes to portions of the figure: - -```julia -F = Figure() -af = F[2,1:2] = Axis(F) -afp = F[3,1:end] = Axis(F) -afpp = F[4,:] = Axis(F) -``` - -The axes are named `af`, `afp` and `afpp`, as they will hold the respective graphs. The key here is the use of matrix notation to layout the graphic in a grid. The first one is row 2 and columns 1 through 2; the second row 3 and again all columns, the third is row 4 and all columns. - -In this figure, we want the ``x``-axis for each of the three graphics to be linked. This command ensures that: - -```julia -linkxaxes!(af, afp, afpp); -``` - -By linking axes, if one is updated, say through `xlims!`, the others will be as well. - -We now plot our functions. The key here is the mutating form of `lines!` takes an axis object to mutate as its first argument: - -```julia -f(x) = 8x^4 - 8x^2 + 1 -fp(x) = 32x^3 - 16x -fpp(x) = 96x^2 - 16 - -xs = -1..1 -lines!(af, xs, f) -lines!(afp, xs, fp) -lines!(afp, xs, zero, color=:blue) -lines!(afpp, xs, fpp) -lines!(afpp, xs, zero, color=:blue); -``` - -We can give title information to each axis: - -```julia -af.title = "f" -afp.title = "fp" -afpp.title = "fpp"; -``` - -Finally, we add a label in the first row, but for illustration purposes, only use the first column. - -```julia -Label(F[1,1], """ -Plots of f and its first and second derivatives. -When the first derivative is zero, the function -f has relative extrema. When the second derivative -is zero, the function f has an inflection point. -"""); -``` - -Finally we display the figure: - -```julia -F -``` - -### Observables - -The basic components of a plot in `Makie` can be updated [interactively](https://makie.juliaplots.org/stable/documentation/nodes/index.html#observables_interaction). `Makie` uses the `Observables` package which allows complicated interactions to be modeled quite naturally. In the following we give a simple example. - - - -In Makie, an `Observable` is a structure that allows its value to be -updated, similar to an array. When changed, observables can trigger -an event. Observables can rely on other observables, so events can be -cascaded. - -This simple example shows how an observable `h` can be used to create a collection of points representing a secant line. The figure shows the value for `h=3/2`. - -```julia -f(x) = sqrt(x) -c = 1 -xs = 0..3 -h = Observable(3/2) - -points = lift(h) do h - xs = [0,c,c+h,3] - tl = x -> f(c) + (f(c+h)-f(c))/h * (x-c) - [Point2(x, tl(x)) for x ∈ xs] -end - -lines(xs, f) -lines!(points) -current_figure() -``` - -We can update the value of `h` using `setindex!` notation (square brackets). For example, to see that the secant line is a good approximation to the tangent line as ``h \rightarrow 0`` we can set `h` to be `1/4` and replot: - -```julia -h[] = 1/4 -current_figure() -``` - -The line `h[] = 1/4` updated `h` which then updated `points` (a points is lifted up from `h`) which updated the graphic. (In these notes, we replot to see the change, but in an interactive session, the current *displayed* figure would be updated; no replotting would be necessary.) - - -Finally, this example shows how to add a slider to adjust the value of `h` with a mouse. The slider object is positioned along with a label using the grid reference, as before. - - -```julia -f(x) = sqrt(x) -c = 1 -xs = 0..3 - -F = Figure() -ax = Axis(F[1,1:2]) -h = Slider(F[2,2], range = 0.01:0.01:1.5, startvalue = 1.5) -Label(F[2,1], "Adjust slider to change `h`"; - justification = :left) - -points = lift(h.value) do h - xs = [0,c,c+h,3] - tl = x-> f(c) + (f(c+h)-f(c))/h * (x-c) - [Point2(x, tl(x)) for x ∈ xs] -end - -lines!(ax, xs, f) -lines!(ax, points) -current_figure() -``` - -The slider value is "lifted" by its `value` component, as shown. Otherwise, the above is fairly similar to just using an observable for `h`. diff --git a/CwJ/alternatives/plotly_plotting.jmd b/CwJ/alternatives/plotly_plotting.jmd deleted file mode 100644 index 84eaae5..0000000 --- a/CwJ/alternatives/plotly_plotting.jmd +++ /dev/null @@ -1,667 +0,0 @@ -# JavaScript based plotting libraries - -!!! alert "Not working with quarto" - Currently, the plots generated here are not rendering within quarto. - - - -This section uses this add-on package: - -```julia -using PlotlyLight -``` - -To avoid a dependence on the `CalculusWithJulia` package, we load two utility packages: - -```julia -using PlotUtils -using SplitApplyCombine -``` - ----- - -`Julia` has different interfaces to a few JavaScript plotting libraries, notably the [vega](https://vega.github.io/vega/) and [vega-lite](https://vega.github.io/vega-lite/) through the [VegaLite.jl](https://github.com/queryverse/VegaLite.jl) package, and [plotly](https://plotly.com/javascript/) through several interfaces: `Plots.jl`, `PlotlyJS.jl`, and `PlotlyLight.jl`. These all make web-based graphics, for display through a web browser. - -The `Plots.jl` interface is a backend for the familiar `Plots` package, making the calling syntax familiar, as is used throughout these notes. The `plotly()` command, from `Plots`, switches to this backend. - -The `PlotlyJS.jl` interface offers direct translation from `Julia` structures to the underlying `JSON` structures needed by plotly, and has mechanisms to call back into `Julia` from `JavaScript`. This allows complicated interfaces to be produced. - -Here we discuss `PlotlyLight` which conveniently provides the translation from `Julia` structures to the -`JSON` structures needed in a light-weight package, which plots quickly, without the delays due to compilation of the more complicated interfaces. Minor modifications would be needed to adjust the examples to work with `PlotlyJS` or `PlotlyBase`. The documentation for the `JavaScript` [library](https://plotly.com/javascript/) provides numerous examples which can easily be translated. The [one-page-reference](https://plotly.com/javascript/reference/) gives specific details, and is quoted from below, at times. - - -This discussion covers the basic of graphing for calculus purposes. It does not cover, for example, the faceting common in statistical usages, or the chart types common in business and statistics uses. The `plotly` library is much more extensive than what is reviewed below. - -## Julia dictionaries to JSON - -`PlotlyLight` uses the `JavaScript` interface for the `plotly` libraries. Unlike more developed interfaces, like the one for `Python`, `PlotlyLight` only manages the translation from `Julia` structures to `JavaScript` structures and the display of the results. - -The key to translation is the mapping for `Julia`'s dictionaries to -the nested `JSON` structures needed by the `JavaScript` library. - - -For example, an introductory [example](https://plotly.com/javascript/line-and-scatter/) for a scatter plot includes this `JSON` structure: - -```julia; eval=false -var trace1 = { - x: [1, 2, 3, 4], - y: [10, 15, 13, 17], - mode: 'markers', - type: 'scatter' -}; -``` - -The `{}` create a list, the `[]` an Array (or vector, as it does with `Julia`), the `name:` are keys. The above is simply translated via: - -```julia; -Config(x = [1,2,3,4], - y = [10, 15, 13, 17], - mode = "markers", - type = "scatter" - ) -``` - -The `Config` constructor (from the `EasyConfig` package loaded with `PlotlyLight`) is an interface for a dictionary whose keys are symbols, which are produced by the named arguments passed to `Config`. By nesting `Config` statements, nested `JavaScript` structures can be built up. As well, these can be built on the fly using `.` notation, as in: - -```julia; hold=true -cfg = Config() -cfg.key1.key2.key3 = "value" -cfg -``` - -To produce a figure with `PlotlyLight` then is fairly straightforward: data and, optionally, a layout are created using `Config`, then passed along to the `Plot` command producing a `Plot` object which has `display` methods defined for it. This will be illustrated through the examples. - - -## Scatter plot - -A basic scatter plot of points ``(x,y)`` is created as follows: - -```julia; hold=true -xs = 1:5 -ys = rand(5) -data = Config(x = xs, - y = ys, - type="scatter", - mode="markers" - ) -Plot(data) -``` - -The symbols `x` and `y` (and later `z`) specify the data to `plotly`. Here the `mode` is specified to show markers. - - -The `type` key specifies the chart or trace type. The `mode` specification sets the drawing mode for the trace. Above it is "markers". It can be any combination of "lines", "markers", or "text" joined with a "+" if more than one is desired. - -## Line plot - -A line plot is very similar, save for a different `mode` specification: - -```julia; hold=true -xs = 1:5 -ys = rand(5) -data = Config(x = xs, - y = ys, - type="scatter", - mode="lines" - ) -Plot(data) -``` - -The difference is solely the specification of the `mode` value, for a line plot it is "lines," for a scatter plot it is "markers" The `mode` "lines+markers" will plot both. The default for the "scatter" types is to use "lines+markers" for small data sets, and "lines" for others, so for this example, `mode` could be left off. - - -### Nothing - -The line graph plays connect-the-dots with the points specified by paired `x` and `y` values. *Typically*, when and `x` value is `NaN` that "dot" (or point) is skipped. However, `NaN` doesn't pass through the JSON conversion -- `nothing` can be used. - -```julia; hold=true -data = Config( - x=[0,1,nothing,3,4,5], - y = [0,1,2,3,4,5], - type="scatter", mode="markers+lines") -Plot(data) -``` - -## Multiple plots - -More than one graph or layer can appear on a plot. The `data` argument can be a vector of `Config` values, each describing a plot. For example, here we make a scatter plot and a line plot: - -```julia; hold=true -data = [Config(x = 1:5, - y = rand(5), - type = "scatter", - mode = "markers", - name = "scatter plot"), - Config(x = 1:5, - y = rand(5), - type = "scatter", - mode = "lines", - name = "line plot") - ] -Plot(data) -``` - -The `name` argument adjusts the name in the legend referencing the plot. This is produced by default. - - -### Adding a layer - -In `PlotlyLight`, the `Plot` object has a field `data` for storing a vector of configurations, as above. After a plot is made, this field can have values pushed onto it and the corresponding layers will be rendered when the plot is redisplayed. - -For example, here we plot the graphs of both the ``\sin(x)`` and ``\cos(x)`` over ``[0,2\pi]``. We used the utility `PlotUtils.adapted_grid` to select the points to use for the graph. - -```julia; hold=true -a, b = 0, 2pi - -xs, ys = PlotUtils.adapted_grid(sin, (a,b)) -p = Plot(Config(x=xs, y=ys, name="sin")) - -xs, ys = PlotUtils.adapted_grid(cos, (a,b)) -push!(p.data, Config(x=xs, y=ys, name="cos")) - -p # to display the plot -``` - -The values for `a` and `b` are used to generate the ``x``- and ``y``-values. These can also be gathered from the existing plot object. Here is one way, where for each trace with an `x` key, the extrema are consulted to update a list of left and right ranges. - -```julia; hold=true -xs, ys = PlotUtils.adapted_grid(x -> x^5 - x - 1, (0, 2)) # answer is (0,2) -p = Plot([Config(x=xs, y=ys, name="Polynomial"), - Config(x=xs, y=0 .* ys, name="x-axis", mode="lines", line=Config(width=5))] - ) -ds = filter(d -> !isnothing(get(d, :x, nothing)), p.data) -a=reduce(min, [minimum(d.x) for d ∈ ds]; init=Inf) -b=reduce(max, [maximum(d.x) for d ∈ ds]; init=-Inf) -(a, b) -``` - - - -## Interactivity - -`JavaScript` allows interaction with a plot as it is presented within a browser. (Not the `Julia` process which produced the data or the plot. For that interaction, `PlotlyJS` may be used.) The basic *default* features are: - -* The data producing a graphic are displayed on hover using flags. -* The legend may be clicked to toggle whether the corresponding graph is displayed. -* The viewing region can be narrowed using the mouse for selection. -* The toolbar has several features for panning and zooming, as well as adjusting the information shown on hover. - -Later we will see that ``3``-dimensional surfaces can be rotated interactively. - - -## Plot attributes - -Attributes of the markers and lines may be adjusted when the data configuration is specified. A selection is shown below. Consult the reference for the extensive list. - -### Marker attributes - -A marker's attributes can be adjusted by values passed to the `marker` key. Labels for each marker can be assigned through a `text` key and adding `text` to the `mode` key. For example: - -```julia; hold=true -data = Config(x = 1:5, - y = rand(5), - mode="markers+text", - type="scatter", - name="scatter plot", - text = ["marker $i" for i in 1:5], - textposition = "top center", - marker = Config(size=12, color=:blue) - ) -Plot(data) -``` - -The `text` mode specification is necessary to have text be displayed -on the chart, and not just appear on hover. The `size` and `color` -attributes are recycled; they can be specified using a vector for -per-marker styling. Here the symbol `:blue` is used to specify a -color, which could also be a name, such as `"blue"`. - -#### RGB Colors - -The `ColorTypes` package is the standard `Julia` package providing an -`RGB` type (among others) for specifying red-green-blue colors. To -make this work with `Config` and `JSON3` requires some type-piracy -(modifying `Base.string` for the `RGB` type) to get, say, `RGB(0.5, -0.5, 0.5)` to output as `"rgb(0.5, 0.5, 0.5)"`. (RGB values in -JavaScript are integers between ``0`` and ``255`` or floating point -values between ``0`` and ``1``.) A string with this content can be -specified. Otherwise, something like the following can be used to -avoid the type piracy: - -```julia; -struct rgb - r - g - b -end -PlotlyLight.JSON3.StructTypes.StructType(::Type{rgb}) = PlotlyLight.JSON3.StructTypes.StringType() -Base.string(x::rgb) = "rgb($(x.r), $(x.g), $(x.b))" -``` - - -With these defined, red-green-blue values can be used for colors. For example to give a range of colors, we might have: - -```julia; hold=true -cols = [rgb(i,i,i) for i in range(10, 245, length=5)] -sizes = [12, 16, 20, 24, 28] -data = Config(x = 1:5, - y = rand(5), - mode="markers+text", - type="scatter", - name="scatter plot", - text = ["marker $i" for i in 1:5], - textposition = "top center", - marker = Config(size=sizes, color=cols) - ) -Plot(data) -``` - -The `opacity` key can be used to control the transparency, with a value between ``0`` and ``1``. - -#### Marker symbols - -The `marker_symbol` key can be used to set a marker shape, with the basic values being: `circle`, `square`, `diamond`, `cross`, `x`, `triangle`, `pentagon`, `hexagram`, `star`, `diamond`, `hourglass`, `bowtie`, `asterisk`, `hash`, `y`, and `line`. Add `-open` or `-open-dot` modifies the basic shape. - -```julia; hold=true -markers = ["circle", "square", "diamond", "cross", "x", "triangle", "pentagon", - "hexagram", "star", "diamond", "hourglass", "bowtie", "asterisk", - "hash", "y", "line"] -n = length(markers) -data = [Config(x=1:n, y=1:n, mode="markers", - marker = Config(symbol=markers, size=10)), - Config(x=1:n, y=2 .+ (1:n), mode="markers", - marker = Config(symbol=markers .* "-open", size=10)), - Config(x=1:n, y=4 .+ (1:n), mode="markers", - marker = Config(symbol=markers .* "-open-dot", size=10)) - ] -Plot(data) -``` - - -### Line attributes - -The `line` key can be used to specify line attributes, such as `width` (pixel width), `color`, or `dash`. - -The `width` key specifies the line width in pixels. - -The `color` key specifies the color of the line drawn. - -The `dash` key specifies the style for the drawn line. Values can be set by string from "solid", "dot", "dash", "longdash", "dashdot", or "longdashdot" or set by specifying a pattern in pixels, e.g. "5px,10px,2px,2px". - -The `shape` attribute determine how the points are connected. The default is `linear`, but other possibilities are `hv`, `vh`, `hvh`, `vhv`, `spline` for various patterns of connectivity. The following example, from the plotly documentation, shows the differences: - - -```julia; hold=true -shapes = ["linear", "hv", "vh", "hvh", "vhv", "spline"] -data = [Config(x = 1:5, y = 5*(i-1) .+ [1,3,2,3,1], mode="lines+markers", type="scatter", - name=shape, - line=Config(shape=shape) - ) for (i, shape) ∈ enumerate(shapes)] -Plot(data) -``` - -### Text - -The text associated with each point can be drawn on the chart, when "text" is included in the `mode` or shown on hover. - - -The onscreen text is passed to the `text` attribute. The [`texttemplate`](https://plotly.com/javascript/reference/scatter/#scatter-texttemplate) key can be used to format the text with details in the accompanying link. - -Similarly, the `hovertext` key specifies the text shown on hover, with [`hovertemplate`](https://plotly.com/javascript/reference/scatter/#scatter-hovertemplate) used to format the displayed text. - - - - -### Filled regions - -The `fill` key for a chart of mode `line` specifies how the area -around a chart should be colored, or filled. The specification are -declarative, with values in "none", "tozeroy", "tozerox", "tonexty", -"tonextx", "toself", and "tonext". The value of "none" is the default, unless stacked traces are used. - -In the following, to highlight the difference between ``f(x) = \cos(x)`` and ``p(x) = 1 - x^2/2`` the area from ``f`` to the next ``y`` is declared; for ``p``, the area to ``0`` is declared. - -```julia; hold=true -xs = range(-1, 1, 100) -data = [ - Config( - x=xs, y=cos.(xs), - fill = "tonexty", - fillcolor = "rgba(0,0,255,0.25)", # to get transparency - line = Config(color=:blue) - ), - Config( - x=xs, y=[1 - x^2/2 for x ∈ xs ], - fill = "tozeroy", - fillcolor = "rgba(255,0,0,0.25)", # to get transparency - line = Config(color=:red) - ) -] -Plot(data) -``` - -The `toself` declaration is used below to fill in a polygon: - -```julia; hold=true -data = Config( - x=[-1,1,1,-1,-1], y = [-1,1,-1,1,-1], - fill="toself", - type="scatter") -Plot(data) -``` - -## Layout attributes - -The `title` key sets the main title; the `title` key in the `xaxis` configuration sets the ``x``-axis title (similarly for the ``y`` axis). - - -The legend is shown when ``2`` or more charts or specified, by default. This can be adjusted with the `showlegend` key, as below. The legend shows the corresponding `name` for each chart. - -```julia; hold=true -data = Config(x=1:5, y=rand(5), type="scatter", mode="markers", name="legend label") -lyt = Config(title = "Main chart title", - xaxis = Config(title="x-axis label"), - yaxis = Config(title="y-axis label"), - showlegend=true - ) -Plot(data, lyt) -``` - -The `xaxis` and `yaxis` keys have many customizations. For example: `nticks` specifies the maximum number of ticks; `range` to set the range of the axis; `type` to specify the axis type from "linear", "log", "date", "category", or "multicategory;" and `visible` - -The aspect ratio of the chart can be set to be equal through the `scaleanchor` key, which specifies another axis to take a value from. For example, here is a parametric plot of a circle: - -```julia; hold=true -ts = range(0, 2pi, length=100) -data = Config(x = sin.(ts), y = cos.(ts), mode="lines", type="scatter") -lyt = Config(title = "A circle", - xaxis = Config(title = "x"), - yaxis = Config(title = "y", - scaleanchor = "x") - ) -Plot(data, lyt) -``` - - -#### Annotations - -Text annotations may be specified as part of the layout object. Annotations may or may not show an arrow. Here is a simple example using a vector of annotations. - -```julia; hold=true -data = Config(x = [0, 1], y = [0, 1], mode="markers", type="scatter") -layout = Config(title = "Annotations", - xaxis = Config(title="x", - range = (-0.5, 1.5)), - yaxis = Config(title="y", - range = (-0.5, 1.5)), - annotations = [ - Config(x=0, y=0, text = "(0,0)"), - Config(x=1, y=1.2, text = "(1,1)", showarrow=false) - ] - ) -Plot(data, layout) -``` - - -The following example is more complicated use of the elements previously described. It mimics an image from [Wikipedia](https://en.wikipedia.org/wiki/List_of_trigonometric_identities) for trigonometric identities. The use of ``\LaTeX`` does not seem to be supported through the `JavaScript` interface; unicode symbols are used instead. The `xanchor` and `yanchor` keys are used to position annotations away from the default. The `textangle` key is used to rotate text, as desired. - -```julia, hold=true -alpha = pi/6 -beta = pi/5 -xₘ = cos(alpha)*cos(beta) -yₘ = sin(alpha+beta) -r₀ = 0.1 - -data = [ - Config( - x = [0,xₘ, xₘ, 0, 0], - y = [0, 0, yₘ, yₘ, 0], - type="scatter", mode="line" - ), - Config( - x = [0, xₘ], - y = [0, sin(alpha)*cos(beta)], - fill = "tozeroy", - fillcolor = "rgba(100, 100, 100, 0.5)" - ), - Config( - x = [0, cos(alpha+beta), xₘ], - y = [0, yₘ, sin(alpha)*cos(beta)], - fill = "tonexty", - fillcolor = "rgba(200, 0, 100, 0.5)", - ), - Config( - x = [0, cos(alpha+beta)], - y = [0, yₘ], - line = Config(width=5, color=:black) - ) -] - -lyt = Config( - height=450, - showlegend=false, - xaxis=Config(visible=false), - yaxis = Config(visible=false, scaleanchor="x"), - annotations = [ - - Config(x = r₀*cos(alpha/2), y = r₀*sin(alpha/2), - text="α", showarrow=false), - Config(x = r₀*cos(alpha+beta/2), y = r₀*sin(alpha+beta/2), - text="β", showarrow=false), - Config(x = cos(alpha+beta) + r₀*cos(pi+(alpha+beta)/2), - y = yₘ + r₀*sin(pi+(alpha+beta)/2), - xanchor="center", yanchor="center", - text="α+β", showarrow=false), - Config(x = xₘ + r₀*cos(pi/2+alpha/2), - y = sin(alpha)*cos(beta) + r₀ * sin(pi/2 + alpha/2), - text="α", showarrow=false), - Config(x = 1/2 * cos(alpha+beta), - y = 1/2 * sin(alpha+beta), - text = "1"), - Config(x = xₘ/2*cos(alpha), y = xₘ/2*sin(alpha), - xanchor="center", yanchor="bottom", - text = "cos(β)", - textangle=-rad2deg(alpha), - showarrow=false), - Config(x = xₘ + sin(beta)/2*cos(pi/2 + alpha), - y = sin(alpha)*cos(beta) + sin(beta)/2*sin(pi/2 + alpha), - xanchor="center", yanchor="top", - text = "sin(β)", - textangle = rad2deg(pi/2-alpha), - showarrow=false), - - Config(x = xₘ/2, - y = 0, - xanchor="center", yanchor="top", - text = "cos(α)⋅cos(β)", showarrow=false), - Config(x = 0, - y = yₘ/2, - xanchor="right", yanchor="center", - text = "sin(α+β)", - textangle=-90, - showarrow=false), - Config(x = cos(alpha+beta)/2, - y = yₘ, - xanchor="center", yanchor="bottom", - text = "cos(α+β)", showarrow=false), - Config(x = cos(alpha+beta) + (xₘ - cos(alpha+beta))/2, - y = yₘ, - xanchor="center", yanchor="bottom", - text = "sin(α)⋅sin(β)", showarrow=false), - Config(x = xₘ, y=sin(alpha)*cos(beta) + (yₘ - sin(alpha)*cos(beta))/2, - xanchor="left", yanchor="center", - text = "cos(α)⋅sin(β)", - textangle=90, - showarrow=false), - Config(x = xₘ, - y = sin(alpha)*cos(beta)/2, - xanchor="left", yanchor="center", - text = "sin(α)⋅cos(β)", - textangle=90, - showarrow=false) - ] -) - -Plot(data, lyt) -``` - -## Parameterized curves - -In ``2``-dimensions, the plotting of a parameterized curve is similar to that of plotting a function. In ``3``-dimensions, an extra ``z``-coordinate is included. - -To help, we define an `unzip` function as an interface to `SplitApplyCombine`'s `invert` function: - -```julia -unzip(v) = SplitApplyCombine.invert(v) -``` - -Earlier, we plotted a two dimensional circle, here we plot the related helix. - -```julia; hold=true -helix(t) = [cos(t), sin(t), t] - -ts = range(0, 4pi, length=200) - -xs, ys, zs = unzip(helix.(ts)) - -data = Config(x=xs, y=ys, z=zs, - type = "scatter3d", # <<- note the 3d - mode = "lines", - line=(width=2, - color=:red) - ) - -Plot(data) -``` - -The main difference is the chart type, as this is a ``3``-dimensional plot, "scatter3d" is used. - -### Quiver plots - -There is no `quiver` plot for `plotly` using JavaScript. In ``2``-dimensions a text-less annotation could be employed. In ``3``-dimensions, the following (from [stackoverflow.com](https://stackoverflow.com/questions/43164909/plotlypython-how-to-plot-arrows-in-3d)) is a possible workaround where a line segment is drawn and capped with a small cone. Somewhat opaquely, we use `NamedTuple` for an iterator to create the keys for the data below: - - -```julia; hold=true -helix(t) = [cos(t), sin(t), t] -helix′(t) = [-sin(t), cos(t), 1] -ts = range(0, 4pi, length=200) -xs, ys, zs = unzip(helix.(ts)) -helix_trace = Config(; NamedTuple(zip((:x,:y,:z), unzip(helix.(ts))))..., - type = "scatter3d", # <<- note the 3d - mode = "lines", - line=(width=2, - color=:red) - ) - -tss = pi/2:pi/2:7pi/2 -rs, r′s = helix.(tss), helix′.(tss) - -arrows = [ - Config(x = [x[1], x[1]+x′[1]], - y = [x[2], x[2]+x′[2]], - z = [x[3], x[3]+x′[3]], - mode="lines", type="scatter3d") - for (x, x′) ∈ zip(rs, r′s) -] - -tips = rs .+ r′s -lengths = 0.1 * r′s - -caps = Config(; - NamedTuple(zip([:x,:y,:z], unzip(tips)))..., - NamedTuple(zip([:u,:v,:w], unzip(lengths)))..., - type="cone", anchor="tail") - -data = vcat(helix_trace, arrows, caps) - -Plot(data) -``` - -If several arrows are to be drawn, it might be more efficient to pass multiple values in for the `x`, `y`, ... values. They expect a vector. In the above, we create ``1``-element vectors. - -## Contour plots - -A contour plot is created by the "contour" trace type. The data is prepared as a vector of vectors, not a matrix. The following has the interior vector corresponding to slices ranging over ``x`` for a fixed ``y``. With this, the construction is straightforward using a comprehension: - -```julia; hold=true -f(x,y) = x^2 - 2y^2 - -xs = range(0,2,length=25) -ys = range(0,2, length=50) -zs = [[f(x,y) for x in xs] for y in ys] - -data = Config( - x=xs, y=ys, z=zs, - type="contour" -) - -Plot(data) -``` - - -The same `zs` data can be achieved by broadcasting and then collecting as follows: - -```julia; hold=true -f(x,y) = x^2 - 2y^2 - -xs = range(0,2,length=25) -ys = range(0,2, length=50) -zs = collect(eachrow(f.(xs', ys))) - -data = Config( - x=xs, y=ys, z=zs, - type="contour" -) - -Plot(data) -``` - -The use of just `f.(xs', ys)` or `f.(xs, ys')`, as with other plotting packages, is not effective, as `JSON3` writes matrices as vectors (with linear indexing). - -## Surface plots - -The chart type "surface" allows surfaces in ``3`` dimensions to be plotted. - -### Surfaces defined by ``z = f(x,y)`` - -Surfaces defined through a scalar-valued function are drawn quite naturally, save for needing to express the height data (``z`` axis) using a vector of vectors, and not a matrix. - -```julia; hold=true -peaks(x,y) = 3 * (1-x)^2 * exp(-(x^2) - (y+1)^2) - - 10*(x/5 - x^3 - y^5) * exp(-x^2-y^2) - 1/3 * exp(-(x+1)^2 - y^2) - -xs = range(-3,3, length=50) -ys = range(-3,3, length=50) -zs = [[peaks(x,y) for x in xs] for y in ys] - -data = Config(x=xs, y=ys, z=zs, - type="surface") - -Plot(data) -``` - -### Parametrically defined surfaces - -For parametrically defined surfaces, the ``x`` and ``y`` values also correspond to matrices. Her we see a pattern to plot a torus. The [`aspectmode`](https://plotly.com/javascript/reference/layout/scene/#layout-scene-aspectmode) instructs the scene's axes to be drawn in proportion with the axes' ranges. - -```julia; hold=true -r, R = 1, 5 -X(theta,phi) = [(r*cos(theta)+R)*cos(phi), (r*cos(theta)+R)*sin(phi), r*sin(theta)] - -us = range(0, 2pi, length=25) -vs = range(0, pi, length=25) - -xs = [[X(u,v)[1] for u in us] for v in vs] -ys = [[X(u,v)[2] for u in us] for v in vs] -zs = [[X(u,v)[3] for u in us] for v in vs] - -data = Config( - x = xs, y = ys, z = zs, - type="surface", - mode="scatter3d" -) - -lyt = Config(scene=Config(aspectmode="data")) - -Plot(data, lyt) -``` diff --git a/CwJ/alternatives/symbolics.jmd b/CwJ/alternatives/symbolics.jmd deleted file mode 100644 index 5a43022..0000000 --- a/CwJ/alternatives/symbolics.jmd +++ /dev/null @@ -1,806 +0,0 @@ -# Symbolics.jl - -There are a few options in `Julia` for symbolic math, for example, the `SymPy` package which wraps a Python library. This section describes a collection of native `Julia` packages providing many features of symbolic math. - -## About - -The `Symbolics` package bills itself as a "fast and modern Computer Algebra System (CAS) for a fast and modern programming language." This package relies on the `SymbolicUtils` package and is built upon by the `ModelingToolkit` package, which is only briefly touched on here. - - -We begin by loading the `Symbolics` package which when loaded re-exports the `SymbolicUtils` package. - -```julia -using Symbolics -``` - -## Symbolic variables - -Symbolic math at its core involves symbolic variables, which essentially defer evaluation until requested. The creation of symbolic variables differs between the two packages discussed here. - -`SymbolicUtils` creates variables which carry `Julia` type information (e.g. `Int`, `Float64`, ...). This type information carries through operations involving these variables. Symbolic variables can be created with the `@syms` macro. For example - -```julia -@syms x y::Int f(x::Real)::Real -``` - -This creates `x` a symbolic value with symbolic type `Number`, `y` a symbolic variable holding integer values, and `f` a symbolic function of a single real variable outputting a real variable. - -The non-exported `symtype` function reveals the underlying type: - -```julia -import Symbolics.SymbolicUtils: symtype - -symtype(x), symtype(y) -``` - -For `y`, the symbolic type being real does not imply the type of `y` is a subtype of `Real`: - -```julia -isa(y, Real) -``` - - -We see that the function `f` when called with `y` would return a value of (symbolic) type `Real`: - -```julia -f(y) |> symtype -``` - -As the symbolic type of `x` is `Number` -- which is not a subtype of `Real` -- the following will error: - -```julia; error=true -f(x) -``` - - - -!!! note - The `SymPy` package also has an `@syms` macro to create variables. Though their names agree, they do different things. Using both packages together would require qualifying many shared method names. For `SymbolicUtils`, the `@syms` macro uses `Julia` types to parameterize the variables. In `SymPy` it is possible to specify *assumptions* on the variables, but that is different and not useful for dispatch without some extra effort. - -### Variables in Symbolics - -For `Symbolics`, symbolic variables are created using a wrapper around an underlying `SymbolicUtils` object. This wrapper, `Num`, is a subtype of `Real` (the underlying `SymbolicUtils` object may have symbolic type `Real`, but it won't be a subtype of `Real`.) - -Symbolic values are created with the `@variables` macro. For example: - -```julia -@variables x y::Int z[1:3]::Int f(..)::Int -``` - -This creates -* a symbolic value `x` of `symtype` `Real` -* a symbolic value `y` of `symtype` `Int` -* a vector of symbolic values each of `symtype` `Int` -* a symbolic function `f` returning an object of `symtype` `Int` - -The symbolic type reflects that of the underlying object behind the `Num` wrapper: - -```julia -typeof(x), symtype(x), typeof(Symbolics.value(x)) -``` - (The `value` method unwraps the `Num` wrapper.) - -### Variables in ModelingToolkit - -The `ModelingToolkit` package has a slightly different declaration for variables, and is described next. First the package is loaded - -```julia -using ModelingToolkit -``` - -`ModelingToolkit` re-exports all of the `Symbolics` package when loaded. - -The role of `ModelingToolkit` is that "it allows for users to give a high-level description of a model for symbolic preprocessing to analyze and enhance the model." This symbolic description allows for variables to be identified as "parameters" or "variables". For example, to parameterize a quadratic equation: - -```julia -@parameters a b c -@variables x -y = a*x^2 + b*x + c -``` - -The numeric solution of the quadratic equation (solving for ``y=0``) would involved specifying values for the parameters and then numerically solving. This separation of parameters and variables is similar to the `f(x, p)` pattern of function definition. - -The typical usage is multi-variable. This example is from the package's documentationto describe a differential equation: - -```julia -@parameters t σ ρ β -@variables x(t) y(t) z(t) -D = Differential(t) -``` - -The `D` will be described a bit later, but it formally specifies a derivative in the `t` variable. The `x(t)`, `y(t)`, `z(t)` are symbolic functions of `t`, so expressions like `D(x)` below mean the time derivative of an unknown function `x`. Here are how the Lorenz equation equations are specified: - -```julia -eqs = [D(x) ~ σ * (y - x), - D(y) ~ x * (ρ - z) - y, - D(z) ~ x * y - β * z] -``` - -## Symbolic expressions - -Symbolic expressions are built up from symbolic variables through natural `Julia` idioms. `SymbolicUtils` privileges a few key operations: `Add`, `Mul`, `Pow`, and `Div`. For examples: - -```julia -@syms x y -typeof(x + y) # `Add` -``` - -```julia -typeof(x * y) # `Mul` -``` - -Whereas, applying a function leaves a different type: - -```julia -typeof(sin(x)) -``` - -The `Term` wrapper just represents the effect of calling a function (in this case `sin`) on its arguments (in this case `x`). - -This happens in the background with symbolic variables in `Symbolics`: - -```julia -@variables x -typeof(sin(x)), typeof(Symbolics.value(sin(x))) -``` - -### Tree structure to expressions - -The `TermInterface` package is used by `SymbolicUtils` to explore the tree structure of an expression. The main methods are (cf. [SymbolicUtils.jl](https://symbolicutils.juliasymbolics.org/#expression_interface)): - -* `istree(ex)`: `true` if `ex` is not a *leaf* node (like a symbol or numeric literal) -* `operation(ex)`: the function being called (if `istree` returns `true`) -* `arguments(ex)`: the arguments to the function begin called -* `symtype(ex)`: the inferred type of the expression - -In addition, the `issym` function, to determine if `x` is of type `Sym`, is useful to distinguish *leaf* nodes, as will be illustrated below. - -These methods can be used to "walk" the tree: - -```julia -@syms x y -ex = 1 + x^2 + y -operation(ex) # the outer function is `+` -``` - - -```julia -arguments(ex) # `+` is n-ary, in this case with 3 arguments -``` - -```julia -ex1 = arguments(ex)[3] # terms have been reordered -operation(ex1) # operation for `x^2` is `^` -``` - -```julia -a, b = arguments(ex1) -``` - -```julia -istree(ex1), istree(a) -``` - -Here `a` is not a "tree", as it has no operation or arguments, it is just a variable (the `x` variable). - -The value of `symtype` is the *inferred* type of an expression, which may not match the actual type. For example, - -```julia -@variables x::Int -symtype(x), symtype(sin(x)), symtype(x/x), symtype(x / x^2) -``` - -The last one, is not likely to be an integer, but that is the inferred type in this case. - -##### Example - -As an example, we write a function to find the free symbols in a symbolic expression comprised of `SymbolicUtils` variables. (The `Symbolics.get_variables` also does this task.) To find the symbols involves walking the expression tree until a leaf node is found and then adding that to our collection if it matches `issym`. - -```julia -import Symbolics.SymbolicUtils: issym -free_symbols(ex) = (s=Set(); free_symbols!(s, ex); s) -function free_symbols!(s, ex) - if istree(ex) - for a ∈ arguments(ex) - free_symbols!(s, a) - end - else - issym(ex) && push!(s, ex) # push new symbol onto set - end -end -``` - -```julia -@syms x y z -ex = sin(x + 1)*cos(z) -free_symbols(ex) -``` - -## Expression manipulation - -### Substitute - -The `substitute` command is used to replace values with other values. For example: - -```julia -@variables x y z -ex = 1 + x + x^2/2 + x^3/6 -substitute(ex, x=>1) -``` - -This defines a symbolic expression, then substitutes the value `1` in for `x`. The `Pair` notation is useful for a *single* substitution. When there is more than one substitution, a dictionary is used: - -```julia -w = x^3 + y^3 - 2z^3 -substitute(w, Dict(x=>2, y=>3)) -``` - -The `fold` argument can be passed `false` to inhibit evaluation of values. Compare: - -```julia -ex = 1 + sqrt(x) -substitute(ex, x=>2), substitute(ex, x=>2, fold=false) -``` - -Or - -```julia -ex = sin(x) -substitute(ex, x=>π), substitute(ex, x=>π, fold=false) -``` - -### Simplify - -Algebraic operations with symbolic values can involve an exponentially increasing number of terms. As such, some simplification rules are applied after an operation to reduce the complexity of the computed value. - -For example, `0+x` should simplify to `x`, as well `1*x`, `x^0`, or `x^1` should each simplify, to some natural answer. - -`SymbolicUtils` also [simplifies](https://symbolicutils.juliasymbolics.org/#simplification) several other expressions, including: - -* `-x` becomes `(-1)*x` -* `x * x` becomes `x^2` (and `x^n` if more terms). Meaning this expression is represented as a power, not a product -* `x + x` becomes `2*x` (and `n*x` if more terms). Similarly, this represented as a product, not a sum. -* `p/q * x` becomes `(p*x)/q)`, similarly `p/q * x/y` becomes `(p*x)/(q*y)`. (Division wraps multiplication.) - -In `SymbolicUtils`, this *rewriting* is accomplished by means of *rewrite rules*. The package makes it easy to apply user-written rewrite rules. - -### Rewriting - -Many algebraic simplifications are done by the `simplify` command. For example, the basic trigonometric identities are applied: - -```julia -@variables x -ex = sin(x)^2 + cos(x)^2 -ex, simplify(ex) -``` - -The `simplify` function applies a series of rewriting rule until the expression stabilizes. The rewrite rules can be user generated, if desired. For example, the Pythagorean identity of trigonometry, just used, can be implement with this rule: - -```julia -r = @acrule(sin(~x)^2 + cos(~x)^2 => one(~x)) -ex |> Symbolics.value |> r |> Num -``` - -The rewrite rule, `r`, is defined by the `@acrule` macro. The `a` is for associative, the `c` for commutative, assumptions made by the macro. (The `c` means `cos(x)^2 + sin(x)^2` will also simplify.) Rewrite rules are called on the underlying `SymbolicUtils` expression, so we first unwrap, then after re-wrap. - -The above expression for `r` is fairly easy to appreciate. The value `~x` matches the same variable or expression. So the above rule will also simplify more complicated expressions: - -```julia -@variables y z -ex1 = substitute(ex, x => sin(x + y + z)) -ex1 |> Symbolics.value |> r |> Num -``` - -Rewrite rules when applied return the rewritten expression, if there is a match, or `nothing`. - -Rules involving two values are also easily created. This one, again, comes from the set of simplifications defined for trigonometry and exponential simplifications: - -```julia -r = @rule(exp(~x)^(~y) => exp(~x * ~y)) # (e^x)^y -> e^(x*y) -ex = exp(-x+z)^y -ex, ex |> Symbolics.value |> r |> Num -``` - -This rule is not commutative or associative, as `x^y` is not the same as `y^x` and `(x^y)^z` is not `x^(y^z)` in general. - - -The application of rules can be filtered through qualifying predicates. This artificial example uses `iseven` which returns `true` for even numbers. Here we subtract `1` when a number is not even, and otherwise leave the number alone. We do this with two rules: - -```julia -reven = @rule ~x::iseven => ~x -rodd = @rule ~x::(!iseven) => ~x - 1 -r = SymbolicUtils.Chain([rodd, reven]) -r(2), r(3) -``` - -The `Chain` function conveniently allows the sequential application of rewrite rules. - - -The notation `~x` is called a "slot variable" in the [documentation](https://symbolicutils.juliasymbolics.org/rewrite/) for `SymbolicUtils`. It matches a single expression. To match more than one expression, a "segment variable", denoted with two `~`s is used. - -### Creating functions - -By utilizing the tree-like nature of a symbolic expression, a `Julia` expression can be built from an symbolic expression easily enough. The `Symbolics.toexpr` function does this: - -```julia -ex = exp(-x + z)^y -Symbolics.toexpr(ex) -``` - -This output shows an internal representation of the steps for computing the value `ex` given different inputs. (The number `(-1)` multiplies `x`, this is added to `z` and the result passed to `exp`. That values is then used as the base for `^` with exponent `y`.) - -Such `Julia` expressions are one step away from building `Julia` functions for evaluating symbolic expressions fast (though with some technical details about "world age" to be reckoned with). The `build_function` function with the argument `expression=Val(false)` will compile a `Julia` function: - -```julia -h = build_function(ex, x, y, z; expression=Val(false)) -h(1, 2, 3) -``` - -The above is *similar* to substitution: - -```julia -substitute(ex, Dict(x=>1, y=>2, z=>3)) -``` - -However, `build_function` will be **significantly** more performant, which when many function calls are used -- such as with plotting -- is a big advantage. - -!!! note - The documentation colorfully says "`build_function` is kind of like if `lambdify` (from `SymPy`) ate its spinach." - -The above, through passing ``3`` variables after the expression, creates a function of ``3`` variables. Functions of a vector of inputs can also be created, just by expressing the variables in that manner: - -```julia -h1 = build_function(ex, [x, y, z]; expression=Val(false)) -h1([1, 2, 3]) # not h1(1,2,3) -``` - -##### Example - -As an example, here we use the `Roots` package to find a zero of a function defined symbolically: - -```julia -import Roots -@variables x -ex = x^5 - x - 1 -λ = build_function(ex, x; expression=Val(false)) -Roots.find_zero(λ, (1, 2)) -``` - -### Plotting - -Using `Plots`, the plotting of symbolic expressions is similar to the plotting of a function, as there is a plot recipe that converts the expression into a function via `build_function`. - -For example, - -```julia -using Plots -@variables x -plot(x^x^x, 0, 2) -``` - -A parametric plot is easily defined: - -```julia -plot(sin(x), cos(x), 0, pi/4) -``` - -Expressions to be plotted can represent multivariate functions. - -```julia -@variables x y -ex = 3*(1-x)^2*exp(-x^2 - (y+1)^2) - 10(x/5-x^3-y^5)*exp(-x^2-y^2) - 1/3*exp(-(x+1)^2-y^2) -xs = ys = range(-5, 5, length=100) -surface(xs, ys, ex) -``` - -The ordering of the variables is determined by `Symbolics.get_variables`: - -```julia -Symbolics.get_variables(ex) -``` - - - -### Polynomial manipulations - -There are some facilities for manipulating polynomial expressions in `Symbolics`. A polynomial, mathematically, is an expression involving one or more symbols with coefficients from a collection that has, at a minimum, addition and multiplication defined. The basic building blocks of polynomials are *monomials*, which are comprised of products of powers of the symbols. Mathematically, monomials are often allowed to have a multiplying coefficient and may be just a coefficient (if each symbol is taken to the power ``0``), but here we consider just expressions of the type ``x_1^{a_1} \cdot x_2^{a_2} \cdots x_k^{a_k}`` with the ``a_i > 0`` as monomials. - -With this understanding, then an expression can be broken up into monomials with a possible leading coefficient (possibly ``1``) *and* terms which are not monomials (such as a constant or a more complicated function of the symbols). This is what is returned by the `polynomial_coeffs` function. - -For example - -```julia -@variables a b c x -d, r = polynomial_coeffs(a*x^2 + b*x + c, (x,)) -``` - -The first term output is dictionary with keys which are the monomials and with values which are the coefficients. The second term, the residual, is all the remaining parts of the expression, in this case just the constant `c`. - -The expression can then be reconstructed through - -```julia -r + sum(v*k for (k,v) ∈ d) -``` - -The above has `a,b,c` as parameters and `x` as the symbol. This separation is designated by passing the desired polynomial symbols to `polynomial_coeff` as an iterable. (Above as a ``1``-element tuple.) - -More complicated polynomials can be similarly decomposed: - -```julia -@variables a b c x y z -ex = a*x^2*y*z + b*x*y^2*z + c*x*y*z^2 -d, r = polynomial_coeffs(ex, (x, y, z)) -``` - -The (sparse) decomposition of the polynomial is returned through `d`. The same pattern as above can be used to reconstruct the expression. -To extract the coefficient for a monomial term, indexing can be used. Of note, is an expression like `x^2*y*z` could *possibly* not equal the algebraically equal `x*y*z*x`, as they are only equal after some simplification, but the keys are in a canonical form, so this is not a concern: - -```julia -d[x*y*z*x], d[z*y*x^2] -``` - -The residual term will capture any non-polynomial terms: - -```julia -ex = sin(x) - x + x^3/6 -d, r = polynomial_coeffs(ex, (x,)) -r -``` - -To find the degree of a monomial expression, the `degree` function is available. Here it is applied to each monomial in `d`: - -```julia -[degree(k) for (k,v) ∈ d] -``` - -The `degree` function will also identify the degree of more complicated terms: - -```julia -degree(1 + x + x^2) -``` - -Mathematically the degree of the ``0`` polynomial may be ``-1`` or undefined, but here it is ``0``: - -```julia -degree(0), degree(1), degree(x), degree(x^a) -``` - -The coefficients are returned as *values* of a dictionary, and dictionaries are unsorted. To have a natural map between polynomials of a single symbol in the standard basis and a vector, we could use a pattern like this: - -```julia -@variables x a0 as[1:10] -p = a0 + sum(as[i]*x^i for i ∈ eachindex(collect(as))) -d, r = polynomial_coeffs(p, (x,)) -d -``` - -To sort the values we can use a pattern like the following: - -```julia -vcat(r, [d[k] for k ∈ sort(collect(keys(d)), by=degree)]) -``` - ----- - -Rational expressions can be decomposed into a numerator and denominator using the following idiom, which ensures the outer operation is division (a binary operation): - -```julia -@variables x -ex = (1 + x + x^2) / (1 + x + x^2 + x^3) -function nd(ex) - ex1 = Symbolics.value(ex) - (operation(ex1) == /) || return (ex, one(ex)) - Num.(arguments(ex1)) -end -nd(ex) -``` - -With this, the study of asymptotic behaviour of a univariate rational expression would involve an investigation like the following: - -```julia -m,n = degree.(nd(ex)) -m > n ? "limit is infinite" : m < n ? "limit is 0" : "limit is a constant" -``` - -### Vectors and matrices - -Symbolic vectors and matrices can be created with a specified size: - -```julia -@variables v[1:3] M[1:2, 1:3] N[1:3, 1:3] -``` - -Computations, like finding the determinant below, are lazy unless the values are `collect`ed: - -```julia -using LinearAlgebra -det(N) -``` - -```julia -det(collect(N)) -``` - -Similarly, with `norm`: - -```julia -norm(v) -``` - -and - -```julia -norm(collect(v)) -``` - -Matrix multiplication is also deferred, but the size compatability of the matrices and vectors is considered early: - -```julia -M*N, N*N, M*v -``` - -This errors, as the matrix dimensions are not compatible for multiplication: - -```julia; error=true -N*M -``` - -Similarly, linear solutions can be symbolically specified: - -```julia -@variables R[1:2, 1:2] b[1:2] -R \ b -``` - -```julia -collect(R \ b) -``` - - - -### Algebraically solving equations - -The `~` operator creates a symbolic equation. For example - -```julia -@variables x y -x^5 - x ~ 1 -``` - -or - -```julia -eqs = [5x + 2y, 6x + 3y] .~ [1, 2] -``` - -The `Symbolics.solve_for` function can solve *linear* equations. For example, - -```julia -Symbolics.solve_for(eqs, [x, y]) -``` - -The coefficients can be symbolic. Two examples could be: - -```julia -@variables m b x y -eq = y ~ m*x + b -Symbolics.solve_for(eq, x) -``` - - -```julia -@variables a11 a12 a22 x y b1 b2 -R,X,b = [a11 a12; 0 a22], [x; y], [b1, b2] -eqs = R*X .~ b -``` - -```julia -Symbolics.solve_for(eqs, [x,y]) -``` - -### Limits - -As of writing, there is no extra functionality provided by `Symbolics` for computing limits. - -### Derivatives - -`Symbolics` provides the `derivative` function to compute the derivative of a function with respect to a variable: - -```julia -@variables a b c x -y = a*x^2 + b*x + c -yp = Symbolics.derivative(y, x) -``` - -Or to find a critical point: - -```julia -Symbolics.solve_for(yp ~ 0, x) # linear equation to solve -``` - - -The derivative computation can also be broken up into an expression indicating the derivative and then a function to apply the derivative rules: - -```julia -D = Differential(x) -D(y) -``` - -and then - -```julia -expand_derivatives(D(y)) -``` - - -Using `Differential`, differential equations can be specified. An example was given in [ODEs](../ODEs/differential_equations.html), using `ModelingToolkit`. - -Higher order derivatives can be done through composition: - -```julia -D(D(y)) |> expand_derivatives -``` - - -Differentials can also be multiplied to create operators for taking higher-order derivatives: - -```julia -@variables x y -ex = (x - y^2)/(x^2 + y^2) -Dx, Dy = Differential(x), Differential(y) -Dxx, Dxy, Dyy = Dx*Dx, Dx*Dy, Dy*Dy -[Dxx(ex) Dxy(ex); Dxy(ex) Dyy(ex)] .|> expand_derivatives -``` - -In addition to `Symbolics.derivative` there are also the helper functions, such as `hessian` which performs the above - -```julia -Symbolics.hessian(ex, [x,y]) -``` - -The `gradient` function is also defined - -```julia -@variables x y z -ex = x^2 - 2x*y + z*y -Symbolics.gradient(ex, [x, y, z]) -``` - -The `jacobian` function takes an array of expressions: - -```julia -@variables x y -eqs = [ x^2 - y^2, 2x*y] -Symbolics.jacobian(eqs, [x,y]) -``` - - -### Integration - -The `SymbolicNumericIntegration` package provides a means to integrate *univariate* expressions through its `integrate` function. - - -Symbolic integration can be approached in different ways. SymPy implements part of the Risch algorithm in addition to other algorithms. Rules-based algorithms could also be implemented. - -For a trivial example, here is a rule that could be used to integrate a single integral - -```julia -@syms x ∫(x) - -is_var(x) = (xs = Symbolics.get_variables(x); length(xs) == 1 && xs[1] === x) -r = @rule ∫(~x::is_var) => x^2/2 - -r(∫(x)) -``` - - -The `SymbolicNumericIntegration` package includes many more predicates for doing rules-based integration, but it primarily approaches the task in a different manner. - -If ``f(x)`` is to be integrated, a set of *candidate* answers is generated. The following is **proposed** as an answer: ``\sum q_i \Theta_i(x)``. Differentiating the proposed answer leads to a *linear system of equations* that can be solved. - -The example in the [paper](https://arxiv.org/pdf/2201.12468v2.pdf) describing the method is with ``f(x) = x \sin(x)`` and the candidate thetas are ``{x, \sin(x), \cos(x), x\sin(x), x\cos(x)}`` so that the propose answer is: - -```math -\int f(x) dx = q_1 x + q_2 \sin(x) + q_3 \cos(x) + q_4 x \sin(x) + q_4 x \cos(x) -``` - -We differentiate the right hand side: - -```julia -@variables q[1:5] x -ΣqᵢΘᵢ = dot(collect(q), (x, sin(x), cos(x), x*sin(x), x*cos(x))) -simplify(Symbolics.derivative(ΣqᵢΘᵢ, x)) -``` - -This must match ``x\sin(x)`` so we have by -equating coefficients of the respective terms: - -```math -q_2 + q_5 = 0, \quad q_4 = 0, \quad q_1 = 0, \quad q_3 = 0, \quad q_5 = -1 -``` - -That is ``q_2=1``, ``q_5=-1``, and the other coefficients are ``0``, giving -an answer computed with: - -```julia -d = Dict(q[i] => v for (i,v) ∈ enumerate((0,1,0,0,-1))) -substitute(ΣqᵢΘᵢ, d) -``` - -The package provides an algorithm for the creation of candidates and the means to solve when possible. The `integrate` function is the main entry point. It returns three values: `solved`, `unsolved`, and `err`. The `unsolved` is the part of the integrand which can not be solved through this package. It is `0` for a given problem when `integrate` is successful in identifying an antiderivative, in which case `solved` is the answer. The value of `err` is a bound on the numerical error introduced by the algorithm. - -To see, we have: - -```julia -using SymbolicNumericIntegration -@variables x - -integrate(x * sin(x)) -``` - -The second term is `0`, as this integrand has an identified antiderivative. - -```julia -integrate(exp(x^2) + sin(x)) -``` - -This returns `exp(x^2)` for the unsolved part, as this function has no simple antiderivative. - -Powers of trig functions have antiderivatives, as can be deduced using integration by parts. When the fifth power is used, there is a numeric aspect to the algorithm that is seen: - -```julia -u,v,w = integrate(sin(x)^5) -``` - -The derivative of `u` matches up to some numeric tolerance: - -```julia -Symbolics.derivative(u, x) - sin(x)^5 -``` - ----- - -The integration of rational functions (ratios of polynomials) can be done algorithmically, provided the underlying factorizations can be identified. The `SymbolicNumericIntegration` package has a function `factor_rational` that can identify factorizations. - -```julia -import SymbolicNumericIntegration: factor_rational -@variables x -u = (1 + x + x^2)/ (x^2 -2x + 1) -v = factor_rational(u) -``` - -The summands in `v` are each integrable. We can see that `v` is a reexpression through - -```julia -simplify(u - v) -``` - -The algorithm is numeric, not symbolic. This can be seen in these two factorizations: - -```julia -u = 1 / expand((x^2-1)*(x-2)^2) -v = factor_rational(u) -``` - -or - -```julia -u = 1 / expand((x^2+1)*(x-2)^2) -v = factor_rational(u) -``` - -As such, the integrals have numeric differences from their mathematical counterparts: - -```julia -a,b,c = integrate(u) -``` - -We can see a bit of how much through the following, which needs a tolerance set to identify the rational numbers of the mathematical factorization correctly: - -```julia -cs = [first(arguments(term)) for term ∈ arguments(a)] # pick off coefficients -``` - -```julia -rationalize.(cs; tol=1e-8) -``` diff --git a/CwJ/derivatives/Project.toml b/CwJ/derivatives/Project.toml deleted file mode 100644 index f6a9189..0000000 --- a/CwJ/derivatives/Project.toml +++ /dev/null @@ -1,17 +0,0 @@ -[deps] -DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" -EllipsisNotation = "da5c29d0-fa7d-589e-88eb-ea29b0a81949" -ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" -ImplicitPlots = "55ecb840-b828-11e9-1645-43f4a9f9ace7" -IntervalArithmetic = "d1acc4aa-44c8-5952-acd4-ba5d80a2a253" -IntervalConstraintProgramming = "138f1668-1576-5ad7-91b9-7425abbf3153" -LaTeXStrings = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f" -MDBM = "dd61e66b-39ce-57b0-8813-509f78be4b4d" -Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" -Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7" -QuadGK = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" -Roots = "f2b01f46-fcfa-551c-844a-d8ac1e96c665" -SymPy = "24249f21-da20-56a4-8eb1-6a02cf4ae2e6" -TaylorSeries = "6aa5eb33-94cf-58f4-a9d0-e4b2c4fc25ea" -TermInterface = "8ea1fca8-c5ef-4a55-8b96-4e9afe9c9a3c" -Unitful = "1986cc42-f94f-5a68-af5c-568840ba703d" diff --git a/CwJ/derivatives/curve_sketching.jmd b/CwJ/derivatives/curve_sketching.jmd deleted file mode 100644 index b41065a..0000000 --- a/CwJ/derivatives/curve_sketching.jmd +++ /dev/null @@ -1,543 +0,0 @@ -# Curve Sketching - -This section uses the following add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using Roots -using Polynomials # some name clash with SymPy -``` - - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -fig_size=(800, 600) -const frontmatter = ( - title = "Curve Sketching", - description = "Calculus with Julia: Curve Sketching", - tags = ["CalculusWithJulia", "derivatives", "curve sketching"], -); -nothing -``` - ----- - -The figure illustrates a means to *sketch* a sine curve - identify as -many of the following values as you can: - -* asymptotic behaviour (as ``x \rightarrow \pm \infty``), -* periodic behaviour, -* vertical asymptotes, -* the $y$ intercept, -* any $x$ intercept(s), -* local peaks and valleys (relative extrema). -* concavity - -With these, a sketch fills in between the -points/lines associated with these values. - - -```julia; hold=true; echo=false; cache=true -### {{{ sketch_sin_plot }}} - - -function sketch_sin_plot_graph(i) - f(x) = 10*sin(pi/2*x) # [0,4] - deltax = 1/10 - deltay = 5/10 - - zs = find_zeros(f, 0-deltax, 4+deltax) - cps = find_zeros(D(f), 0-deltax, 4+deltax) - xs = range(0, stop=4*(i-2)/6, length=50) - if i == 1 - ## plot zeros - title = "Plot the zeros" - p = scatter(zs, 0*zs, title=title, xlim=(-deltax,4+deltax), ylim=(-10-deltay,10+deltay), legend=false) - elseif i == 2 - ## plot extrema - title = "Plot the local extrema" - p = scatter(zs, 0*zs, title=title, xlim=(-deltax,4+deltax), ylim=(-10-deltay,10+deltay), legend=false) - scatter!(p, cps, f.(cps)) - else - ## sketch graph - title = "sketch the graph" - p = scatter(zs, 0*zs, title=title, xlim=(-deltax,4+deltax), ylim=(-10-deltay,10+deltay), legend=false) - scatter!(p, cps, f.(cps)) - plot!(p, xs, f.(xs)) - end - p -end - - -caption = L""" - -After identifying asymptotic behaviours, -a curve sketch involves identifying the $y$ intercept, if applicable; the $x$ intercepts, if possible; the local extrema; and changes in concavity. From there a sketch fills in between the points. In this example, the periodic function $f(x) = 10\cdot\sin(\pi/2\cdot x)$ is sketched over $[0,4]$. - -""" - - - -n = 8 -anim = @animate for i=1:n - sketch_sin_plot_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - -Though this approach is most useful for hand-sketches, the underlying -concepts are important for properly framing graphs made with the -computer. - -We can easily make a graph of a function over a specified -interval. What is not always so easy is to pick an interval that shows -off the features of interest. In the section on -[rational](../precalc/rational_functions.html) functions there was a -discussion about how to draw graphs for rational functions so that -horizontal and vertical asymptotes can be seen. These are properties -of the "large." In this section, we build on this, but concentrate now -on more local properties of a function. - -##### Example - -Produce a graph of the function $f(x) = x^4 -13x^3 + 56x^2-92x + 48$. - -We identify this as a fourth-degree polynomial with postive leading -coefficient. Hence it will eventually look $U$-shaped. If we graph -over a too-wide interval, that is all we will see. Rather, we do some -work to produce a graph that shows the zeros, peaks, and valleys of -$f(x)$. To do so, we need to know the extent of the zeros. We can try -some theory, but instead we just guess and if that fails, will work harder: - -```julia; -f(x) = x^4 - 13x^3 + 56x^2 -92x + 48 -rts = find_zeros(f, -10, 10) -``` - -As we found $4$ roots, we know by the fundamental theorem of algebra we have them all. This means, our graph need not focus on values much larger than $6$ or much smaller than $1$. - -To know where the peaks and valleys are, we look for the critical points: - -```julia; -cps = find_zeros(f', 1, 6) -``` - -Because we have the $4$ distinct zeros, we must have the peaks and -valleys appear in an interleaving manner, so a search over $[1,6]$ -finds all three critical points and without checking, they must -correspond to relative extrema. - -Next we identify the *inflection points* which are among the zeros of the second derivative (when defined): - -```julia -ips = find_zeros(f'', 1, 6) -``` - -If there is no sign change for either ``f'`` or ``f''`` over ``[a,b]`` then the sketch of ``f`` on this interval must be one of: - -* increasing and concave up (if ``f' > 0`` and ``f'' > 0``) -* increasing and concave down (if ``f' > 0`` and ``f'' < 0``) -* decreasing and concave up (if ``f' < 0`` and ``f'' > 0``) -* decreasing and concave down (if ``f' < 0`` and ``f'' < 0``) - -This aids in sketching the graph between the critical points and inflection points. - - -We finally check that if we were to just use $[0,7]$ as a domain to -plot over that the function doesn't get too large to mask the -oscillations. This could happen if the $y$ values at the end points -are too much larger than the $y$ values at the peaks and valleys, as -only so many pixels can be used within a graph. For this we have: - -```julia; -f.([0, cps..., 7]) -``` - -The values at $0$ and at $7$ are a bit large, as compared to the -relative extrema, and since we know the graph is eventually -$U$-shaped, this offers no insight. So we narrow the range a bit for -the graph: - -```julia; -plot(f, 0.5, 6.5) -``` - - ----- - -This sort of analysis can be automated. The plot "recipe" for polynomials from the `Polynomials` package does similar considerations to choose a viewing window: - -```julia -xₚ = variable(Polynomial) -plot(f(xₚ)) # f(xₚ) of Polynomial type -``` - - - - -##### Example - -Graph the function - -```math -f(x) = \frac{(x-1)\cdot(x-3)^2}{x \cdot (x-2)}. -``` - -Not much to do here if you are satisfied with a graph that only gives insight into the asymptotes of this rational function: - -```julia; -𝒇(x) = ( (x-1)*(x-3)^2 ) / (x * (x-2) ) -plot(𝒇, -50, 50) -``` - -We can see the slant asymptote and hints of vertical asymptotes, but, -we'd like to see more of the basic features of the graph. - -Previously, we have discussed rational functions and their -asymptotes. This function has numerator of degree ``3`` and denominator of -degree ``2``, so will have a slant asymptote. As well, the zeros of the -denominator, $0$ and $-2$, will lead to vertical asymptotes. - -To identify how wide a viewing window should be, for the rational -function the asymptotic behaviour is determined after the concavity is -done changing and we are past all relative extrema, so we should take -an interval that includes all potential inflection points and critical -points: - -```julia; -𝒇cps = find_zeros(𝒇', -10, 10) -poss_ips = find_zero(𝒇'', (-10, 10)) -extrema(union(𝒇cps, poss_ips)) -``` - -So a range over $[-5,5]$ should display the key features including the slant asymptote. - -Previously we used the `rangeclamp` function defined in `CalculusWithJulia` to avoid the distortion that vertical asymptotes can have: - -```julia; -plot(rangeclamp(𝒇), -5, 5) -``` - -With this graphic, we can now clearly see in the graph the two zeros at $x=1$ and $x=3$, the vertical asymptotes at $x=0$ and $x=2$, and the slant asymptote. - ----- - -Again, this sort of analysis can be systematized. The rational function type in the `Polynomials` package takes a stab at that, but isn't quite so good at capturing the slant asymptote: - -```julia -xᵣ = variable(RationalFunction) -plot(𝒇(xᵣ)) # f(x) of RationalFunction type -``` - - -##### Example - -Consider the function ``V(t) = 170 \sin(2\pi\cdot 60 \cdot t)``, a model for the alternating current waveform for an outlet in the United States. Create a graph. - -Blindly trying to graph this, we will see immediate issues: - -```julia -V(t) = 170 * sin(2*pi*60*t) -plot(V, -2pi, 2pi) -``` - -Ahh, this periodic function is *too* rapidly oscillating to be plotted without care. We recognize this as being of the form ``V(t) = a\cdot\sin(c\cdot t)``, so where the sine function has a period of ``2\pi``, this will have a period of ``2\pi/c``, or ``1/60``. So instead of using ``(-2\pi, 2\pi)`` as the interval to plot over, we need something much smaller: - - -```julia -plot(V, -1/60, 1/60) -``` - - -##### Example - -Plot the function ``f(x) = \ln(x/100)/x``. - -We guess that this function has a *vertical* asymptote at ``x=0+`` and a horizontal asymptote as ``x \rightarrow \infty``, we verify through: - -```julia -@syms x -ex = log(x/100)/x -limit(ex, x=>0, dir="+"), limit(ex, x=>oo) -``` - -The ``\ln(x/100)`` part of ``f`` goes ``-\infty`` as ``x \rightarrow 0+``; yet ``f(x)`` is eventually positive as ``x \rightarrow 0``. So a graph should - -* not show too much of the vertical asymptote -* capture the point where ``f(x)`` must cross ``0`` -* capture the point where ``f(x)`` has a relative maximum -* show enough past this maximum to indicate to the reader the eventual horizontal asyptote. - -For that, we need to get the ``x`` intercepts and the critical points. The ``x/100`` means this graph has some scaling to it, so we first look between ``0`` and ``200``: - -```julia -find_zeros(ex, 0, 200) # domain is (0, oo) -``` - -Trying the same for the critical points comes up empty. We know there is one, but it is past ``200``. Scanning wider, we see: - -```julia -find_zeros(diff(ex,x), 0, 500) -``` - - -So maybe graphing over ``[50, 300]`` will be a good start: - -```julia -plot(ex, 50, 300) -``` - -But it isn't! The function takes its time getting back towards ``0``. We know that there must be a change of concavity as ``x \rightarrow \infty``, as there is a horizontal asymptote. We looks for the anticipated inflection point to ensure our graph includes that: - -```julia -find_zeros(diff(ex, x, x), 1, 5000) -``` - -So a better plot is found by going well beyond that inflection point: - -```julia -plot(ex, 75, 1500) -``` - - - -## Questions - -###### Question - -Consider this graph - -```julia; hold=true; echo=false -f(x) = (x-2)* (x-2.5)*(x-3) / ((x-1)*(x+1)) -p = plot(f, -20, -1-.3, legend=false, xlim=(-15, 15), color=:blue) -plot!(p, f, -1 + .2, 1 - .02, color=:blue) -plot!(p, f, 1 + .05, 20, color=:blue) -``` - -What kind of *asymptotes* does it appear to have? - -```julia; hold=true; echo=false -choices = [ -L"Just a horizontal asymptote, $y=0$", -L"Just vertical asymptotes at $x=-1$ and $x=1$", -L"Vertical asymptotes at $x=-1$ and $x=1$ and a horizontal asymptote $y=1$", -L"Vertical asymptotes at $x=-1$ and $x=1$ and a slant asymptote" -] -answ = 4 -radioq(choices, answ) -``` - -###### Question - -Consider the function ``p(x) = x + 2x^3 + 3x^3 + 4x^4 + 5x^5 +6x^6``. Which interval shows more than a ``U``-shaped graph that dominates for large ``x`` due to the leading term being ``6x^6``? - -(Find an interval that contains the zeros, critical points, and inflection points.) - - -```julia; hold=true; echo=false -choices = ["``(-5,5)``, the default bounds of a calculator", -"``(-3.5, 3.5)``, the bounds given by Cauchy for the real roots of ``p``", -"``(-1, 1)``, as many special polynomials have their roots in this interval", -"``(-1.1, .25)``, as this constains all the roots, the critical points, and inflection points and just a bit more" -] -radioq(choices, 4, keep_order=true) -``` -###### Question - -Let ``f(x) = x^3/(9-x^2)``. - -What points are *not* in the domain of ``f``? - -```julia; echo=false -qchoices = [ - "The values of `find_zeros(f, -10, 10)`: `[-3, 0, 3]`", - "The values of `find_zeros(f', -10, 10)`: `[-5.19615, 0, 5.19615]`", - "The values of `find_zeros(f'', -10, 10)`: `[-3, 0, 3]`", - "The zeros of the numerator: `[0]`", - "The zeros of the denominator: `[-3, 3]`", - "The value of `f(0)`: `0`", - "None of these choices" -] -radioq(qchoices, 5, keep_order=true) -``` - -The ``x``-intercepts are: - -```julia; hold=true; echo=false -radioq(qchoices, 4, keep_order=true) -``` - -The ``y``-intercept is: -```julia; hold=true; echo=false -radioq(qchoices, 6, keep_order=true) -``` - -There are *vertical asymptotes* at ``x=\dots``? - -```julia; hold=true; echo=false -radioq(qchoices, 5) -``` - -The *slant* asymptote has slope? - -```julia; hold=true; echo=false -numericq(1) -``` - -The function has critical points at - -```julia; hold=true,echo=false -radioq(qchoices, 2, keep_order=true) -``` - -The function has relative extrema at - -```julia; hold=true;echo=false -radioq(qchoices, 7, keep_order=true) -``` - -The function has inflection points at - -```julia; hold=true;echo=false -radioq(qchoices, 7, keep_order=true) -``` - - - -###### Question - -A function ``f`` has - -* zeros of ``\{-0.7548\dots, 2.0\}``, -* critical points at ``\{-0.17539\dots, 1.0, 1.42539\dots\}``, -* inflection points at ``\{0.2712\dots,1.2287\}``. - -Is this a possible graph of ``f``? - -```julia; hold=true;echo=false -f(x) = x^4 - 3x^3 + 2x^2 + x - 2 -plot(f, -1, 2.5, legend=false) -``` - -```julia; hold=true;echo=false -yesnoq("yes") -``` - -###### Question - -Two models for population growth are *exponential* growth: $P(t) = P_0 a^t$ and -[logistic growth](https://en.wikipedia.org/wiki/Logistic_function#In_ecology:_modeling_population_growth): $P(t) = K P_0 a^t / (K + P_0(a^t - 1))$. The exponential growth model has growth rate proportional to the current population. The logistic model has growth rate depending on the current population *and* the available resources (which can limit growth). - - -Letting $K=10$, $P_0=5$, and $a= e^{1/4}$. A plot over $[0,5]$ shows somewhat similar behaviour: - -```julia; -K, P0, a = 50, 5, exp(1/4) -exponential_growth(t) = P0 * a^t -logistic_growth(t) = K * P0 * a^t / (K + P0*(a^t-1)) - -plot(exponential_growth, 0, 5) -plot!(logistic_growth) -``` - -Does a plot over $[0,50]$ show qualitatively similar behaviour? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -Exponential growth has $P''(t) = P_0 a^t \log(a)^2 > 0$, so has no inflection point. By plotting over a sufficiently wide interval, can you answer: does the logistic growth model have an inflection point? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -If yes, find it numerically: - -```julia; hold=true; echo=false -val = find_zero(D(logistic_growth,2), (0, 20)) -numericq(val) -``` - -The available resources are quantified by $K$. As $K \rightarrow \infty$ what is the limit of the logistic growth model: - -```julia; hold=true; echo=false -choices = [ -"The exponential growth model", -"The limit does not exist", -"The limit is ``P_0``"] -answ = 1 -radioq(choices, answ) -``` - -##### Question - -The plotting algorithm for plotting functions starts with a small -initial set of points over the specified interval ($21$) and then -refines those sub-intervals where the second derivative is determined -to be large. - -Why are sub-intervals where the second derivative is large different than those where the second derivative is small? - -```julia; hold=true; echo=false -choices = [ -"The function will increase (or decrease) rapidly when the second derivative is large, so there needs to be more points to capture the shape", -"The function will have more curvature when the second derivative is large, so there needs to be more points to capture the shape", -"The function will be much larger (in absolute value) when the second derivative is large, so there needs to be more points to capture the shape", -] -answ = 2 -radioq(choices, answ) -``` - -##### Question - -Is there a nice algorithm to identify what domain a function should be -plotted over to produce an informative graph? -[Wilkinson](https://www.cs.uic.edu/~wilkinson/Publications/plotfunc.pdf) -has some suggestions. (Wilkinson is well known to the `R` community as -the specifier of the grammar of graphics.) It is mentioned that -"finding an informative domain for a given function depends on at least -three features: periodicity, asymptotics, and monotonicity." - -Why would periodicity matter? - -```julia; hold=true; echo=false -choices = [ -"An informative graph only needs to show one or two periods, as others can be inferred.", -"An informative graph need only show a part of the period, as the rest can be inferred.", -L"An informative graph needs to show several periods, as that will allow proper computation for the $y$ axis range."] -answ = 1 -radioq(choices, answ) -``` - -Why should asymptotics matter? - -```julia; hold=true; echo=false -choices = [ -L"A vertical asymptote can distory the $y$ range, so it is important to avoid too-large values", -L"A horizontal asymptote must be plotted from $-\infty$ to $\infty$", -"A slant asymptote must be plotted over a very wide domain so that it can be identified." -] -answ = 1 -radioq(choices, answ) -``` - -Monotonicity means increasing or decreasing. This is important for what reason? - -```julia; hold=true; echo=false -choices = [ -"For monotonic regions, a large slope or very concave function might require more care to plot", -"For monotonic regions, a function is basically a straight line", -"For monotonic regions, the function will have a vertical asymptote, so the region should not be plotted" -] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/derivatives/derivatives.jmd b/CwJ/derivatives/derivatives.jmd deleted file mode 100644 index d59aaff..0000000 --- a/CwJ/derivatives/derivatives.jmd +++ /dev/null @@ -1,1527 +0,0 @@ -# Derivatives - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -``` - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using DataFrames - -const frontmatter = ( - title = "Derivatives", - description = "Calculus with Julia: Derivatives", - tags = ["CalculusWithJulia", "derivatives", "derivatives"], -); - -fig_size=(800, 600) - -nothing -``` - ----- - -Before defining the derivative of a function, let's begin with two -motivating examples. - -##### Example: Driving - -Imagine motoring along down highway ``61`` leaving Minnesota on the way to -New Orleans; though lost in listening to music, still mindful of the -speedometer and odometer, both prominently placed on the dashboard of -the car. - -The speedometer reads ``60`` miles per hour, what is the odometer doing? -Besides recording total distance traveled, it is incrementing -dutifully every hour by ``60`` miles. Why? Well, the well-known formula relating distance, time and rate of travel is - -```math -\text{distance} = \text{ rate } \times \text{ time.} -``` - -If the rate is a constant ``60`` miles/hour, then in one hour the distance traveled is ``60`` miles. - -Of course, the odometer isn't just incrementing once per hour, it is incrementing once every ``1/10``th of a mile. How much time does that take? Well, we would need to solve $1/10=60 \cdot t$ which means $t=1/600$ hours, better known as once every ``6`` seconds. - -Using some mathematical notation, would give $x(t) = v\cdot t$, where -$x$ is position at time $t$, $v$ is the *constant* velocity and $t$ the time -traveled in hours. A simple graph of the first three hours of travel would show: - -```julia; hold=true; -position(t) = 60 * t -plot(position, 0, 3) -``` - -Oh no, we hit traffic. In the next ``30`` minutes we only traveled -``15`` miles. We were so busy looking out for traffic, the speedometer was -not checked. What would the average speed have been? Though in the ``30`` -minutes of stop-and-go traffic, the displayed speed may have varied, the *average speed* -would simply be the change in distance over the change in time, or -$\Delta x / \Delta t$. That is - -```julia -15/(1/2) -``` - - -Now suppose that after $6$ hours of travel the GPS in the car gives us a readout of distance traveled -as a function of time. The graph looks like this: - -```julia; hold=true; echo=false -function position(t) - t <= 3 ? 60*t : - t <= 3.5 ? position(3) + 30*(t-3) : - t <= 4 ? position(3.5) + 75 * (t-3.5) : - t <= 4.5 ? position(4) : position(4.5) + 60*(t-4.5) -end -plot(position, 0, 6) -``` - -We can see with some effort that the slope is steady for the first three hours, is slightly less between $3$ and -$3.5$ hours, then is a bit steeper for the next half hour. After that, it is flat for the -about half an hour, then the slope continues on with same value as in the first -``3`` hours. What does that say about our speed during our trip? - -Based on the graph, what was the average speed over the first three hours? Well, we traveled ``180`` miles, and took ``3`` hours: - -```julia -180/3 -``` - -What about the next half hour? Squinting shows the amount traveled was ``15`` miles (``195 - 180``) and it took ``1/2`` an hour: - -```julia -15/(1/2) -``` - -And the half hour after that? The average speed is found from the distance traveled, ``37.5`` miles, divided by the time, ``1/2`` hour: - -```julia -37.5 / (1/2) -``` - -Okay, so there was some speeding involved. - -The next half hour the car did not move. What was the average speed? Well the change in position was ``0``, but the time was ``1/2`` hour, so the average was ``0``. - -Perhaps a graph of the speed is a bit more clear. We can do this based on the above: - -```julia -function speed(t) - 0 < t <= 3 ? 60 : - t <= 3.5 ? 30 : - t <= 4 ? 75 : - t <= 4.5 ? 0 : 60 -end -plot(speed, 0, 6) -``` - -The jumps, as discussed before, are artifacts of the graphing -algorithm. What is interesting, is we could have derived the graph of -`speed` from that of `x` by just finding the slopes of the line -segments, and we could have derived the graph of `x` from that of -`speed`, just using the simple formula relating distance, rate, and -time. - -!!! note - We were pretty loose with some key terms. There is a - distinction between "speed" and "velocity", this being the speed - is the absolute value of velocity. Velocity incorporates a - direction as well as a magnitude. Similarly, distance traveled and - change in position are not the same thing when there is back - tracking involved. The total distance traveled is computed with - the speed, the change in position is computed with the - velocity. When there is no change of sign, it is a bit more - natural, perhaps, to use the language of speed and distance. - -##### Example: Galileo's ball and ramp experiment - -One of history's most famous experiments was performed by -[Galileo](http://en.wikipedia.org/wiki/History_of_experiments) where -he rolled balls down inclined ramps, making note of distance traveled -with respect to time. As Galileo had no ultra-accurate measuring device, -he needed to slow movement down by controlling the angle of the -ramp. With this, he could measure units of distance per units of time. -(Click through to *Galileo and Perspective* [Dauben](http://www.mcm.edu/academic/galileo/ars/arshtml/mathofmotion1.html).) - - -Suppose that no matter what the incline was, Galileo observed that in -units of the distance traveled in the first second that the distance -traveled between subsequent seconds was $3$ times, then $5$ times, then -$7$ times, ... This table summarizes. - -```julia; hold=true; echo=false -ts = [0,1,2,3,4,5] -dxs = [0,1,3, 5, 7, 9] -ds = [0,1,4,9,16,25] -d = DataFrame(t=ts, delta=dxs, distance=ds) -table(d) -``` - -A graph of distance versus time could be found by interpolating between the measured points: - -```julia; -ts = [0,1,2,3,4, 5] -xs = [0,1,4,9,16,25] -plot(ts, xs) -``` - -The graph looks almost quadratic. What would the following questions have yielded? - -* What is the average speed between $0$ and $3$? - -```julia -(9-0) / (3-0) # (xs[4] - xs[1]) / (ts[4] - ts[1]) -``` - -* What is the average speed between $2$ and $3$? - -```julia -(9-4) / (3-2) # (xs[4] - xs[3]) / (ts[4] - ts[3]) -``` - -From the graph, we can tell that the slope of the line connecting -$(2,4)$ and $(3,9)$ will be greater than that connecting $(0,0)$ and -$(3,9)$. In fact, given the shape of the graph (concave up), the line -connecting $(0,0)$ with any point will have a slope less than or equal -to any of the line segments. - -The average speed between $k$ and $k+1$ for this graph is: - -```julia -xs[2]-xs[1], xs[3] - xs[2], xs[4] - xs[3], xs[5] - xs[4] -``` - -We see it increments by $2$. The acceleration is the rate of change of -speed. We see the rate of change of speed is constant, as the speed -increments by ``2`` each time unit. - -Based on this - and given Galileo's insight - it appears the -acceleration for a falling body subject to gravity will be -**constant** and the position as a function of time will be quadratic. - -## The slope of the secant line - -In the above examples, we see that the average speed is computed using -the slope formula. This can be generalized for any univariate function -$f(x)$: - -> The average rate of change between $a$ and $b$ is $(f(b) - f(a)) / -> (b - a)$. It is typical to express this as $\Delta y/ \Delta x$, -> where $\Delta$ means "change". - -Geometrically, this is the slope of the line connecting the points -$(a, f(a))$ and $(b, f(b))$. This line is called a -[secant](http://en.wikipedia.org/wiki/Secant_line) line, which is just -a line intersecting two specified points on a curve. - - -Rather than parameterize this problem using $a$ and $b$, we let $c$ and $c+h$ represent the two values for $x$, then the secant-line-slope formula becomes - -```math -m = \frac{f(c+h) - f(c)}{h}. -``` - -## The slope of the tangent line - -The slope of the secant line represents the average rate of change -over a given period, $h$. What if this rate is so variable, that it -makes sense to take smaller and smaller periods $h$? In fact, what if -$h$ goes to $0$? - -```julia; hold=true; echo=false; cache=true -function secant_line_tangent_line_graph(n) - f(x) = sin(x) - c = pi/3 - h = 2.0^(-n) * pi/4 - m = (f(c+h) - f(c))/h - - xs = range(0, stop=pi, length=50) - plt = plot(f, 0, pi, legend=false, size=fig_size) - plot!(plt, xs, f(c) .+ cos(c)*(xs .- c), color=:orange) - plot!(plt, xs, f(c) .+ m*(xs .- c), color=:black) - scatter!(plt, [c,c+h], [f(c), f(c+h)], color=:orange, markersize=5) - - plot!(plt, [c, c+h, c+h], [f(c), f(c), f(c+h)], color=:gray30) - annotate!(plt, [(c+h/2, f(c), text("h", :top)), (c + h + .05, (f(c) + f(c + h))/2, text("f(c+h) - f(c)", :left))]) - - plt -end -caption = L""" - -The slope of each secant line represents the *average* rate of change between $c$ and $c+h$. As $h$ goes towards $0$, we recover the slope of the tangent line, which represents the *instantatneous* rate of change. - -""" - - - -n = 5 -anim = @animate for i=0:n - secant_line_tangent_line_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - - - -The graphic suggests that the slopes of the secant line converge to -the slope of a "tangent" line. That is, for a given $c$, this -limit exists: - -```math -\lim_{h \rightarrow 0} \frac{f(c+h) - f(c)}{h}. -``` - -We will define the tangent line at $(c, f(c))$ to be the line through -the point with the slope from the limit above - provided that limit -exists. Informally, the tangent line is the line through the point -that best approximates the function. - -```julia; hold=true; echo=false; cache=true -function line_approx_fn_graph(n) - f(x) = sin(x) - c = pi/3 - h = round(2.0^(-n) * pi/2, digits=2) - m = cos(c) - - Delta = max(f(c) - f(c-h), f(min(c+h, pi/2)) - f(c)) - - p = plot(f, c-h, c+h, legend=false, xlims=(c-h,c+h), ylims=(f(c)-Delta,f(c)+Delta )) - plot!(p, x -> f(c) + m*(x-c)) - scatter!(p, [c], [f(c)]) - p -end -caption = L""" - -The tangent line is the best linear approximation to the function at the point $(c, f(c))$. As the viewing window zooms in on $(c,f(c))$ we - can see how the graph and its tangent line get more similar. - -""" - -n = 6 -anim = @animate for i=1:n - line_approx_fn_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - -The tangent line is not just a line that intersects the graph in one -point, nor does it need only intersect the line in just one point. - -!!! note - This last point was certainly not obvious at - first. [Barrow](http://www.maa.org/sites/default/files/0746834234133.di020795.02p0640b.pdf), - who had Newton as a pupil, and was the first to sketch a proof of - part of the Fundamental Theorem of Calculus, understood a tangent - line to be a line that intersects a curve at only one point. - - -##### Example - -What is the slope of the tangent line to $f(x) = \sin(x)$ at $c=0$? - -We need to compute the limit $(\sin(c+h) - \sin(c))/h$ which is the -limit as $h$ goes to $0$ of $\sin(h)/h.$ We know this to be ``1.`` - -```julia; hold=true -f(x) = sin(x) -c = 0 -tl(x) = f(c) + 1 * (x - c) -plot(f, -pi/2, pi/2) -plot!(tl, -pi/2, pi/2) -``` - -## The derivative - -The limit of the slope of the secant line gives an operation: for each -$c$ in the domain of $f$ there is a number (the slope of the tangent -line) or it does not exist. That is, there is a derived function from -$f$. Call this function the *derivative* of $f$. - - -There are many notations for the derivative, mostly we use the "prime" notation: - -```math -f'(x) = \lim_{h \rightarrow 0} \frac{f(x+h) - f(x)}{h}. -``` - -The limit above is identical, only it uses $x$ instead of $c$ to -emphasize that we are thinking of a function now, and not just a value -at a point. - - -The derivative is related to a function, but at times it is more convenient to write only the expression defining the rule of the function. In that case, we use this notation for the derivative ``[\text{expression}]'``. - -### Some basic derivatives - -- **The power rule**. What is the derivative of the monomial $f(x) = x^n$? We need to look - at $(x+h)^n - x^n$ for positive, integer-value $n$. Let's look at a case, $n=5$ - -```julia -@syms x::real h::real -n = 5 -ex = expand((x+h)^n - x^n) -``` - -All terms have an `h` in them, so we cancel it out: - -```julia -cancel(ex/h, h) -``` - -We see the lone term `5x^4` without an $h$, so as we let $h$ go to $0$, this will be the limit. That is, $f'(x) = 5x^4$. - - -For integer-valued, positive, $n$, the binomial theorem gives an -expansion $(x+h)^n = x^n + nx^{n-1}\cdot h^1 + n\cdot(n-1)x^{n-2}\cdot h^2 + \cdots$. Subtracting $x^n$ -then dividing by $h$ leaves just the term $nx^{n-1}$ without a power -of $h$, so the limit, in general, is just this term. That is: - -```math -[x^n]' = nx^{n-1}. -``` - - -It isn't a special case, but when $n=0$, we also have the above -formula applies, as $x^0$ is the constant $1$, and all constant -functions will have a derivative of $0$ at all $x$. We will see that in -general, the power rule applies for any $n$ where $x^n$ is defined. - -- What is the derivative of $f(x) = \sin(x)$? We know that $f'(0)= 1$ - by the earlier example with ``(\sin(0+h)-\sin(0))/h = \sin(h)/h``, here we solve in general. - -We need to consider the difference $\sin(x+h) - \sin(x)$: - -```julia -sympy.expand_trig(sin(x+h) - sin(x)) # expand_trig is not exposed in `SymPy` -``` - -That used the formula $\sin(x+h) = \sin(x)\cos(h) + \sin(h)\cos(x)$. - -We could then rearrange the secant line slope formula to become: - -```math -\cos(x) \cdot \frac{\sin(h)}{h} + \sin(x) \cdot \frac{\cos(h) - 1}{h} -``` - -and take a limit. If the answer isn't clear, we can let `SymPy` do this work: - -```julia -limit((sin(x+h) - sin(x))/ h, h => 0) -``` - -From the formula ``[\sin(x)]' = \cos(x)`` we can easily get the *slope* of the tangent line to ``f(x) = \sin(x)`` at ``x=0`` by simply evaluating ``\cos(0) = 1``. - -- Let's see what the derivative of $\ln(x) = \log(x)$ is (using base ``e`` for ``\log`` unless otherwise indicated). We have - -```math -\frac{\log(x+h) - \log(x)}{h} = \frac{1}{h}\log(\frac{x+h}{x}) = \log((1+h/x)^{1/h}). -``` - -As noted earlier, Cauchy saw the limit as $u$ goes to $0$ of $f(u) = (1 + -u)^{1/u}$ is $e$. Re-expressing the above we can get $1/h \cdot -\log(f(h/x))$. The limit as $h$ goes to $0$ of this is found from -the composition rules for limits: as $\lim_{h \rightarrow 0} f(h/x) = -e^{1/x}$, and since $\log(x)$ is continuous at $e^{1/x}$ we get this expression has a limit of $1/x$. - -We verify through: - -```julia -limit((log(x+h) - log(x))/h, h => 0) -``` - -- The derivative of ``f(x) = e^x`` can also be done from a limit. We have - -```math -\frac{e^{x+h} - e^x}{h} = \frac{e^x \cdot(e^h -1)}{h}. -``` - -Earlier, we saw that $\lim_{h \rightarrow 0}(e^h - 1)/h = 1$. With this, we get -$[e^x]' = e^x$, that is it is a function satisfying $f'=f$. - - ----- - -There are several different -[notations](http://en.wikipedia.org/wiki/Notation_for_differentiation) -for derivatives. Some are historical, some just add -flexibility. We use the prime notation of Lagrange: ``f'(x)``, ``u'`` and ``[\text{expr}]'``, -where the first emphasizes that the derivative is a function with a -value at ``x``, the second emphasizes the derivative operates on -functions, the last emphasizes that we are taking the derivative of -some expression. - - -There are many other notations: - -- The Leibniz notation uses the infinitesimals: ``dy/dx`` to relate to - ``\Delta y/\Delta x``. This notation is very common, and especially - useful when more than one variable is involved. `SymPy` uses - Leibniz notation in some of its output, expressing somethings such - as: - -```math -f'(x) = \frac{d}{d\xi}(f(\xi)) \big|_{\xi=x}. -``` - - The notation - ``\big|`` - on the right-hand side separates the tasks of finding the - derivative and evaluating the derivative at a specific value. - -- Euler used `D` for the operator `D(f)`. This was initially used by - [Argobast](http://jeff560.tripod.com/calculus.html). The notation `D(f)(c)` would be needed to evaluate the derivative at a point. - -- Newton used a "dot" above the variable, ``\dot{x}(t)``, which is still widely used in physics to indicate a derivative in time. This inidicates take the derivative and then plug in ``t``. - -- The notation ``[expr]'(c)`` or ``[expr]'\big|_{x=c}``would similarly mean, take the derivative of the expression and **then** evaluate at ``c``. - - - - -## Rules of derivatives - -We could proceed in a similar manner -- using limits to find other -derivatives, but let's not. If we have a function $f(x) = x^5 -\sin(x)$, it would be nice to leverage our previous work on the -derivatives of $f(x) =x^5$ and $g(x) = \sin(x)$, rather than derive an -answer from scratch. - - -As with limits and continuity, it proves very useful to consider rules -that make the process of finding derivatives of combinations of -functions a matter of combining derivatives of the individual functions in some manner. - -We already have one such rule: - -### Power rule - -We have seen for integer ``n \geq 0`` the formula: - -```math -[x^n]' = n x^{n-1}. -``` - -This will be shown true for all real exponents. - -### Sum rule - -Let's consider $k(x) = a\cdot f(x) + b\cdot g(x)$, what is its derivative? That is, in terms of $f$, $g$ and their derivatives, can we express $k'(x)$? - -We can rearrange $(k(x+h) - k(x))$ as follows: - -```math -(a\cdot f(x+h) + b\cdot g(x+h)) - (a\cdot f(x) + b \cdot g(x)) = -a\cdot (f(x+h) - f(x)) + b \cdot (g(x+h) - g(x)). -``` - -Dividing by $h$, we see that this becomes - -```math -a\cdot \frac{f(x+h) - f(x)}{h} + b \cdot \frac{g(x+h) - g(x)}{h} \rightarrow a\cdot f'(x) + b\cdot g'(x). -``` - -That is ``[a\cdot f(x) + b \cdot g(x)]' = a\cdot f'(x) + b\cdot g'(x)``. - - -This holds two rules: the derivative of a constant times a function is -the constant times the derivative of the function; and the derivative -of a sum of functions is the sum of the derivative of the functions. - -This example shows a useful template: - -```math -\begin{align*} -[2x^2 - \frac{x}{3} + 3e^x]' & = 2[\square]' - \frac{[\square]]}{3} + 3[\square]'\\ -&= 2[x^2]' - \frac{[x]'}{3} + 3[e^x]'\\ -&= 2(2x) - \frac{1}{3} + 3e^x\\ -&= 4x - \frac{1}{3} + 3e^x -\end{align*} -``` - -### Product rule - -Other rules can be similarly derived. `SymPy` can give us them as -well. Here we define two symbolic functions `u` and `v` and let `SymPy` -derive a formula for the derivative of a product of functions: - -```julia; hold=true; -@syms u() v() -f(x) = u(x) * v(x) -limit((f(x+h) - f(x))/h, h => 0) -``` - -The output uses the Leibniz notation to represent that the derivative -of $u(x) \cdot v(x)$ is the $u$ times the derivative of $v$ evaluated -at ``x`` plus $v$ times the derivative of $u$ evaluated at ``x``. A -common shorthand is $[uv]' = u'v + uv'$. - -This example shows a useful template for the product rule: - -```math -\begin{align*} -[(x^2+1)\cdot e^x]' &= [\square]' \cdot (\square) + (\square) \cdot [\square]'\\ -&= [x^2 + 1]' \cdot (e^x) + (x^2+1) \cdot [e^x]'\\ -&= (2x)\cdot e^x + (x^2+1)\cdot e^x -\end{align*} -``` - - - -### Quotient rule - -The derivative of $f(x) = u(x)/v(x)$ - a ratio of functions - can be -similarly computed. The result will be $[u/v]' = (u'v - uv')/u^2$: - -```julia; hold=true; -@syms u() v() -f(x) = u(x) / v(x) -limit((f(x+h) - f(x))/h, h => 0) -``` - -This example shows a useful template for the quotient rule: - -```math -\begin{align*} -[\frac{x^2+1}{e^x}]' &= \frac{[\square]' \cdot (\square) - (\square) \cdot [\square]'}{(\square)^2}\\ -&= \frac{[x^2 + 1]' \cdot (e^x) - (x^2+1) \cdot [e^x]'}{(e^x)^2}\\ -&= \frac{(2x)\cdot e^x - (x^2+1)\cdot e^x}{e^{2x}} -\end{align*} -``` - -##### Examples - -Compute the derivative of ``f(x) = (1 + \sin(x)) + (1 + x^2)``. - -As written we can identify ``f(x) = u(x) + v(x)`` with -``u=(1 + \sin(x))``, ``v=(1 + x^2)``. The sum rule immediately applies to give: - -```math -f'(x) = (\cos(x)) + (2x). -``` - ----- - -Compute the derivative of ``f(x) = (1 + \sin(x)) \cdot (1 + x^2)``. - -The same ``u`` and ``v`` my be identified. The product rule readily applies to yield: - -```math -f'(x) = u'v + uv' = \cos(x) \cdot (1 + x^2) + (1 + \sin(x)) \cdot (2x). -``` - ----- - -Compute the derivative of ``f(x) = (1 + \sin(x)) / (1 + x^2)``. - -The same ``u`` and ``v`` my be identified. The quotient rule readily applies to yield: - -```math -f'(x) = u'v - uv' = \frac{\cos(x) \cdot (1 + x^2) - (1 + \sin(x)) \cdot (2x)}{(1+x^2)^2}. -``` - ----- - -Compute the derivative of ``f(x) = (x-1) \cdot (x-2)``. - -This can be done using the product rule *or* by expanding the polynomial and using the power and sum rule. As this polynomial is easy to expand, we do both and compare: - -```math -[(x-1)(x-2)]' = [x^2 - 3x + 2]' = 2x -3. -``` - -Whereas the product rule gives: - -```math -[(x-1)(x-2)]' = 1\cdot (x-2) + (x-1)\cdot 1 = 2x - 3. -``` - ----- - -Find the derivative of $f(x) = (x-1)(x-2)(x-3)(x-4)(x-5)$. - -We could expand this, as above, but without computer assistance the potential for error is high. Instead we will use the product rule on the product of ``5`` terms. - -Let's first treat the case of $3$ products: - -```math -[u\cdot v\cdot w]' =[ u \cdot (vw)]' = u' (vw) + u [vw]' = u'(vw) + u[v' w + v w'] = -u' vw + u v' w + uvw'. -``` - -This pattern generalizes, clearly, to: -```math -[f_1\cdot f_2 \cdots f_n]' = f_1' f_2 \cdots f_n + f_1 \cdot f_2' \cdot (f_3 \cdots f_n) + \dots + -f_1 \cdots f_{n-1} \cdot f_n'. -``` - -There are $n$ terms, each where one of the $f_i$s have a derivative. Were we to multiply top and bottom by $f_i$, we would get each term looks like: $f \cdot f_i'/f_i$. - -With this, we can proceed. Each term $x-i$ has derivative $1$, so the answer to $f'(x)$, with $f$ as above, is $f'(x) = f(x)/(x-1) + f(x)/(x-2) + f(x)/(x-3) + f(x)/(x-4) + f(x)/(x-5)$, that is: - -```math -f'(x) = (x-2)(x-3)(x-4)(x-5) + (x-1)(x-3)(x-4)(x-5) + (x-1)(x-2)(x-4)(x-5) + (x-1)(x-2)(x-3)(x-5) + (x-1)(x-2)(x-3)(x-4). -``` - ----- - -Find the derivative of ``x\sin(x)`` evaluated at ``\pi``. - -```math -[x\sin(x)]'\big|_{x=\pi} = (1\sin(x) + x\cos(x))\big|_{x=\pi} = (\sin(\pi) + \pi \cdot \cos(\pi)) = -\pi. -``` - -### Chain rule - -Finally, the derivative of a composition of functions can be computed -using pieces of each function. This gives a rule called the *chain -rule*. Before deriving, let's give a slight motivation. - - -Consider the output of a factory for some widget. It depends on two steps: -an initial manufacturing step and a finishing step. The number of -employees is important in how much is initially manufactured. Suppose -$x$ is the number of employees and $g(x)$ is the amount initially -manufactured. Adding more employees increases the amount made by the -made-up rule $g(x) = \sqrt{x}$. The finishing step depends on how much -is made by the employees. If $y$ is the amount made, then $f(y)$ is -the number of widgets finished. Suppose for some reason that $f(y) = -y^2.$ - -How many widgets are made as a function of employees? The composition -$u(x) = f(g(x))$ would provide that. Changes in the initial manufacturing step lead to changes in how much is initially made; changes in the initial amount made leads to changes in the finished products. Each change contributes to the overall change. - -What is the effect of adding employees on the rate of output of widgets? -In this specific case we know the answer, as $(f \circ g)(x) = x$, so -the answer is just the rate is $1$. - -In general, we want to express $\Delta f / \Delta x$ in a form so that we can take a limit. - -But what do we know? We know $\Delta g / \Delta x$ and $\Delta f/\Delta y$. Using $y=g(x)$, this suggests that we might have luck with the right side of this equation: - -```math -\frac{\Delta f}{\Delta x} = \frac{\Delta f}{\Delta y} \cdot \frac{\Delta y}{\Delta x}. -``` - - -Interpreting this, we get the *average* rate of change in the -composition can be thought of as a product: The *average* rate of -change of the initial step ($\Delta y/ \Delta x$) times the *average* -rate of the change of the second step evaluated not at $x$, but at -$y$, $\Delta f/ \Delta y$. - - -Re-expressing using derivative notation with $h$ would be: - - -```math -\frac{f(g(x+h)) - f(g(x))}{h} = \frac{f(g(x+h)) - f(g(x))}{g(x+h) - g(x)} \cdot \frac{g(x+h) - g(x)}{h}. -``` - -The left hand side will converge to the derivative of $u(x)$ or $[f(g(x))]'$. - -The right most part of the right side would have a limit $g'(x)$, were -we to let $h$ go to $0$. - -It isn't obvious, but the left part of the right side has the limit -$f'(g(x))$. This would be clear if *only* $g(x+h) = g(x) + h$, for -then the expression would be exactly the limit expression with -$c=g(x)$. But, alas, except to some hopeful students and some special -cases, it is definitely not the case in general that $g(x+h) = g(x) + h$ - that -right parentheses actually means something. However, it is *nearly* -the case that $g(x+h) = g(x) + kh$ for some $k$ and this can be used to formulate a -proof (one of the two detailed -[here](http://en.wikipedia.org/wiki/Chain_rule#Proofs) and [here](http://kruel.co/math/chainrule.pdf)). - - -Combined, we would end up with: - - -> *The chain rule*: $[f(g(x))]' = f'(g(x)) \cdot g'(x)$. That is the -> derivative of the outer function evaluated at the inner function -> times the derivative of the inner function. - - -To see that this works in our specific case, we assume the general -power rule that $[x^n]' = n x^{n-1}$ to get: - -```math -\begin{align*} -f(x) &= x^2 & g(x) &= \sqrt{x}\\ -f'(\square) &= 2(\square) & g'(x) &= \frac{1}{2}x^{-1/2} -\end{align*} -``` - -We use ``\square`` for the argument of `f'` to emphasize that ``g(x)`` is the needed value, not just ``x``: - -```math -\begin{align*} -[(\sqrt{x})^2]' &= [f(g(x)]'\\ -&= f'(g(x)) \cdot g'(x) \\ -&= 2(\sqrt{x}) \cdot \frac{1}{2}x^{-1/2}\\ -&= \frac{2\sqrt{x}}{2\sqrt{x}}\\ -&=1 -\end{align*} -``` - - -This is the same as the derivative of $x$ found by first evaluating the composition. For this problem, the chain rule is not necessary, but typically it is a needed rule to fully differentiate a function. - -##### Examples - -Find the derivative of ``f(x) = \sqrt{1 - x^2}``. We identify the composition of ``\sqrt{x}`` and ``(1-x^2)``. We set the functions and their derivatives into a pattern to emphasize the pieces in the chain-rule formula: - -```math -\begin{align*} -f(x) &=\sqrt{x} = x^{1/2} & g(x) &= 1 - x^2 \\ -f'(\square) &=(1/2)(\square)^{-1/2} & g'(x) &= -2x -\end{align*} -``` - -Then: - -```math -[f(g(x))]' = (1/2)(1-x^2)^{-1/2} \cdot (-2x). -``` - ----- - -Find the derivative of ``\log(2 + \sin(x))``. This is a composition ``\log(x)`` -- with derivative ``1/x`` and ``2 + \sin(x)`` -- with derivative ``\cos(x)``. We get ``(1/\sin(x)) \cos(x)``. - -In general, - -```math -[\log(f(x))]' \frac{f'(x)}{f(x)}. -``` - ----- - -Find the derivative of ``e^{f(x)}``. The inner function has derivative ``f'(x)``, the outer function has derivative ``e^x`` (the same as the outer function itself). We get for a derivative - -```math -[e^{f(x)}]' = e^{f(x)} \cdot f'(x). -``` - -This is a useful rule to remember for expressions involving exponentials. - ----- - -Find the derivative of ``\sin(x)\cos(2x)`` at ``x=\pi``. - -```math -[\sin(x)\cos(2x)]'\big|_{x=\pi} = -(\cos(x)\cos(2x) + \sin(x)(-\sin(2x)\cdot 2))\big|_{x=\pi} = -((-1)(1) + (0)(-0)(2)) = -1. -``` - -##### Proof of the Chain Rule - -A function is *differentiable* at $a$ if the following limit exists $\lim_{h \rightarrow 0}(f(a+h)-f(a))/h$. Reexpressing this as: $f(a+h) - f(a) - f'(a)h = \epsilon_f(h) h$ where as $h\rightarrow 0$, $\epsilon_f(h) \rightarrow 0$. Then, we have: - -```math -g(a+h) = g(a) + g'(a)h + \epsilon_g(h) h = g(a) + h', -``` - -Where $h' = (g'(a) + \epsilon_g(h))h \rightarrow 0$ as $h \rightarrow 0$ will be used to simplify the following: - - - -```math -\begin{align} -f(g(a+h)) - f(g(a)) &= -f(g(a) + g'(a)h + \epsilon_g(h)h) - f(g(a)) \\ -&= f(g(a)) + f'(g(a)) (g'(a)h + \epsilon_g(h)h) + \epsilon_f(h')(h') - f(g(a))\\ -&= f'(g(a)) g'(a)h + f'(g(a))(\epsilon_g(h)h) + \epsilon_f(h')(h'). -\end{align} -``` - -Rearranging: - -```math -f(g(a+h)) - f(g(a)) - f'(g(a)) g'(a) h = f'(g(a))\epsilon_g(h))h + \epsilon_f(h')(h') = -(f'(g(a)) \epsilon_g(h) + \epsilon_f(h')( (g'(a) + \epsilon_g(h))))h = -\epsilon(h)h, -``` - -where $\epsilon(h)$ combines the above terms which go to zero as $h\rightarrow 0$ into one. This is -the alternative definition of the derivative, showing $(f\circ g)'(a) = f'(g(a)) g'(a)$ when $g$ is differentiable at $a$ and $f$ is differentiable at $g(a)$. - -##### The "chain" rule - -The chain rule name could also be simply the "composition rule," as that is the operation the rule works for. However, in practice, there are usually *multiple* compositions, and the "chain" rule is used to chain together the different pieces. To get a sense, consider a triple composition ``u(v(w(x())))``. This will have derivative: - -```math -\begin{align*} -[u(v(w(x)))]' &= u'(v(w(x))) \cdot [v(w(x))]' \\ -&= u'(v(w(x))) \cdot v'(w(x)) \cdot w'(x) -\end{align*} -``` - -The answer can be viewed as a repeated peeling off of the outer -function, a view with immediate application to many compositions. To -see that in action with an expression, consider this derivative -problem, shown in steps: - -```math -\begin{align*} -[\sin(e^{\cos(x^2-x)})]' -&= \cos(e^{\cos(x^2-x)}) \cdot [e^{\cos(x^2-x)}]'\\ -&= \cos(e^{\cos(x^2-x)}) \cdot e^{\cos(x^2-x)} \cdot [\cos(x^2-x)]'\\ -&= \cos(e^{\cos(x^2-x)}) \cdot e^{\cos(x^2-x)} \cdot (-\sin(x^2-x)) \cdot [x^2-x]'\\ -&= \cos(e^{\cos(x^2-x)}) \cdot e^{\cos(x^2-x)} \cdot (-\sin(x^2-x)) \cdot (2x-1)\\ -\end{align*} -``` - - -##### More examples of differentiation - -Find the derivative of $x^5 \cdot \sin(x)$. - -This is a product of functions, using $[u\cdot v]' = u'v + uv'$ we get: - -```math -5x^4 \cdot \sin(x) + x^5 \cdot \cos(x) -``` ----- - -Find the derivative of $x^5 / \sin(x)$. - -This is a quotient of functions. Using $[u/v]' = (u'v - uv')/v^2$ we get - -```math -(5x^4 \cdot \sin(x) - x^5 \cdot \cos(x)) / (\sin(x))^2. -``` - ----- - -Find the derivative of $\sin(x^5)$. This is a composition of -functions $u(v(x))$ with $v(x) = x^5$. The chain rule says find the -derivative of $u$ ($\cos(x)$) and evaluate at $v(x)$ ($\cos(x^5)$) -then multiply by the derivative of $v$: - -```math -\cos(x^5) \cdot 5x^4. -``` - ----- - -Similarly, but differently, find the derivative of $\sin(x)^5$. Now -$v(x) = \sin(x)$, so the derivative of $u(x)$ ($5x^4$) evaluated at -$v(x)$ is $5(\sin(x))^4$ so multiplying by $v'$ gives: - -```math -5(\sin(x))^4 \cdot \cos(x) -``` - ----- - -We can verify these with `SymPy`. Rather than take a limit, we will -use `SymPy`'s `diff` function to compute derivatives. - -```julia -diff(x^5 * sin(x)) -``` - -```julia -diff(x^5/sin(x)) -``` - -```julia -diff(sin(x^5)) -``` - -and finally, - -```julia -diff(sin(x)^5) -``` - -!!! note - The `diff` function can be called as `diff(ex)` when there is - just one free variable, as in the above examples; as `diff(ex, - var)` when there are parameters in the expression. - ----- - -The general product rule: For any $n$ - not just integer values - we can re-express $x^n$ using $e$: $x^n = e^{n \log(x)}$. Now the chain rule can be applied: - -```math -[x^n]' = [e^{n\log(x)}]' = e^{n\log(x)} \cdot (n \frac{1}{x}) = n x^n \cdot \frac{1}{x} = n x^{n-1}. -``` - ----- - -Find the derivative of $f(x) = x^3 (1-x)^2$ using either the power rule or the sum rule. - -The power rule expresses $f=u\cdot v$. With $u(x)=x^3$ and $v(x)=(1-x)^2$ we get: - -```math -u'(x) = 3x^2, \quad v'(x) = 2 \cdot (1-x)^1 \cdot (-1), -``` - -the last by the chain rule. Combining with $u' v + u v'$ we get: -$f'(x) = (3x^2)\cdot (1-x)^2 + x^3 \cdot (-2) \cdot (1-x)$. - -Otherwise, the polynomial can be expanded to give -$f(x)=x^5-2x^4+x^3$ which has derivative $f'(x) = 5x^4 - 8x^3 + 3x^2$. - ----- - -Find the derivative of $f(x) = x \cdot e^{-x^2}$. - -Using the product rule and then the chain rule, we have: - -```math -\begin{align} -f'(x) &= [x \cdot e^{-x^2}]'\\ -&= [x]' \cdot e^{-x^2} + x \cdot [e^{-x^2}]'\\ -&= 1 \cdot e^{-x^2} + x \cdot (e^{-x^2}) \cdot [-x^2]'\\ -&= e^{-x^2} + x \cdot e^{-x^2} \cdot (-2x)\\ -&= e^{-x^2} (1 - 2x^2). -\end{align} -``` - ----- - -Find the derivative of $f(x) = e^{-ax} \cdot \sin(x)$. - -Using the product rule and then the chain rule, we have: - -```math -\begin{align} -f'(x) &= [e^{-ax} \cdot \sin(x)]'\\ -&= [e^{-ax}]' \cdot \sin(x) + e^{-ax} \cdot [\sin(x)]'\\ -&= e^{-ax} \cdot [-ax]' \cdot \sin(x) + e^{-ax} \cdot \cos(x)\\ -&= e^{-ax} \cdot (-a) \cdot \sin(x) + e^{-ax} \cos(x)\\ -&= e^{-ax}(\cos(x) - a\sin(x)). -\end{align} -``` - ----- - -Find the derivative of ``e^{-x^2/2}`` at ``x=1``. - -```math -[e^{-x^2/2}]'\big|_{x=1} = -(e^{-x^2/2} \cdot \frac{-2x}{2}) \big|_{x=1} = -e^{-1/2} \cdot (-1) = -e^{-1/2}. -``` - -##### Example: derivative of inverse functions - - -Suppose we knew that $\log(x)$ had derivative of $1/x$, but didn't know the derivative of $e^x$. From their inverse relation, we have: $x=\log(e^x)$, so taking derivatives of both sides would yield: - -```math -1 = (\frac{1}{e^x}) \cdot [e^x]'. -``` - -Or solving, $[e^x]' = e^x$. This is a general strategy to find the -derivative of an *inverse* function. - - -The graph of an inverse function is related to the graph of the function through the symmetry ``y=x``. - -For example, the graph of ``e^x`` and ``\log(x)`` have this symmetry, emphasized below: - -```julia;hold=true;echo=false; -f(x) = exp(x) -f′(x) = exp(x) -f⁻¹(x) = log(x) # using Unicode typed with "f^\-^\1" -xs = range(0, 2, length=25) -ys = f.(xs) -plot(f, 0, 2, aspect_ratio=:equal, xlim=(0,8), ylim=(0,8), legend=false) -scatter!(xs, ys) -plot!(f⁻¹, extrema(ys)...) -scatter!(ys, xs, color=:blue) -plot!(identity, linestyle=:dot) # the line y=x -x₀, y₀ = xs[13], ys[13] -plot!([x₀, y₀],[y₀, x₀], linestyle=:dash) -ys′ = @. y₀ + f(x₀)*(xs - x₀) -plot!(xs, ys′, linestyle=:dash) -g(y) = 1/f′(f⁻¹(y)) -xs′ = @. x₀ + g(y₀) * (ys - y₀) -plot!(ys, xs′, linestyle=:dash) -``` - -The point ``(1, e)`` on the graph of ``e^x`` matches the point ``(e, 1)`` on the graph of the inverse function, ``\log(x)``. The slope of the tangent line at ``x=1`` to ``e^x`` is given by ``e`` as well. What is the slope of the tangent line to ``\log(x)`` at ``x=e``? - -As seen, the value can be computed, but how? - -Finding the derivative of the inverse function can be achieved from the chain rule using the identify ``f^{-1}(f(x)) = x`` for all ``x`` in the domain of ``f``. - -The chain rule applied to both sides, yields: - -```math -1 = [f^{-1}]'(f(x)) \cdot f'(x) -``` - -Solving, we see that ``[f^{-1}]'(f(x)) = 1/f'(x)``. To emphasize the evaluation of the derivative of the inverse function at ``f(x)`` we might write: - -```math -\frac{d}{du} (f^{-1}(u)) \big|_{u=f(x)} = \frac{1}{f'(x)} -``` - -So the reciprocal of the slope of the tangent line of ``f`` at the mirror image point. In the above, we see if the slope of the tangent line at ``(1,e)`` to ``f`` is ``e``, then the slope of the tangent line to ``f^{-1}(x)`` at ``(e,1)`` would be ``1/e``. - -#### Rules of derivatives and some sample functions - -This table summarizes the rules of derivatives that allow derivatives -of more complicated expressions to be computed with the derivatives of -their pieces. - -```julia; hold=true; echo=false -nm = ["Power rule", "constant", "sum/difference", "product", "quotient", "chain"] -rule = [L"[x^n]' = n\cdot x^{n-1}", - L"[cf(x)]' = c \cdot f'(x)", - L"[f(x) \pm g(x)]' = f'(x) \pm g'(x)", - L"[f(x) \cdot g(x)]' = f'(x)\cdot g(x) + f(x) \cdot g'(x)", - L"[f(x)/g(x)]' = (f'(x) \cdot g(x) - f(x) \cdot g'(x)) / g(x)^2", - L"[f(g(x))]' = f'(g(x)) \cdot g'(x)"] -d = DataFrame(Name=nm, Rule=rule) -table(d) -``` - -This table gives some useful derivatives: - -```julia; hold=true; echo=false -fn = [L"x^n (\text{ all } n)", -L"e^x", -L"\log(x)", -L"\sin(x)", -L"\cos(x)"] -a = [L"nx^{n-1}", -L"e^x", -L"1/x", -L"\cos(x)", -L"-\sin(x)"] -d = DataFrame(Function=fn, Derivative=a) -table(d) -``` - -## Higher-order derivatives - -The derivative of a function is an operator, it takes a function and -returns a new, derived, function. We could repeat this -operation. The result is called a higher-order derivative. The -Lagrange notation uses additional "primes" to indicate how many. So -$f''(x)$ is the second derivative and $f'''(x)$ the third. For even -higher orders, sometimes the notation is $f^{(n)}(x)$ to indicate an -$n$th derivative. - - -##### Examples - -Find the first ``3`` derivatives of ``f(x) = ax^3 + bx^2 + cx + d``. - -Differentiating a polynomial is done with the sum rule, here we repeat three times: - -```math -\begin{align} -f(x) &= ax^3 + bx^2 + cx + d\\ -f'(x) &= 3ax^2 + 2bx + c \\ -f''(x) &= 3\cdot 2 a x + 2b \\ -f'''(x) &= 6a -\end{align} -``` - -We can see, the fourth derivative -- and all higher order ones -- would be identically ``0``. This is part of a general phenomenon: an ``n``th degree polynomial has only ``n`` non-zero derivatives. - - ----- - -Find the first ``5`` derivatives of ``\sin(x)``. - -```math -\begin{align} -f(x) &= \sin(x) \\ -f'(x) &= \cos(x) \\ -f''(x) &= -\sin(x) \\ -f'''(x) &= -\cos(x) \\ -f^{(4)} &= \sin(x) \\ -f^{(5)} &= \cos(x) -\end{align} -``` - -We see the derivatives repeat themselves. (We also see alternative notation for higher order derivatives.) - - ----- - -Find the second derivative of $e^{-x^2}$. - -We need the chain rule *and* the product rule: - -```math -[e^{-x^2}]'' = [e^{-x^2} \cdot (-2x)]' = \left(e^{-x^2} \cdot (-2x)\right) \cdot(-2x) + e^{-x^2} \cdot (-2) = -e^{-x^2}(4x^2 - 2). -``` - -This can be verified: - -```julia -diff(diff(exp(-x^2))) |> simplify -``` - -Having to iterate the use of `diff` is cumbersome. An alternate notation is either specifying the variable twice: `diff(ex, x, x)` or using a number after the variable: `diff(ex, x, 2)`: - - -```julia -diff(exp(-x^2), x, x) |> simplify -``` - -Higher-order derivatives can become involved when the product or quotient rules becomes involved. - - -## Questions - -###### Question - -The derivative at $c$ is the slope of the tangent line at $x=c$. Answer the following based on this graph: - -```julia -fn = x -> -x*exp(x)*sin(pi*x) -plot(fn, 0, 2) -``` - -At which of these points $c= 1/2, 1, 3/2$ is the derivative negative? - -```julia; hold=true; echo=false -choices = ["``1/2``", "``1``", "``3/2``"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -Which value looks bigger from reading the graph: - -```julia; hold=true; echo=false -choices = ["``f(1)``", "``f(3/2)``"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -At $0.708 \dots$ and $1.65\dots$ the derivative has a common value. What is it? - -```julia; hold=true; echo=false -numericq(0, 1e-2) -``` - -###### Question - -Consider the graph of the `airyai` function (from `SpecialFunctions`) over $[-5, 5]$. - -```julia; hold=true;echo=false -plot(airyai, -5, 5) -``` - -At $x = -2.5$ the derivative is postive or negative? - -```julia; hold=true; echo=false -choices = ["positive", "negative"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - - -At $x=0$ the derivative is postive or negative? - -```julia; hold=true; echo=false -choices = ["positive", "negative"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -At $x = 2.5$ the derivative is postive or negative? - -```julia; hold=true; echo=false -choices = ["positive", "negative"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Compute the derivative of $e^x$ using `limit`. What do you get? - -```julia; hold=true; echo=false -choices = ["``e^x``", "``x^e``", "``(e-1)x^e``", "``e x^{(e-1)}``", "something else"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Compute the derivative of $x^e$ using `limit`. What do you get? - -```julia; hold=true; echo=false -choices = ["``e^x``", "``x^e``", "``(e-1)x^e``", "``e x^{(e-1)}``", "something else"] -answ = 5 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Compute the derivative of $e^{e\cdot x}$ using `limit`. What do you get? - -```julia; hold=true; echo=false -choices = ["``e^x``", "``x^e``", "``(e-1)x^e``", "``e x^{(e-1)}``", "``e \\cdot e^{e\\cdot x}``", "something else"] -answ = 5 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -In the derivation of the derivative of $\sin(x)$, the following limit is needed: - -```math -L = \lim_{h \rightarrow 0} \frac{\cos(h) - 1}{h}. -``` - -This is - -```julia; hold=true; echo=false -choices = [ -L" $1$, as this is clearly the analog of the limit of $\sin(h)/h$.", -L"Does not exist. The answer is $0/0$ which is undefined", -L" $0$, as this expression is the derivative of cosine at $0$. The answer follows, as cosine clearly has a tangent line with slope $0$ at $x=0$."] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -Let $f(x) = (e^x + e^{-x})/2$ and $g(x) = (e^x - e^{-x})/2$. Which is true? - -```julia; hold=true; echo=false -choices = [ -"``f'(x) = g(x)``", -"``f'(x) = -g(x)``", -"``f'(x) = f(x)``", -"``f'(x) = -f(x)``" -] -answ= 1 -radioq(choices, answ) -``` - - - -###### Question - -Let $f(x) = (e^x + e^{-x})/2$ and $g(x) = (e^x - e^{-x})/2$. Which is true? - -```julia; hold=true; echo=false -choices = [ -"``f''(x) = g(x)``", -"``f''(x) = -g(x)``", -"``f''(x) = f(x)``", -"``f''(x) = -f(x)``"] -answ= 3 -radioq(choices, answ) -``` - - - - - -###### Question - -Consider the function $f$ and its transformation $g(x) = a + f(x)$ -(shift up by $a$). Do $f$ and $g$ have the same derivative? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -Consider the function $f$ and its transformation $g(x) = f(x - a)$ -(shift right by $a$). Do $f$ and $g$ have the same derivative? - -```julia; hold=true; echo=false -yesnoq("no") -``` - - - -Consider the function $f$ and its transformation $g(x) = f(x - a)$ -(shift right by $a$). Is $f'$ at $x$ equal to $g'$ at $x-a$? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -Consider the function $f$ and its transformation $g(x) = c f(x)$, $c > -1$. Do $f$ and $g$ have the same derivative? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -Consider the function $f$ and its transformation $g(x) = f(x/c)$, $c > -1$. Do $f$ and $g$ have the same derivative? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -Which of the following is true? - -```julia; hold=true; echo=false -choices = [ -L"If the graphs of $f$ and $g$ are translations up and down, the tangent line at corresponding points is unchanged.", -L"If the graphs of $f$ and $g$ are rescalings of each other through $g(x)=f(x/c)$, $c > 1$. Then the tangent line for corresponding points is the same.", -L"If the graphs of $f$ and $g$ are rescalings of each other through $g(x)=cf(x)$, $c > 1$. Then the tangent line for corresponding points is the same." -] -answ = 1 -radioq(choices, answ) -``` - - - -###### Question - -The rate of change of volume with respect to height is $3h$. The rate of change of height with respect to time is $2t$. At at $t=3$ the height is $h=14$ what is the rate of change of volume with respect to time when $t=3$? - -```julia; hold=true; echo=false -## dv/dt = dv/dh * dh/dt = 3h * 2t -h = 14; t=3 -val = (3*h) * (2*t) -numericq(val) -``` - - -###### Question - -Which equation below is $f(x) = \sin(k\cdot x)$ a solution of ($k > 1$)? - -```julia; hold=true; echo=false -choices = [ -"``f'(x) = k^2 \\cdot f(x)``", -"``f'(x) = -k^2 \\cdot f(x)``", -"``f''(x) = k^2 \\cdot f(x)``", -"``f''(x) = -k^2 \\cdot f(x)``"] -answ = 4 -radioq(choices, answ) -``` - -###### Question - -Let $f(x) = e^{k\cdot x}$, $k > 1$. Which equation below is $f(x)$ a solution of? - - -```julia; hold=true; echo=false -choices = [ -"``f'(x) = k^2 \\cdot f(x)``", -"``f'(x) = -k^2 \\cdot f(x)``", -"``f''(x) = k^2 \\cdot f(x)``", -"``f''(x) = -k^2 \\cdot f(x)``"] -answ = 3 -radioq(choices, answ) -``` - -##### Question - -Their are ``6`` trig functions. The derivatives of ``\sin(x)`` and ``\cos(x)`` should be memorized. The others can be derived if not memorized using the quotient rule or chain rule. - -What is ``[\tan(x)]'``? (Use ``\tan(x) = \sin(x)/\cos(x)``.) - -```julia; echo=false -trig_choices = [ -"``\\sec^2(x)``", -"``\\sec(x)\\tan(x)``", -"``-\\csc^2(x)``", -"``-\\csc(x)\\cot(x)``" -] -radioq(trig_choices, 1) -``` - -What is ``[\cot(x)]'``? (Use ``\tan(x) = \cos(x)/\sin(x)``.) - -```julia; echo=false -radioq(trig_choices, 3) -``` - -What is ``[\sec(x)]'``? (Use ``\sec(x) = 1/\cos(x)``.) - -```julia; echo=false -radioq(trig_choices, 2) -``` - -What is ``[\csc(x)]'``? (Use ``\csc(x) = 1/\sin(x)``.) - -```julia; echo=false -radioq(trig_choices, 4) -``` - -##### Question - -Consider this picture of composition: - -```julia; hold=true; echo=false -f(x) = sin(x) -g(x) = exp(x) -a,b = 0, 1.55 - -xs = range(a, b, length=100) -ys = g.(xs) -us = range(extrema(ys)..., length=100) -vs = f.(us) -pf = plot(vs, us, ylim=extrema(ys), ymirror=true, legend=false) -xs′ = range(0.5, 1.5, length=100) -us′ = g.(xs′) -vs′ = [f(g(1) + g'(1)*(x-1)) for x ∈ xs′] -plot!(pf, vs′, us′) - -plot!(pf, [1, f(g(1)), f(g(1))], [g(1), g(1), 1]) -quiver!(pf, [.75, f(g(1))], [g(1), 2], quiver=([-.01, 0],[0, -0.1])) -pg = plot(xs, ys, ylim=extrema(ys), legend=false) -plot!(pg, [1,1,0],[1, g(1),g(1)]) -quiver!(pg, [1, 0.5], [2, g(1)], quiver=([0, -0.1],[0.1, 0])) -plot!(tangent(g,1)) -l = @layout [a b] -plot(pf, pg, layout=l) -``` - -The right graph is of ``g(x) = \exp(x)`` at ``x=1``, the left graph of ``f(x) = \sin(x)`` *rotated* ``90`` degrees counter-clockwise. Chasing the arrows shows graphically how ``f(g(1))`` can be computed. The nearby values ``f(g(1+h))`` are -- using the tangent line of ``g`` at ``x-1`` -- approximated by ``f(g(1) + g'(1)\cdot h)``, as shown in the graph segment on the left. - -Assuming the approximation gets better for ``h`` close to ``0``, as it visually does, the derivative at ``1`` for ``f(g(x))`` should be given by this limit: - - -```math -\begin{align*} -\frac{d(f\circ g)}{dx}\mid_{x=1} -&= \lim_{h\rightarrow 0} \frac{f(g(1) + g'(1)h)-f(g(1))}{h}\\ -&= \lim_{h\rightarrow 0} \frac{f(g(1) + g'(1)h)-f(g(1))}{h}\\ -&= \lim_{h\rightarrow 0} \frac{f(g(1) + g'(1)h)-f(g(1))}{g'(1)h} \cdot g'(1)\\ -&= \lim_{h\rightarrow 0} (f\circ g)'(1) \cdot g'(1). -\end{align*} -``` - -What limit law, described below assuming all limits exist. allows the last equals sign? - -```julia; hold=true; echo=false -choices = [ -raw""" -The limit of a sum is the sum of the limits: -``\lim_{x\rightarrow c}(au(x)+bv(x)) = a\lim_{x\rightarrow c}u(x) + b\lim_{x\rightarrow c}v(x)`` -""", -raw""" -The limit of a product is the product of the limits: -``\lim_{x\rightarrow c}(u(x)\cdot v(x)) = \lim_{x\rightarrow c}u(x) \cdot \lim_{x\rightarrow c}v(x)`` -""", -raw""" -The limit of a composition (under assumptions on ``v``): -``\lim_{x \rightarrow c}u(v(x)) = \lim_{w \rightarrow \lim_{x \rightarrow c}v(x)} u(w)``. -""" -] -radioq(choices, 3, keep_order=true) -``` diff --git a/CwJ/derivatives/figures/extrema-ladder.png b/CwJ/derivatives/figures/extrema-ladder.png deleted file mode 100644 index 8654ffe..0000000 Binary files a/CwJ/derivatives/figures/extrema-ladder.png and /dev/null differ diff --git a/CwJ/derivatives/figures/extrema-rectangles.png b/CwJ/derivatives/figures/extrema-rectangles.png deleted file mode 100644 index 4951e04..0000000 Binary files a/CwJ/derivatives/figures/extrema-rectangles.png and /dev/null differ diff --git a/CwJ/derivatives/figures/extrema-ring-string.R b/CwJ/derivatives/figures/extrema-ring-string.R deleted file mode 100644 index 61ed1a5..0000000 --- a/CwJ/derivatives/figures/extrema-ring-string.R +++ /dev/null @@ -1,17 +0,0 @@ -## Used to make ring figure. Redo in Julia?? -plot.new() -plot.window(xlim=c(0,1), ylim=c(-5, 1.1)) -x <- seq(.1, .9, length=9) -y <- c(-4.46262,-4.46866, -4.47268, -4.47469, -4.47468, -4.47267, -4.46864, -4.4626 , -4.45454) -lines(c(0, x[3], 1), c(0, y[3], 1)) -points(c(0,1), c(0,1), pch=16, cex=2) -text(c(0,1), c(0,1), c("(0,0)", c("(a,b)")), pos=3) - -lines(c(0, x[3], x[3]), c(0, 0, y[3]), cex=2, col="gray") -lines(c(1, x[3], x[3]), c(1, 1, y[3]), cex=2, col="gray") -text(x[3]/2, 0, "x", pos=1) -text(x[3], y[3]/2, "|y|", pos=2) -text(x[3], (1 + y[3])/2, "b-y", pos=4) -text((x[3] + 1)/2, 1, "a-x", pos=1) - -text(x[3], y[3], "0", cex=4, col="gold") diff --git a/CwJ/derivatives/figures/extrema-ring-string.png b/CwJ/derivatives/figures/extrema-ring-string.png deleted file mode 100644 index bd422b1..0000000 Binary files a/CwJ/derivatives/figures/extrema-ring-string.png and /dev/null differ diff --git a/CwJ/derivatives/figures/fcarc-may2016-fig35-350.gif b/CwJ/derivatives/figures/fcarc-may2016-fig35-350.gif deleted file mode 100644 index e97c11a..0000000 Binary files a/CwJ/derivatives/figures/fcarc-may2016-fig35-350.gif and /dev/null differ diff --git a/CwJ/derivatives/figures/fcarc-may2016-fig40-300.gif b/CwJ/derivatives/figures/fcarc-may2016-fig40-300.gif deleted file mode 100644 index 5655e05..0000000 Binary files a/CwJ/derivatives/figures/fcarc-may2016-fig40-300.gif and /dev/null differ diff --git a/CwJ/derivatives/figures/fcarc-may2016-fig43-250.gif b/CwJ/derivatives/figures/fcarc-may2016-fig43-250.gif deleted file mode 100644 index 94dccb2..0000000 Binary files a/CwJ/derivatives/figures/fcarc-may2016-fig43-250.gif and /dev/null differ diff --git a/CwJ/derivatives/figures/lhopital-32.png b/CwJ/derivatives/figures/lhopital-32.png deleted file mode 100644 index 6561eab..0000000 Binary files a/CwJ/derivatives/figures/lhopital-32.png and /dev/null differ diff --git a/CwJ/derivatives/figures/long-shadow-noir.png b/CwJ/derivatives/figures/long-shadow-noir.png deleted file mode 100644 index ce4ac6d..0000000 Binary files a/CwJ/derivatives/figures/long-shadow-noir.png and /dev/null differ diff --git a/CwJ/derivatives/first_second_derivatives.jmd b/CwJ/derivatives/first_second_derivatives.jmd deleted file mode 100644 index 08ca32c..0000000 --- a/CwJ/derivatives/first_second_derivatives.jmd +++ /dev/null @@ -1,998 +0,0 @@ -# The first and second derivatives - -This section uses these add-on packages: - - -```julia -using CalculusWithJulia -using Plots -using SymPy -using Roots -``` - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "The first and second derivatives", - description = "Calculus with Julia: The first and second derivatives", - tags = ["CalculusWithJulia", "derivatives", "the first and second derivatives"], -); - -nothing -``` - ----- - -This section explores properties of a function, ``f(x)``, that are described by properties of its first and second derivatives, ``f'(x)`` and ``f''(x)``. As part of the conversation two tests are discussed that characterize when a critical point is a relative maximum or minimum. (We know that any relative maximum or minimum occurs at a critical point, but it is not true that *any* critical point will be a relative maximum or minimum.) - - - -## Positive or increasing on an interval - -We start with some vocabulary: - -> A function $f$ is **positive** on an interval $I$ if for any $a$ in $I$ it must be that $f(a) > 0$. - -Of course, we define *negative* in a parallel manner. The intermediate value theorem says a continuous function can not change from positive to negative without crossing $0$. This is not the case for functions with jumps, of course. - -Next, - -> A function, $f$, is (strictly) **increasing** on an interval $I$ if for any $a < b$ it must be that $f(a) < f(b)$. - -The word strictly is related to the inclusion of the $<$ precluding the possibility of a function being flat over an interval that the $\leq$ inequality would allow. - -A parallel definition with $a < b$ implying $f(a) > f(b)$ would be used for a *strictly decreasing* function. - - -We can try and prove these properties for a function algebraically -- -we'll see both are related to the zeros of some function. However, -before proceeding to that it is usually helpful to get an idea of -where the answer is using exploratory graphs. - -We will use a helper function, `plotif(f, g, a, b)` that plots the function `f` over `[a,b]` coloring it red when `g` is positive (and blue otherwise). -Such a function is defined for us in the accompanying `CalculusWithJulia` package, which has been previously been loaded. - - -To see where a function is positive, we simply pass the function -object in for *both* `f` and `g` above. For example, let's look at -where $f(x) = \sin(x)$ is positive: - -```julia; hold=true; -f(x) = sin(x) -plotif(f, f, -2pi, 2pi) -``` - - -Let's graph with `cos` in the masking spot and see what happens: - -```julia; -plotif(sin, cos, -2pi, 2pi) -``` - -Maybe surprisingly, we see that the increasing parts of the sine curve are now -highlighted. Of course, the cosine is the derivative of the sine -function, now we discuss that this is no coincidence. - -For the sequel, we will use `f'` notation to find numeric derivatives, with the notation being defined in the `CalculusWithJulia` package using the `ForwardDiff` package. - - - - -## The relationship of the derivative and increasing - -The derivative, $f'(x)$, computes the slope of the tangent line to the -graph of $f(x)$ at the point $(x,f(x))$. If the derivative is -positive, the tangent line will have an increasing slope. Clearly if -we see an increasing function and mentally layer on a tangent line, it will -have a positive slope. Intuitively then, increasing functions and -positive derivatives are related concepts. But there are some -technicalities. - -Suppose $f(x)$ has a derivative on $I$ . Then - -> If $f'(x)$ is positive on an interval $I=(a,b)$, then $f(x)$ is strictly increasing on $I$. - -Meanwhile, - -> If a function $f(x)$ is increasing on $I$, then $f'(x) \geq 0$. - -The technicality being the equality parts. In the second statement, we -have the derivative is non-negative, as we can't guarantee it is -positive, even if we considered just strictly increasing functions. - -We can see by the example of $f(x) = x^3$ that strictly increasing -functions can have a zero derivative, at a point. - -The mean value theorem provides the reasoning behind the first statement: on -$I$, the slope of any secant line between $d < e$ (both in $I$) is matched by the slope of some -tangent line, which by assumption will always be positive. If the -secant line slope is written as $(f(e) - f(d))/(e - d)$ with $d < e$, -then it is clear then that $f(e) - f(d) > 0$, or $d < e$ implies $f(d) < f(e)$. - -The second part, follows from the secant line equation. The derivative -can be written as a limit of secant-line slopes, each of which is -positive. The limit of positive things can only be non-negative, -though there is no guarantee the limit will be positive. - - -So, to visualize where a function is increasing, we can just pass in -the derivative as the masking function in our `plotif` function, as long as we are wary about places with $0$ derivative (flat spots). - -For example, here, with a more complicated function, the intervals where the function is -increasing are highlighted by passing in the functions derivative to `plotif`: - -```julia; hold=true; -f(x) = sin(pi*x) * (x^3 - 4x^2 + 2) -plotif(f, f', -2, 2) -``` - -### First derivative test - - -When a function changes from increasing to decreasing, or decreasing to increasing, it will have a peak or a valley. More formally, such points are relative extrema. - -When discussing the mean value thereom, we defined *relative -extrema* : - -> * The function $f(x)$ has a *relative maximum* at $c$ if the value $f(c)$ is an *absolute maximum* for some *open* interval containing $c$. -> * Similarly, ``f(x)`` has a *relative minimum* at ``c`` if the value ``f(c)`` is an absolute minimum for *some* open interval about ``c``. - - -We know since [Fermat](http://tinyurl.com/nfgz8fz) that: - -> Relative maxima and minima *must* occur at *critical* points. - -Fermat says that *critical points* -- where the function is defined, but its derivative is either ``0`` or undefined -- are *interesting* points, however: - -> A critical point need not indicate a relative maxima or minima. - -Again, $f(x)=x^3$ provides the example at $x=0$. This is a critical point, but clearly not a -relative maximum or minimum - it is just a slight pause for a -strictly increasing function. - -This leaves the question: - -> When will a critical point correspond to a relative maximum or minimum? - -This question can be answered by considering the first derivative. - -> *The first derivative test*: If $c$ is a critical point for $f(x)$ and -> *if* $f'(x)$ changes sign at $x=c$, then $f(c)$ will be either a -> relative maximum or a relative minimum. -> * It will be a relative maximum if the derivative changes sign from $+$ to $-$. -> * It will be a relative minimum if the derivative changes sign from $-$ to $+$. -> * If $f'(x)$ does not change sign at $c$, then $f(c)$ is *not* a relative maximum or minimum. - -The classification part, should be clear: e.g., if the derivative is positive then -negative, the function $f$ will increase to $(c,f(c))$ then decrease -from $(c,f(c))$ -- so ``f`` will have a local maximum at ``c``. - -Our definition of critical point *assumes* $f(c)$ exists, as $c$ is in -the domain of $f$. With this assumption, vertical asymptotes are -avoided. However, it need not be that $f'(c)$ exists. The absolute -value function at $x=0$ provides an example: this point is a critical -point where the derivative changes sign, but ``f'(x)`` is not defined at exactly -$x=0$. Regardless, it is guaranteed that $f(c)$ will be a relative -minimum by the first derivative test. - -##### Example - -Consider the function $f(x) = e^{-\lvert x\rvert} \cos(\pi x)$ over $[-3,3]$: - -```julia; -𝐟(x) = exp(-abs(x)) * cos(pi * x) -plotif(𝐟, 𝐟', -3, 3) -``` - -We can see the first derivative test in action: at the peaks and -valleys -- the relative extrema -- the color changes. This is because ``f'`` is changing sign as as the function -changes from increasing to decreasing or vice versa. - -This function has a critical point at ``0``, as can be seen. It corresponds to a point where the derivative does not exist. It is still identified through `find_zeros`, which picks up zeros and in case of discontinuous functions, like `f'`, zero crossings: - -```julia -find_zeros(𝐟', -3, 3) -``` - -##### Example - -Find all the relative maxima and minima of the function $f(x) = -\sin(\pi \cdot x) \cdot (x^3 - 4x^2 + 2)$ over the interval $[-2, 2]$. - -We will do so numerically. For -this task we first need to gather the critical points. As each of the -pieces of $f$ are everywhere differentiable and no quotients are -involved, the function $f$ will be everywhere differentiable. As such, -only zeros of $f'(x)$ can be critical points. We find these with - -```julia; -𝒇(x) = sin(pi*x) * (x^3 - 4x^2 + 2) -𝒇cps = find_zeros(𝒇', -2, 2) -``` - -We should be careful though, as `find_zeros` may miss zeros that are not -simple or too close together. A critical point will correspond to a -relative maximum if the function crosses the axis, so these can not be -"pauses." As this is exactly the case we are screening for, we double -check that all the critical points are accounted for by graphing the -derivative: - -```julia; -plot(𝒇', -2, 2, legend=false) -plot!(zero) -scatter!(𝒇cps, 0*𝒇cps) -``` - -We see the six zeros as stored in `cps` and note that at each the -function clearly crosses the $x$ axis. - -From this last graph of the derivative we can also characterize the -graph of $f$: The left-most critical point coincides with a relative minimum -of $f$, as the derivative changes sign from negative to -positive. The critical points then alternate relative maximum, -relative minimum, relative maximum, relative, minimum, and finally relative maximum. - -##### Example - -Consider the function $g(x) = \sqrt{\lvert x^2 - 1\rvert}$. Find the critical -points and characterize them as relative extrema or not. - -We will apply the same approach, but need to get a handle on how large -the values can be. The function is a composition of three -functions. We should expect that the only critical points will occur -when the interior polynomial, $x^2-1$ has values of interest, which is -around the interval $(-1, 1)$. So we look to the slightly wider interval $[-2, 2]$: - -```julia; -g(x) = sqrt(abs(x^2 - 1)) -gcps = find_zeros(g', -2, 2) -``` - -We see the three values $-1$, $0$, $1$ that correspond to the two -zeros and the relative minimum of $x^2 - 1$. We could graph things, -but instead we characterize these values using a sign chart. A -piecewise continuous function can only change sign when it crosses $0$ or jumps over ``0``. The -derivative will be continuous, except possibly at the three values -above, so is piecewise continuous. - -A sign chart picks convenient values between crossing points to test if the function is positive or negative over those intervals. When computing by hand, these would ideally be values for which the function is easily computed. On the computer, this isn't a concern; below the midpoint is chosen: - -```julia; -pts = sort(union(-2, gcps, 2)) # this includes the endpoints (a, b) and the critical points -test_pts = pts[1:end-1] + diff(pts)/2 # midpoints of intervals between pts -[test_pts sign.(g'.(test_pts))] -``` - -Such values are often summarized graphically on a number line using a *sign chart*: - -```julia; eval=false - - ∞ + 0 - ∞ + g' -<---- -1 ----- 0 ----- 1 ----> -``` - -(The values where the function is ``0`` or could jump over ``0`` are shown on the number line, and the sign between these points is indicated. So the first minus sign shows ``g'(x)`` is *negative* on ``(-\infty, -1)``, the second minus sign shows ``g'(x)`` is negative on ``(0,1)``.) - - -Reading this we have: - -- the derivative changes sign from negative to postive at $x=-1$, so $g(x)$ will have a relative minimum. - -- the derivative changes sign from positive to negative at $x=0$, so $g(x)$ will have a relative maximum. - -- the derivative changes sign from negative to postive at $x=1$, so $g(x)$ will have a relative minimum. - -In the `CalculusWithJulia` package there is `sign_chart` function that will do such work for us, though with a different display: - -```julia -sign_chart(g', -2, 2) -``` - -(This function numerically identifies ``x``-values for the specified function which are zeros, infinities, or points where the function jumps ``0``. It then shows the resulting sign pattern of the function from left to right.) - -We did this all without graphs. But, let's look at the graph of the derivative: - -```julia; -plot(g', -2, 2) -``` - -We see asymptotes at $x=-1$ and $x=1$! These aren't zeroes of $f'(x)$, -but rather where $f'(x)$ does not exist. The conclusion is correct - -each of $-1$, $0$ and $1$ are critical points with the identified characterization - but not for the -reason that they are all zeros. - -```julia; -plot(g, -2, 2) -``` - - -Finally, why does `find_zeros` find these values that are not zeros of -$g'(x)$? As discussed briefly above, it uses the bisection algorithm -on bracketing intervals to find zeros which are guaranteed by the -intermediate value theorem, but when applied to discontinuous functions, as `f'` is, will also identify values where the function jumps over ``0``. - - -##### Example - -Consider the function $f(x) = \sin(x) - x$. Characterize the critical points. - -We will work symbolically for this example. - -```julia; -@syms x -fx = sin(x) - x -fp = diff(fx, x) -solve(fp) -``` - -We get values of $0$ and $2\pi$. Let's look at the derivative at these points: - -At $x=0$ we have to the left and right signs found by - -```julia; -fp(-pi/2), fp(pi/2) -``` - -Both are negative. The derivative does not change sign at $0$, so the critical point is neither a relative minimum or maximum. - -What about at $2\pi$? We do something similar: - -```julia; -fp(2pi - pi/2), fp(2pi + pi/2) -``` - -Again, both negative. The function $f(x)$ is just decreasing near -$2\pi$, so again the critical point is neither a relative minimum or maximum. - -A graph verifies this: - -```julia; -plot(fx, -3pi, 3pi) -``` - -We see that at $0$ and $2\pi$ there are "pauses" as the function -decreases. We should also see that this pattern repeats. The critical -points found by `solve` are only those within a certain domain. Any -value that satisfies $\cos(x) - 1 = 0$ will be a critical point, and -there are infinitely many of these of the form $n \cdot 2\pi$ for $n$ -an integer. - - -As a comment, the `solveset` function, which is replacing `solve`, -returns the entire collection of zeros: - -```julia; -solveset(fp) -``` - ----- - -Of course, `sign_chart` also does this, only numerically. We just need to pick an interval wide enough to contains ``[0,2\pi]`` - -```julia -sign_chart((x -> sin(x)-x)', -3pi, 3pi) -``` - - - -##### Example - -Suppose you know $f'(x) = (x-1)\cdot(x-2)\cdot (x-3) = x^3 - 6x^2 + -11x - 6$ and $g'(x) = (x-1)\cdot(x-2)^2\cdot(x-3)^3 = x^6 -14x^5 -+80x^4-238x^3+387x^2-324x+108$. - -How would the graphs of $f(x)$ and $g(x)$ differ, as they share identical critical points? - -The graph of $f(x)$ - a function we do not have a formula for - can have its critical points characterized by the first derivative test. As the derivative changes sign at each, all critical points correspond to relative maxima. The sign pattern is negative/positive/negative/positive so we have from left to right a relative minimum, a relative maximum, and then a relative minimum. This is consistent with a ``4``th degree polynomial with ``3`` relative extrema. - -For the graph of $g(x)$ we can apply the same analysis. Thinking for a -moment, we see as the factor $(x-2)^2$ comes as a power of $2$, the -derivative of $g(x)$ will not change sign at $x=2$, so there is no -relative extreme value there. However, at $x=3$ the factor has an odd -power, so the derivative will change sign at $x=3$. So, as $g'(x)$ is -positive for large *negative* values, there will be a relative maximum -at $x=1$ and, as $g'(x)$ is positive for large *positive* values, a -relative minimum at $x=3$. - -The latter is consistent with a $7$th degree polynomial with positive leading coefficient. It is intuitive that since $g'(x)$ is a $6$th degree polynomial, $g(x)$ will be a $7$th degree one, as the power rule applied to a polynomial results in a polynomial of lesser degree by one. - - -Here is a simple schematic that illustrates the above considerations. - -```julia; eval=false -f' - 0 + 0 - 0 + f'-sign - ↘ ↗ ↘ ↗ f-direction - ∪ ∩ ∪ f-shape - -g' + 0 - 0 - 0 + g'-sign - ↗ ↘ ↘ ↗ g-direction - ∩ ~ ∪ g-shape -<------ 1 ----- 2 ----- 3 ------> -``` - -## Concavity - -Consider the function $f(x) = x^2$. Over this function we draw some -secant lines for a few pairs of $x$ values: - -```julia; echo=false -let - f(x) = x^2 - seca(f,a,b) = x -> f(a) + (f(b) - f(a)) / (b-a) * (x-a) - p = plot(f, -2, 3, legend=false, linewidth=5, xlim=(-2,3), ylim=(-2, 9)) - plot!(p,seca(f, -1, 2)) - a,b = -1, 2; xs = range(a, stop=b, length=50) - plot!(xs, seca(f, a, b).(xs), linewidth=5) - plot!(p,seca(f, 0, 3/2)) - a,b = 0, 3/2; xs = range(a, stop=b, length=50) - plot!(xs, seca(f, a, b).(xs), linewidth=5) - p -end -``` - -The graph attempts to illustrate that for this function the secant -line between any two points $a < b$ will lie above the graph over $[a,b]$. - -This is a special property not shared by all functions. Let $I$ be an open interval. - -> **Concave up**: A function $f(x)$ is concave up on $I$ if for any $a < b$ in $I$, the secant line between $a$ and $b$ lies above the graph of $f(x)$ over $[a,b]$. - -A similar definition exists for *concave down* where the secant lines -lie below the graph. Notationally, concave up says for any $x$ in $[a,b]$: - -```math -f(a) + \frac{f(b) - f(a)}{b-a} \cdot (x-a) \geq f(x) \quad\text{ (concave up) } -``` - -Replacing -$\geq$ with $\leq$ defines *concave down*, and with either $>$ or $<$ -will add the prefix "strictly." These definitions are useful for a -general definition of -[convex functions](https://en.wikipedia.org/wiki/Convex_function). - -We won't work with these definitions in this section, rather we will characterize -concavity for functions which have either a first or second -derivative: - -> * If $f'(x)$ exists and is *increasing* on $(a,b)$, then $f(x)$ is concave up on $(a,b)$. -> * If ``f'(x)`` is *decreasing* on ``(a,b)``, then ``f(x)`` is concave *down*. - -A proof of this makes use of the same trick used to establish the mean -value theorem from Rolle's theorem. Assume ``f'`` is increasing and let -$g(x) = f(x) - (f(a) + M \cdot (x-a))$, where $M$ is the slope of -the secant line between $a$ and $b$. By construction $g(a) = g(b) = -0$. If $f'(x)$ is increasing, then so is $g'(x) = f'(x) + M$. By its -definition above, showing ``f`` is concave up is the same as showing $g(x) \leq -0$. Suppose to the contrary that there is a value where $g(x) > 0$ in -$[a,b]$. We show this can't be. Assuming $g'(x)$ always exists, after -some work, Rolle's theorem will ensure there is a value where $g'(c) = -0$ and $(c,g(c))$ is a relative maximum, and as we know there is at -least one positive value, it must be $g(c) > 0$. The first derivative -test then ensures that $g'(x)$ will increase to the left of $c$ and -decrease to the right of $c$, since $c$ is at a critical point and not -an endpoint. But this can't happen as $g'(x)$ is assumed to be -increasing on the interval. - - -The relationship between increasing functions and their derivatives -- if $f'(x) > 0 $ on $I$, then ``f`` is increasing on $I$ -- -gives this second characterization of concavity when the second -derivative exists: - -> * If $f''(x)$ exists and is positive on $I$, then $f(x)$ is concave up on $I$. -> * If $f''(x)$ exists and is negative on $I$, then $f(x)$ is concave down on $I$. - -This follows, as we can think of $f''(x)$ as just the first derivative -of the function $f'(x)$, so the assumption will force $f'(x)$ to exist and be -increasing, and hence $f(x)$ to be concave up. - -##### Example - -Let's look at the function $x^2 \cdot e^{-x}$ for positive $x$. A -quick graph shows the function is concave up, then down, then up in -the region plotted: - -```julia; -h(x) = x^2 * exp(-x) -plotif(h, h'', 0, 8) -``` - -From the graph, we would expect that the second derivative - which is continuous - would have two zeros on $[0,8]$: - -```julia; -ips = find_zeros(h'', 0, 8) -``` - -As well, between the zeros we should have the sign pattern `+`, `-`, and `+`, as we verify: - -```julia; -sign_chart(h'', 0, 8) -``` - -### Second derivative test - -Concave up functions are "opening" up, and often clearly $U$-shaped, though that is not necessary. At a -relative minimum, where there is a ``U``-shape, the graph will be concave up; conversely -at a relative maximum, where the graph has a downward ``\cap``-shape, the function will be concave down. This observation becomes: - -> The **second derivative test**: If $c$ is a critical point of $f(x)$ -> with $f''(c)$ existing in a neighborhood of $c$, then -> * The value $f(c)$ will be a relative maximum if $f''(c) > 0$, -> * The value $f(c)$ will be a relative minimum if $f''(c) < 0$, and -> * *if* ``f''(c) = 0`` the test is *inconclusive*. - - -If $f''(c)$ is positive in an interval about $c$, then $f''(c) > 0$ implies the function is -concave up at $x=c$. In turn, concave up implies the derivative is increasing -so must go from negative to positive at the critical point. - -The second derivative test is **inconclusive** when $f''(c)=0$. No such -general statement exists, as there isn't enough information. For -example, the function $f(x) = x^3$ has $0$ as a critical point, -$f''(0)=0$ and the value does not correspond to a relative maximum or minimum. On the -other hand $f(x)=x^4$ has $0$ as a critical point, $f''(0)=0$ is a -relative minimum. - -##### Example - -Use the second derivative test to characterize the critical points of $j(x) = x^5 - x^4 + x^3$. - -```julia; -j(x) = x^5 - 2x^4 + x^3 -jcps = find_zeros(j', -3, 3) -``` - -We can check the sign of the second derivative for each critical point: - -```julia; -[jcps j''.(jcps)] -``` - -That $j''(0.6) < 0$ implies that at $0.6$, $j(x)$ will have a relative -maximum. As $''(1) > 0$, the second derivative test says at $x=1$ -there will be a relative minimum. That $j''(0) = 0$ says that only -that there **may** be a relative maximum or minimum at $x=0$, as the second -derivative test does not speak to this situation. (This last check, requiring a function evaluation to be `0`, is susceptible to floating point errors, so isn't very robust as a general tool.) - -This should be consistent with -this graph, where $-0.25$, and $1.25$ are chosen to capture the zero at -$0$ and the two relative extrema: - -```julia; -plotif(j, j'', -0.25, 1.25) -``` - -For the graph we see that $0$ **is not** a relative maximum or minimum. We could have seen this numerically by checking the first derivative test, and noting there is no sign change: - -```julia; -sign_chart(j', -3, 3) -``` - -##### Example - -One way to visualize the second derivative test is to *locally* overlay on a critical point a parabola. For example, consider ``f(x) = \sin(x) + \sin(2x) + \sin(3x)`` over ``[0,2\pi]``. It has ``6`` critical points over ``[0,2\pi]``. In this graphic, we *locally* layer on ``6`` parabolas: - -```julia; hold=true; -f(x) = sin(x) + sin(2x) + sin(3x) -p = plot(f, 0, 2pi, legend=false, color=:blue, linewidth=3) -cps = find_zeros(f', (0, 2pi)) -Δ = 0.5 -for c in cps - parabola(x) = f(c) + (f''(c)/2) * (x-c)^2 - plot!(parabola, c - Δ, c + Δ, color=:red, linewidth=5, alpha=0.6) -end -p -``` - - -The graphic shows that for this function near the relative extrema the parabolas *approximate* the function well, so that the relative extrema are characterized by the relative extrema of the parabolas. - -At each critical point ``c``, the parabolas have the form - -```math -f(c) + \frac{f''(c)}{2}(x-c)^2. -``` - -The ``2`` is a mystery to be answered in the section on [Taylor series](../taylor_series_polynomials.html), the focus here is on the *sign* of ``f''(c)``: - -* if ``f''(c) > 0`` then the approximating parabola opens upward and the critical point is a point of relative minimum for ``f``, -* if ``f''(c) < 0`` then the approximating parabola opens downward and the critical point is a point of relative maximum for ``f``, and -* were ``f''(c) = 0`` then the approximating parabola is just a line -- the tangent line at a critical point -- and is non-informative about extrema. - -That is, the parabola picture is just the second derivative test in this light. - -### Inflection points - -An inflection point is a value where the *second* derivative of $f$ -changes sign. At an inflection point the derivative will change from -increasing to decreasing (or vice versa) and the function will change -from concave up to down (or vice versa). - -We can use the `find_zeros` function to identify potential inflection -points by passing in the second derivative function. For example, -consider the bell-shaped function - -```math -k(x) = e^{-x^2/2}. -``` - -A graph suggests relative a maximum at $x=0$, a horizontal asymptote of $y=0$, -and two inflection points: - -```julia; -k(x) = exp(-x^2/2) -plotif(k, k'', -3, 3) -``` - -The inflection points can be found directly, if desired, or numerically with: - -```julia; -find_zeros(k'', -3, 3) -``` - -(The `find_zeros` function may return points which are not inflection points. It primarily returns points where $k''(x)$ changes sign, but *may* also find points where $k''(x)$ is $0$ yet does not change sign at $x$.) - - - -##### Example - -A car travels from a stop for 1 mile in 2 minutes. A graph of its -position as a function of time might look like any of these graphs: - -```julia; echo=false -let - v(t) = 30/60*t - w(t) = t < 1/2 ? 0.0 : (t > 3/2 ? 1.0 : (t-1/2)) - y(t) = 1 / (1 + exp(-t)) - y1(t) = y(2(t-1)) - y2(t) = y1(t) - y1(0) - y3(t) = 1/y2(2) * y2(t) - plot(v, 0, 2, label="f1") - plot!(w, label="f2") - plot!(y3, label="f3") -end -``` - -All three graphs have the same *average* velocity which is just the -$1/2$ miles per minute (``30`` miles an hour). But the instantaneous -velocity - which is given by the derivative of the position function) -varies. - -The graph `f1` has constant velocity, so the position is a straight -line with slope $v_0$. The graph `f2` is similar, though for first and -last 30 seconds, the car does not move, so must move faster during the -time it moves. A more realistic graph would be `f3`. The position -increases continuously, as do the others, but the velocity changes -more gradually. The initial velocity is less than $v_0$, but -eventually gets to be more than $v_0$, then velocity starts to -increase less. At no point is the velocity not increasing, for `f3`, -the way it is for `f2` after a minute and a half. - -The rate of change of the velocity is the acceleration. For `f1` this -is zero, for `f2` it is zero as well - when it is defined. However, -for `f3` we see the increase in velocity is positive in the first -minute, but negative in the second minute. This fact relates to the -concavity of the graph. As acceleration is the derivative of velocity, -it is the second derivative of position - the graph we see. Where the -acceleration is *positive*, the position graph will be concave *up*, -where the acceleration is *negative* the graph will be concave -*down*. The point $t=1$ is an inflection point, and -would be felt by most riders. - - - -## Questions - - -###### Question - -Consider this graph: - -```julia; -plot(airyai, -5, 0) # airyai in `SpecialFunctions` loaded with `CalculusWithJulia` -``` - -On what intervals (roughly) is the function positive? - -```julia; hold=true; echo=false -choices=[ -"``(-3.2,-1)``", -"``(-5, -4.2)``", -"``(-5, -4.2)`` and ``(-2.5, 0)``", -"``(-4.2, -2.5)``"] -answ = 3 -radioq(choices, answ) -``` - - -###### Question - -Consider this graph: - -```julia; hold=true; echo=false -import SpecialFunctions: besselj -p = plot(x->besselj(x, 1), -5,-3) -``` - -On what intervals (roughly) is the function negative? - -```julia; hold=true; echo=false -choices=[ -"``(-5.0, -4.0)``", -"``(-25.0, 0.0)``", -"``(-5.0, -4.0)`` and ``(-4, -3)``", -"``(-4.0, -3.0)``"] -answ = 4 -radioq(choices, answ) -``` - -###### Question - -Consider this graph - -```julia; hold=true; echo=false -plot(x->besselj(x, 21), -5,-3) -``` - -On what interval(s) is this function increasing? - - -```julia; hold=true; echo=false -choices=[ -"``(-5.0, -3.8)``", -"``(-3.8, -3.0)``", -"``(-4.7, -3.0)``", -"``(-0.17, 0.17)``" -] -answ = 3 -radioq(choices, answ) -``` - -###### Question - - -Consider this graph - -```julia; hold=true; echo=false -p = plot(x -> 1 / (1+x^2), -3, 3) -``` - -On what interval(s) is this function concave up? - - -```julia; hold=true; echo=false -choices=[ -"``(0.1, 1.0)``", -"``(-3.0, 3.0)``", -"``(-0.6, 0.6)``", -" ``(-3.0, -0.6)`` and ``(0.6, 3.0)``" -] -answ = 4 -radioq(choices, answ) -``` - - -###### Question - -If it is known that: - -* A function $f(x)$ has critical points at $x=-1, 0, 1$ - -* at $-2$ an $-1/2$ the values are: $f'(-2) = 1$ and $f'(-1/2) = -1$. - -What can be concluded? - -```julia; hold=true; echo=false -choices = [ - "Nothing", - "That the critical point at ``-1`` is a relative maximum", - "That the critical point at ``-1`` is a relative minimum", - "That the critical point at ``0`` is a relative maximum", - "That the critical point at ``0`` is a relative minimum" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Mystery function $f(x)$ has $f'(2) = 0$ and $f''(0) = 2$. What is the *most* you can say about $x=2$? - -```julia; hold=true; echo=false -choices = [ -" ``f(x)`` is continuous at ``2``", -" ``f(x)`` is continuous and differentiable at ``2``", -" ``f(x)`` is continuous and differentiable at ``2`` and has a critical point", -" ``f(x)`` is continuous and differentiable at ``2`` and has a critical point that is a relative minimum by the second derivative test" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -Find the smallest critical point of $f(x) = x^3 e^{-x}$. - -```julia; echo=false -let - f(x)= x^3*exp(-x) - cps = find_zeros(D(f), -5, 10) - val = minimum(cps) - numericq(val) -end -``` - -###### Question - -How many critical points does $f(x) = x^5 - x + 1$ have? - -```julia; echo=false -let - f(x) = x^5 - x + 1 - cps = find_zeros(D(f), -3, 3) - val = length(cps) - numericq(val) -end -``` - -###### Question - -How many inflection points does $f(x) = x^5 - x + 1$ have? - - -```julia; echo=false -let - f(x) = x^5 - x + 1 - cps = find_zeros(D(f,2), -3, 3) - val = length(cps) - numericq(val) -end -``` - -###### Question - -At $c$, $f'(c) = 0$ and $f''(c) = 1 + c^2$. Is $(c,f(c))$ a relative maximum? ($f$ is a "nice" function.) - -```julia; hold=true; echo=false -choices = [ -"No, it is a relative minimum", -"No, the second derivative test is possibly inconclusive", -"Yes" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -At $c$, $f'(c) = 0$ and $f''(c) = c^2$. Is $(c,f(c))$ a relative minimum? ($f$ is a "nice" function.) - -```julia; hold=true; echo=false -choices = [ -"No, it is a relative maximum", -"No, the second derivative test is possibly inconclusive if ``c=0``, but otherwise yes", -"Yes" -] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -```julia; echo=false -let - f(x) = exp(-x) * sin(pi*x) - plot(D(f), 0, 3) -end -``` - -The graph shows $f'(x)$. Is it possible that $f(x) = e^{-x} \sin(\pi x)$? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -(Plot ``f(x)`` and compare features like critical points, increasing decreasing to that indicated by ``f'`` through the graph.) - - -###### Question - -```julia; hold=true; echo=false -f(x) = x^4 - 3x^3 - 2x + 4 -plot(D(f,2), -2, 4) -``` - - -The graph shows $f'(x)$. Is it possible that $f(x) = x^4 - 3x^3 - 2x + 4$? - -```julia; hold=true; echo=false -yesnoq("no") -``` - - - -###### Question - -```julia; hold=true; echo=false -f(x) = (1+x)^(-2) -plot(D(f,2), 0,2) -``` - - -The graph shows $f''(x)$. Is it possible that $f(x) = (1+x)^{-2}$? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -###### Question - -```julia; hold=true; echo=false -f_p(x) = (x-1)*(x-2)^2*(x-3)^2 -plot(f_p, 0.75, 3.5) -``` - -This plot shows the graph of $f'(x)$. What is true about the critical points and their characterization? - -```julia; hold=true; echo=false -choices = [ -"The critical points are at ``x=1`` (a relative minimum), ``x=2`` (not a relative extrema), and ``x=3`` (not a relative extrema).", -"The critical points are at ``x=1`` (a relative maximum), ``x=2`` (not a relative extrema), and ``x=3`` (not a relative extrema).", -"The critical points are at ``x=1`` (a relative minimum), ``x=2`` (not a relative extrema), and ``x=3`` (a relative minimum).", -"The critical points are at ``x=1`` (a relative minimum), ``x=2`` (a relative minimum), and ``x=3`` (a relative minimum).", -] -answ=1 -radioq(choices, answ) -``` - -##### Question - -You know $f''(x) = (x-1)^3$. What do you know about $f(x)$? - -```julia; hold=true; echo=false -choices = [ -"The function is concave down over ``(-\\infty, 1)`` and concave up over ``(1, \\infty)``", -"The function is decreasing over ``(-\\infty, 1)`` and increasing over ``(1, \\infty)``", -"The function is negative over ``(-\\infty, 1)`` and positive over ``(1, \\infty)``", -] -answ = 1 -radioq(choices, answ) -``` - -##### Question - -While driving we accelerate to get through a light before it turns red. However, at time $t_0$ a car cuts in front of us and we are forced to break. If $s(t)$ represents position, what is $t_0$: - -```julia; hold=true; echo=false -choices = ["A zero of the function", -"A critical point for the function", -"An inflection point for the function"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The [investopedia](https://www.investopedia.com/terms/i/inflectionpoint.asp) website describes: - -"An **inflection point** is an event that results in a significant change in the progress of a company, industry, sector, economy, or geopolitical situation and can be considered a turning point after which a dramatic change, with either positive or negative results, is expected to result." - -This accurately summarizes how the term is used outside of math books. Does it also describe how the term is used *inside* math books? - -```julia; hold=true, echo=false -choices = ["Yes. Same words, same meaning", - """No, but it is close. An inflection point is when the *acceleration* changes from positive to negative, so if "results" are about how a company's rate of change is changing, then it is in the ballpark."""] -radioq(choices, 2) -``` - -###### Question - -The function ``f(x) = x^3 + x^4`` has a critical point at ``0`` and a second derivative of ``0`` at ``x=0``. Without resorting to the first derivative test, and only considering that *near* ``x=0`` the function ``f(x)`` is essentially ``x^3``, as ``f(x) = x^3(1+x)``, what can you say about whether the critical point is a relative extrema? - -```julia; hold=true; echo=false -choices = ["As ``x^3`` has no extrema at ``x=0``, neither will ``f``", - "As ``x^4`` is of higher degree than ``x^3``, ``f`` will be ``U``-shaped, as ``x^4`` is."] -radioq(choices, 1) -``` diff --git a/CwJ/derivatives/implicit_differentiation.jmd b/CwJ/derivatives/implicit_differentiation.jmd deleted file mode 100644 index 1d2df38..0000000 --- a/CwJ/derivatives/implicit_differentiation.jmd +++ /dev/null @@ -1,1014 +0,0 @@ -# Implicit Differentiation - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using ImplicitPlots -using Roots -using SymPy -``` - - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Implicit Differentiation", - description = "Calculus with Julia: Implicit Differentiation", - tags = ["CalculusWithJulia", "derivatives", "implicit differentiation"], -); -nothing -``` - ----- - -## Graphs of equations - - -An **equation** in ``y`` and ``x`` is an algebraic expression involving an -equality with two (or more) variables. An example might be ``x^2 + y^2 -= 1``. - -The **solutions** to an equation in the variables ``x`` and ``y`` are all -points ``(x,y)`` which satisfy the equation. - - -The **graph** of an equation is just the set of solutions to the equation represented in the Cartesian plane. - -With this definition, the graph of a function ``f(x)`` is just the graph of the equation ``y = f(x)``. -In general, graphing an equation is more complicated than graphing a -function. For a function, we know for a given value of ``x`` what -the corresponding value of ``f(x)`` is through evaluation of the function. For -equations, we may have ``0``, ``1`` or more ``y`` values for a given ``x`` and -even more problematic is we may have no rule to find these values. - - -There are a few options for plotting equations in `Julia`. We will use `ImplicitPlots` in this section, but note both `ImplicitEquations` and `IntervalConstraintProgramming` offer alternatives that are a bit more flexible. - - -To plot an implicit equation using `ImplicitPlots` requires expressing the relationship in terms of a function, and then plotting the equation `f(x,y) = 0`. In practice this simply requires all the terms be moved to one side of an equals sign. - -To plot the circle of radius ``2``, or the equations ``x^2 + y^2 = 2^2`` we would move all terms to one side ``x^2 + y^2 - 2^2 = 0`` and then express the left hand side through a function: - -```julia; -f(x,y) = x^2 + y^2 - 2^2 -``` - - -This function is then is passed to the `implicit_plot` function, which works with `Plots` to render the graphic: - -```julia; -implicit_plot(f) -``` - - -!!! note - The `f` is a function of *two* variables, used here to express one side of an equation. `Julia` makes this easy to do - just make sure two variables are in the signature of `f` when it is defined. Using functions like this, we can express our equation in the form ``f(x,y) = c`` or, more generally, as ``f(x,y) = g(x,y)``. The latter of which can be expressed as ``h(x,y) = f(x,y) - g(x,y) = 0``. That is, only the form ``f(x,y)=0`` is needed to represent an equation. - -!!! note - There are two different styles in `Julia` to add simple plot recipes. `ImplicitPlots` adds a new plotting function (`implicit_plot`); alternatively many packages add a new recipe for the generic `plot` method using new types. (For example, `SymPy` has a plot recipe for symbolic types. - - -Of course, more complicated equations are possible and the steps are -similar - only the function definition is more involved. For example, -the [Devils -curve](http://www-groups.dcs.st-and.ac.uk/~history/Curves/Devils.html) -has the form - -```math -y^4 - x^4 + ay^2 + bx^2 = 0 -``` - -Here we draw the curve for a particular choice of ``a`` and ``b``. For -illustration purposes, a narrower viewing window is specified below using `xlims` and `ylims`: - -```julia; hold=true -a,b = -1,2 -f(x,y) = y^4 - x^4 + a*y^2 + b*x^2 -implicit_plot(f; xlims=(-3,3), ylims=(-3,3), legend=false) -``` - -## Tangent lines, implicit differentiation - - -The graph ``x^2 + y^2 = 1`` has well-defined tangent lines at all points except -``(-1,0)`` and ``(0, 1)`` and even at these two points, we could call the vertical lines -``x=-1`` and ``x=1`` tangent lines. However, to recover the slope of these tangent lines would -need us to express ``y`` as a function of ``x`` and then differentiate -that function. Of course, in this example, we would need two functions: -``f(x) = \sqrt{1-x^2}`` and ``g(x) = - \sqrt{1-x^2}`` to do this -completely. - -In general though, we may not be able to solve for ``y`` in terms of ``x``. What then? - -The idea is to *assume* that ``y`` is representable by some function of -``x``. This makes sense, moving on the curve from ``(x,y)`` to some nearby -point, means changing ``x`` will cause some change in ``y``. This -assumption is only made *locally* - basically meaning a complicated -graph is reduced to just a small, well-behaved, section of its graph. - - -With this assumption, asking what ``dy/dx`` is has an obvious meaning - -what is the slope of the tangent line to the graph at ``(x,y)``. (The assumption eliminates the question of what a tangent line would mean when a graph self intersects.) - -The method of implicit differentiation allows this question to be -investigated. It begins by differentiating both sides of the equation -assuming ``y`` is a function of ``x`` to derive a new equation involving ``dy/dx``. - -For example, starting with ``x^2 + y^2 = 1``, differentiating both sides in ``x`` gives: - -```math -2x + 2y\cdot \frac{dy}{dx} = 0. -``` - -The chain rule was used to find ``(d/dx)(y^2) = [y(x)^2]' = 2y \cdot dy/dx``. From this we can solve for ``dy/dx`` (the resulting equations are linear in ``dy/dx``, so can always be solved explicitly): - -```math -\frac{dy}{dx} = -\frac{x}{y}. -``` - -This says the slope of the tangent line depends on the point ``(x,y)`` through the formula ``-x/y``. - - -As a check, we compare to what we would have found had we solved for -``y= \sqrt{1 - x^2}`` (for ``(x,y)`` with ``y \geq 0``). We would have -found: ``dy/dx = 1/2 \cdot 1/\sqrt{1 - x^2} \cdot (-2x)``. Which can be -simplified to ``-x/y``. This should show that the method -above - assuming ``y`` is a function of ``x`` and differentiating - is not -only more general, but can even be easier. - -The name - *implicit differentiation* - comes from the assumption that -``y`` is implicitly defined in terms of ``x``. According to the -[Implicit Function Theorem](http://en.wikipedia.org/wiki/Implicit_function_theorem) the -above method will work provided the curve has sufficient smoothness -near the point ``(x,y)``. - -##### Examples - -Consider the [serpentine](http://www-history.mcs.st-and.ac.uk/Curves/Serpentine.html) equation - -```math -x^2y + a\cdot b \cdot y - a^2 \cdot x = 0, \quad a\cdot b > 0. -``` - -For ``a = 2, b=1`` we have the graph: - -```julia;hold=true -a, b = 2, 1 -f(x,y) = x^2*y + a * b * y - a^2 * x -implicit_plot(f) -``` - -We can see that at each point in the viewing window the tangent line -exists due to the smoothness of the curve. Moreover, at a point -``(x,y)`` the tangent will have slope ``dy/dx`` satisfying: - -```math -2xy + x^2 \frac{dy}{dx} + a\cdot b \frac{dy}{dx} - a^2 = 0. -``` - -Solving, yields: - -```math -\frac{dy}{dx} = \frac{a^2 - 2xy}{ab + x^2}. -``` - - -In particular, the point ``(0,0)`` is always on this graph, and the tangent line will have positive slope ``a^2/(ab) = a/b``. - ----- - -The [eight](http://www-history.mcs.st-and.ac.uk/Curves/Eight.html) curve has representation - -```math -x^4 = a^2(x^2-y^2), \quad a \neq 0. -``` - - -A graph for ``a=3`` shows why it has the name it does: - -```julia;hold=true -a = 3 -f(x,y) = x^4 - a^2*(x^2 - y^2) -implicit_plot(f) -``` - -The tangent line at ``(x,y)`` will have slope, ``dy/dx`` satisfying: - -```math -4x^3 = a^2 \cdot (2x - 2y \frac{dy}{dx}). -``` - -Solving gives: - -```math -\frac{dy}{dx} = -\frac{4x^3 - a^2 \cdot 2x}{a^2 \cdot 2y}. -``` - -The point ``(3,0)`` can be seen to be a solution to the equation and -should have a vertical tangent line. This also is reflected in the -formula, as the denominator is ``a^2\cdot 2 y``, which is ``0`` at this point, whereas the numerator is not. - - -##### Example - -The quotient rule can be hard to remember, unlike the product rule. No -reason to despair, the product rule plus implicit differentiation can -be used to recover the quotient rule. Suppose ``y=f(x)/g(x)``, then we -could also write ``y g(x) = f(x)``. Differentiating implicitly gives: - -```math -\frac{dy}{dx} g(x) + y g'(x) = f'(x). -``` - -Solving for ``dy/dx`` gives: - -```math -\frac{dy}{dx} = \frac{f'(x) - y g'(x)}{g(x)}. -``` - -Not quite what we expect, perhaps, but substituting in ``f(x)/g(x)`` for ``y`` gives us the usual formula: - - -```math -\frac{dy}{dx} = \frac{f'(x) - \frac{f(x)}{g(x)} g'(x)}{g(x)} = \frac{f'(x) g(x) - f(x) g'(x)}{g(x)^2}. -``` - -!!! note - In this example we mix notations using ``g'(x)`` to - represent a derivative of ``g`` with respect to ``x`` and ``dy/dx`` to - represent the derivative of ``y`` with respect to ``x``. This is done to - emphasize the value that we are solving for. It is just a convention - though, we could just as well have used the "prime" notation for each. - - -##### Example: Graphing a tangent line - -Let's see how to add a graph of a tangent line to the graph of an -equation. Tangent lines are tangent at a point, so we need a point to -discuss. - -Returning to the equation for a circle, ``x^2 + y^2 = 1``, let's look at -``(\sqrt{2}/2, - \sqrt{2}/2)``. The derivative is ``-y/x``, so the slope -at this point is ``1``. The line itself has equation ``y = b + m \cdot -(x-a)``. The following represents this in `Julia`: - -```julia;hold=true -F(x,y) = x^2 + y^2 - 1 - -a,b = sqrt(2)/2, -sqrt(2)/2 - -m = -a/b -tl(x) = b + m * (x-a) - -implicit_plot(F, xlims=(-2, 2), ylims=(-2, 2), aspect_ratio=:equal) -plot!(tl) -``` - -We added *both* the implicit plot of ``F`` and the tangent line to the graph at the given point. - - -##### Example - -When we assume ``y`` is a function of ``x``, it may not be feasible to -actually find the function algebraically. However, in many cases one -can be found numerically. Suppose ``G(x,y) = c`` describes the -equation. Then for a fixed ``x``, ``y(x)`` solves ``G(x,y(x))) - c = 0``, so -``y(x)`` is a zero of a known function. As long as we can piece together -which ``y`` goes with which, we can find the function. - -For example, the [folium](http://www-history.mcs.st-and.ac.uk/Curves/Foliumd.html) of Descartes has the equation - -```math -x^3 + y^3 = 3axy. -``` - -Setting ``a=1`` we have the graph: - -```julia; -𝒂 = 1 -G(x,y) = x^3 + y^3 - 3*𝒂*x*y -implicit_plot(G) -``` - -We can solve for the lower curve, ``y``, as a function of ``x``, as follows: - -```julia; -y1(x) = minimum(find_zeros(y->G(x,y), -10, 10)) # find_zeros from `Roots` -``` - -This gives the lower part of the curve, which we can plot with: - -```julia; -plot(y1, -5, 5) -``` - -Though, in this case, the cubic equation would admit a closed-form solution, the approach illustrated applies more generally. - - -## Using SymPy for computation - -`SymPy` can be used to perform implicit differentiation. The three -steps are similar: we assume ``y`` is a function of ``x``, *locally*; -differentiate both sides; solve the result for ``dy/dx``. - - -Let's do so for the [Trident of -Newton](http://www-history.mcs.st-and.ac.uk/Curves/Trident.html), which -is represented in Cartesian form as follows: - -```math -xy = cx^3 + dx^2 + ex + h. -``` - - - -To approach this task in `SymPy`, we begin by defining our symbolic expression. For now, we keep the parameters as symbolic values: - -```julia; -@syms a b c d x y -ex = x*y - (a*c^3 + b*x^2 + c*x + d) -``` - -To express that `y` is a locally a function of `x`, we use a "symbolic function" object: - -```julia; -@syms u() -``` - -The object `u` is the symbolic function, and `u(x)` a symbolic expression -involving a symbolic function. This is what we will use to refer to `y`. - - -Assume ``y`` is a function of ``x``, called `u(x)`, this substitution is just a renaming: - -```julia; -ex1 = ex(y => u(x)) -``` - -At this point, we differentiate in `x`: - -```julia; -ex2 = diff(ex1, x) -``` - -The next step is solve for ``dy/dx`` - the lone answer to the linear equation - which is done as follows: - -```julia; -dydx = diff(u(x), x) -ex3 = solve(ex2, dydx)[1] # pull out lone answer with [1] indexing -``` - -As this represents an answer in terms of `u(x)`, we replace that term with the original variable: - -```julia; -dydx₁ = ex3(u(x) => y) -``` - -If `x` and `y` are the variable names, this function will combine the steps above: - -```julia; -function dy_dx(eqn, x, y) - @syms u() - eqn1 = eqn(y => u(x)) - eqn2 = solve(diff(eqn1, x), diff(u(x), x))[1] - eqn2(u(x) => y) -end -``` - - -Let ``a = b = c = d = 1``, then ``(1,4)`` is a point on the curve. We can draw a tangent line to this point with these commands: - -```julia; -H = ex(a=>1, b=>1, c=>1, d=>1) -x0, y0 = 1, 4 -𝒎 = dydx₁(x=>1, y=>4, a=>1, b=>1, c=>1, d=>1) -implicit_plot(lambdify(H); xlims=(-5,5), ylims=(-5,5), legend=false) -plot!(y0 + 𝒎 * (x-x0)) -``` - -Basically this includes all the same steps as if done "by hand." Some effort could have been saved in plotting, had -values for the parameters been substituted initially, but not doing so -shows their dependence in the derivative. - -!!! warning - The use of `lambdify(H)` is needed to turn the symbolic expression, `H`, into a function. - -!!! note - While `SymPy` itself has the `plot_implicit` function for plotting implicit equations, this works only with `PyPlot`, not `Plots`, so we use the `ImplicitPlots` package in these examples. - - -## Higher order derivatives - -Implicit differentiation can be used to find ``d^2y/dx^2`` or other higher-order derivatives. At each stage, the same technique is applied. The -only "trick" is that some simplifications can be made. - -For example, consider ``x^3 - y^3=3``. To find ``d^2y/dx^2``, we first find ``dy/dx``: - -```math -3x^2 - (3y^2 \frac{dy}{dx}) = 0. -``` - -We could solve for ``dy/dx`` at this point - it always appears as a linear factor - to get: - -```math -\frac{dy}{dx} = \frac{3x^2}{3y^2} = \frac{x^2}{y^2}. -``` - -However, we differentiate the first equation, as we generally try to avoid the quotient rule - -```math -6x - (6y \frac{dy}{dx} \cdot \frac{dy}{dx} + 3y^2 \frac{d^2y}{dx^2}) = 0. -``` - -Again, if must be that ``d^2y/dx^2`` appears as a linear factor, so we can solve for it: - -```math -\frac{d^2y}{dx^2} = \frac{6x - 6y (\frac{dy}{dx})^2}{3y^2}. -``` - -One last substitution for ``dy/dx`` gives: - -```math -\frac{d^2y}{dx^2} = \frac{-6x + 6y (\frac{x^2}{y^2})^2}{3y^2} = -2\frac{x}{y^2} + 2\frac{x^4}{y^5} = 2\frac{x}{y^2}(1 - \frac{x^3}{y^3}) = 2\frac{x}{y^5}(y^3 - x^3) = 2 \frac{x}{y^5}(-3). -``` - -It isn't so pretty, but that's all it takes. - - - -To visualize, we plot implicitly and notice that: - -* as we change quadrants from the third to the fourth to the first the - concavity changes from down to up to down, as the sign of the second - derivative changes from negative to positive to negative; - -* and that at these inflection points, the "tangent" line is vertical - when ``y=0`` and flat when ``x=0``. - -```julia; -K(x,y) = x^3 - y^3 - 3 -implicit_plot(K, xlims=(-3, 3), ylims=(-3, 3)) -``` - - -The same problem can be done symbolically. The steps are similar, though the last step (replacing ``x^3 - y^3`` with ``3``) isn't done without explicitly asking. - -```julia; hold=true -@syms x y u() - -eqn = K(x,y) - 3 -eqn1 = eqn(y => u(x)) -dydx = solve(diff(eqn1,x), diff(u(x), x))[1] # 1 solution -d2ydx2 = solve(diff(eqn1, x, 2), diff(u(x),x, 2))[1] # 1 solution -eqn2 = d2ydx2(diff(u(x), x) => dydx, u(x) => y) -simplify(eqn2) -``` - -## Inverse functions - -As [mentioned](../precalc/inversefunctions.html), an [inverse](http://en.wikipedia.org/wiki/Inverse_function) function for ``f(x)`` is a function ``g(x)`` satisfying: -``y = f(x)`` if and only if ``g(y) = x`` for all ``x`` in the domain of ``f`` and ``y`` in the range of ``f``. - -In short, both ``f \circ g`` and ``g \circ f`` are identify functions on their respective domains. -As inverses are unique, their notation, ``f^{-1}(x)``, reflects the name of the related function. - -The chain rule can be used to give the derivative of an inverse -function when applied to ``f(f^{-1}(x)) = x``. Solving gives, -``[f^{-1}(x)]' = 1 / f'(f^{-1}(x))``. - -This is great - if we can remember the rules. If not, sometimes implicit -differentiation can also help. - -Consider the inverse function for the tangent, which exists when the domain of the tangent function is restricted to ``(-\pi/2, \pi/2)``. The function solves ``y = \tan^{-1}(x)`` or ``\tan(y) = x``. Differentiating this yields: - -```math -\sec(y)^2 \frac{dy}{dx} = 1. -``` - -Or ``dy/dx = 1/\sec^2(y)``. - -But ``\sec(y)^2 = 1 + \tan(y)^2 = 1 + x^2``, as can be seen by right-triangle trigonometry. This yields the formula ``dy/dx = [\tan^{-1}(x)]' = 1 / (1 + x^2)``. - -##### Example - -For a more complicated example, suppose we have a moving trajectory ``(x(t), -y(t))``. The angle it makes with the origin satisfies - -```math -\tan(\theta(t)) = \frac{y(t)}{x(t)}. -``` - -Suppose ``\theta(t)`` can be defined in terms of the inverse to some -function (``\tan^{-1}(x)``). We can differentiate implicitly to find ``\theta'(t)`` in -terms of derivatives of ``y`` and ``x``: - -```math -\sec^2(\theta(t)) \cdot \theta'(t) = \frac{y'(t) x(t) - y(t) x'(t)}{x(t))^2}. -``` - -But ``\sec^2(\theta(t)) = (r(t)/x(t))^2 = (x(t)^2 + y(t)^2) / x(t)^2``, so moving to the other side the secant term gives an explicit, albeit complicated, expression for the derivative of ``\theta`` in terms of the functions ``x`` and ``y``: - -```math -\theta'(t) = \frac{x^2}{x^2(t) + y^2(t)} \cdot \frac{y'(t) x(t) - y(t) x'(t)}{x(t))^2} = \frac{y'(t) x(t) - y(t) x'(t)}{x^2(t) + y^2(t)}. -``` - -This could have been made easier, had we leveraged the result of the previous example. - -#### Example: from physics - -Many problems are best done with implicit derivatives. A video showing -such a problem along with how to do it analytically is -[here](http://ocw.mit.edu/courses/mathematics/18-01sc-single-variable-calculus-fall-2010/unit-2-applications-of-differentiation/part-b-optimization-related-rates-and-newtons-method/session-32-ring-on-a-string/). - -This video starts with a simple question: - - -> If you have a rope and heavy ring, where will the ring position itself -> due to gravity? - - -Well, suppose you hold the rope in two places, which we can take to be -``(0,0)`` and ``(a,b)``. Then let ``(x,y)`` be all the possible positions of -the ring that hold the rope taught. Then we have this picture: - - -```julia; hold=true; echo=false - - P = (4,1) - Q = (1, -3) - scatter([0,4], [0,1], legend=false, xaxis=nothing, yaxis=nothing) - plot!([0,1,4],[0,-3,1]) - a,b = .05, .25 - ts = range(0, 2pi, length=100) - plot!(1 .+ a*sin.(ts), -3 .+ b*cos.(ts), color=:gold) - annotate!((4-0.3,1,"(a,b)")) - plot!([0,1,1],[0,0,-3], color=:gray, alpha=0.25) - plot!([1,1,4],[0,1,1], color=:gray, alpha=0.25) - Δ = 0.15 - annotate!([(1/2, 0-Δ, "x"), (5/2, 1 - Δ, "a-x"), (1-Δ, -1, "|y|"), (1+Δ, -1, "b-y")]) -``` - - - -Since the length of the rope does not change, we must have for any admissible ``(x,y)`` that: - -```math -L = \sqrt{x^2 + y^2} + \sqrt{(a-x)^2 + (b-y)^2}, -``` - -where these terms come from the two hypotenuses in the figure, as computed through -Pythagorean's theorem. - - -> If we assume that the ring will minimize the value of y subject to -> this constraint, can we solve for y? - - -We create a function to represent the equation: - -```julia; -F₀(x, y, a, b) = sqrt(x^2 + y^2) + sqrt((a-x)^2 + (b-y)^2) -``` - -To illustrate, we need specific values of ``a``, ``b``, and ``L``: -```julia; -𝐚, 𝐛, 𝐋 = 3, 3, 10 # L > sqrt{a^2 + b^2} -F₀(x, y) = F₀(x, y, 𝐚, 𝐛) -``` - -Our values ``(x,y)`` must satisfy ``f(x,y) = L``. Let's graph: - -```julia; -implicit_plot((x,y) -> F₀(x,y) - 𝐋, xlims=(-5, 7), ylims=(-5, 7)) -``` - -The graph is an ellipse, though slightly tilted. - -Okay, now to find the lowest point. This will be when the derivative -is ``0``. We solve by assuming ``y`` is a function of ``x`` called `u`. We have already defined symbolic variables `a`, `b`, `x`, and `y`, here we define `L`: - -```julia; -@syms L -``` - -Then - -```julia; -eqn = F₀(x,y,a,b) - L -``` - -```julia; -eqn_1 = diff(eqn(y => u(x)), x) -eqn_2 = solve(eqn_1, diff(u(x), x))[1] -dydx₂ = eqn_2(u(x) => y) -``` - -We are looking for when the tangent line has ``0`` slope, or when -`dydx` is ``0``: - -```julia; -cps = solve(dydx₂, x) -``` - -There are two answers, as we could guess from the graph, but we want the one for the smallest value of ``y``, which is the second. - -The values of `dydx` depend on any pair ``(x,y)``, but our solution must -also satisfy the equation. That is for our value of ``x``, we need to find -the corresponding ``y``. This should be possible by substituting: - -```julia; -eqn1 = eqn(x => cps[2]) -``` - -We would try to solve `eqn1` for `y` with `solve(eqn1, y)`, but -`SymPy` can't complete this problem. Instead, we will approach this -numerically using `find_zero` from the `Roots` package. We make the above a function of `y` alone - -```julia; -eqn2 = eqn1(a=>3, b=>3, L=>10) -ystar = find_zero(eqn2, -3) -``` - -Okay, now we need to put this value back into our expression for the `x` value and also substitute in for the parameters: - -```julia; -xstar = N(cps[2](y => ystar, a =>3, b => 3, L => 3)) -``` - -Our minimum is at `(xstar, ystar)`, as this graphic shows: - -```julia; - -tl(x) = ystar + 0 * (x- xstar) -implicit_plot((x,y) -> F₀(x,y,3,3) - 10, xlims=(-4, 7), ylims=(-10, 10)) -plot!(tl) -``` - - - -If you watch the video linked to above, you will see that the -surprising fact here is the resting point is such that the angles -formed by the rope are the same. Basically this makes the tension in -both parts of the rope equal, so there is a static position (if not -static, the ring would move and not end in the final position). We can -verify this fact numerically by showing the arctangents of the two -triangles are the same up to a sign: - -```julia; -a0, b0 = 0,0 # the foci of the ellipse are (0,0) and (3,3) -a1, b1 = 3, 3 -atan((b0 - ystar)/(a0 - xstar)) + atan((b1 - ystar)/(a1 - xstar)) # ≈ 0 -``` - - - -Now, were we lucky and just happened to take ``a=3``, ``b = 3`` in such a way to -make this work? Well, no. But convince yourself by doing the above for -different values of ``b``. - - ----- - -In the above, we started with ``F(x,y) = L`` and solved symbolically for ``y=f(x)`` so that ``F(x,f(x)) = L``. Then we took a derivative of ``f(x)`` and set this equal to ``0`` to solve for the minimum ``y`` values. - -Here we try the same problem numerically, using a zero-finding approach to identify ``f(x))``. - -Starting with ``F(x,y) = \sqrt{x^2 + y^2} + \sqrt{(x-1)^2 + (b-2)^2}`` and ``L=3``, we have: - -```julia -F₁(x,y) = F₀(x,y, 1, 2) - 3 # a,b,L = 1,2,3 -implicit_plot(F₁) -``` - -Trying to find the lowest ``y`` value we have from the graph it is near ``x=0.1``. We can do better. - -First, we could try so solve for the ``f`` using `find_zero`. Here is one way: - -```julia -f₀(x) = find_zero(y -> F₁(x, y), 0) -``` - -We use ``0`` as an initial guess, as the ``y`` value is near ``0``. More on this later. We could then just sample many ``x`` values between ``-0.5`` and ``1.5`` and find the one corresponding to the smallest ``t`` value: - -```julia -findmin([f₀(x) for x ∈ range(-0.5, 1.5, length=100)]) -``` - -This shows the smallest value is around ``-0.414`` and occurs in the ``33``rd position of the sampled ``x`` values. Pretty good, but we can do better. We just need to differentiate ``f``, solve for ``f'(x) = 0`` and then put that value back into ``f`` to find the smallest ``y``. - -**However** there is one subtle point. Using automatic differentiation, as implemented in `ForwardDiff`, with `find_zero` requires the `x0` initial value to have a certain type. In this case, the same type as the "`x`" passed into ``f(x)``. So rather than use an initial value of ``0``, we must use an initial value `zero(x)`! (Otherwise, there will be an error "`no method matching Float64(::ForwardDiff.Dual{...`".) - -With this slight modification, we have: - -```julia -f₁(x) = find_zero(y -> F₁(x, y), zero(x)) -plot(f₁', -0.5, 1.5) -``` - -The zero of `f'` is a bit to the right of ``0``, say ``0.2``; we use `find_zero` again to find it: - -```julia -xstar₁ = find_zero(f₁', 0.2) -xstar₁, f₁(xstar₁) -``` - -It is important to note that the above uses of `find_zero` required *good* initial guesses, which we were fortunate enough to identify. - - - -## Questions - -###### Question - -Is ``(1,1)`` on the graph of - -```math -x^2 - 2xy + y^2 = 1? -``` - -```julia; hold=true; echo=false -x,y=1,1 -yesnoq(x^2 - 2x*y + y^2 ==1) -``` - -###### Question - -For the equation - -```math -x^2y + 2y - 4 x = 0, -``` - -if ``x=4``, what is a value for ``y`` such that ``(x,y)`` is a point on the graph of the equation? - -```julia; hold=true; echo=false -@syms x y -eqn = x^2*y + 2y - 4x -val = float(N(solve(subs(eqn, (x,4)), y)[1])) -numericq(val) -``` - - -###### Question - -For the equation - -```math -(y-5)\cdot \cos(4\cdot \sqrt{(x-4)^2 + y^2)} = x\cdot\sin(2\sqrt{x^2 + y^2}) -``` - - -is the point ``(5,0)`` a solution? - -```julia; hold=true; echo=false -yesnoq(false) -``` - -##### Question - -Let ``(x/3)^2 + (y/2)^2 = 1``. Find the slope of the tangent line at the point ``(3/\sqrt{2}, 2/\sqrt{2})``. - -```julia; hold=true; echo=false -@syms x y u() -eqn = (x/3)^2 + (y/2)^2 - 1 -dydx = SymPy.solve(SymPy.diff(SymPy.subs(eqn, y, u(x)), x), SymPy.diff(u(x), x))[1] -val = float(SymPy.N(SymPy.subs(dydx, (u(x), y), (x, 3/sqrt(2)), (y, 2/sqrt(2))))) -numericq(val) -``` - - -###### Question - -The [lame](http://www-history.mcs.st-and.ac.uk/Curves/Lame.html) curves satisfy: - -```math -\left(\frac{x}{a}\right)^n + \left(\frac{y}{b}\right)^n = 1. -``` - -An ellipse is when ``n=1``. Take ``n=3``, ``a=1``, and ``b=2``. - -Find a *positive* value of ``y`` when ``x=1/2``. - -```julia; hold=true; echo=false -a,b,n=1,2,3 -val = b*(1 - ((1/2)/a)^n)^(1/n) -numericq(val) -``` - -What expression gives ``dy/dx``? - -```julia; hold=true; echo=false -choices = [ -"`` -(y/x) \\cdot (x/a)^n \\cdot (y/b)^{-n}``", -"``b \\cdot (1 - (x/a)^n)^{1/n}``", -"``-(x/a)^n / (y/b)^n``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Let ``y - x^2 = -\log(x)``. At the point ``(1/2, 0.9431...)``, the graph has a tangent line. Find this line, then find its intersection point with the ``y`` axes. - -This intersection is: - -```julia; hold=true; echo=false -f(x) = x^2 - log(x) -x0 = 1/2 -tl(x) = f(x0) + f'(x0) * (x - x0) -numericq(tl(0)) -``` - - - -###### Question - -The [witch](http://www-history.mcs.st-and.ac.uk/Curves/Witch.html) of [Agnesi](http://www.maa.org/publications/periodicals/convergence/mathematical-treasures-maria-agnesis-analytical-institutions) is the curve given by the equation - -```math -y(x^2 + a^2) = a^3. -``` - -If ``a=1``, numerically find a a value of ``y`` when ``x=2``. - -```julia; hold=true; echo=false -a = 1 -f(x,y) = y*(x^2 + a^2) - a^3 -val = find_zero(y->f(2,y), 1) -numericq(val) -``` - - -What expression yields ``dy/dx`` for this curve: - -```julia; hold=true; echo=false -choices = [ -"``-2xy/(x^2 + a^2)``", -"``2xy / (x^2 + a^2)``", -"``a^3/(x^2 + a^2)``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -```julia; hold=true; echo=false -### {{{lhopital_35}}} -imgfile = "figures/fcarc-may2016-fig35-350.gif" -caption = """ -Image number 35 from L'Hospitals calculus book (the first). Given a description of the curve, identify the point ``E`` which maximizes the height. -""" -ImageFile(:derivatives, imgfile, caption) -``` - - -The figure above shows a problem appearing in L'Hospital's first calculus book. Given a function defined implicitly by ``x^3 + y^3 = axy`` (with ``AP=x``, ``AM=y`` and ``AB=a``) find the point ``E`` that maximizes the height. In the [AMS feature column](http://www.ams.org/samplings/feature-column/fc-2016-05) this problem is illustrated and solved in the historical manner, with the comment that the concept of implicit differentiation wouldn't have occurred to L'Hospital. - -Using Implicit differentiation, find when ``dy/dx = 0``. - -```julia; hold=true; echo=false -choices = ["``y^2 = 3x/a``", "``y=3x^2/a``", "``y=a/(3x^2)``", "``y^2=a/(3x)``"] -answ = 2 -radioq(choices, answ) -``` - -Substituting the correct value of ``y``, above, into the defining equation gives what value for ``x``: - -```julia; hold=true; echo=false -choices=[ -"``x=(1/2) a 2^{1/2}``", -"``x=(1/3) a 2^{1/3}``", -"``x=(1/2) a^3 3^{1/3}``", -"``x=(1/3) a^2 2^{1/2}``" -] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -For the equation of an ellipse: - -```math -\left(\frac{x}{a}\right)^2 + \left(\frac{y}{b}\right)^2 = 1, -``` - -compute ``d^2y/dx^2``. Is this the answer? - -```math -\frac{d^2y}{dx^2} = -\frac{b^2}{a^2\cdot y} - \frac{b^4\cdot x^2}{a^4\cdot y^3} = -\frac{1}{y}\frac{b^2}{a^2}(1 + \frac{b^2 x^2}{a^2 y^2}). -``` - -```julia; hold=true; echo=false -yesnoq(true) -``` - -If ``y>0`` is the sign positive or negative? - -```julia; hold=true; echo=false -choices = ["positive", "negative", "Can be both"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -If ``x>0`` is the sign positive or negative? - - -```julia; hold=true; echo=false -choices = ["positive", "negative", "Can be both"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -When ``x>0``, the graph of the equation is... - -```julia; hold=true; echo=false -choices = ["concave up", "concave down", "both concave up and down"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -## Appendix - -There are other packages in the `Julia` ecosystem that can plot implicit equations. - - -### The ImplicitEquations package - -The `ImplicitEquations` packages can plot equations and inequalities. The use is somewhat similar to the examples above, but the object plotted is a predicate, not a function. These predicates are created with functions like `Eq` or `Lt`. - -For example, the `ImplicitPlots` manual shows this function ``f(x,y) = (x^4 + y^4 - 1) \cdot (x^2 + y^2 - 2) + x^5 \cdot y`` to plot. Using `ImplicitEquations`, this equation would be plotted with: - -```julia; hold=true -using ImplicitEquations -f(x,y) = (x^4 + y^4 - 1) * (x^2 + y^2 - 2) + x^5 * y -r = Eq(f, 0) # the equation f(x,y) = 0 -plot(r) -``` - - -Unlike `ImplicitPlots`, inequalities may be displayed: - -```julia; hold=true -f(x,y) = (x^4 + y^4 - 1) * (x^2 + y^2 - 2) + x^5 * y -r = Lt(f, 0) # the inequality f(x,y) < 0 -plot(r; M=10, N=10) # less blocky -``` - - -The rendered plots look "blocky" due to the algorithm used to plot the -equations. As there is no rule defining ``(x,y)`` pairs to plot, a -search by regions is done. A region is initially labeled -undetermined. If it can be shown that for any value in the region the -equation is true (equations can also be inequalities), the region is -colored black. If it can be shown it will never be true, the region is -dropped. If a black-and-white answer is not clear, the region is -subdivided and each subregion is similarly tested. This continues -until the remaining undecided regions are smaller than some -threshold. Such regions comprise a boundary, and here are also colored -black. Only regions are plotted - not ``(x,y)`` pairs - so the results -are blocky. Pass larger values of ``N=M`` (with defaults of ``8``) to -`plot` to lower the threshold at the cost of longer computation times, -as seen in the last example. - - -### The IntervalConstraintProgramming package - -The `IntervalConstraintProgramming` package also can be used to graph implicit equations. For certain problem descriptions it is significantly faster and makes better graphs. The usage is slightly more involved. We show the commands, but don't run them here, as there are minor conflicts with the `CalculusWithJulia`package. - -We specify a problem using the `@constraint` macro. Using a macro -allows expressions to involve free symbols, so the problem is -specified in an equation-like manner: - -```julia; eval=false -S = @constraint x^2 + y^2 <= 2 -``` - - -The right hand side must be a number. - -The area to plot over must be specified as an `IntervalBox`, basically a pair of intervals. The interval ``[a,b]`` is expressed through `a..b`: - -```julia; eval=false -J = -3..3 -X = IntervalArithmetic.IntervalBox(J, J) -``` - -The `pave` command does the heavy lifting: - -```julia; eval=false -region = IntervalConstraintProgramming.pave(S, X) -``` - -A plot can be made of either the boundary, the interior, or both. - -```julia; eval=false -plot(region.inner) # plot interior; use r.boundary for boundary -``` diff --git a/CwJ/derivatives/lhospitals_rule.jmd b/CwJ/derivatives/lhospitals_rule.jmd deleted file mode 100644 index a7df189..0000000 --- a/CwJ/derivatives/lhospitals_rule.jmd +++ /dev/null @@ -1,770 +0,0 @@ -# L'Hospital's Rule - - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy - -``` - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using Roots - -fig_size=(800, 600) -const frontmatter = ( - title = "L'Hospital's Rule", - description = "Calculus with Julia: L'Hospital's Rule", - tags = ["CalculusWithJulia", "derivatives", "l'hospital's rule"], -); - -nothing -``` - ----- - -Let's return to limits of the form ``\lim_{x \rightarrow c}f(x)/g(x)`` which have an -indeterminate form of ``0/0`` if both are evaluated at ``c``. The typical -example being the limit considered by Euler: - -```math -\lim_{x\rightarrow 0} \frac{\sin(x)}{x}. -``` - -We know this is ``1`` using a bound from geometry, but might also guess -this is one, as we know from linearization near ``0`` that we have ``\sin(x) \approx x`` or, more specifically: - -```math -\sin(x) = x - \sin(\xi)x^2/2, \quad 0 < \xi < x. -``` - -This would yield: - -```math -\lim_{x \rightarrow 0} \frac{\sin(x)}{x} = \lim_{x\rightarrow 0} \frac{x -\sin(\xi) x^2/2}{x} = \lim_{x\rightarrow 0} 1 + \sin(\xi) \cdot x/2 = 1. -``` - -This is because we know ``\sin(\xi) x/2`` has a limit of ``0``, when ``|\xi| \leq |x|``. - -That doesn't look any easier, as we worried about the error term, but -if just mentally replaced ``\sin(x)`` with ``x`` - which it basically is -near ``0`` - then we can see that the limit should be the same as ``x/x`` -which we know is ``1`` without thinking. - - -Basically, we found that in terms of limits, if both ``f(x)`` and ``g(x)`` -are ``0`` at ``c``, that we *might* be able to just take this limit: -``(f(c) + f'(c) \cdot(x-c)) / (g(c) + g'(c) \cdot (x-c))`` which is just -``f'(c)/g'(c)``. - -Wouldn't that be nice? We could find difficult limits just by -differentiating the top and the bottom at ``c`` (and not use the messy quotient rule). - - -Well, in fact that is more or less true, a fact that dates back to -[L'Hospital](http://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule) - -who wrote the first textbook on differential calculus - though this result is -likely due to one of the Bernoulli brothers. - -> *L'Hospital's rule*: Suppose: -> * that ``\lim_{x\rightarrow c+} f(c) =0`` and ``\lim_{x\rightarrow c+} g(c) =0``, -> * that ``f`` and ``g`` are differentiable in ``(c,b)``, and -> * that ``g(x)`` exists and is non-zero for *all* ``x`` in ``(c,b)``, -> then **if** the following limit exists: -> ``\lim_{x\rightarrow c+}f'(x)/g'(x)=L`` it follows that -> ``\lim_{x \rightarrow c+}f(x)/g(x) = L``. - - -That is *if* the right limit of ``f(x)/g(x)`` is indeterminate of the -form ``0/0``, but the right limit of ``f'(x)/g'(x)`` is known, -possibly by simple continuity, then the right limit of ``f(x)/g(x)`` -exists and is equal to that of ``f'(x)/g'(x)``. - -The rule equally applies to *left limits* and *limits* at ``c``. Later it will see there are other generalizations. - -To apply this rule to Euler's example, ``\sin(x)/x``, we just need to consider that: - -```math -L = 1 = \lim_{x \rightarrow 0}\frac{\cos(x)}{1}, -``` - -So, as well, ``\lim_{x \rightarrow 0} \sin(x)/x = 1``. - -This is due to ``\cos(x)`` being continuous at ``0``, so this limit is -just ``\cos(0)/1``. (More importantly, the tangent line expansion of -``\sin(x)`` at ``0`` is ``\sin(0) + \cos(0)x``, so that ``\cos(0)`` is why -this answer is as it is, but we don't need to think in terms of -``\cos(0)``, but rather the tangent-line expansion, which is ``\sin(x) -\approx x``, as ``\cos(0)`` appears as the coefficient. - - -!!! note - In [Gruntz](http://www.cybertester.com/data/gruntz.pdf), in a reference attributed to Speiss, we learn that L'Hospital was a French Marquis who was taught in ``1692`` the calculus of Leibniz by Johann Bernoulli. They made a contract obliging Bernoulli to leave his mathematical inventions to L'Hospital in exchange for a regular compensation. This result was discovered in ``1694`` and appeared in L'Hospital's book of ``1696``. - -##### Examples - -- Consider this limit at ``0``: ``(a^x - 1)/x``. We have ``f(x) =a^x-1`` has - ``f(0) = 0``, so this limit is indeterminate of the form ``0/0``. The - derivative of ``f(x)`` is ``f'(x) = a^x \log(a)`` which has ``f'(0) = \log(a)``. - The derivative of the bottom is also ``1`` at ``0``, so we have: - -```math -\log(a) = \frac{\log(a)}{1} = \frac{f'(0)}{g'(0)} = \lim_{x \rightarrow 0}\frac{f'(x)}{g'(x)} = \lim_{x \rightarrow 0}\frac{f(x)}{g(x)} -= \lim_{x \rightarrow 0}\frac{a^x - 1}{x}. -``` - -!!! note - Why rewrite in the "opposite" direction? Because the theorem's result -- ``L`` is the limit -- is only true if the related limit involving the derivative exists. We don't do this in the following, but did so here to emphasize the need for the limit of the ratio of the derivatives to exist. - -- Consider this limit: - -```math -\lim_{x \rightarrow 0} \frac{e^x - e^{-x}}{x}. -``` - -It too is of the indeterminate form ``0/0``. The derivative of the top -is ``e^x + e^{-x}``, which is ``2`` when ``x=0``, so the ratio of -``f'(0)/g'(0)`` is seen to be ``2`` By continuity, the limit of the ratio of the derivatives is ``2``. Then by L'Hospital's rule, the limit above is -``2``. - - -- Sometimes, L'Hospital's rule must be applied twice. Consider this - limit: - -```math -\lim_{x \rightarrow 0} \frac{\cos(x)}{1 - x^2} -``` - -By L'Hospital's rule *if* this following limit exists, the two will be equal: - -```math -\lim_{x \rightarrow 0} \frac{-\sin(x)}{-2x}. -``` - -But if we didn't guess the answer, we see that this new problem is *also* indeterminate -of the form ``0/0``. So, repeating the process, this new limit will exist and be equal to the following -limit, should it exist: - -```math -\lim_{x \rightarrow 0} \frac{-\cos(x)}{-2} = 1/2. -``` - -As ``L = 1/2`` for this related limit, it must also be the limit of the original problem, by L'Hospital's rule. - - -- Our "intuitive" limits can bump into issues. Take for example the limit of ``(\sin(x)-x)/x^2`` as ``x`` goes to ``0``. Using ``\sin(x) \approx x`` makes this look like ``0/x^2`` which is still indeterminate. (Because the difference is higher order than ``x``.) Using L'Hospitals, says this limit will exist (and be equal) if the following one does: - -```math -\lim_{x \rightarrow 0} \frac{\cos(x) - 1}{2x}. -``` - -This particular limit is indeterminate of the form ``0/0``, so we again try L'Hospital's rule and consider - - -```math -\lim_{x \rightarrow 0} \frac{-\sin(x)}{2} = 0 -``` - -So as this limit exists, working backwards, the original limit in question will also be ``0``. - - -- This example comes from the Wikipedia page. It "proves" a discrete approximation for the second derivative. - -Show if ``f''(x)`` exists at ``c`` and is continuous at ``c``, then - -```math -f''(c) = \lim_{h \rightarrow 0} \frac{f(c + h) - 2f(c) + f(c-h)}{h^2}. -``` - -This will follow from two applications of L'Hospital's rule to the -right-hand side. The first says, the limit on the right is equal to -this limit, should it exist: - -```math -\lim_{h \rightarrow 0} \frac{f'(c+h) - 0 - f'(c-h)}{2h}. -``` - -We have to be careful, as we differentiate in the ``h`` variable, not -the ``c`` one, so the chain rule brings out the minus sign. But again, -as we still have an indeterminate form ``0/0``, this limit will equal the -following limit should it exist: - -```math -\lim_{h \rightarrow 0} \frac{f''(c+h) - 0 - (-f''(c-h))}{2} = -\lim_{c \rightarrow 0}\frac{f''(c+h) + f''(c-h)}{2} = f''(c). -``` - -That last equality follows, as it is assumed that ``f''(x)`` exists at ``c`` and is continuous, that is, ``f''(c \pm h) \rightarrow f''(c)``. - -The expression above finds use when second derivatives are numerically approximated. (The middle expression is the basis of the central-finite difference approximation to the derivative.) - - -* L'Hospital himself was interested in this limit for ``a > 0`` ([math overflow](http://mathoverflow.net/questions/51685/how-did-bernoulli-prove-lh%C3%B4pitals-rule)) - -```math -\lim_{x \rightarrow a} \frac{\sqrt{2a^3\cdot x-x^4} - a\cdot(a^2\cdot x)^{1/3}}{ a - (a\cdot x^3)^{1/4}}. -``` - - -These derivatives can be done by hand, but to avoid any minor mistakes -we utilize `SymPy` taking care to use rational numbers for the -fractional powers, so as not to lose precision through floating point -roundoff: - -```julia; -@syms a::positive x::positive -f(x) = sqrt(2a^3*x - x^4) - a * (a^2*x)^(1//3) -g(x) = a - (a*x^3)^(1//4) -``` - -We can see that at ``x=a`` we have the indeterminate form ``0/0``: - -```julia; -f(a), g(a) -``` - -What about the derivatives? - -```julia; -fp, gp = diff(f(x),x), diff(g(x),x) -fp(x=>a), gp(x=>a) -``` - -Their ratio will not be indeterminate, so the limit in question is just the ratio: - -```julia; -fp(x=>a) / gp(x=>a) -``` - -Of course, we could have just relied on `limit`, which knows about L'Hospital's rule: - -```julia; -limit(f(x)/g(x), x, a) -``` - -## Idea behind L'Hospital's rule - -A first proof of L'Hospital's rule takes advantage of Cauchy's -[generalization](http://en.wikipedia.org/wiki/Mean_value_theorem#Cauchy.27s_mean_value_theorem) -of the mean value theorem to two functions. Suppose ``f(x)`` and ``g(x)`` are -continuous on ``[c,b]`` and differentiable on ``(c,b)``. On -``(c,x)``, ``c < x < b`` there exists a ``\xi`` with ``f'(\xi) \cdot (f(x) - f(c)) = -g'(\xi) \cdot (g(x) - g(c))``. In our formulation, both ``f(c)`` and ``g(c)`` -are zero, so we have, provided we know that ``g(x)`` is non zero, that -``f(x)/g(x) = f'(\xi)/g'(\xi)`` for some ``\xi``, ``c < \xi < c + x``. That -the right-hand side has a limit as ``x \rightarrow c+`` is true by the -assumption that the limit of the ratio of the derivatives exists. (The ``\xi`` -part can be removed by considering it as a composition of a function -going to ``c``.) Thus the right limit of the ratio ``f/g`` is -known. - ----- - -```julia; echo=false; cache=true -let -## {{{lhopitals_picture}}} - -function lhopitals_picture_graph(n) - - g = (x) -> sqrt(1 + x) - 1 - x^2 - f = (x) -> x^2 - ts = range(-1/2, stop=1/2, length=50) - - - a, b = 0, 1/2^n * 1/2 - m = (f(b)-f(a)) / (g(b)-g(a)) - - ## get bounds - tl = (x) -> g(0) + m * (x - f(0)) - - lx = max(find_zero(x -> tl(x) - (-0.05), (-1000, 1000)), -0.6) - rx = min(find_zero(x -> tl(x) - (0.25), (-1000, 1000)), 0.2) - xs = [lx, rx] - ys = map(tl, xs) - - plt = plot(g, f, -1/2, 1/2, legend=false, size=fig_size, xlim=(-.6, .5), ylim=(-.1, .3)) - plot!(plt, xs, ys, color=:orange) - scatter!(plt, [g(a),g(b)], [f(a),f(b)], markersize=5, color=:orange) - plt -end - -caption = L""" - -Geometric interpretation of ``L=\lim_{x \rightarrow 0} x^2 / (\sqrt{1 + -x} - 1 - x^2)``. At ``0`` this limit is indeterminate of the form -``0/0``. The value for a fixed ``x`` can be seen as the slope of a secant -line of a parametric plot of the two functions, plotted as ``(g, -f)``. In this figure, the limiting "tangent" line has ``0`` slope, -corresponding to the limit ``L``. In general, L'Hospital's rule is -nothing more than a statement about slopes of tangent lines. - -""" - -n = 6 -anim = @animate for i=1:n - lhopitals_picture_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -plotly() - ImageFile(imgfile, caption) -end -``` - -## Generalizations - -L'Hospital's rule generalizes to other indeterminate forms, in -particular the indeterminate form ``\infty/\infty`` can be proved at the same time as ``0/0`` -with a more careful -[proof](http://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule#General_proof). - -The value ``c`` in the limit can also be infinite. Consider this case with ``c=\infty``: - -```math -\begin{align*} -\lim_{x \rightarrow \infty} \frac{f(x)}{g(x)} &= -\lim_{x \rightarrow 0} \frac{f(1/x)}{g(1/x)} -\end{align*} -``` - -L'Hospital's limit applies as ``x \rightarrow 0``, so we differentiate to get: - -```math -\begin{align*} -\lim_{x \rightarrow 0} \frac{[f(1/x)]'}{[g(1/x)]'} -&= \lim_{x \rightarrow 0} \frac{f'(1/x)\cdot(-1/x^2)}{g'(1/x)\cdot(-1/x^2)}\\ -&= \lim_{x \rightarrow 0} \frac{f'(1/x)}{g'(1/x)}\\ -&= \lim_{x \rightarrow \infty} \frac{f'(x)}{g'(x)}, -\end{align*} -``` - -*assuming* the latter limit exists, L'Hospital's rule assures the equality - -```math -\lim_{x \rightarrow \infty} \frac{f(x)}{g(x)} = -\lim_{x \rightarrow \infty} \frac{f'(x)}{g'(x)}, -``` - -##### Examples - - -For example, consider - -```math -\lim_{x \rightarrow \infty} \frac{x}{e^x}. -``` - -We see it is of the form ``\infty/\infty``. Taking advantage of the fact that L'Hospital's rule applies to limits -at ``\infty``, we have that this limit will exist and be equal to this one, -should it exist: - -```math -\lim_{x \rightarrow \infty} \frac{1}{e^x}. -``` - -This limit is, of course, ``0``, as it is of the form ``1/\infty``. It is not -hard to build up from here to show that for any integer value of ``n>0`` -that: - -```math -\lim_{x \rightarrow \infty} \frac{x^n}{e^x} = 0. -``` - -This is an expression of the fact that exponential functions grow faster than polynomial functions. - - -Similarly, powers grow faster than logarithms, as this limit shows, which is indeterminate of the form ``\infty/\infty``: - -```math -\lim_{x \rightarrow \infty} \frac{\log(x)}{x} = -\lim_{x \rightarrow \infty} \frac{1/x}{1} = 0, -``` - -the first equality by L'Hospital's rule, as the second limit exists. - - - -## Other indeterminate forms - - -Indeterminate forms of the type ``0 \cdot \infty``, ``0^0``, -``\infty^\infty``, ``\infty - \infty`` can be re-expressed to be in the -form ``0/0`` or ``\infty/\infty`` and then L'Hospital's theorem can be -applied. - - -###### Example: rewriting ``0 \cdot \infty`` - -What is the limit ``x \log(x)`` as ``x \rightarrow 0+``? The form is ``0\cdot \infty``, rewriting, we see this is just: - -```math -\lim_{x \rightarrow 0+}\frac{\log(x)}{1/x}. -``` - -L'Hospital's rule clearly applies to one-sided limits, as well as two -(our proof sketch used one-sided limits), so this limit will equal the -following, should it exist: - -```math -\lim_{x \rightarrow 0+}\frac{1/x}{-1/x^2} = \lim_{x \rightarrow 0+} -x = 0. -``` - -###### Example: rewriting ``0^0`` - -What is the limit ``x^x`` as ``x \rightarrow 0+``? The expression is of the form ``0^0``, which is indeterminate. (Even though floating point math defines the value as ``1``.) We can rewrite this by taking a log: - -```math -x^x = \exp(\log(x^x)) = \exp(x \log(x)) = \exp(\log(x)/(1/x)). -``` - -Be just saw that ``\lim_{x \rightarrow 0+}\log(x)/(1/x) = 0``. So by the -rules for limits of compositions and the fact that ``e^x`` is -continuous, we see ``\lim_{x \rightarrow 0+} x^x = e^0 = 1``. - - - -##### Example: rewriting ``\infty - \infty`` - -A limit ``\lim_{x \rightarrow c} f(x) - g(x)`` of indeterminate form ``\infty - \infty`` can be reexpressed to be of the from ``0/0`` through the transformation: - -```math -\begin{align*} -f(x) - g(x) &= f(x)g(x) \cdot (\frac{1}{g(x)} - \frac{1}{f(x)}) \\ -&= \frac{\frac{1}{g(x)} - \frac{1}{f(x)}}{\frac{1}{f(x)g(x)}}. -\end{align*} -``` - -Applying this to - -```math -L = \lim_{x \rightarrow 1} \big(\frac{x}{x-1} - \frac{1}{\log(x)}\big) -``` - -We get that ``L`` is equal to the following limit: - -```math -\lim_{x \rightarrow 1} \frac{\log(x) - \frac{x-1}{x}}{\frac{x-1}{x} \log(x)} -= -\lim_{x \rightarrow 1} \frac{x\log(x)-(x-1)}{(x-1)\log(x)} -``` - -In `SymPy` we have: - -```julia -𝒇 = x*log(x) - (x-1) -𝒈 = (x-1)*log(x) -𝒇(1), 𝒈(1) -``` - -L'Hospital's rule applies to the form ``0/0``, so we try: - -```julia -𝒇′ = diff(𝒇, x) -𝒈′ = diff(𝒈, x) -𝒇′(1), 𝒈′(1) -``` - -Again, we get the indeterminate form ``0/0``, so we try again with second derivatives: - -```julia -𝒇′′ = diff(𝒇, x, x) -𝒈′′ = diff(𝒈, x, x) -𝒇′′(1), 𝒈′′(1) -``` - -From this we see the limit is ``1/2``, as could have been done directly: - - -```julia -limit(𝒇/𝒈, x=>1) -``` - -## The assumptions are necessary - -##### Example: the limit existing is necessary - -The following limit is *easily* seen by comparing terms of largest growth: - -```math -1 = \lim_{x \rightarrow \infty} \frac{x - \sin(x)}{x} -``` - -However, the limit of the ratio of the derivatives *does* not exist: - -```math -\lim_{x \rightarrow \infty} \frac{1 - \cos(x)}{1}, -``` - -as the function just oscillates. This shows that L'Hospital's rule does not apply when the limit of the the ratio of the derivatives does not exist. - - -##### Example: the assumptions matter - -This example comes from the thesis of Gruntz to highlight possible issues when computer systems do simplifications. - -Consider: - -```math -\lim_{x \rightarrow \infty} \frac{1/2\sin(2x) +x}{\exp(\sin(x))\cdot(\cos(x)\sin(x)+x)}. -``` - -If we apply L'Hospital's rule using simplification we have: - -```julia -u(x) = 1//2*sin(2x) + x -v(x) = exp(sin(x))*(cos(x)*sin(x) + x) -up, vp = diff(u(x),x), diff(v(x),x) -limit(simplify(up/vp), x => oo) -``` - - -However, this answer is incorrect. The reason being subtle. The simplification cancels a term of ``\cos(x)`` that appears in the numerator and denominator. Before cancellation, we have `vp` will have infinitely many zero's as ``x`` approaches ``\infty`` so L'Hospital's won't apply (the limit won't exist, as every ``2\pi`` the ratio is undefined so the function is never eventually close to some ``L``). - -This ratio has no limit, as it oscillates, as confirmed by `SymPy`: - -```julia -limit(u(x)/v(x), x=> oo) -``` - - -## Questions - -###### Question - -This function ``f(x) = \sin(5x)/x`` is *indeterminate* at ``x=0``. What type? - -```julia; echo=false -lh_choices = [ -"``0/0``", -"``\\infty/\\infty``", -"``0^0``", -"``\\infty - \\infty``", -"``0 \\cdot \\infty``" -] -nothing -``` - -```julia; hold=true; echo=false -answ = 1 -radioq(lh_choices, answ, keep_order=true) -``` - -###### Question - -This function ``f(x) = \sin(x)^{\sin(x)}`` is *indeterminate* at ``x=0``. What type? - -```julia; hold=true; echo=false -answ =3 -radioq(lh_choices, answ, keep_order=true) -``` - -###### Question - -This function ``f(x) = (x-2)/(x^2 - 4)`` is *indeterminate* at ``x=2``. What type? - -```julia; hold=true; echo=false -answ = 1 -radioq(lh_choices, answ, keep_order=true) -``` - -###### Question - -This function ``f(x) = (g(x+h) - g(x-h)) / (2h)`` (``g`` is continuous) is *indeterminate* at ``h=0``. What type? - -```julia; hold=true; echo=false -answ = 1 -radioq(lh_choices, answ, keep_order=true) -``` - -###### Question - -This function ``f(x) = x \log(x)`` is *indeterminate* at ``x=0``. What type? - -```julia; hold=true; echo=false -answ = 5 -radioq(lh_choices, answ, keep_order=true) -``` - - -###### Question - -Does L'Hospital's rule apply to this limit: - -```math -\lim_{x \rightarrow \pi} \frac{\sin(\pi x)}{\pi x}. -``` - -```julia; hold=true; echo=false -choices = [ -"Yes. It is of the form ``0/0``", -"No. It is not indeterminate" -] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -Use L'Hospital's rule to find the limit - -```math -L = \lim_{x \rightarrow 0} \frac{4x - \sin(x)}{x}. -``` - -What is ``L``? - -```julia; hold=true; echo=false -f(x) = (4x - sin(x))/x -L = float(N(limit(f, 0))) -numericq(L) -``` - -###### Question - - -Use L'Hospital's rule to find the limit - -```math -L = \lim_{x \rightarrow 0} \frac{\sqrt{1+x} - 1}{x}. -``` - -What is ``L``? - -```julia; hold=true; echo=false -f(x) = (sqrt(1+x) - 1)/x -L = float(N(limit(f, 0))) -numericq(L) -``` - - - -###### Question - - -Use L'Hospital's rule *one* or more times to find the limit - -```math -L = \lim_{x \rightarrow 0} \frac{x - \sin(x)}{x^3}. -``` - -What is ``L``? - -```julia; hold=true; echo=false -f(x) = (x - sin(x))/x^3 -L = float(N(limit(f, 0))) -numericq(L) -``` - - -###### Question - - -Use L'Hospital's rule *one* or more times to find the limit - -```math -L = \lim_{x \rightarrow 0} \frac{1 - x^2/2 - \cos(x)}{x^3}. -``` - -What is ``L``? - -```julia; hold=true; echo=false -f(x) = (1 - x^2/2 - cos(x))/x^3 -L = float(N(limit(f, 0))) -numericq(L) -``` - -###### Question - - -Use L'Hospital's rule *one* or more times to find the limit - -```math -L = \lim_{x \rightarrow \infty} \frac{\log(\log(x))}{\log(x)}. -``` - -What is ``L``? - -```julia; hold=true; echo=false -f(x) = log(log(x))/log(x) -L = N(limit(f(x), x=> oo)) -numericq(L) -``` - - -###### Question - -By using a common denominator to rewrite this expression, use L'Hospital's rule to find the limit - -```math -L = \lim_{x \rightarrow 0} \frac{1}{x} - \frac{1}{\sin(x)}. -``` - -What is ``L``? - -```julia; hold=true; echo=false -f(x) = 1/x - 1/sin(x) -L = float(N(limit(f, 0))) -numericq(L) -``` - -##### Question - -Use L'Hospital's rule to find the limit - -```math -L = \lim_{x \rightarrow \infty} \log(x)/x -``` - -What is ``L``? - -```julia; hold=true; echo=false -L = float(N(limit(log(x)/x, x=>oo))) -numericq(L) -``` - -##### Question - -Using L'Hospital's rule, does - - -```math -\lim_{x \rightarrow 0+} x^{\log(x)} -``` - -exist? - -Consider ``x^{\log(x)} = e^{\log(x)\log(x)}``. - -```julia; hold=true; echo=false -yesnoq(false) -``` - - -##### Question - -Using L'Hospital's rule, find the limit of - -```math -\lim_{x \rightarrow 1} (2-x)^{\tan(\pi/2 \cdot x)}. -``` - -(Hint, express as ``\exp^{\tan(\pi/2 \cdot x) \cdot \log(2-x)}`` and take the limit of the resulting exponent.) - -```julia; hold=true; echo=false -choices = [ -"``e^{2/\\pi}``", -"``{2\\pi}``", -"``1``", -"``0``", -"It does not exist" -] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/derivatives/linearization.jmd b/CwJ/derivatives/linearization.jmd deleted file mode 100644 index 8570a79..0000000 --- a/CwJ/derivatives/linearization.jmd +++ /dev/null @@ -1,806 +0,0 @@ -# Linearization - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using TaylorSeries -using DualNumbers -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Linearization", - description = "Calculus with Julia: Linearization", - tags = ["CalculusWithJulia", "derivatives", "linearization"], -); - -nothing -``` - ----- - -The derivative of $f(x)$ has the interpretation as the slope of the -tangent line. The tangent line is the line that best approximates the -function at the point. - -Using the point-slope form of a line, we see that the tangent line to the graph of $f(x)$ at $(c,f(c))$ is given by: - -```math -y = f(c) + f'(c) \cdot (x - c). -``` - -This is written as an equation, though we prefer to work with -functions within `Julia`. Here we write such a function as an -operator - it takes a function `f` and returns a function -representing the tangent line. - -```julia; eval=false -tangent(f, c) = x -> f(c) + f'(c) * (x - c) -``` - -(Recall, the `->` indicates that an anonymous function is being generated.) - -This function along with the `f'` notation for automatic derivatives is defined in the -`CalculusWithJulia` package. - - -We make some graphs with tangent lines: - -```julia;hold=true -f(x) = x^2 -plot(f, -3, 3) -plot!(tangent(f, -1)) -plot!(tangent(f, 2)) -``` - -The graph shows that near the point, the line and function are close, -but this need not be the case away from the point. We can express this informally as - -```math -f(x) \approx f(c) + f'(c) \cdot (x-c) -``` - -with the understanding this applies for $x$ "close" to $c$. - -Usually for the applications herein, instead of ``x`` and ``c`` the two points are ``x+\Delta_x`` and ``x``. This gives: - -> *Linearization*: ``\Delta_y = f(x +\Delta_x) - f(x) \approx f'(x) \Delta_x``, for small ``\Delta_x``. - - -This section gives some implications of this fact and quantifies what -"close" can mean. - -##### Example - -There are several approximations that are well known in physics, due to their widespread usage: - -* That $\sin(x) \approx x$ around $x=0$: - -```julia;hold=true -plot(sin, -pi/2, pi/2) -plot!(tangent(sin, 0)) -``` - -Symbolically: - -```julia; hold=true -@syms x -c = 0 -f(x) = sin(x) -f(c) + diff(f(x),x)(c) * (x - c) -``` - - -* That $\log(1 + x) \approx x$ around $x=0$: - -```julia; hold=true; -f(x) = log(1 + x) -plot(f, -1/2, 1/2) -plot!(tangent(f, 0)) -``` - - -Symbolically: - -```julia; hold=true -@syms x -c = 0 -f(x) = log(1 + x) -f(c) + diff(f(x),x)(c) * (x - c) -``` - -(The `log1p` function implements a more accurate version of this function when numeric values are needed.) - - -* That $1/(1-x) \approx x$ around $x=0$: - -```julia; hold=true; -f(x) = 1/(1-x) -plot(f, -1/2, 1/2) -plot!(tangent(f, 0)) -``` - -Symbolically: - -```julia; hold=true -@syms x -c = 0 -f(x) = 1 / (1 - x) -f(c) + diff(f(x),x)(c) * (x - c) -``` - - - - -* That ``(1+x)^n \approx 1 + nx`` around ``x = 0``. For example, with ``n=5`` - -```julia; hold=true; -n = 5 -f(x) = (1+x)^n # f'(0) = n = n(1+x)^(n-1) at x=0 -plot(f, -1/2, 1/2) -plot!(tangent(f, 0)) -``` - - -Symbolically: - -```julia; hold=true -@syms x, n::real -c = 0 -f(x) = (1 + x)^n -f(c) + diff(f(x),x)(x=>c) * (x - c) -``` - ----- - -In each of these cases, a more complicated non-linear function -is well approximated in a region of interest by a simple linear -function. - -## Numeric approximations - -```julia; hold=true; echo=false -f(x) = sin(x) -a, b = -1/4, pi/2 - -p = plot(f, a, b, legend=false); -plot!(p, x->x, a, b); -plot!(p, [0,1,1], [0, 0, 1], color=:brown); -plot!(p, [1,1], [0, sin(1)], color=:green, linewidth=4); -annotate!(p, collect(zip([1/2, 1+.075, 1/2-1/8], [.05, sin(1)/2, .75], ["Δx", "Δy", "m=dy/dx"]))); -p -``` - -The plot shows the tangent line with slope $dy/dx$ and the actual -change in $y$, $\Delta y$, for some specified $\Delta x$. The small -gap above the sine curve is the error were the value of the sine approximated using the drawn tangent line. We can see that approximating -the value of $\Delta y = \sin(c+\Delta x) - \sin(c)$ with the often -easier to compute $(dy/dx) \cdot \Delta x = f'(c)\Delta x$ - for small enough values of -$\Delta x$ - is not going to be too far off provided $\Delta x$ is not too large. - -This approximation is known as linearization. It can be used both in -theoretical computations and in pratical applications. To see how -effective it is, we look at some examples. - -##### Example - -If $f(x) = \sin(x)$, $c=0$ and $\Delta x= 0.1$ then the values for the actual change in the function values and the value of $\Delta y$ are: - -```julia; -f(x) = sin(x) -c, deltax = 0, 0.1 -f(c + deltax) - f(c), f'(c) * deltax -``` - -The values are pretty close. But what is $0.1$ radians? Lets use degrees. Suppose we have $\Delta x = 10^\circ$: - -```julia; -deltax⁰ = 10*pi/180 -actual = f(c + deltax⁰) - f(c) -approx = f'(c) * deltax⁰ -actual, approx -``` - - -They agree until the third decimal value. The *percentage error* is just $1/2$ a percent: - -```julia; -(approx - actual) / actual * 100 -``` - - -### Relative error or relative change - -The relative error is defined by - -```math -\big| \frac{\text{actual} - \text{approximate}}{\text{actual}} \big|. -``` - -However, typically with linearization, we talk about the *relative change*, not relative error, as the denominator is easier to compute. This is - -```math -\frac{f(x + \Delta_x) - f(x)}{f(x)} = \frac{\Delta_y}{f(x)} \approx -\frac{f'(x) \cdot \Delta_x}{f(x)} -``` - -The *percentage change* multiplies by ``100``. - - -##### Example - -What is the relative change in surface area of a sphere if the radius changes from ``r`` to ``r + dr``? - -We have ``S = 4\pi r^2`` so the approximate relative change, ``dy/S`` is given, using the derivative ``dS/dr = 8\pi r``, by - -```math -\frac{8\pi\cdot r\cdot dr}{4\pi r^2} = 2r\cdot dr. -``` - - -##### Example - -We are traveling ``60`` miles. At ``60`` miles an hour, we will take ``60`` minutes (or one hour). How long will it take at ``70`` miles an hour? (Assume you can't divide, but, instead, can only multiply!) - - -Well the answer is $60/70$ hours or $60/70 \cdot 60$ minutes. But we -can't divide, so we turn this into a multiplication problem via some algebra: - -```math -\frac{60}{70} = \frac{60}{60 + 10} = \frac{1}{1 + 10/60} = \frac{1}{1 + 1/6}. -``` - -Okay, so far no calculator was needed. We wrote $70 = 60 + 10$, as we -know that $60/60$ is just $1$. This almost gets us there. If we really -don't want to divide, we can get an answer by using the tangent line -approximation for $1/(1+x)$ around $x=0$. This is $1/(1+x) \approx 1 - -x$. (You can check by finding that $f'(0) = -1$.) Thus, our answer is -approximately $5/6$ of an hour or 50 minutes. - -How much in error are we? - -```julia; -abs(50 - 60/70*60) / (60/70*60) * 100 -``` - -That's about $3$ percent. Not bad considering we could have done all -the above in our head while driving without taking our eyes off the -road to use the calculator on our phone for a division. - -##### Example - -A ``10``cm by ``10``cm by ``10``cm cube will contain ``1`` liter -(``1000``cm``^3``). In manufacturing such a cube, the side lengths are -actually $10.1$ cm. What will be the volume in liters? Compute this -with a linear approximation to $(10.1)^3$. - -Here $f(x) = x^3$ and we are asked to approximate $f(10.1)$. Letting $c=10$, we have: - -```math -f(c + \Delta) \approx f(c) + f'(c) \cdot \Delta = 1000 + f'(c) \cdot (0.1) -``` - -Computing the derivative can be done easily, we get for our answer: - -```julia; -fp(x) = 3*x^2 -c₀, Delta = 10, 0.1 -approx₀ = 1000 + fp(c₀) * Delta -``` - -This is a relative error as a percent of: - -```julia; -actual₀ = 10.1^3 -(actual₀ - approx₀)/actual₀ * 100 -``` - -The manufacturer may be interested instead in comparing the volume of the actual object to the $1$ liter target. They might use the approximate value for this comparison, which would yield: - -```julia; -(1000 - approx₀)/approx₀ * 100 -``` - -This is off by about $3$ percent. Not so bad for some applications, devastating for others. - - -##### Example: Eratosthenes and the circumference of the earth - -[Eratosthenes](https://en.wikipedia.org/wiki/Eratosthenes) is said to have been the first person to estimate the radius (or by relation the circumference) of the earth. The basic idea is based on the difference of shadows cast by the sun. Suppose Eratosthenes sized the circumference as ``252,000`` *stadia*. Taking ``1``` stadia as ``160`` meters and the actual radius of the earth as ``6378.137`` kilometers, we can convert to see that Eratosthenes estimated the radius as ``6417``. - - -If Eratosthenes were to have estimated the volume of a spherical earth, what would be his approximate percentage change between his estimate and the actual? - -Using ``V = 4/3 \pi r^3`` we get ``V' = 4\pi r^2``: - -```julia -rₑ = 6417 -rₐ = 6378.137 -Δᵣ = rₑ - rₐ -Vₛ(r) = 4/3 * pi * r^3 -Δᵥ = Vₛ'(rₑ) * Δᵣ -Δᵥ / Vₛ(rₑ) * 100 -``` - -##### Example: a simple pendulum - -A *simple* pendulum is comprised of a massless "bob" on a rigid "rod" -of length $l$. The rod swings back and forth making an angle $\theta$ -with the perpendicular. At rest $\theta=0$, here we have $\theta$ swinging with $\lvert\theta\rvert \leq \theta_0$ -for some $\theta_0$. - -According to [Wikipedia](http://tinyurl.com/yz5sz7e) - and many -introductory physics book - while swinging, the angle $\theta$ varies -with time following this equation: - -```math -\theta''(t) + \frac{g}{l} \sin(\theta(t)) = 0. -``` - -That is, the second derivative of $\theta$ is proportional to the sine -of $\theta$ where the proportionality constant involves $g$ from -gravity and the length of the "rod." - -This would be much easier if the second derivative were proportional to the angle $\theta$ and not its sine. - -[Huygens](http://en.wikipedia.org/wiki/Christiaan_Huygens) used the -approximation of $\sin(x) \approx x$, noted above, to say that when -the angle is not too big, we have the pendulum's swing obeying -$\theta''(t) = -g/l \cdot t$. Without getting too involved in why, -we can verify by taking two derivatives that $\theta_0\sin(\sqrt{g/l}\cdot t)$ will be a solution to this -modified equation. - -With this solution, the motion is periodic with constant amplitude (assuming frictionless behaviour), as -the sine function is. More surprisingly, the period is found from $T = -2\pi/(\sqrt{g/l}) = 2\pi \sqrt{l/g}$. It depends on $l$ - longer -"rods" take more time to swing back and forth - but does not depend -on the how wide the pendulum is swinging between (provided $\theta_0$ -is not so big the approximation of $\sin(x) \approx x$ fails). This -latter fact may be surprising, though not to Galileo who discovered -it. - -## Differentials - -The Leibniz notation for a derivative is ``dy/dx`` indicating the -change in ``y`` as ``x`` changes. It proves convenient to decouple -this using *differentials* ``dx`` and ``dy``. What do these notations -mean? They measure change along the tangent line in same way -``\Delta_x`` and ``\Delta_y`` measure change for the function. The differential ``dy`` depends on both ``x`` and ``dx``, it being defined by ``dy=f'(x)dx``. As tangent lines locally represent a function, ``dy`` and ``dx`` are often associated with an *infinitesimal* difference. - -Taking ``dx = \Delta_x``, as in the previous graphic, we can compare ``dy`` -- the change along the tangent line given by ``dy/dx \cdot dx`` -- and ``\Delta_y`` -- the change along the function given by ``f(x + \Delta_x) - f(x)``. The linear approximation, ``f(x + \Delta_x) - f(x)\approx f'(x)dx``, says that - -```math -\Delta_y \approx dy; \quad \text{ when } \Delta_x = dx -``` - - - - -## The error in approximation - -How good is the approximation? Graphically we can see it is pretty -good for the graphs we choose, but are there graphs out there for -which the approximation is not so good? Of course. However, we can -say this (the -[Lagrange](http://en.wikipedia.org/wiki/Taylor%27s_theorem) form of a -more general Taylor remainder theorem): - -> Let ``f(x)`` be twice differentiable on ``I=(a,b)``, ->``f`` is continuous on ``[a,b]``, and -> ``a < c < b``. Then for any ``x`` in ``I``, there exists some value ``\xi`` between ``c`` and ``x`` such that -> ``f(x) = f(c) + f'(c)(x-c) + (f''(\xi)/2)\cdot(x-c)^2``. - - -That is, the error is basically a constant depending on the concavity -of $f$ times a quadratic function centered at $c$. - -For $\sin(x)$ at $c=0$ we get $\lvert\sin(x) - x\rvert = \lvert-\sin(\xi)\cdot x^2/2\rvert$. -Since $\lvert\sin(\xi)\rvert \leq 1$, we must have this bound: -$\lvert\sin(x) - x\rvert \leq x^2/2$. - - -Can we verify? Let's do so graphically: - -```julia; hold=true -h(x) = abs(sin(x) - x) -g(x) = x^2/2 -plot(h, -2, 2, label="h") -plot!(g, -2, 2, label="f") -``` - -The graph shows a tight bound near ``0`` and then a bound over this viewing window. - - -Similarly, for $f(x) = \log(1 + x)$ we have the following at $c=0$: - -```math -f'(x) = 1/(1+x), \quad f''(x) = -1/(1+x)^2. -``` - -So, as $f(c)=0$ and $f'(c) = 1$, we have - -```math -\lvert f(x) - x\rvert \leq \lvert f''(\xi)\rvert \cdot \frac{x^2}{2} -``` - -We see that $\lvert f''(x)\rvert$ is decreasing for $x > -1$. So if $-1 < x < c$ we have - -```math -\lvert f(x) - x\rvert \leq \lvert f''(x)\rvert \cdot \frac{x^2}{2} = \frac{x^2}{2(1+x)^2}. -``` - -And for $c=0 < x$, we have - -```math -\lvert f(x) - x\rvert \leq \lvert f''(0)\rvert \cdot \frac{x^2}{2} = x^2/2. -``` - - -Plotting we verify the bound on ``|\log(1+x)-x|``: - -```julia; hold=true -h(x) = abs(log(1+x) - x) -g(x) = x < 0 ? x^2/(2*(1+x)^2) : x^2/2 -plot(h, -0.5, 2, label="h") -plot!(g, -0.5, 2, label="g") -``` - -Again, we see the very close bound near ``0``, which widens at the edges of the viewing window. - -### Why is the remainder term as it is? - -To see formally why the remainder is as it is, we recall the mean value -theorem in the extended form of Cauchy. Suppose $c=0$, $x > 0$, and let $h(x) = f(x) - (f(0) + -f'(0) x)$ and $g(x) = x^2$. Then we have that there exists a $e$ with -$0 < e < x$ such that - -```math -\text{error} = h(x) - h(0) = (g(x) - g(0)) \frac{h'(e)}{g'(e)} = x^2 \cdot \frac{1}{2} \cdot \frac{f'(e) - f'(0)}{e} = -x^2 \cdot \frac{1}{2} \cdot f''(\xi). -``` - -The value of $\xi$, from the mean value theorem applied to $f'(x)$, -satisfies $0 < \xi < e < x$, so is in $[0,x].$ - -### The big (and small) "oh" - -`SymPy` can find the tangent line expression as a special case of its `series` function (which implements [Taylor series](../taylor_series_polynomials.html)). The `series` function needs an expression to approximate; a variable specified, as there may be parameters in the expression; a value ``c`` for *where* the expansion is taken, with default ``0``; and a number of terms, for this example ``2`` for a constant and linear term. (There is also an optional `dir` argument for one-sided expansions.) - - - - -Here we see the answer provided for $e^{\sin(x)}$: - -```julia; -@syms x -series(exp(sin(x)), x, 0, 2) -``` - -The expression $1 + x$ comes from the fact that `exp(sin(0))` is $1$, and the derivative `exp(sin(0)) * cos(0)` is *also* $1$. But what is the $\mathcal{O}(x^2)$? - -We know the answer is *precisely* $f''(\xi)/2 \cdot x^2$ for some ``\xi``, but were we -only concerned about the scale as $x$ goes to zero -that when ``f''`` is continuous that the error when divided by ``x^2`` goes to some finite value (``f''(0)/2``). More generally, if the error divided by ``x^2`` is *bounded* as ``x`` goes to ``0``, then we say the error is "big oh" of ``x^2``. - - -The [big](http://en.wikipedia.org/wiki/Big_O_notation) "oh" notation, -``f(x) = \mathcal{O}(g(x))``, says that the ratio ``f(x)/g(x)`` is -bounded as ``x`` goes to ``0`` (or some other value ``c``, depending -on the context). A little "oh" (e.g., ``f(x) = \mathcal{o}(g(x))``) -would mean that the limit ``f(x)/g(x)`` would be ``0``, as -``x\rightarrow 0``, a much stronger assertion. - -Big "oh" and little "oh" give us a sense of how good an approximation -is without being bogged down in the details of the exact value. As -such they are useful guides in focusing on what is primary and what is -secondary. Applying this to our case, we have this rough form of the -tangent line approximation valid for functions having a continuous second -derivative at ``c``: - -```math -f(x) = f(c) + f'(c)(x-c) + \mathcal{O}((x-c)^2). -``` - -##### Example: the algebra of tangent line approximations - -Suppose $f(x)$ and $g(x)$ are represented by their tangent lines about $c$, respectively: - - -```math -\begin{align*} -f(x) &= f(c) + f'(c)(x-c) + \mathcal{O}((x-c)^2), \\ -g(x) &= g(c) + g'(c)(x-c) + \mathcal{O}((x-c)^2). -\end{align*} -``` - -Consider the sum, after rearranging we have: - -```math -\begin{align*} -f(x) + g(x) &= \left(f(c) + f'(c)(x-c) + \mathcal{O}((x-c)^2)\right) + \left(g(c) + g'(c)(x-c) + \mathcal{O}((x-c)^2)\right)\\ -&= \left(f(c) + g(c)\right) + \left(f'(c)+g'(c)\right)(x-c) + \mathcal{O}((x-c)^2). -\end{align*} -``` - -The two big "Oh" terms become just one as the sum of a constant times $(x-c)^2$ plus a constant time $(x-c)^2$ is just some other constant times $(x-c)^2$. What we can read off from this is the term multiplying $(x-c)$ is just the derivative of $f(x) + g(x)$ (from the sum rule), so this too is a tangent line approximation. - - -Is it a coincidence that a basic algebraic operation with tangent lines approximations produces a tangent line approximation? Let's try multiplication: - -```math -\begin{align*} -f(x) \cdot g(x) &= [f(c) + f'(c)(x-c) + \mathcal{O}((x-c)^2)] \cdot [g(c) + g'(c)(x-c) + \mathcal{O}((x-c)^2)]\\ -&=[(f(c) + f'(c)(x-c)] \cdot [g(c) + g'(c)(x-c)] + (f(c) + f'(c)(x-c) \cdot \mathcal{O}((x-c)^2)) + g(c) + g'(c)(x-c) \cdot \mathcal{O}((x-c)^2)) + [\mathcal{O}((x-c)^2))]^2\\ -&= [(f(c) + f'(c)(x-c)] \cdot [g(c) + g'(c)(x-c)] + \mathcal{O}((x-c)^2)\\ -&= f(c) \cdot g(c) + [f'(c)\cdot g(c) + f(c)\cdot g'(c)] \cdot (x-c) + [f'(c)\cdot g'(c) \cdot (x-c)^2 + \mathcal{O}((x-c)^2)] \\ -&= f(c) \cdot g(c) + [f'(c)\cdot g(c) + f(c)\cdot g'(c)] \cdot (x-c) + \mathcal{O}((x-c)^2) -\end{align*} -``` - -The big "oh" notation just sweeps up many things including any products of it *and* the term $f'(c)\cdot g'(c) \cdot (x-c)^2$. Again, we see from the product rule that this is just a tangent line approximation for $f(x) \cdot g(x)$. - -The basic mathematical operations involving tangent lines can be computed just using the tangent lines when the desired accuracy is at the tangent line level. This is even true for composition, though there the outer and inner functions may have different "$c$"s. - -Knowing this can simplify the task of finding tangent line approximations of compound expressions. - -For example, suppose we know that at $c=0$ we have these formula where $a \approx b$ is a shorthand for the more formal $a=b + \mathcal{O}(x^2)$: - -```math -\sin(x) \approx x, \quad e^x \approx 1 + x, \quad \text{and}\quad 1/(1+x) \approx 1 - x. -``` - -Then we can immediately see these tangent line approximations about $x=0$: - - -```math -e^x \cdot \sin(x) \approx (1+x) \cdot x = x + x^2 \approx x, -``` - -and - -```math -\frac{\sin(x)}{e^x} \approx \frac{x}{1 + x} \approx x \cdot(1-x) = x-x^2 \approx x. -``` - - -Since $\sin(0) = 0$, we can use these to find the tangent line approximation of - -```math -e^{\sin(x)} \approx e^x \approx 1 + x. -``` - - -Note that $\sin(\exp(x))$ is approximately $\sin(1+x)$ but not approximately $1+x$, as the expansion for $\sin$ about $1$ is not simply $x$. - - -### The TaylorSeries package - -The `TaylorSeries` packages will do these calculations in a manner similar to how `SymPy` transforms a function and a symbolic variable into a symbolic expression. - -For example, we have - -```julia -t = Taylor1(Float64, 1) -``` - -The number type and the order is specified to the constructor. Linearization is order ``1``, other orders will be discussed later. This variable can now be composed with mathematical functions and the linearization of the function will be returned: - -```julia -sin(t), exp(t), 1/(1+t) -``` - -```julia -sin(t)/exp(t), exp(sin(t)) -``` - - -##### Example: Automatic differentiation - -Automatic differentiation (forward mode) essentially uses this technique. A "dual" is introduced which has terms ``a +b\epsilon`` where ``\epsilon^2 = 0``. -The ``\epsilon`` is like ``x`` in a linear expansion, so the `a` coefficient encodes the value and the `b` coefficient reflects the derivative at the value. Numbers are treated like a variable, so their "b coefficient" is a `1`. Here then is how `0` is encoded: - -```julia; -Dual(0, 1) -``` - -Then what is ``\(x)``? It should reflect both ``(\sin(0), \cos(0))`` the latter being the derivative of ``\sin``. We can see this is *almost* what is computed behind the scenes through: - -```julia; hold=true -x = Dual(0, 1) -@code_lowered sin(x) -``` - -This output of `@code_lowered` can be confusing, but this simple case needn't be. Working from the end we see an assignment to a variable named `%7` of `Dual(%3, %6)`. The value of `%3` is `sin(x)` where `x` is the value `0` above. The value of `%6` is `cos(x)` *times* the value `1` above (the `xp`), which reflects the *chain* rule being used. (The derivative of `sin(u)` is `cos(u)*du`.) So this dual number encodes both the function value at `0` and the derivative of the function at `0`.) - -Similarly, we can see what happens to `log(x)` at `1` (encoded by `Dual(1,1)`): - -```julia; hold=true -x = Dual(1, 1) -@code_lowered log(x) -``` - -We can see the derivative again reflects the chain rule, it being given by `1/x * xp` where `xp` acts like `dx` (from assignments `%5` and `%4`). Comparing the two outputs, we see only the assignment to `%4` differs, it reflecting the derivative of the function. - - - - -## Questions - -###### Question - -What is the right linear approximation for $\sqrt{1 + x}$ near $0$? - -```julia; hold=true; echo=false -choices = [ -"``1 + 1/2``", -"``1 + x^{1/2}``", -"``1 + (1/2) \\cdot x``", -"``1 - (1/2) \\cdot x``"] -answ = 3 -radioq(choices, answ) -``` - - -###### Question - - -What is the right linear approximation for $(1 + x)^k$ near $0$? - -```julia; hold=true; echo=false -choices = [ -"``1 + k``", -"``1 + x^k``", -"``1 + k \\cdot x``", -"``1 - k \\cdot x``"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -What is the right linear approximation for $\cos(\sin(x))$ near $0$? - -```julia; hold=true; echo=false -choices = [ -"``1``", -"``1 + x``", -"``x``", -"``1 - x^2/2``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -What is the right linear approximation for $\tan(x)$ near $0$? - -```julia; hold=true; echo=false -choices = [ -"``1``", -"``x``", -"``1 + x``", -"``1 - x``" -] -answ = 2 -radioq(choices, answ) -``` - - - -###### Question - -What is the right linear approximation of $\sqrt{25 + x}$ near $x=0$? - -```julia; hold=true; echo=false -choices = [ -"``5 \\cdot (1 + (1/2) \\cdot (x/25))``", -"``1 - (1/2) \\cdot x``", -"``1 + x``", -"``25``" -] -answ = 1 -radioq(choices, answ) -``` - - - -###### Question - - -Let $f(x) = \sqrt{x}$. Find the actual error in approximating $f(26)$ by the -value of the tangent line at $(25, f(25))$ at $x=26$. - -```julia; hold=true; echo=false -tgent(x) = 5 + x/10 -answ = tgent(1) - sqrt(26) -numericq(answ) -``` - -###### Question - -An estimate of some quantity was $12.34$ the actual value was $12$. What was the *percentage error*? - -```julia; hold=true; echo=false -est = 12.34 -act = 12.0 -answ = (est -act)/act * 100 -numericq(answ) -``` - - -###### Question - -Find the percentage error in estimating $\sin(5^\circ)$ by $5 \pi/180$. - -```julia; hold=true; echo=false -tl(x) = x -x0 = 5 * pi/180 -est = x0 -act = sin(x0) -answ = (est -act)/act * 100 -numericq(answ) -``` - -###### Question - -The side length of a square is measured roughly to be $2.0$ cm. The actual length $2.2$ cm. What is the difference in area (in absolute values) as *estimated* by a tangent line approximation. - -```julia; hold=true; echo=false -tl(x) = 4 + 4x -answ = tl(.2) - 4 -numericq(abs(answ)) -``` - - -###### Question - -The [Birthday problem](https://en.wikipedia.org/wiki/Birthday_problem) computes the probability that in a group of ``n`` people, under some assumptions, that no two share a birthday. Without trying to spoil the problem, we focus on the calculus specific part of the problem below: - -```math -\begin{align*} -p -&= \frac{365 \cdot 364 \cdot \cdots (365-n+1)}{365^n} \\ -&= \frac{365(1 - 0/365) \cdot 365(1 - 1/365) \cdot 365(1-2/365) \cdot \cdots \cdot 365(1-(n-1)/365)}{365^n}\\ -&= (1 - \frac{0}{365})\cdot(1 -\frac{1}{365})\cdot \cdots \cdot (1-\frac{n-1}{365}). -\end{align*} -``` - -Taking logarithms, we have ``\log(p)`` is - -```math -\log(1 - \frac{0}{365}) + \log(1 -\frac{1}{365})+ \cdots + \log(1-\frac{n-1}{365}). -``` - -Now, use the tangent line approximation for ``\log(1 - x)`` and the sum formula for ``0 + 1 + 2 + \dots + (n-1)`` to simplify the value of ``\log(p)``: - -```julia; hold=true; echo=false -choices = ["``-n(n-1)/2/365``", - "``-n(n-1)/2\\cdot 365``", - "``-n^2/(2\\cdot 365)``", - "``-n^2 / 2 \\cdot 365``"] -radioq(choices, 1, keep_order=true) -``` - - -If ``n = 10``, what is the approximation for ``p`` (not ``\log(p)``)? - -```julia; hold=true; echo=false -n=10 -val = exp(-n*(n-1)/2/365) -numericq(val) -``` - -If ``n=100``, what is the approximation for ``p`` (not ``\log(p)``? - -```julia; hold=true; echo=false -n=100 -val = exp(-n*(n-1)/2/365) -numericq(val, 1e-2) -``` diff --git a/CwJ/derivatives/mean-value.js b/CwJ/derivatives/mean-value.js deleted file mode 100644 index 05550a5..0000000 --- a/CwJ/derivatives/mean-value.js +++ /dev/null @@ -1,23 +0,0 @@ -// https://jsxgraph.uni-bayreuth.de/wiki/index.php?title=Mean_Value_Theorem -var board = JXG.JSXGraph.initBoard('jsxgraph', {boundingbox: [-5, 10, 7, -6], axis:true}); -board.suspendUpdate(); -var p = []; -p[0] = board.create('point', [-1,-2], {size:2}); -p[1] = board.create('point', [6,5], {size:2}); -p[2] = board.create('point', [-0.5,1], {size:2}); -p[3] = board.create('point', [3,3], {size:2}); -var f = JXG.Math.Numerics.lagrangePolynomial(p); -var graph = board.create('functiongraph', [f,-10, 10]); - -var g = function(x) { - return JXG.Math.Numerics.D(f)(x)-(p[1].Y()-p[0].Y())/(p[1].X()-p[0].X()); -}; - -var r = board.create('glider', [ - function() { return JXG.Math.Numerics.root(g,(p[0].X()+p[1].X())*0.5); }, - function() { return f(JXG.Math.Numerics.root(g,(p[0].X()+p[1].X())*0.5)); }, - graph], {name:' ',size:4,fixed:true}); -board.create('tangent', [r], {strokeColor:'#ff0000'}); -line = board.create('line',[p[0],p[1]],{strokeColor:'#ff0000',dash:1}); - -board.unsuspendUpdate(); diff --git a/CwJ/derivatives/mean_value_theorem.jmd b/CwJ/derivatives/mean_value_theorem.jmd deleted file mode 100644 index af71977..0000000 --- a/CwJ/derivatives/mean_value_theorem.jmd +++ /dev/null @@ -1,710 +0,0 @@ -# The mean value theorem for differentiable functions. - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using Roots - -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using Printf -using SymPy - -fig_size = (800, 600) - -const frontmatter = ( - title = "The mean value theorem for differentiable functions.", - description = "Calculus with Julia: The mean value theorem for differentiable functions.", - tags = ["CalculusWithJulia", "derivatives", "the mean value theorem for differentiable functions."], -); - -nothing -``` - ----- - -A function is *continuous* at $c$ if $f(c+h) - f(c) \rightarrow 0$ as $h$ goes to $0$. We can write that as ``f(c+h) - f(x) = \epsilon_h``, with ``\epsilon_h`` denoting a function going to ``0`` as ``h \rightarrow 0``. With this notion, differentiability could be written as ``f(c+h) - f(c) - f'(c)h = \epsilon_h \cdot h``. This is clearly a more demanding requirement that mere continuity at ``c``. - -We defined a function to be *continuous* on an interval $I=(a,b)$ if -it was continuous at each point $c$ in $I$. Similarly, we define a -function to be *differentiable* on the interval $I$ it it is differentiable -at each point $c$ in $I$. - -This section looks at properties of differentiable functions. As there is a more stringent definition, perhaps more properties are a consequence of the definition. - -## Differentiable is more restrictive than continuous. - -Let $f$ be a differentiable function on $I=(a,b)$. We see that -``f(c+h) - f(c) = f'(c)h + \epsilon_h\cdot h = h(f'(c) + \epsilon_h)``. The right hand side will clearly go to ``0`` as ``h\rightarrow 0``, so ``f`` will be continuous. In short: - -> A differentiable function on $I=(a,b)$ is continuous on $I$. - -Is it possible that all continuous functions are differentiable? - -The fact that the derivative is related to the tangent line's slope -might give an indication that this won't be the case - we just need a -function which is continuous but has a point with no tangent line. The -usual suspect is $f(x) = \lvert x\rvert$ at $0$. - -```julia; hold=true -f(x) = abs(x) -plot(f, -1,1) -``` - -We can see formally that the secant line expression will not have a -limit when $c=0$ (the left limit is $-1$, the right limit $1$). But -more insight is gained by looking a the shape of the graph. At the origin, the graph -always is vee-shaped. There is no linear function that approximates this function -well. The function is just not smooth enough, as it has a kink. - - -There are other functions that have kinks. These are often associated -with powers. For example, at $x=0$ this function will not have a -derivative: - -```julia; hold=true; -f(x) = (x^2)^(1/3) -plot(f, -1, 1) -``` - - -Other functions have tangent lines that become vertical. The natural slope would be $\infty$, but this isn't a limiting answer (except in the extended sense we don't apply to the definition of derivatives). A candidate for this case is the cube root function: - -```julia; -plot(cbrt, -1, 1) -``` - -The derivative at $0$ would need to be $+\infty$ to match the -graph. This is implied by the formula for the derivative from the -power rule: $f'(x) = 1/3 \cdot x^{-2/3}$, which has a vertical -asymptote at $x=0$. - - -!!! note - The `cbrt` function is used above, instead of `f(x) = x^(1/3)`, as the - latter is not defined for negative `x`. Though it can be for the exact - power `1/3`, it can't be for an exact power like `1/2`. This means the - value of the argument is important in determining the type of the - output - and not just the type of the argument. Having type-stable - functions is part of the magic to making `Julia` run fast, so `x^c` is - not defined for negative `x` and most floating point exponents. - - -Lest you think that continuous functions always have derivatives -except perhaps at exceptional points, this isn't the case. The -functions used to -[model](http://tinyurl.com/cpdpheb) the -stock market are continuous but have no points where they are -differentiable. - - - - - - -## Derivatives and maxima. - -We have defined an *absolute maximum* of $f(x)$ over an interval to be -a value $f(c)$ for a point $c$ in the interval that is as large as any -other value in the interval. Just specifying a function and an -interval does not guarantee an absolute maximum, but specifying a -*continuous* function and a *closed* interval does, by the extreme value theorem. - -> *A relative maximum*: We say $f(x)$ has a *relative maximum* at $c$ -> if there exists *some* interval $I=(a,b)$ with $a < c < b$ for which -> $f(c)$ is an absolute maximum for $f$ and $I$. - -The difference is a bit subtle, for an absolute maximum the interval -must also be specified, for a relative maximum there just needs to -exist some interval, possibly really small, though it must be bigger -than a point. - -!!! note - A hiker can appreciate the difference. A relative maximum would be the - crest of any hill, but an absolute maximum would be the summit. - -What does this have to do with derivatives? - -[Fermat](http://science.larouchepac.com/fermat/fermat-maxmin.pdf), -perhaps with insight from Kepler, was interested in maxima of -polynomial functions. As a warm up, he considered a line segment $AC$ and a point $E$ -with the task of choosing $E$ so that $(E-A) \times (C-A)$ being a maximum. We might recognize this as -finding the maximum of $f(x) = (x-A)\cdot(C-x)$ for some $A < -C$. Geometrically, we know this to be at the midpoint, as the equation -is a parabola, but Fermat was interested in an algebraic solution that -led to more generality. - -He takes $b=AC$ and $a=AE$. Then the product is $a \cdot (b-a) = -ab - a^2$. He then perturbs this writing $AE=a+e$, then this new -product is $(a+e) \cdot (b - a - e)$. Equating the two, and canceling -like terms gives $be = 2ae + e^2$. He cancels the $e$ and basically -comments that this must be true for all $e$ even as $e$ goes to $0$, -so $b = 2a$ and the value is at the midpoint. - -In a more modern approach, this would be the same as looking at this expression: - -```math -\frac{f(x+e) - f(x)}{e} = 0. -``` - -Working on the left hand side, for non-zero $e$ we can cancel the -common $e$ terms, and then let $e$ become $0$. This becomes a problem -in solving $f'(x)=0$. Fermat could compute the derivative for any -polynomial by taking a limit, a task we would do now by the power -rule and the sum and difference of function rules. - - -This insight holds for other types of functions: - -> If $f(c)$ is a relative maximum then either $f'(c) = 0$ or the -> derivative at $c$ does not exist. - -When the derivative exists, this says the tangent line is flat. (If it -had a slope, then the the function would increase by moving left or -right, as appropriate, a point we pursue later.) - - -For a continuous function $f(x)$, call a point $c$ in the domain of -$f$ where either $f'(c)=0$ or the derivative does not exist a **critical** -**point**. - - -We can combine Bolzano's extreme value theorem with Fermat's insight to get the following: - -> A continuous function on $[a,b]$ has an absolute maximum that occurs -> at a critical point $c$, $a < c < b$, or an endpoint, $a$ or $b$. - -A similar statement holds for an absolute minimum. This gives a -restricted set of places to look for absolute maximum and minimum values - all the critical points and the endpoints. - -It is also the case that all relative extrema occur at a critical point, *however* not all critical points correspond to relative extrema. We will see *derivative tests* that help characterize when that occurs. - -```julia;hold=true; echo=false; -### {{{lhopital_32}}} -imgfile = "figures/lhopital-32.png" -caption = L""" -Image number ``32`` from L'Hopitals calculus book (the first) showing that -at a relative minimum, the tangent line is parallel to the -$x$-axis. This of course is true when the tangent line is well defined -by Fermat's observation. -""" -ImageFile(:derivatives, imgfile, caption) -``` - - -### Numeric derivatives - -The `ForwardDiff` package provides a means to numerically compute derivatives without approximations at a point. In `CalculusWithJulia` this is extended to find derivatives of functions and the `'` notation is overloaded for function objects. Hence these two give nearly identical answers, the difference being only the type of number used: - - -```julia;hold=true -f(x) = 3x^3 - 2x -fp(x) = 9x^2 - 2 -f'(3), fp(3) -``` - - -##### Example - -For the function $f(x) = x^2 \cdot e^{-x}$ find the absolute maximum over the interval $[0, 5]$. - -We have that $f(x)$ is continuous on the closed interval of the -question, and in fact differentiable on $(0,5)$, so any critical point -will be a zero of the derivative. We can check for these with: - - -```julia; -f(x) = x^2 * exp(-x) -cps = find_zeros(f', -1, 6) # find_zeros in `Roots` -``` - -We get $0$ and $2$ are critical points. The endpoints are $0$ and -$5$. So the absolute maximum over this interval is either at $0$, $2$, -or $5$: - -```julia; -f(0), f(2), f(5) -``` - -We see that $f(2)$ is then the maximum. - -A few things. First, `find_zeros` can miss some roots, in particular -endpoints and roots that just touch $0$. We should graph to verify it -didn't. Second, it can be easier sometimes to check the values using -the "dot" notation. If `f`, `a`,`b` are the function and the interval, -then this would typically follow this pattern: - -```julia -a, b = 0, 5 -critical_pts = find_zeros(f', a, b) -f.(critical_pts), f(a), f(b) -``` - -For this problem, we have the left endpoint repeated, but in general -this won't be a point where the derivative is zero. - - -As an aside, the output above is not a single container. To achieve that, the values can be combined before the broadcasting: - -```julia -f.(vcat(a, critical_pts, b)) -``` - - -##### Example - -For the function $g(x) = e^x\cdot(x^3 - x)$ find the absolute maximum over the interval $[0, 2]$. - -We follow the same pattern. Since $f(x)$ is continuous on the closed interval and differentiable on the open interval we know that the absolute maximum must occur at an endpoint ($0$ or $2$) or a critical point where $f'(c)=0$. To solve for these, we have again: - -```julia; -g(x) = exp(x) * (x^3 - x) -gcps = find_zeros(g', 0, 2) -``` - -And checking values gives: - -```julia; -g.(vcat(0, gcps, 2)) -``` - -Here the maximum occurs at an endpoint. The critical point $c=0.67\dots$ -does not produce a maximum value. Rather $f(0.67\dots)$ is an absolute -minimum. - -!!! note - **Absolute minimum** We haven't discussed the parallel problem of - absolute minima over a closed interval. By considering the function - $h(x) = - f(x)$, we see that the any thing true for an absolute - maximum should hold in a related manner for an absolute minimum, in - particular an absolute minimum on a closed interval will only occur - at a critical point or an end point. - -## Rolle's theorem - -Let $f(x)$ be differentiable on $(a,b)$ and continuous on -$[a,b]$. Then the absolute maximum occurs at an endpoint or where the -derivative is ``0`` (as the derivative is always defined). This gives rise to: - -> *[Rolle's](http://en.wikipedia.org/wiki/Rolle%27s_theorem) theorem*: For $f$ differentiable on ``(a,b)`` and continuous on ``[a,b]``, if $f(a)=f(b)$, then there exists some $c$ in $(a,b)$ with $f'(c) = 0$. - -This modest observation opens the door to many relationships between a function and its derivative, as it ties the two together in one statement. - -To see why Rolle's theorem is true, we assume that $f(a)=0$, otherwise -consider $g(x)=f(x)-f(a)$. By the extreme value theorem, there must be -an absolute maximum and minimum. If $f(x)$ is ever positive, then the -absolute maximum occurs in $(a,b)$ - not at an endpoint - so at a -critical point where the derivative is $0$. Similarly if $f(x)$ is -ever negative. Finally, if $f(x)$ is just $0$, then take any $c$ in -$(a,b)$. - -The statement in Rolle's theorem speaks to existence. It doesn't give -a recipe to find $c$. It just guarantees that there is *one* or *more* -values in the interval $(a,b)$ where the derivative is $0$ if we -assume differentiability on $(a,b)$ and continuity on $[a,b]$. - -##### Example - -Let $j(x) = e^x \cdot x \cdot (x-1)$. We know $j(0)=0$ and $j(1)=0$, -so on $[0,1]$. Rolle's theorem -guarantees that we can find *at* *least* one answer (unless numeric -issues arise): - -```julia; -j(x) = exp(x) * x * (x-1) -find_zeros(j', 0, 1) -``` - -This graph illustrates the lone value for $c$ for this problem - -```julia; echo=false -x0 = find_zero(j', (0, 1)) -plot([j, x->j(x0) + 0*(x-x0)], 0, 1) -``` - - -## The mean value theorem - -We are driving south and in one hour cover 70 miles. If the speed -limit is 65 miles per hour, were we ever speeding? We'll we averaged -more than the speed limit so we know the answer is yes, but why? -Speeding would mean our instantaneous speed was more than the speed -limit, yet we only know for sure our *average* speed was more than the -speed limit. The mean value tells us that if some conditions are met, -then at some point (possibly more than one) we must have that our -instantaneous speed is equal to our average speed. - -The mean value theorem is a direct generalization of Rolle's theorem. - -> *Mean value theorem*: Let $f(x)$ be differentiable on $(a,b)$ and -> continuous on $[a,b]$. Then there exists a value $c$ in $(a,b)$ -> where $f'(c) = (f(b) - f(a)) / (b - a)$. - - -This says for any secant line between $a < b$ there will -be a parallel tangent line at some $c$ with $a < c < b$ (all provided $f$ -is differentiable on $(a,b)$ and continuous on $[a,b]$). - - -This graph illustrates the theorem. The orange line is the secant -line. A parallel line tangent to the graph is guaranteed by the mean -value theorem. In this figure, there are two such lines, rendered -using red. - - -```julia; hold=true; echo=false -f(x) = x^3 - x -a, b = -2, 1.75 -m = (f(b) - f(a)) / (b-a) -cps = find_zeros(x -> f'(x) - m, a, b) - -p = plot(f, a-1, b+1, linewidth=3, legend=false) -plot!(x -> f(a) + m*(x-a), a-1, b+1, linewidth=3, color=:orange) -scatter!([a,b], [f(a), f(b)]) - -for cp in cps - plot!(x -> f(cp) + f'(cp)*(x-cp), a-1, b+1, color=:red) -end -p -``` - -Like Rolle's theorem this is a guarantee that something exists, not a -recipe to find it. In fact, the mean value theorem is just Rolle's -theorem applied to: - -```math -g(x) = f(x) - (f(a) + (f(b) - f(a)) / (b-a) \cdot (x-a)) -``` - -That is the function $f(x)$, minus the secant line between $(a,f(a))$ and $(b, f(b))$. - -```julia; hold=true; echo=false -# Need to bring jsxgraph into PLUTO -#caption = """ -#Illustration of the mean value theorem from -#[jsxgraph](https://jsxgraph.uni-bayreuth.de/). -#The polynomial function interpolates the points ``A``,``B``,``C``, and ``D``. -#Adjusting these creates different functions. Regardless of the -#function -- which as a polynomial will always be continuous and -#differentiable -- the slope of the secant line between ``A`` and ``B`` is alway#s matched by **some** tangent line between the points ``A`` and ``B``. -#""" -#JSXGraph(:derivatives, "mean-value.js", caption) -nothing -``` - -```=html -
-``` - -```ojs -//| echo: false -//| output: false - -JXG = require("jsxgraph"); - -board = JXG.JSXGraph.initBoard('jsxgraph', {boundingbox: [-5, 10, 7, -6], axis:true}); -p = [ - board.create('point', [-1,-2], {size:2}), - board.create('point', [6,5], {size:2}), - board.create('point', [-0.5,1], {size:2}), - board.create('point', [3,3], {size:2}) -]; -f = JXG.Math.Numerics.lagrangePolynomial(p); -graph = board.create('functiongraph', [f,-10, 10]); - -g = function(x) { - return JXG.Math.Numerics.D(f)(x)-(p[1].Y()-p[0].Y())/(p[1].X()-p[0].X()); -}; - -r = board.create('glider', [ - function() { return JXG.Math.Numerics.root(g,(p[0].X()+p[1].X())*0.5); }, - function() { return f(JXG.Math.Numerics.root(g,(p[0].X()+p[1].X())*0.5)); }, - graph], {name:' ',size:4,fixed:true}); -board.create('tangent', [r], {strokeColor:'#ff0000'}); -line = board.create('line',[p[0],p[1]],{strokeColor:'#ff0000',dash:1}); -``` - -This interactive example can also be found at [jsxgraph](http://jsxgraph.uni-bayreuth.de/wiki/index.php?title=Mean_Value_Theorem). It shows a cubic polynomial fit to the ``4`` adjustable points labeled A through D. The secant line is drawn between points A and B with a dashed line. A tangent line -- with the same slope as the secant line -- is identified at a point ``(\alpha, f(\alpha))`` where ``\alpha`` is between the points A and B. That this can always be done is a conseuqence of the mean value theorem. - - -##### Example - -The mean value theorem is an extremely useful tool to relate properties of a function with properties of its derivative, as, like Rolle's theorem, it includes both ``f`` and ``f'`` in its statement. - - -For example, suppose we have a function $f(x)$ and we know that the -derivative is **always** $0$. What can we say about the function? - -Well, constant functions have derivatives that are constantly $0$. -But do others? We will see the answer is no: If a function has a zero derivative in ``(a,b)`` it must be a constant. We can readily see that if ``f`` is a polynomial function this is the case, as we can differentiate a polynomial function and this will be zero only if **all** its coefficients are ``0``, which would mean there is no non-constant leading term in the polynomial. But polynomials are not representative of all functions, and so a proof requires a bit more effort. - - -Suppose it is known that $f'(x)=0$ on some interval ``I`` and we take any ``a < b`` in ``I``. Since $f'(x)$ always exists, $f(x)$ is always differentiable, and -hence always continuous. So on $[a,b]$ the conditions of the mean -value theorem apply. That is, there is a $c$ in ``(a,b)`` with $(f(b) - f(a)) / (b-a) = -f'(c) = 0$. But this would imply $f(b) - f(a)=0$. That is $f(x)$ is a -constant, as for any $a$ and $b$, we see $f(a)=f(b)$. - -### The Cauchy mean value theorem - -[Cauchy](http://en.wikipedia.org/wiki/Mean_value_theorem#Cauchy.27s_mean_value_theorem) -offered an extension to the mean value theorem above. Suppose both $f$ -and $g$ satisfy the conditions of the mean value theorem on $[a,b]$ with $g(b)-g(a) \neq 0$, -then there exists at least one $c$ with $a < c < b$ such that - -```math -f'(c) = g'(c) \cdot \frac{f(b) - f(a)}{g(b) - g(a)}. -``` - -The proof follows by considering $h(x) = f(x) - r\cdot g(x)$, with $r$ chosen so that $h(a)=h(b)$. Then Rolle's theorem applies so that there is a $c$ with $h'(c)=0$, so $f'(c) = r g'(c)$, but $r$ can be seen to be $(f(b)-f(a))/(g(b)-g(a))$, which proves the theorem. - -Letting $g(x) = x$ demonstrates that the mean value theorem is a special case. - -##### Example - -Suppose $f(x)$ and $g(x)$ satisfy the Cauchy mean value theorem on -$[0,x]$, $g'(x)$ is non-zero on $(0,x)$, and $f(0)=g(0)=0$. Then we have: - -```math -\frac{f(x) - f(0)}{g(x) - g(0)} = \frac{f(x)}{g(x)} = \frac{f'(c)}{g'(c)}, -``` - -For some $c$ in $[0,x]$. If $\lim_{x \rightarrow 0} f'(x)/g'(x) = L$, -then the right hand side will have a limit of $L$, and hence the left -hand side will too. That is, when the limit exists, we have under -these conditions that $\lim_{x\rightarrow 0}f(x)/g(x) = -\lim_{x\rightarrow 0}f'(x)/g'(x)$. - -This could be used to prove the limit of $\sin(x)/x$ as $x$ goes to -$0$ just by showing the limit of $\cos(x)/1$ is $1$, as is known by -continuity. - -### Visualizing the Cauchy mean value theorem - -The Cauchy mean value theorem can be visualized in terms of a tangent -line and a *parallel* secant line in a similar manner as the mean -value theorem as long as a *parametric* graph is used. A parametric -graph plots the points $(g(t), f(t))$ for some range of $t$. That is, -it graphs *both* functions at the same time. The following illustrates -the construction of such a graph: - -```julia; hold=true; echo=false; cache=true -### {{{parametric_fns}}} - - - -function parametric_fns_graph(n) - f = (x) -> sin(x) - g = (x) -> x - - ns = (1:10)/10 - ts = range(-pi/2, stop=-pi/2 + ns[n] * pi, length=100) - - plt = plot(f, g, -pi/2, -pi/2 + ns[n] * pi, legend=false, size=fig_size, - xlim=(-1.1,1.1), ylim=(-pi/2-.1, pi/2+.1)) - scatter!(plt, [f(ts[end])], [g(ts[end])], color=:orange, markersize=5) - val = @sprintf("% 0.2f", ts[end]) - annotate!(plt, [(0, 1, "t = $val")]) -end -caption = L""" - -Illustration of parametric graph of $(g(t), f(t))$ for $-\pi/2 \leq t -\leq \pi/2$ with $g(x) = \sin(x)$ and $f(x) = x$. Each point on the -graph is from some value $t$ in the interval. We can see that the -graph goes through $(0,0)$ as that is when $t=0$. As well, it must go -through $(1, \pi/2)$ as that is when $t=\pi/2$ - -""" - - -n = 10 -anim = @animate for i=1:n - parametric_fns_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - - -With $g(x) = \sin(x)$ and $f(x) = x$, we can take $I=[a,b] = -[0, \pi/2]$. In the figure below, the *secant line* is drawn in red which -connects $(g(a), f(a))$ with the point $(g(b), f(b))$, and hence -has slope $\Delta f/\Delta g$. The parallel lines drawn show the *tangent* lines with slope $f'(c)/g'(c)$. Two exist for this problem, the mean value theorem guarantees at least one will. - - -```julia; hold=true; echo=false -g(x) = sin(x) -f(x) = x -ts = range(-pi/2, stop=pi/2, length=50) -a,b = 0, pi/2 -m = (f(b) - f(a))/(g(b) - g(a)) -cps = find_zeros(x -> f'(x)/g'(x) - m, -pi/2, pi/2) -c = cps[1] -Delta = (0 + m * (c - 0)) - (g(c)) - -p = plot(g, f, -pi/2, pi/2, linewidth=3, legend=false) -plot!(x -> f(a) + m * (x - g(a)), -1, 1, linewidth=3, color=:red) -scatter!([g(a),g(b)], [f(a), f(b)]) -for c in cps - plot!(x -> f(c) + m * (x - g(c)), -1, 1, color=:orange) -end - -p -``` - - - - -## Questions - -###### Question - -Rolle's theorem is a guarantee of a value, but does not provide a recipe to find it. For the function $1 - x^2$ over the interval $[-5,5]$, find a value $c$ that satisfies the result. - -```julia; hold=true; echo=false -c = 0 -numericq(c) -``` - - -###### Question - -The extreme value theorem is a guarantee of a value, but does not provide a recipe to find it. For the function $f(x) = \sin(x)$ on $I=[0, \pi]$ find a value $c$ satisfying the theorem for an absolute maximum. - -```julia; hold=true; echo=false -c = pi/2 -numericq(c) -``` - -###### Question - -The extreme value theorem is a guarantee of a value, but does not provide a recipe to find it. For the function $f(x) = \sin(x)$ on $I=[\pi, 3\pi/2]$ find a value $c$ satisfying the theorem for an absolute maximum. - -```julia; hold=true; echo=false -c = pi -numericq(c) -``` -###### Question - -The mean value theorem is a guarantee of a value, but does not provide a recipe to find it. For $f(x) = x^2$ on $[0,2]$ find a value of $c$ satisfying the theorem. - -```julia; hold=true; echo=false -c = 1 -numericq(c) -``` - -###### Question - -The Cauchy mean value theorem is a guarantee of a value, but does not provide a recipe to find it. For $f(x) = x^3$ and $g(x) = x^2$ find a value $c$ in the interval $[1, 2]$ - -```julia; hold=true; echo=false -c,x = symbols("c, x", real=true) -val = solve(3c^2 / (2c) - (2^3 - 1^3) / (2^2 - 1^2), c)[1] -numericq(float(val)) -``` - - -###### Question - -Will the function $f(x) = x + 1/x$ satisfy the conditions of the mean value theorem over $[-1/2, 1/2]$? - -```julia; hold=true; echo=false -radioq(["Yes", "No"], 2) -``` - -###### Question - -Just as it is a fact that $f'(x) = 0$ (for all $x$ in $I$) implies -$f(x)$ is a constant, so too is it a fact that if $f'(x) = g'(x)$ that -$f(x) - g(x)$ is a constant. What function would you consider, if you -wanted to prove this with the mean value theorem? - -```julia; hold=true; echo=false -choices = [ -"``h(x) = f(x) - (f(b) - f(a)) / (b - a)``", -"``h(x) = f(x) - (f(b) - f(a)) / (b - a) \\cdot g(x)``", -"``h(x) = f(x) - g(x)``", -"``h(x) = f'(x) - g'(x)``" -] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -Suppose $f''(x) > 0$ on $I$. Why is it impossible that $f'(x) = 0$ at more than one value in $I$? - -```julia; hold=true; echo=false -choices = [ -L"It isn't. The function $f(x) = x^2$ has two zeros and $f''(x) = 2 > 0$", -"By the Rolle's theorem, there is at least one, and perhaps more", -L"By the mean value theorem, we must have $f'(b) - f'(a) > 0$ when ever $b > a$. This means $f'(x)$ is increasing and can't double back to have more than one zero." -] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -Let $f(x) = 1/x$. For $0 < a < b$, find $c$ so that $f'(c) = (f(b) - f(a)) / (b-a)$. - -```julia; hold=true; echo=false -choices = [ -"``c = (a+b)/2``", -"``c = \\sqrt{ab}``", -"``c = 1 / (1/a + 1/b)``", -"``c = a + (\\sqrt{5} - 1)/2 \\cdot (b-a)``" -] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -Let $f(x) = x^2$. For $0 < a < b$, find $c$ so that $f'(c) = (f(b) - f(a)) / (b-a)$. - -```julia; hold=true; echo=false -choices = [ -"``c = (a+b)/2``", -"``c = \\sqrt{ab}``", -"``c = 1 / (1/a + 1/b)``", -"``c = a + (\\sqrt{5} - 1)/2 \\cdot (b-a)``" -] -answ = 1 -radioq(choices, answ) -``` - - - - - -###### Question - -In an example, we used the fact that if $0 < c < x$, for some $c$ given by the mean value theorem and $f(x)$ goes to $0$ as $x$ goes to zero then $f(c)$ will also go to zero. Suppose we say that $c=g(x)$ for some function $c$. - -Why is it known that $g(x)$ goes to $0$ as $x$ goes to zero (from the right)? - -```julia; hold=true; echo=false -choices = [L"The squeeze theorem applies, as $0 < g(x) < x$.", -L"As $f(x)$ goes to zero by Rolle's theorem it must be that $g(x)$ goes to $0$.", -L"This follows by the extreme value theorem, as there must be some $c$ in $[0,x]$."] -answ = 1 -radioq(choices, answ) -``` - -Since $g(x)$ goes to zero, why is it true that if $f(x)$ goes to $L$ as $x$ goes to zero that $f(g(x))$ must also have a limit $L$? - -```julia; hold=true; echo=false -choices = ["It isn't true. The limit must be 0", -L"The squeeze theorem applies, as $0 < g(x) < x$", -"This follows from the limit rules for composition of functions"] -answ = 3 -radioq(choices, answ) -``` diff --git a/CwJ/derivatives/more_zeros.jmd b/CwJ/derivatives/more_zeros.jmd deleted file mode 100644 index c6c1f98..0000000 --- a/CwJ/derivatives/more_zeros.jmd +++ /dev/null @@ -1,530 +0,0 @@ -# Derivative-free alternatives to Newton's method - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using ImplicitEquations -using Roots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -const frontmatter = ( - title = "Derivative-free alternatives to Newton's method", - description = "Calculus with Julia: Derivative-free alternatives to Newton's method", - tags = ["CalculusWithJulia", "derivatives", "derivative-free alternatives to newton's method"], -); - -nothing -``` - ----- - -Newton's method is not the only algorithm of its kind for identifying zeros of a function. In this section we discuss some alternatives. - -## The `find_zero(f, x0)` function - -The function `find_zero` from the `Roots` packages provides several different algorithms for finding a zero of a function, including some a derivative-free -algorithms for finding zeros when started with an initial -guess. The default method is similar to Newton's method in that only a good initial -guess is needed. However, the algorithm, while possibly slower in terms of -function evaluations and steps, is engineered to be a bit more -robust to the choice of initial estimate than Newton's method. (If it -finds a bracket, it will use a bisection algorithm which is guaranteed to -converge, but can be slower to do so.) Here we see how to call the -function: - -```julia; -f(x) = cos(x) - x -x₀ = 1 -find_zero(f, x₀) -``` - -Compare to this related call which uses the bisection method: - -```julia; -find_zero(f, (0, 1)) ## [0,1] must be a bracketing interval -``` - -For this example both give the same answer, but the bisection method -is a bit less convenient as a bracketing interval must be pre-specified. - -## The secant method - -The default `find_zero` method above uses a secant-like method unless a bracketing method is found. The secant method is historic, dating back over ``3000`` years. Here we discuss the secant method in a more general framework. - -One way to view Newton's method is through the inverse of ``f`` (assuming it exists): if ``f(\alpha) = 0`` then ``\alpha = f^{-1}(0)``. - -If ``f`` has a simple zero at ``\alpha`` and is locally invertible (that is some ``f^{-1}`` exists) then the update step for Newton's method can be identified with: - -* fitting a polynomial to the local inverse function of ``f`` going through through the point ``(f(x_0),x_0)``, -* and matching the slope of ``f`` at the same point. - -That is, we can write ``g(y) = h_0 + h_1 (y-f(x_0))``. Then ``g(f(x_0)) = x_0 = h_0``, so ``h_0 = x_0``. From ``g'(f(x_0)) = 1/f'(x_0)``, we get ``h_1 = 1/f'(x_0)``. That is, ``g(y) = x_0 + (y-f(x_0))/f'(x_0)``. At ``y=0,`` we get the update step ``x_1 = g(0) = x_0 - f(x_0)/f'(x_0)``. - - -A similar viewpoint can be used to create derivative-free methods. - - -For example, the [secant method](https://en.wikipedia.org/wiki/Secant_method) can be seen as the result of fitting a degree-``1`` polynomial approximation for ``f^{-1}`` through two points ``(f(x_0),x_0)`` and ``(f(x_1), x_1)``. - - -Again, expressing this approximation as ``g(y) = h_0 + h_1(y-f(x_1))`` leads to ``g(f(x_1)) = x_1 = h_0``. -Substituting ``f(x_0)`` gives ``g(f(x_0)) = x_0 = x_1 + h_1(f(x_0)-f(x_1))``. Solving for ``h_1`` leads to ``h_1=(x_1-x_0)/(f(x_1)-f(x_0))``. Then ``x_2 = g(0) = x_1 + (x_1-x_0)/(f(x_1)-f(x_0)) \cdot f(x_1)``. This is the first step of the secant method: - -```math -x_{n+1} = x_n - f(x_n) \frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}. -``` - -That is, where the next step of Newton's method comes from the intersection of the tangent line at ``x_n`` with the ``x``-axis, the next step of the secant method comes from the intersection of the secant line defined by ``x_n`` and ``x_{n-1}`` with the ``x`` axis. That is, the secant method simply replaces ``f'(x_n)`` with the slope of the secant line between ``x_n`` and ``x_{n-1}``. - - -We code the update step as `λ2`: - -```julia; -λ2(f0,f1,x0,x1) = x1 - f1 * (x1-x0) / (f1-f0) -``` - -Then we can run a few steps to identify the zero of sine starting at ``3`` and ``4`` - -```julia; hold=true; term=true -x0,x1 = 4,3 -f0,f1 = sin.((x0,x1)) -@show x1,f1 - -x0,x1 = x1, λ2(f0,f1,x0,x1) -f0,f1 = f1, sin(x1) -@show x1,f1 - -x0,x1 = x1, λ2(f0,f1,x0,x1) -f0,f1 = f1, sin(x1) -@show x1,f1 - -x0,x1 = x1, λ2(f0,f1,x0,x1) -f0,f1 = f1, sin(x1) -@show x1,f1 - -x0,x1 = x1, λ2(f0,f1,x0,x1) -f0,f1 = f1, sin(x1) -x1,f1 -``` - - -Like Newton's method, the secant method coverges quickly for this problem (though its rate is less than the quadratic rate of Newton's method). - - -This method is included in `Roots` as `Secant()` (or `Order1()`): - -```julia; -find_zero(sin, (4,3), Secant()) -``` - - - -Though the derivative is related to the slope of the secant line, that is in the limit. The convergence of the secant method is not as fast as Newton's method, though at each step of the secant method, only one new function evaluation is needed, so it can be more efficient for functions that are expensive to compute or differentiate. - - -Let ``\epsilon_{n+1} = x_{n+1}-\alpha``, where ``\alpha`` is assumed to be the *simple* zero of ``f(x)`` that the secant method converges to. A [calculation](https://math.okstate.edu/people/binegar/4513-F98/4513-l08.pdf) shows that - -```math -\begin{align*} -\epsilon_{n+1} &\approx \frac{x_n-x_{n-1}}{f(x_n)-f(x_{n-1})} \frac{(1/2)f''(\alpha)(e_n-e_{n-1})}{x_n-x_{n-1}} \epsilon_n \epsilon_{n-1}\\ -& \approx \frac{f''(\alpha)}{2f'(\alpha)} \epsilon_n \epsilon_{n-1}\\ -&= C \epsilon_n \epsilon_{n-1}. -\end{align*} -``` - -The constant `C` is similar to that for Newton's method, and reveals potential troubles for the secant method similar to those of Newton's method: a poor initial guess (the initial error is too big), the second derivative is too large, the first derivative too flat near the answer. - -Assuming the error term has the form ``\epsilon_{n+1} = A|\epsilon_n|^\phi`` and substituting into the above leads to the equation - -```math -\frac{A^{1-1/\phi}}{C} = |\epsilon_n|^{1 - \phi +1/\phi}. -``` - -The left side being a constant suggests ``\phi`` solves: ``1 - \phi + 1/\phi = 0`` or ``\phi^2 -\phi - 1 = 0``. The solution is the golden ratio, ``(1 + \sqrt{5})/2 \approx 1.618\dots``. - - -### Steffensen's method - -Steffensen's method is a secant-like method that converges with ``|\epsilon_{n+1}| \approx C |\epsilon_n|^2``. The secant is taken between the points ``(x_n,f(x_n))`` and ``(x_n + f(x_n), f(x_n + f(x_n))``. Like Newton's method this requires ``2`` function evaluations per step. Steffensen's is implemented through `Roots.Steffensen()`. Steffensen's method is more sensitive to the initial guess than other methods, so in practice must be used with care, though it is a starting point for many higher-order derivative-free methods. - - - -## Inverse quadratic interpolation - -Inverse quadratic interpolation fits a quadratic polynomial through three points, not just two like the Secant method. The third being ``(f(x_2), x_2)``. - - -For example, here is the inverse quadratic function, ``g(y)``, going through three points marked with red dots. The blue dot is found from ``(g(0), 0)``. - -```julia; hold=true; echo=false - -a,b,c = 1,2,3 -fa,fb,fc = -1,1/4,1 -g(y) = (y-fb)*(y-fa)/(fc-fb)/(fc-fa)*c + (y-fc)*(y-fa)/(fb-fc)/(fb-fa)*b + (y-fc)*(y-fb)/(fa-fc)/(fa-fb)*a -ys = range(-2,2, length=100) -xs = g.(ys) -plot(xs, ys, legend=false) -scatter!([a,b,c],[fa,fb,fc], color=:red, markersize=5) -scatter!([g(0)],[0], color=:blue, markersize=5) -plot!(zero, color=:blue) -``` - - -Here we use `SymPy` to identify the degree-``2`` polynomial as a function of ``y``, then evaluate it at ``y=0`` to find the next step: - - -```julia -@syms y hs[0:2] xs[0:2] fs[0:2] -H(y) = sum(hᵢ*(y - fs[end])^i for (hᵢ,i) ∈ zip(hs, 0:2)) - -eqs = [H(fᵢ) ~ xᵢ for (xᵢ, fᵢ) ∈ zip(xs, fs)] -ϕ = solve(eqs, hs) -hy = subs(H(y), ϕ) -``` - -The value of `hy` at ``y=0`` yields the next guess based on the past three, and is given by: - -```julia; -q⁻¹ = hy(y => 0) -``` - - -Though the above can be simplified quite a bit when computed by hand, here we simply make this a function with `lambdify` which we will use below. - -```julia; -λ3 = lambdify(q⁻¹) # fs, then xs -``` - -(`SymPy`'s `lambdify` function, by default, picks the order of its argument lexicographically, in this case they will be the `f` values then the `x` values.) - -An inverse quadratic step is utilized by Brent's method, as possible, to yield a rapidly convergent bracketing algorithm implemented as a default zero finder in many software languages. `Julia`'s `Roots` package implements the method in `Roots.Brent()`. An inverse cubic interpolation is utilized by [Alefeld, Potra, and Shi](https://dl.acm.org/doi/10.1145/210089.210111) which gives an asymptotically even more rapidly convergent algorithm then Brent's (implemented in `Roots.AlefeldPotraShi()` and also `Roots.A42()`). This is used as a finishing step in many cases by the default hybrid `Order0()` method of `find_zero`. - -In a bracketing algorithm, the next step should reduce the size of the bracket, so the next iterate should be inside the current bracket. However, quadratic convergence does not guarantee this to happen. As such, sometimes a subsitute method must be chosen. - -[Chandrapatla's](https://www.google.com/books/edition/Computational_Physics/cC-8BAAAQBAJ?hl=en&gbpv=1&pg=PA95&printsec=frontcover) method, is a bracketing method utilizing an inverse quadratic step as the centerpiece. The key insight is the test to choose between this inverse quadratic step and a bisection step. This is done in the following based on values of ``\xi`` and ``\Phi`` defined within: - -```julia; -function chandrapatla(f, u, v, λ; verbose=false) - a,b = promote(float(u), float(v)) - fa,fb = f(a),f(b) - @assert fa * fb < 0 - - if abs(fa) < abs(fb) - a,b,fa,fb = b,a,fb,fa - end - - c, fc = a, fa - - maxsteps = 100 - for ns in 1:maxsteps - - Δ = abs(b-a) - m, fm = (abs(fa) < abs(fb)) ? (a, fa) : (b, fb) - ϵ = eps(m) - if Δ ≤ 2ϵ - return m - end - @show m,fm - iszero(fm) && return m - - ξ = (a-b)/(c-b) - Φ = (fa-fb)/(fc-fb) - - if Φ^2 < ξ < 1 - (1-Φ)^2 - xt = λ(fa,fc,fb, a,c,b) # inverse quadratic - else - xt = a + (b-a)/2 - end - - ft = f(xt) - - isnan(ft) && break - - if sign(fa) == sign(ft) - c,fc = a,fa - a,fa = xt,ft - else - c,b,a = b,a,xt - fc,fb,fa = fb,fa,ft - end - - verbose && @show ns, a, fa - - end - error("no convergence: [a,b] = $(sort([a,b]))") -end -``` - -Like bisection, this method ensures that ``a`` and ``b`` is a bracket, but it moves ``a`` to the newest estimate, so does not maintain that ``a < b`` throughout. - -We can see it in action on the sine function. Here we pass in ``\lambda``, but in a real implementation (as in `Roots.Chandrapatla()`) we would have programmed the algorithm to compute the inverse quadratic value. - -```julia; term=true -chandrapatla(sin, 3, 4, λ3, verbose=true) -``` - - - -The condition `Φ^2 < ξ < 1 - (1-Φ)^2` can be visualized. Assume `a,b=0,1`, `fa,fb=-1/2,1`, Then `c < a < b`, and `fc` has the same sign as `fa`, but what values of `fc` will satisfy the inequality? - -```julia; -ξ(c,fc) = (a-b)/(c-b) -Φ(c,fc) = (fa-fb)/(fc-fb) -Φl(c,fc) = Φ(c,fc)^2 -Φr(c,fc) = 1 - (1-Φ(c,fc))^2 -a,b = 0, 1 -fa,fb = -1/2, 1 -region = Lt(Φl, ξ) & Lt(ξ,Φr) -plot(region, xlims=(-2,a), ylims=(-3,0)) -``` - -When `(c,fc)` is in the shaded area, the inverse quadratic step is chosen. We can see that `fc < fa` is needed. - -For these values, this area is within the area where a implicit quadratic step will result in a value between `a` and `b`: - -```julia; -l(c,fc) = λ3(fa,fb,fc,a,b,c) -region₃ = ImplicitEquations.Lt(l,b) & ImplicitEquations.Gt(l,a) -plot(region₃, xlims=(-2,0), ylims=(-3,0)) -``` - -There are values in the parameter space where this does not occur. - -## Tolerances - -The `chandrapatla` algorithm typically waits until `abs(b-a) <= 2eps(m)` (where ``m`` is either ``b`` or ``a`` depending on the size of ``f(a)`` and ``f(b)``) is satisfied. Informally this means the algorithm stops when the two bracketing values are no more than a small amount apart. What is a "small amount?" - -To understand, we start with the fact that floating point numbers are an approximation to real numbers. - -Floating point numbers effectively represent a number in scientific -notation in terms of - -* a sign (plus or minus) , -* a *mantissa* (a number in ``[1,2)``, in binary ), and -* an exponent (to represent a power of ``2``). - -The mantissa is of the form `1.xxxxx...xxx` where there are ``m`` -different `x`s each possibly a `0` or `1`. The `i`th `x` indicates if the term `1/2^i` should be -included in the value. The mantissa is the sum of `1` plus the -indicated values of `1/2^i` for `i` in `1` to `m`. So the last `x` represents if `1/2^m` should be -included in the sum. As such, the -mantissa represents a discrete set of values, separated by `1/2^m`, as -that is the smallest difference possible. - -For example if `m=2` then the possible value for the mantissa are `11 => 1 + 1/2 + 1/4 = 7/4`, -`10 => 1 + 1/2 = 6/4`, `01 => 1 + 1/4 = 5/4`. and `00 => 1 = 4/4`, values separated by `1/4 = 1/2^m`. - -For ``64``-bit floating point numbers `m=52`, so the values in the mantissa differ by `1/2^52 = 2.220446049250313e-16`. This is the value of `eps()`. - -However, this "gap" between numbers is for values when the exponent is `0`. That is the numbers in `[1,2)`. For values in `[2,4)` the gap is twice, between `[1/2,1)` the gap is half. That is the gap depends on the size of the number. The gap between `x` and its next largest floating point number is given by `eps(x)` and that always satisfies `eps(x) <= eps() * abs(x)`. - -One way to think about this is the difference between `x` and the next largest floating point values is *basically* `x*(1+eps()) - x` or `x*eps()`. - -For the specific example, `abs(b-a) <= 2eps(m)` means that the gap between `a` and `b` is essentially 2 floating point values from the ``x`` value with the smallest ``f(x)`` value. - - -For bracketing methods that is about as good as you can get. However, once floating values are understood, the absolute best you can get for a bracketing interval would be -* along the way, a value `f(c)` is found which is *exactly* `0.0` -* the endpoints of the bracketing interval are *adjacent* floating point values, meaning the interval can not be bisected and `f` changes sign between the two values. - - -There can be problems when the stopping criteria is `abs(b-a) <= 2eps(m))` and the answer is `0.0` that require engineering around. For example, the algorithm above for the function `f(x) = -40*x*exp(-x)` does not converge when started with `[-9,1]`, even though `0.0` is an obvious zero. - - - -```julia; hold=true -fu(x) = -40*x*exp(-x) -chandrapatla(fu, -9, 1, λ3) -``` - -Here the issue is `abs(b-a)` is tiny (of the order `1e-119`) but `eps(m)` is even smaller. - - - - - -For non-bracketing methods, like Newton's method or the secant method, different criteria are useful. -There may not be a bracketing interval for `f` (for example `f(x) = (x-1)^2`) so the second criteria above might need to be restated in terms of the last two iterates, ``x_n`` and ``x_{n-1}``. Calling this difference ``\Delta = |x_n - x_{n-1}|``, we might stop if ``\Delta`` is small enough. As there are scenarios where this can happen, but the function is not at a zero, a check on the size of ``f`` is needed. - -However, there may be no floating point value where ``f`` is exactly `0.0` so checking the size of `f(x_n)` requires some agreement. - -First if `f(x_n)` is `0.0` then it makes sense to call `x_n` an *exact zero* of ``f``, even though this may hold even if `x_n`, a floating point value, is not mathematically an *exact* zero of ``f``. (Consider `f(x) = x^2 - 2x + 1`. Mathematically, this is identical to `g(x) = (x-1)^2`, but `f(1 + eps())` is zero, while `g(1+eps())` is `4.930380657631324e-32`. - -However, there may never be a value with `f(x_n)` exactly `0.0`. (The value of `sin(pi)` is not zero, for example, as `pi` is an approximation to ``\pi``, as well the `sin` of values adjacent to `float(pi)` do not produce `0.0` exactly.) - - -Suppose `x_n` is the closest floating number to ``\alpha``, the zero. Then the relative rounding error, ``(`` `x_n` ``- \alpha)/\alpha``, will be a value ``(1 + \delta)`` with ``\delta`` less than `eps()`. - -How far then can `f(x_n)` be from ``0 = f(\alpha)``? - -```math -f(x_n) = f(x_n - \alpha + \alpha) = f(\alpha + \alpha \cdot \delta) = f(\alpha \cdot (1 + \delta)), -``` - -Assuming ``f`` has a derivative, the linear approximation gives: - -```math -f(x_n) \approx f(\alpha) + f'(\alpha) \cdot (\alpha\delta) = f'(\alpha) \cdot \alpha \delta -``` - -So we should consider `f(x_n)` an *approximate zero* when it is on the scale of -``f'(\alpha) \cdot \alpha \delta``. - -That ``\alpha`` factor means we consider a *relative* tolerance for `f`. -Also important -- when `x_n` is close to `0`, -is the need for an *absolute* tolerance, one not dependent on the size of `x`. -So a good condition to check if `f(x_n)` is small is - -`abs(f(x_n)) <= abs(x_n) * rtol + atol`, or `abs(f(x_n)) <= max(abs(x_n) * rtol, atol)` - -where the relative tolerance, `rtol`, would absorb an estimate for ``f'(\alpha)``. - - -Now, in Newton's method the update step is ``f(x_n)/f'(x_n)``. Naturally when ``f(x_n)`` is close to ``0``, the update step is small and ``\Delta`` will be close to ``0``. *However*, should ``f'(x_n)`` be large, then ``\Delta`` can also be small and the algorithm will possibly stop, as ``x_{n+1} \approx x_n`` -- but not necessarily ``x_{n+1} \approx \alpha``. So termination on ``\Delta`` alone can be off. Checking if ``f(x_{n+1})`` is an approximate zero is also useful to include in a stopping criteria. - -One thing to keep in mind is that the right-hand side of the rule `abs(f(x_n)) <= abs(x_n) * rtol + atol`, as a function of `x_n`, goes to `Inf` as `x_n` increases. So if `f` has `0` as an asymptote (like `e^(-x)`) for large enough `x_n`, the rule will be `true` and `x_n` could be counted as an approximate zero, despite it not being one. - -So a modified criteria for convergence might look like: - -* stop if ``\Delta`` is small and `f` is an approximate zero with some tolerances -* stop if `f` is an approximate zero with some tolerances, but be mindful that this rule can identify mathematically erroneous answers. - -It is not uncommon to assign `rtol` to have a value like `sqrt(eps())` to account for accumulated floating point errors and the factor of ``f'(\alpha)``, though in the `Roots` package it is set smaller by default. - - -## Questions - -###### Question - -Let `f(x) = tanh(x)` (the hyperbolic tangent) and `fp(x) = sech(x)^2`, its derivative. - -Does *Newton's* method (using `Roots.Newton()`) converge starting at `1.0`? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -Does *Newton's* method (using `Roots.Newton()`) converge starting at `1.3`? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -Does the secant method (using `Roots.Secant()`) converge starting at `1.3`? (a second starting value will automatically be chosen, if not directly passed in.) - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -###### Question - -For the function `f(x) = x^5 - x - 1` both Newton's method and the secant method will converge to the one root when started from `1.0`. Using `verbose=true` as an argument to `find_zero`, (e.g., `find_zero(f, x0, Roots.Secant(), verbose=true)`) how many *more* steps does the secant method need to converge? - -```julia; hold=true; echo=false -numericq(2) -``` - -Do the two methods converge to the exact same value? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -###### Question - -Let `f(x) = exp(x) - x^4` and `x0=8.0`. How many steps (iterations) does it take for the secant method to converge using the default tolerances? - -```julia; hold=true; echo=false -numericq(10, 1) -``` - -###### Question - -Let `f(x) = exp(x) - x^4` and a starting bracket be `x0 = [8.9]`. Then calling `find_zero(f,x0, verbose=true)` will show that 49 steps are needed for exact bisection to converge. What about with the `Roots.Brent()` algorithm, which uses inverse quadratic steps when it can? - -It takes how many steps? - -```julia; hold=true; echo=false -numericq(36, 1) -``` - -The `Roots.A42()` method uses inverse cubic interpolation, as possible, how many steps does this method take to converge? - -```julia; hold=true; echo=false -numericq(3, 1) -``` - -The large difference is due to how the tolerances are set within `Roots`. The `Brent method gets pretty close in a few steps, but takes a much longer time to get close enough for the default tolerances - - -###### Question - -Consider this crazy function defined by: - -```julia; eval=false -f(x) = cos(100*x)-4*erf(30*x-10) -``` - -(The `erf` function is the (error function](https://en.wikipedia.org/wiki/Error_function) and is in the `SpecialFunctions` package loaded with `CalculusWithJulia`.) - -Make a plot over the interval $[-3,3]$ to see why it is called "crazy". - -Does `find_zero` find a zero to this function starting from $0$? - - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -If so, what is the value? - -```julia; hold=true; echo=false -f(x) = cos(100*x)-4*erf(30*x-10) -val = find_zero(f, 0) -numericq(val) -``` - -If not, what is the reason? - -```julia; hold=true; echo=false -choices = [ -"The zero is a simple zero", -"The zero is not a simple zero", -"The function oscillates too much to rely on the tangent line approximation far from the zero", -"We can find an answer" -] -answ = 4 -radioq(choices, answ, keep_order=true) -``` - - -Does `find_zero` find a zero to this function starting from $1$? - - -```julia; hold=true; echo=false -yesnoq(false) -``` - -If so, what is the value? - -```julia; hold=true; echo=false -numericq(-999.999) -``` - -If not, what is the reason? - -```julia; hold=true; echo=false -choices = [ -"The zero is a simple zero", -"The zero is not a simple zero", -"The function oscillates too much to rely on the tangent line approximations far from the zero", -"We can find an answer" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` diff --git a/CwJ/derivatives/newtons-method.js b/CwJ/derivatives/newtons-method.js deleted file mode 100644 index a328710..0000000 --- a/CwJ/derivatives/newtons-method.js +++ /dev/null @@ -1,72 +0,0 @@ -// newton's method - -const b = JXG.JSXGraph.initBoard('jsxgraph', { - boundingbox: [-3,5,3,-5], axis:true -}); - - -var f = function(x) {return x*x*x*x*x - x - 1}; -var fp = function(x) { return 4*x*x*x*x - 1}; -var x0 = 0.85; - -var nm = function(x) { return x - f(x)/fp(x);}; - -var l = b.create('point', [-1.5,0], {name:'', size:0}); -var r = b.create('point', [1.5,0], {name:'', size:0}); -var xaxis = b.create('line', [l,r]) - - -var P0 = b.create('glider', [x0,0,xaxis], {name:'x0'}); -var P0a = b.create('point', [function() {return P0.X();}, - function() {return f(P0.X());}], {name:''}); - -var P1 = b.create('point', [function() {return nm(P0.X());}, - 0], {name:''}); -var P1a = b.create('point', [function() {return P1.X();}, - function() {return f(P1.X());}], {name:''}); - -var P2 = b.create('point', [function() {return nm(P1.X());}, - 0], {name:''}); -var P2a = b.create('point', [function() {return P2.X();}, - function() {return f(P2.X());}], {name:''}); - -var P3 = b.create('point', [function() {return nm(P2.X());}, - 0], {name:''}); -var P3a = b.create('point', [function() {return P3.X();}, - function() {return f(P3.X());}], {name:''}); - -var P4 = b.create('point', [function() {return nm(P3.X());}, - 0], {name:''}); -var P4a = b.create('point', [function() {return P4.X();}, - function() {return f(P4.X());}], {name:''}); -var P5 = b.create('point', [function() {return nm(P4.X());}, - 0], {name:'x5', strokeColor:'black'}); - - - - - -P0a.setAttribute({fixed:true}); -P1.setAttribute({fixed:true}); -P1a.setAttribute({fixed:true}); -P2.setAttribute({fixed:true}); -P2a.setAttribute({fixed:true}); -P3.setAttribute({fixed:true}); -P3a.setAttribute({fixed:true}); -P4.setAttribute({fixed:true}); -P4a.setAttribute({fixed:true}); -P5.setAttribute({fixed:true}); - -var sc = '#000000'; -b.create('segment', [P0,P0a], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P0a, P1], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P1,P1a], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P1a, P2], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P2,P2a], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P2a, P3], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P3,P3a], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P3a, P4], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P4,P4a], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P4a, P5], {strokeColor:sc, strokeWidth:1}); - -b.create('functiongraph', [f, -1.5, 1.5]) diff --git a/CwJ/derivatives/newtons_method.jmd b/CwJ/derivatives/newtons_method.jmd deleted file mode 100644 index f4ddcc2..0000000 --- a/CwJ/derivatives/newtons_method.jmd +++ /dev/null @@ -1,1432 +0,0 @@ -# Newton's method - - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using Roots -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using ImplicitPlots - -fig_size = (800, 600) -const frontmatter = ( - title = "Newton's method", - description = "Calculus with Julia: Newton's method", - tags = ["CalculusWithJulia", "derivatives", "newton's method"], -); - -nothing -``` - ----- - -The Babylonian method is an algorithm to find an approximate value for $\sqrt{k}$. -It was described by the first-century Greek mathematician Hero of -[Alexandria](http://en.wikipedia.org/wiki/Babylonian_method). - -The method starts with some initial guess, called $x_0$. It then -applies a formula to produce an improved guess. This is repeated until -the improved guess is accurate enough or it is clear the algorithm -fails to work. - -For the Babylonian method, the next guess, $x_{i+1}$, is derived from the current guess, $x_i$. In mathematical notation, this is the updating step: - - -```math -x_{i+1} = \frac{1}{2}(x_i + \frac{k}{x_i}) -``` - - -We use this algorithm to approximate the square root of $2$, a value known to -the Babylonians. - -Start with $x$, then form $x/2 + 1/x$, from this again form $x/2 + 1/x$, repeat. - -We represent this step using a function - -```julia -babylon(x) = x/2 + 1/x -``` - -Let's look starting with $x = 2$ as a rational number: - -```julia; hold=true -x₁ = babylon(2//1) -x₁, x₁^2.0 -``` - -Our estimate improved from something which squared to $4$ down to something which squares to $2.25.$ A big improvement, but there is still more to come. Had we done one more step: - - -```julia; -x₂ = (babylon ∘ babylon)(2//1) -x₂, x₂^2.0 -``` - -We now see accuracy until the third decimal point. - -```julia; -x₃ = (babylon ∘ babylon ∘ babylon)(2//1) -x₃, x₃^2.0 -``` - -This is now accurate to the sixth decimal point. That is about as far -as we, or the Bablyonians, would want to go by hand. Using rational -numbers quickly grows out of hand. The next step shows the explosion. - -```julia; -reduce((x,step) -> babylon(x), 1:4, init=2//1) -``` - -(In the above, we used `reduce` to repeat a function call ``4`` times, as an alternative to the composition operation. In this section we show a few styles to do this repetition before introducing a packaged function.) - - -However, with the advent of floating point numbers, the method stays quite manageable: - - -```julia; hold=true; -xₙ = reduce((x, step) -> babylon(x), 1:6, init=2.0) -xₙ, xₙ^2 -``` - -We can see that the algorithm - to the precision offered by floating -point numbers - has resulted in an answer `1.414213562373095`. This -answer is an *approximation* to the actual answer. Approximation is necessary, -as $\sqrt{2}$ is an irrational number and so can never be exactly -represented in floating point. That being said, we can see that the value -of $f(x)$ is accurate to the last decimal place, so our approximation -is very close and is achieved in a few steps. - -## Newton's generalization - -Let $f(x) = x^3 - 2x -5$. The value of ``2`` is almost a zero, but not quite, as $f(2) = --1$. We can check that there are no *rational* roots. Though there is -a method to solve the cubic it may be difficult to compute and will -not be as generally applicable as some algorithm like the Babylonian -method to produce an approximate answer. - -Is there some generalization to the Babylonian method? - -We know that the tangent line is a good approximation to the function -at the point. Looking at this graph gives a hint as to an algorithm: - -```julia; hold=true; echo=false -f(x) = x^3 - 2x - 5 -fp(x) = 3x^2 - 2 -c = 2 -p = plot(f, 1.75, 2.25, legend=false) -plot!(x->f(2) + fp(2)*(x-2)) -plot!(zero) -scatter!(p, [c], [f(c)], color=:orange, markersize=3) -p -``` - -The tangent line and the function nearly agree near $2$. So much so, -that the intersection point of the tangent line with the $x$ axis -nearly hides the actual zero of $f(x)$ that is near $2.1$. - -That is, it seems that the intersection of the tangent line and the -$x$ axis should be an improved approximation for the zero of the -function. - -Let $x_0$ be $2$, and $x_1$ be the intersection point of the tangent line -at $(x_0, f(x_0))$ with the $x$ axis. Then by the definition of the -tangent line: - -```math -f'(x_0) = \frac{\Delta y }{\Delta x} = \frac{f(x_0)}{x_0 - x_1}. -``` - -This can be solved for $x_1$ to give $x_1 = x_0 - f(x_0)/f'(x_0)$. In general, if we had $x_i$ and used the intersection point of the tangent line to produce $x_{i+1}$ we would have Newton's method: - -```math -x_{i+1} = x_i - \frac{f(x_i)}{f'(x_i)}. -``` - - -Using automatic derivatives, as brought in with the `CalculusWithJulia` package, we can implement this algorithm. - - -The algorithm above starts at $2$ and then becomes: - -```julia; -f(x) = x^3 - 2x - 5 -x0 = 2.0 -x1 = x0 - f(x0) / f'(x0) -``` - -We can see we are closer to a zero: - -```julia; -f(x0), f(x1) -``` - -Trying again, we have - -```julia; -x2 = x1 - f(x1)/ f'(x1) -x2, f(x2), f(x1) -``` - -And again: - -```julia; -x3 = x2 - f(x2)/ f'(x2) -x3, f(x3), f(x2) -``` - - -```julia; -x4 = x3 - f(x3)/ f'(x3) -x4, f(x4), f(x3) -``` - -We see now that $f(x_4)$ is within machine tolerance of $0$, so we -call $x_4$ an *approximate zero* of $f(x)$. - - -> **Newton's method:** Let $x_0$ be an initial guess for a zero of -> $f(x)$. Iteratively define $x_{i+1}$ in terms of the just -> generated $x_i$ by: -> ```math -> x_{i+1} = x_i - f(x_i) / f'(x_i). -> ``` -> Then for -> reasonable functions and reasonable initial guesses, the sequence of -> points converges to a zero of $f$. - -On the computer, we know that actual convergence will likely never -occur, but accuracy to a certain tolerance can often be achieved. - - - -In the example above, we kept track of the previous values. This is -unnecessary if only the answer is sought. In that case, the update -step could use the same variable. Here we use `reduce`: - -```julia;hold=true; -xₙ = reduce((x, step) -> x - f(x)/f'(x), 1:4, init=2) -xₙ, f(xₙ) -``` - -In practice, the algorithm is implemented not by repeating the update step a fixed number of times, rather by repeating the step until either we -converge or it is clear we won't converge. For good guesses and most -functions, convergence happens quickly. - - - -!!! note - Newton looked at this same example in 1699 (B.T. Polyak, *Newton's - method and its use in optimization*, European Journal of Operational - Research. 02/2007; 181(3):1086-1096.) though his technique was - slightly different as he did not use the derivative, *per se*, but - rather an approximation based on the fact that his function was a - polynomial (though identical to the derivative). Raphson (1690) - proposed the general form, hence the usual name of the Newton-Raphson - method. - -#### Examples - -##### Example: visualizing convergence - -This graphic demonstrates the method and the rapid convergence: - -```julia; echo=false -function newtons_method_graph(n, f, a, b, c) - - xstars = [c] - xs = [c] - ys = [0.0] - - plt = plot(f, a, b, legend=false, size=fig_size) - plot!(plt, [a, b], [0,0], color=:black) - - - ts = range(a, stop=b, length=50) - for i in 1:n - x0 = xs[end] - x1 = x0 - f(x0)/D(f)(x0) - push!(xstars, x1) - append!(xs, [x0, x1]) - append!(ys, [f(x0), 0]) - end - plot!(plt, xs, ys, color=:orange) - scatter!(plt, xstars, 0*xstars, color=:orange, markersize=5) - plt -end -nothing -``` - -```julia; hold=true; echo=false; cache=true -### {{{newtons_method_example}}} - -caption = """ - -Illustration of Newton's Method converging to a zero of a function. - -""" -n = 6 - -fn, a, b, c = x->log(x), .15, 2, .2 - -anim = @animate for i=1:n - newtons_method_graph(i-1, fn, a, b, c) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - ----- - -This interactive graphic (built using [JSXGraph](https://jsxgraph.uni-bayreuth.de/wp/index.html)) allows the adjustment of the point `x0`, initially at ``0.85``. Five iterations of Newton's method are illustrated. Different positions of `x0` clearly converge, others will not. - -```=html -
-``` - -```ojs -//| echo: false -//| output: false - -JXG = require("jsxgraph"); - -// newton's method - -b = JXG.JSXGraph.initBoard('jsxgraph', { - boundingbox: [-3,5,3,-5], axis:true -}); - - -f = function(x) {return x*x*x*x*x - x - 1}; -fp = function(x) { return 4*x*x*x*x - 1}; -x0 = 0.85; - -nm = function(x) { return x - f(x)/fp(x);}; - -l = b.create('point', [-1.5,0], {name:'', size:0}); -r = b.create('point', [1.5,0], {name:'', size:0}); -xaxis = b.create('line', [l,r]) - - -P0 = b.create('glider', [x0,0,xaxis], {name:'x0'}); -P0a = b.create('point', [function() {return P0.X();}, - function() {return f(P0.X());}], {name:''}); - -P1 = b.create('point', [function() {return nm(P0.X());}, - 0], {name:''}); -P1a = b.create('point', [function() {return P1.X();}, - function() {return f(P1.X());}], {name:''}); - -P2 = b.create('point', [function() {return nm(P1.X());}, - 0], {name:''}); -P2a = b.create('point', [function() {return P2.X();}, - function() {return f(P2.X());}], {name:''}); - -P3 = b.create('point', [function() {return nm(P2.X());}, - 0], {name:''}); -P3a = b.create('point', [function() {return P3.X();}, - function() {return f(P3.X());}], {name:''}); - -P4 = b.create('point', [function() {return nm(P3.X());}, - 0], {name:''}); -P4a = b.create('point', [function() {return P4.X();}, - function() {return f(P4.X());}], {name:''}); -P5 = b.create('point', [function() {return nm(P4.X());}, - 0], {name:'x5', strokeColor:'black'}); - - - - - -P0a.setAttribute({fixed:true}); -P1.setAttribute({fixed:true}); -P1a.setAttribute({fixed:true}); -P2.setAttribute({fixed:true}); -P2a.setAttribute({fixed:true}); -P3.setAttribute({fixed:true}); -P3a.setAttribute({fixed:true}); -P4.setAttribute({fixed:true}); -P4a.setAttribute({fixed:true}); -P5.setAttribute({fixed:true}); - -sc = '#000000'; -b.create('segment', [P0,P0a], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P0a, P1], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P1,P1a], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P1a, P2], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P2,P2a], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P2a, P3], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P3,P3a], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P3a, P4], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P4,P4a], {strokeColor:sc, strokeWidth:1}); -b.create('segment', [P4a, P5], {strokeColor:sc, strokeWidth:1}); - -b.create('functiongraph', [f, -1.5, 1.5]) - -``` - - - -##### Example: numeric not algebraic - -For the function $f(x) = \cos(x) - x$, we see that SymPy can not solve symbolically for a zero: - -```julia; -@syms x::real -solve(cos(x) - x, x) -``` - -We can find a numeric solution, even though there is no closed-form answer. Here we try Newton's method: - -```julia; hold=true -f(x) = cos(x) - x -x = .5 -x = x - f(x)/f'(x) # 0.7552224171056364 -x = x - f(x)/f'(x) # 0.7391416661498792 -x = x - f(x)/f'(x) # 0.7390851339208068 -x = x - f(x)/f'(x) # 0.7390851332151607 -x = x - f(x)/f'(x) -x, f(x) -``` - -To machine tolerance the answer is a zero, even though the exact answer is irrational and all finite floating point values can be represented as rational numbers. - -##### Example - -Use Newton's method to find the *largest* real solution to ``e^x = x^6``. - -A plot shows us roughly where the value lies: - -```julia; hold=true -f(x) = exp(x) -g(x) = x^6 -plot(f, 0, 25, label="f") -plot!(g, label="g") -``` - -Clearly by ``20`` the two paths diverge. We know exponentials eventually grow faster than powers, and this is seen in the graph. - -To use Newton's method to find the intersection point. Stop when the increment ``f(x)/f'(x)`` is smaller than `1e-4`. -We need to turn the solution to an equation into a value where a function is ``0``. Just moving the terms to one side of the equals sign gives ``e^x - x^6 = 0``, or the ``x`` we seek is a solution to ``h(x)=0`` with ``h(x) = e^x - x^6``. - - -```julia; hold=true; term=true -h(x) = exp(x) - x^6 -x = 20 -for step in 1:10 - delta = h(x)/h'(x) - x = x - delta - @show step, x, delta -end -``` - -So it takes ``8`` steps to get an increment that small and about `10` steps to get to full convergence. - -##### Example division as multiplication - -[Newton-Raphson Division](http://tinyurl.com/kjj9w92) is a means to divide by multiplying. - -Why would you want to do that? Well, even for computers division is -harder (read slower) than multiplying. The trick is that $p/q$ is -simply $p \cdot (1/q)$, so finding a means to compute a reciprocal by -multiplying will reduce division to multiplication. - - -Well suppose we have $q$, we could try to use Newton's method to find -$1/q$, as it is a solution to $f(x) = x - 1/q$. The Newton update step -simplifies to: - -```math -x - f(x) / f'(x) \quad\text{or}\quad x - (x - 1/q)/ 1 = 1/q -``` - -That doesn't really help, as Newton's method is just $x_{i+1} = 1/q$. -That is, it just jumps to the answer, the one we want to compute by some other means! - - -Trying again, we simplify the update step for a related function: -$f(x) = 1/x - q$ with $f'(x) = -1/x^2$ and then one step of the process is: - -```math -x_{i+1} = x_i - (1/x_i - q)/(-1/x_i^2) = -qx^2_i + 2x_i. -``` - -Now for $q$ in the interval $[1/2, 1]$ we want to get a *good* initial -guess. Here is a claim. We can use $x_0=48/17 - 32/17 \cdot q$. Let's check -graphically that this is a reasonable initial approximation to $1/q$: - -```julia; hold=true - -plot(q -> 1/q, 1/2, 1, label="1/q") -plot!(q -> 1/17 * (48 - 32q), label="linear approximation") -``` - - - -It can be shown that we have for any $q$ in $[1/2, 1]$ with initial guess $x_0 = -48/17 - 32/17\cdot q$ that Newton's method will converge to ``16`` digits in no more -than this many steps: - -```math -\log_2(\frac{53 + 1}{\log_2(17)}). -``` - - - -```julia; -a = log2((53 + 1)/log2(17)) -ceil(Integer, a) -``` - -That is ``4`` steps suffices. - -For $q = 0.80$, to find $1/q$ using the above we have - -```julia; hold=true -q = 0.80 -x = (48/17) - (32/17)*q -x = -q*x*x + 2*x -x = -q*x*x + 2*x -x = -q*x*x + 2*x -x = -q*x*x + 2*x -``` - -This method has basically $18$ multiplication and addition operations -for one division, so it naively would seem slower, but timing this -shows the method is competitive with a regular division. - -## Wrapping in a function - -In the previous examples, we saw fast convergence, guaranteed converge in ``4`` steps, and an example where ``8`` steps were needed to get the requested level of approximation. Newton's method usually converges quickly, but may converge slowly, and may not converge at all. Automating the task to avoid repeatedly running the update step is -a task best done by the computer. - -The `while` loop is a -good way to repeat commands until some condition is met. With this, we -present a simple function implementing Newton's method, we iterate -until the update step gets really small (the `atol`) or the -convergence takes more than ``50`` steps. (There are other, better choices that could be used to determine when the algorithm should stop, these are just easy to understand.) - -```julia; -function nm(f, fp, x0) - atol = 1e-14 - ctr = 0 - delta = Inf - while (abs(delta) > atol) && (ctr < 50) - delta = f(x0)/fp(x0) - x0 = x0 - delta - ctr = ctr + 1 - end - - ctr < 50 ? x0 : NaN -end -``` - - -##### Examples - - -- Find a zero of $\sin(x)$ starting at $x_0=3$: - -```julia; -nm(sin, cos, 3) -``` - -This is an approximation for $\pi$, that historically found use, as the convergence is fast. - -- Find a solution to $x^5 = 5^x$ near $2$: - -Writing a function to handle this, we have: - -```julia; -k(x) = x^5 - 5^x -``` - -We could find the derivative by hand, but use the automatic one instead: - -```julia; -alpha = nm(k, k', 2) -alpha, f(alpha) -``` - -### Functions in the Roots package - -Typing in the `nm` function might be okay once, but would be tedious -if it was needed each time. Besides, it isn't as robust to different inputs as possible. The `Roots` package provides a `Newton` -method for `find_zero`. - - -To use a different method with `find_zero`, the calling pattern is `find_zero(f, x, M)` where `f` represent the function(s), `x` the initial point(s), and `M` the method. Here we have: - -```julia -find_zero((sin, cos), 3, Roots.Newton()) -``` - -Or, if a derivative is not specified, one can be computed using automatic differentiation: - -```julia; hold=true -f(x) = sin(x) -find_zero((f, f'), 2, Roots.Newton()) -``` - -The argument `verbose=true` will force a print out of a message summarizing the convergence and showing each step. - -```julia; hold=true -f(x) = exp(x) - x^4 -find_zero((f,f'), 8, Roots.Newton(); verbose=true) -``` - - - -##### Example: intersection of two graphs - -Find the intersection point between $f(x) = \cos(x)$ and $g(x) = 5x$ near $0$. - -We have Newton's method to solve for zeros of $f(x)$, i.e. when $f(x) = -0$. Here we want to solve for $x$ with $f(x) = g(x)$. To do so, we -make a new function $h(x) = f(x) - g(x)$, that is $0$ when $f(x)$ -equals $g(x)$: - -```julia; hold=true -f(x) = cos(x) -g(x) = 5x -h(x) = f(x) - g(x) -x0 = find_zero((h,h'), 0, Roots.Newton()) -x0, h(x0), f(x0), g(x0) -``` - ----- - -We redo the above using a *parameter* for the ``5``, as there are some options on how it would be done. We let `f(x,p) = cos(x) - p*x`. Then we can use `Roots.Newton` by also defining a derivative: - -```julia; hold=true -f(x,p) = cos(x) - p*x -fp(x,p) = -sin(x) - p -xn = find_zero((f,fp), pi/4, Roots.Newton(); p=5) -xn, f(xn, 5) -``` - -To use automatic differentiation is not straightforward, as we must hold the `p` fixed. For this, we introduce a closure that fixes `p` and differentiates in the `x` variable (called `u` below): - -```julia; hold=true -f(x,p) = cos(x) - p*x -fp(x,p) = (u -> f(u,p))'(x) -xn = find_zero((f,fp), pi/4, Roots.Newton(); p=5) -``` - -##### Example: Finding $c$ in Rolle's Theorem - -The function $r(x) = \sqrt{1 - \cos(x^2)^2}$ has a zero at $0$ and one at ``a`` near ``1.77``. - -```julia; -r(x) = sqrt(1 - cos(x^2)^2) -plot(r, 0, 1.77) -``` - -As $f(x)$ is differentiable between $0$ and $a$, Rolle's theorem says -there will be value where the derivative is $0$. Find that value. - -This value will be a zero of the derivative. A graph shows it should be near $1.2$, so we use that as a starting value to get the answer: - -```julia; -find_zero((r',r''), 1.2, Roots.Newton()) -``` - - - - -## Convergence rates - -Newton's method is famously known to have "quadratic convergence." What -does this mean? Let the error in the $i$th step be called $e_i = x_i - -\alpha$. Then Newton's method satisfies a bound of the type: - -```math -\lvert e_{i+1} \rvert \leq M_i \cdot e_i^2. -``` - -If $M$ were just a constant and we suppose $e_0 = 10^{-1}$ then $e_1$ -would be less than $M 10^{-2}$ and $e_2$ less than $M^2 10^{-4}$, -$e_3$ less than $M^3 10^{-8}$ and $e_4$ less than $M^4 10^{-16}$ which -for $M=1$ is basically the machine precision when values are near -``1``. That is for some problems, with a good initial guess it will -take around ``4`` or so steps to converge. - -To identify ``M``, let ``\alpha`` be the zero of ``f`` to be approximated. Assume - -* The function ``f`` has at continuous second derivative in a neighborhood of ``\alpha``. -* The value ``f'(\alpha)`` is *non-zero* in the neighborhood of ``\alpha``. - -Then this linearization holds at each $x_i$ in the above neighborhood: - -```math -f(x) = f(x_i) + f'(x_i) \cdot (x - x_i) + \frac{1}{2} f''(\xi) \cdot (x-x_i)^2. -``` - -The value $\xi$ is from the mean value theorem and is between $x$ and $x_i$. - -Dividing by ``f'(x_i)`` and setting ``x=\alpha`` (as $f(\alpha)=0$) leaves - -```math -0 = \frac{f(x_i)}{f'(x_i)} + (\alpha-x_i) + \frac{1}{2}\cdot \frac{f''(\xi)}{f'(x_i)} \cdot (\alpha-x_i)^2. -``` - -For this value, we have - -```math -\begin{align*} -x_{i+1} - \alpha -&= \left(x_i - \frac{f(x_i)}{f'(x_i)}\right) - \alpha\\ -&= \left(x_i - \alpha \right) - \frac{f(x_i)}{f'(x_i)}\\ -&= (x_i - \alpha) + \left( -(\alpha - x_i) + \frac{1}{2}\frac{f''(\xi) \cdot(\alpha - x_i)^2}{f'(x_i)} -\right)\\ -&= \frac{1}{2}\frac{f''(\xi)}{f'(x_i)} \cdot(x_i - \alpha)^2. -\end{align*} -``` - -That is - -```math -e_{i+1} = \frac{1}{2}\frac{f''(\xi)}{f'(x_i)} e_i^2. -``` - - -This convergence to ``\alpha`` will be quadratic *if*: - -- The initial guess $x_0$ is not too far from $\alpha$, so $e_0$ is - managed. - -- The derivative at $\alpha$ is not too close to $0$, hence, by continuity ``f'(x_i)`` is not too close to ``0``. (As it appears in - the denominator). That is, the function can't be too flat, which - should make sense, as then the tangent line is nearly parallel to - the $x$ axis and would intersect far away. - -- The function ``f`` has a continuous second derivative at ``\alpha``. - -- The second derivative is not too big (in absolute value) near ``\alpha``. - A large second derivative means the function is very concave, - which means it is "turning" a lot. In this case, the function turns - away from the tangent line quickly, so the tangent line's zero is - not necessarily a good approximation to the actual zero, $\alpha$. - - -!!! note - The basic tradeoff: methods like Newton's are faster than the - bisection method in terms of function calls, but are not guaranteed to - converge, as the bisection method is. - - -What can go wrong when one of these isn't the case is illustrated next: - -### Poor initial step - -```julia; hold=true; echo=false; cache=true -### {{{newtons_method_poor_x0}}} -caption = """ - -Illustration of Newton's Method converging to a zero of a function, -but slowly as the initial guess, is very poor, and not close to the -zero. The algorithm does converge in this illustration, but not quickly and not to the nearest root from -the initial guess. - -""" - -fn, a, b, c = x -> sin(x) - x/4, -15, 20, 2pi - -n = 20 -anim = @animate for i=1:n - newtons_method_graph(i-1, fn, a, b, c) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 2) - -ImageFile(imgfile, caption) -``` - - -```julia; hold=true; echo=false; cache=true -# {{{newtons_method_flat}}} -caption = L""" - -Illustration of Newton's method failing to coverge as for some $x_i$, -$f'(x_i)$ is too close to ``0``. In this instance after a few steps, the -algorithm just cycles around the local minimum near $0.66$. The values -of $x_i$ repeat in the pattern: $1.0002, 0.7503, -0.0833, 1.0002, -\dots$. This is also an illustration of a poor initial guess. If there -is a local minimum or maximum between the guess and the zero, such -cycles can occur. - -""" - -fn, a, b, c = x -> x^5 - x + 1, -1.5, 1.4, 0.0 - -n=7 -anim = @animate for i=1:n - newtons_method_graph(i-1, fn, a, b, c) -end -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - -### The second derivative is too big - -```julia; hold=true; echo=false; cache=true -# {{{newtons_method_cycle}}} - -fn, a, b, c, = x -> abs(x)^(0.49), -2, 2, 1.0 -caption = L""" - -Illustration of Newton's Method not converging. Here the second -derivative is too big near the zero - it blows up near $0$ - and the -convergence does not occur. Rather the iterates increase in their -distance from the zero. - -""" - -n=10 -anim = @animate for i=1:n - newtons_method_graph(i-1, fn, a, b, c) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 2) - -ImageFile(imgfile, caption) -``` -### The tangent line at some xᵢ is flat - -```julia; hold=true; echo=false; cache=true -# {{{newtons_method_wilkinson}}} - -caption = L""" - -The function $f(x) = x^{20} - 1$ has two bad behaviours for Newton's -method: for $x < 1$ the derivative is nearly $0$ and for $x>1$ the -second derivative is very big. In this illustration, we have an -initial guess of $x_0=8/9$. As the tangent line is fairly flat, the -next approximation is far away, $x_1 = 1.313\dots$. As this guess is -is much bigger than $1$, the ratio $f(x)/f'(x) \approx -x^{20}/(20x^{19}) = x/20$, so $x_i - x_{i-1} \approx (19/20)x_i$ -yielding slow, linear convergence until $f''(x_i)$ is moderate. For -this function, starting at $x_0=8/9$ takes 11 steps, at $x_0=7/8$ -takes 13 steps, at $x_0=3/4$ takes ``55`` steps, and at $x_0=1/2$ it takes -$204$ steps. - -""" - - -fn,a,b,c = x -> x^20 - 1, .7, 1.4, 8/9 -n = 10 - -anim = @animate for i=1:n - newtons_method_graph(i-1, fn, a, b, c) -end -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - -###### Example - -Suppose $\alpha$ is a simple zero for $f(x)$. (The value $\alpha$ is -a zero of multiplicity $k$ if $f(x) = (x-\alpha)^kg(x)$ where -$g(\alpha)$ is not zero. A simple zero has multiplicity $1$. If -$f'(\alpha) \neq 0$ and the second derivative exists, then a zero -$\alpha$ will be simple.) Around $\alpha$, quadratic convergence should -apply. However, consider the function $g(x) = f(x)^k$ for some integer -$k \geq 2$. Then $\alpha$ is still a zero, but the derivative of $g$ -at $\alpha$ is zero, so the tangent line is basically flat. This will -slow the convergence up. We can see that the update step $g'(x)/g(x)$ -becomes $(1/k) f'(x)/f(x)$, so an extra factor is introduced. - -The calculation that produces the quadratic convergence now becomes: - -```math -x_{i+1} - \alpha = (x_i - \alpha) - \frac{1}{k}(x_i-\alpha + \frac{f''(\xi)}{2f'(x_i)}(x_i-\alpha)^2) = -\frac{k-1}{k} (x_i-\alpha) + \frac{f''(\xi)}{2kf'(x_i)}(x_i-\alpha)^2. -``` - -As $k > 1$, the $(x_i - \alpha)$ term dominates, and we see the -convergence is linear with $\lvert e_{i+1}\rvert \approx (k-1)/k -\lvert e_i\rvert$. - - - -## Questions - -###### Question - -Look at this graph with $x_0$ marked with a point: - -```julia; hold=true; echo=false -import SpecialFunctions: airyai -p = plot(airyai, -3.3, 0, legend=false); -plot!(p, zero, -3.3, 0); -scatter!(p, [-2.8], [0], color=:orange, markersize=5); -annotate!(p, [(-2.8, 0.2, "x₀")]) -p -``` - -If one step of Newton's method was used, what would be the value of $x_1$? - -```julia; hold=true; echo=false -choices = ["``-2.224``", "``-2.80``", "``-0.020``", "``0.355``"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Look at this graph of some increasing, concave up $f(x)$ with initial point $x_0$ marked. Let ``\alpha`` be the zero. - -```julia; hold=true; echo=false -p = plot(x -> x^2 - 2, .75, 2.2, legend=false); -plot!(p, zero, color=:green); -scatter!(p, [1],[0], color=:orange, markersize=5); -annotate!(p, [(1,.25, "x₀"), (sqrt(2), .2, "α")]); -p -``` - - -What can be said about $x_1$? - -```julia; hold=true; echo=false -choices = [ -L"It must be $x_1 > \alpha$", -L"It must be $x_1 < x_0$", -L"It must be $x_0 < x_1 < \alpha$" -] -answ = 1 -radioq(choices, answ) -``` - ----- - -Look at this graph of some increasing, concave up $f(x)$ with initial point $x_0$ marked. Let $\alpha$ be the zero. - -```julia; hold=true; echo=false -p = plot(x -> x^2 - 2, .75, 2.2, legend=false); -plot!(p, zero, .75, 2.2, color=:green); -scatter!(p, [2],[0], color=:orange, markersize=5); -annotate!(p, [(2,.25, "x₀"), (sqrt(2), .2, "α")]); -p -``` - - -What can be said about $x_1$? - -```julia; hold=true; echo=false -choices = [ -L"It must be $x_1 < \alpha$", -L"It must be $x_1 > x_0$", -L"It must be $\alpha < x_1 < x_0$" -] -answ = 3 -radioq(choices, answ) -``` - ----- - -Suppose $f(x)$ is increasing and concave up. From the tangent line representation: $f(x) = f(c) + f'(c)\cdot(x-c) + f''(\xi)/2 \cdot(x-c)^2$, explain why it must be that the graph of $f(x)$ lies on or *above* the tangent line. - -```julia; hold=true; echo=false -choices = [ -L"As $f''(\xi)/2 \cdot(x-c)^2$ is non-negative, we must have $f(x) - (f(c) + f'(c)\cdot(x-c)) \geq 0$.", -L"As $f''(\xi) < 0$ it must be that $f(x) - (f(c) + f'(c)\cdot(x-c)) \geq 0$.", -L"This isn't true. The function $f(x) = x^3$ at $x=0$ provides a counterexample" -] -answ = 1 -radioq(choices, answ) -``` - -This question can be used to give a proof for the previous two questions, which can be answered by considering the graphs alone. Combined, they say that if a function is increasing and concave up and ``\alpha`` is a zero, then if ``x_0 < \alpha`` it will be ``x_1 > \alpha``, and for any ``x_i > \alpha``, ``\alpha <= x_{i+1} <= x_\alpha``, so the sequence in Newton's method is decreasing and bounded below; conditions for which it is guaranteed mathematically there will be convergence. - - -###### Question - - -Let $f(x) = x^2 - 3^x$. This has derivative $2x - 3^x \cdot -\log(3)$. Starting with $x_0=0$, what does Newton's method converge on? - -```julia; hold=true; echo=false -f(x) = x^2 - 3^x; -fp(x) = 2x - 3^x*log(3); -val = Roots.newton(f, fp, 0); -numericq(val, 1e-14) -``` - -###### Question - - -Let $f(x) = \exp(x) - x^4$. There are 3 zeros for this function. Which one does Newton's method converge to when $x_0=2$? - - - -```julia; hold=true; echo=false -f(x) = exp(x) - x^4; -fp(x) = exp(x) - 4x^3; -xstar= Roots.newton(f, fp, 2); -numericq(xstar, 1e-1) -``` - -###### Question - - - -Let $f(x) = \exp(x) - x^4$. As mentioned, there are 3 zeros for this function. Which one does Newton's method converge to when $x_0=8$? - - - -```julia; hold=true; echo=false -f(x) = exp(x) - x^4; -fp(x) = exp(x) - 4x^3; -xstar = Roots.newton(f, fp, 8); -numericq(xstar, 1e-1) -``` - -###### Question - - -Let $f(x) = \sin(x) - \cos(4\cdot x)$. - -Starting at $\pi/8$, solve for the root returned by Newton's method - - -```julia; hold=true; echo=false -k1=4 -f(x) = sin(x) - cos(k1*x); -fp(x) = cos(x) + k1*sin(k1*x); -val = Roots.newton(f, fp, pi/(2k1)); -numericq(val) -``` - - -###### Question - -Using Newton's method find a root to $f(x) = \cos(x) - x^3$ starting at $x_0 = 1/2$. - -```julia; hold=true; echo=false -f(x) = cos(x) - x^3 -val = Roots.newton(f,f', 1/2) -numericq(val) -``` - - - -###### Question - -Use Newton's method to find a root of $f(x) = x^5 + x -1$. Make a quick graph to find a reasonable starting point. - -```julia; hold=true; echo=false -f(x) = x^5 + x - 1 -val = Roots.newton(f,f', -1) -numericq(val) -``` - -###### Question - - - -```julia; hold=true;echo=false -##Consider the following illustration of Newton's method: -caption = """ -Illustration of Newton's method. Moving the point ``x_0`` shows different behaviours of the algorithm. -""" -## JSXGraph(:derivatives, "newtons-method.js", caption) -nothing -``` - -For the following graph, graphically consider the algorithm for a few different starting points. - -```julia; hold=true; echo=false -# placeholder until CWJ bumps up a version? -plot(x -> x^5 - x - 1, -1, 2) -``` - -If ``x_0`` is ``1`` what occurs? - -```julia;echo=false -nm_choices = [ -"The algorithm converges very quickly. A good initial point was chosen.", -"The algorithm converges, but slowly. The initial point is close enough to the answer to ensure decreasing errors.", -"The algrithm fails to converge, as it cycles about" -] -radioq(nm_choices, 1, keep_order=true) -``` - - -When ``x_0 = 1.0`` the following values are true for ``f``: - -```julia; echo=false -ff(x) = x^5 - x - 1 -α = find_zero(ff, 1) -function error_terms(x) - (e₀=x-α, f₀′= f'(x), f̄₀′′=f''(α), ē₁ = 1/2*f''(α)/f'(x)*(x-α)^2) -end -error_terms(1.0) -``` - -Where the values `f̄₀′′` and `ē₁` are worst-case estimates when ``\xi`` is between ``x_0`` and the zero. - -Does the magnitude of the error increase or decrease in the first step? - -```julia; hold=true; echo=false -radioq(["Appears to increase", "It decreases"],2,keep_order=true) -``` - - -If ``x_0`` is set near ``0.40`` what happens? - - -```julia; hold=true;echo=false -radioq(nm_choices, 3, keep_order=true) -``` - -When ``x_0 = 0.4`` the following values are true for ``f``: - -```julia; hold=true; echo=false -error_terms(0.4) -``` - -Where the values `f̄₀′′` and `ē₁` are worst-case estimates when ``\xi`` is between ``x_0`` and the zero. - -Does the magnitude of the error increase or decrease in the first step? - -```julia; hold=true;echo=false -radioq(["Appears to increase", "It decreases"],1,keep_order=true) -``` - - - -If ``x_0`` is set near ``0.75`` what happens? - - -```julia; hold=true;echo=false -radioq(nm_choices, 3, keep_order=true) -``` - - - - - -###### Question - -Will Newton's method converge for the function $f(x) = x^5 - x + 1$ starting at $x=1$? - -```julia; hold=true; echo=false -choices = [ -"Yes", -"No. The initial guess is not close enough", -"No. The second derivative is too big", -L"No. The first derivative gets too close to $0$ for one of the $x_i$"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - - - -###### Question - -Will Newton's method converge for the function $f(x) = 4x^5 - x + 1$ starting at $x=1$? - - -```julia; hold=true; echo=false -choices = [ -"Yes", -"No. The initial guess is not close enough", -"No. The second derivative is too big, or does not exist", -L"No. The first derivative gets too close to $0$ for one of the $x_i$"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Will Newton's method converge for the function $f(x) = x^{10} - 2x^3 - x + 1$ starting from $0.25$? - - -```julia; hold=true; echo=false -choices = [ -"Yes", -"No. The initial guess is not close enough", -"No. The second derivative is too big, or does not exist", -L"No. The first derivative gets too close to $0$ for one of the $x_i$"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Will Newton's method converge for $f(x) = 20x/(100 x^2 + 1)$ starting at $0.1$? - - -```julia; hold=true; echo=false -choices = [ -"Yes", -"No. The initial guess is not close enough", -"No. The second derivative is too big, or does not exist", -L"No. The first derivative gets too close to $0$ for one of the $x_i$"] -answ = 4 -radioq(choices, answ, keep_order=true) -``` - - - -###### Question - -Will Newton's method converge to a zero for $f(x) = \sqrt{(1 - x^2)^2}$? - - -```julia; hold=true; echo=false -choices = [ -"Yes", -"No. The initial guess is not close enough", -"No. The second derivative is too big, or does not exist", -L"No. The first derivative gets too close to $0$ for one of the $x_i$"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Use Newton's method to find a root of $f(x) = 4x^4 - 5x^3 + 4x^2 -20x -6$ starting at $x_0 = 0$. - -```julia; hold=true; echo=false -f(x) = 4x^4 - 5x^3 + 4x^2 -20x -6 -val = find_zero((f,f') , 0, Roots.Newton()) -numericq(val) -``` - -###### Question - -Use Newton's method to find a zero of $f(x) = \sin(x) - x/2$ that is *bigger* than $0$. - -```julia; hold=true; echo=false -f(x) = sin(x) - x/2 -val = find_zero((f,f'), 2, Roots.Newton()) -numericq(val) -``` - - -###### Question - -The Newton baffler (defined below) is so named, as Newton's method will fail to find the root for most starting points. - -```julia; -function newton_baffler(x) - if ( x - 0.0 ) < -0.25 - 0.75 * ( x - 0 ) - 0.3125 - elseif ( x - 0 ) < 0.25 - 2.0 * ( x - 0 ) - else - 0.75 * ( x - 0 ) + 0.3125 - end -end -``` - - -Will Newton's method find the zero at $0.0$ starting at $1$? - - -```julia; hold=true; echo=false -yesnoq("no") -``` - -Considering this plot: - -```julia; hold=true; -plot(newton_baffler, -1.1, 1.1) -``` - -Starting with $x_0=1$, you can see why Newton's method will fail. Why? - -```julia; hold=true; echo=false -choices = [ -L"It doesn't fail, it converges to $0$", -L"The tangent lines for $|x| > 0.25$ intersect at $x$ values with $|x| > 0.25$", -L"The first derivative is $0$ at $1$" -] -answ = 2 -radioq(choices, answ) -``` - - -This function does not have a small first derivative; or a large second derivative; and the bump up can be made as close to the origin as desired, so the starting point can be very close to the zero. However, even though the conditions of the error term are satisfied, the error term does not apply, as ``f`` is not continuously differentiable. - - -###### Question - -Let $f(x) = \sin(x) - x/4$. Starting at $x_0 = 2\pi$ Newton's method will converge to a value, but it will take many steps. Using the argument `verbose=true` for `find_zero`, how many steps does it take: - -```julia; hold=true; echo=false -f(x) = sin(x) - x/4 -x₀ = 2π -tracks = Roots.Tracks() -find_zero((f,f'), x₀, Roots.Newton(); tracks=tracks) -val = tracks.steps -numericq(val, 2) -``` - -What is the zero that is found? - -```julia; hold=true; echo=false -val = Roots.newton(f,f', 2pi) -numericq(val) -``` - -Is this the closest zero to the starting point, $x_0$? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -###### Question - -Quadratic convergence of Newton's method only applies to *simple* -roots. For example, we can see (using the `verbose=true` argument to -the `Roots` package's `newton` method, that it only takes $4$ steps to -find a zero to $f(x) = \cos(x) - x$ starting at $x_0 = 1$. But it takes -many more steps to find the same zero for $f(x) = (\cos(x) - x)^2$. - -How many? - -```julia; hold=true; echo=false -val = 24 -numericq(val, 2) -``` - -###### Question: Implicit equations - -The equation $x^2 + x\cdot y + y^2 = 1$ is a rotated ellipse. - -```julia; hold=true; echo=false - -f(x,y) = x^2 + x * y + y^2 - 1 -implicit_plot(f, xlims=(-2,2), ylims=(-2,2), legend=false) -``` - -Can we find which point on its graph has the largest $y$ value? - -This would be straightforward *if* we could write $y(x) = \dots$, for then we would simply find the critical points and investiate. But we can't so easily solve for $y$ interms of $x$. However, we can use Newton's method to do so: - -```julia; -function findy(x) - fn = y -> (x^2 + x*y + y^2) - 1 - fp = y -> (x + 2y) - find_zero((fn, fp), sqrt(1 - x^2), Roots.Newton()) -end -``` - -For a *fixed* x, this solves for $y$ in the equation: $F(y) = x^2 + x \cdot y + y^2 - 1 = 0$. It should be that $(x,y)$ is a solution: - -```julia; hold=true; -x = .75 -y = findy(x) -x^2 + x*y + y^2 ## is this 1? -``` - -So we have a means to find $y(x)$, but it is implicit. - -Using `find_zero`, find the value $x$ which maximizes `y` by finding a zero of `y'`. Use this to find the point $(x,y)$ with largest $y$ value. - -```julia; hold=true; echo=false -xstar = find_zero(findy', 0.5) -ystar = findy(xstar) -choices = ["``(-0.57735, 1.15470)``", - "``(0,0)``", - "``(0, -0.57735)``", - "``(0.57735, 0.57735)``"] -answ = 1 -radioq(choices, answ) -``` - -(Using automatic derivatives works for values identified with `find_zero` *as long as* the initial point has its type the same as that of `x`.) - -###### Question - -In the last problem we used an *approximate* derivative (forward difference) in place of the derivative. -This can introduce an error due to the approximation. Would Newton's method still converge if the derivative in the algorithm were replaced with an approximate derivative? In general, this can often be done *but* the convergence can be *slower* and the sensitivity to a poor initial guess even greater. - -Three common approximations are given by the -difference quotient for a fixed $h$: $f'(x_i) \approx (f(x_i+h)-f(x_i))/h$; -the secant line approximation: $f'(x_i) \approx (f(x_i) - f(x_{i-1})) / (x_i - x_{i-1})$; and the -Steffensen approximation $f'(x_i) \approx (f(x_i + f(x_i)) - f(x_i)) / f(x_i)$ (using $h=f(x_i)$). - - -Let's revisit the $4$-step convergence of Newton's method to the root of $f(x) = 1/x - q$ when $q=0.8$. Will these methods be as fast? - - -Let's define the above approximations for a given `f`: - -```julia; -q₀ = 0.8 -fq(x) = 1/x - q₀ -secant_approx(x0,x1) = (fq(x1) - f(x0)) / (x1 - x0) -diffq_approx(x0, h) = secant_approx(x0, x0+h) -steff_approx(x0) = diffq_approx(x0, fq(x0)) -``` - -Then using the difference quotient would look like: - -```julia; hold=true; -Δ = 1e-6 -x1 = 42/17 - 32/17 * q₀ -x1 = x1 - fq(x1) / diffq_approx(x1, Δ) # |x1 - xstar| = 0.06511395862036995 -x1 = x1 - fq(x1) / diffq_approx(x1, Δ) # |x1 - xstar| = 0.003391809999860218; etc -``` - -The Steffensen method would look like: - -```julia; hold=true; -x1 = 42/17 - 32/17 * q₀ -x1 = x1 - fq(x1) / steff_approx(x1) # |x1 - xstar| = 0.011117056291670258 -x1 = x1 - fq(x1) / steff_approx(x1) # |x1 - xstar| = 3.502579696146313e-5; etc. -``` - -And the secant method like: - -```julia; hold=true; -Δ = 1e-6 -x1 = 42/17 - 32/17 * q₀ -x0 = x1 - Δ # we need two initial values -x0, x1 = x1, x1 - fq(x1) / secant_approx(x0, x1) # |x1 - xstar| = 8.222358365284066e-6 -x0, x1 = x1, x1 - fq(x1) / secant_approx(x0, x1) # |x1 - xstar| = 1.8766323799379592e-6; etc. -``` - -Repeat each of the above algorithms until `abs(x1 - 1.25)` is `0` (which will happen for this problem, though not in general). Record the steps. - -* Does the difference quotient need *more* than $4$ steps? - -```julia; hold=true; echo=false -yesnoq(false) -``` - -* Does the secant method need *more* than $4$ steps? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -* Does the Steffensen method need *more* than 4 steps? - -```julia; hold=true; echo=false -yesnoq(false) -``` - -All methods work quickly with this well-behaved problem. In general -the convergence rates are slightly different for each, with the -Steffensen method matching Newton's method and the difference quotient -method being slower in general. All can be more sensitive to the initial guess. diff --git a/CwJ/derivatives/numeric_derivatives.jmd b/CwJ/derivatives/numeric_derivatives.jmd deleted file mode 100644 index 65e4547..0000000 --- a/CwJ/derivatives/numeric_derivatives.jmd +++ /dev/null @@ -1,347 +0,0 @@ -# Numeric derivatives - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using ForwardDiff -using SymPy -using Roots -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Numeric derivatives", - description = "Calculus with Julia: Numeric derivatives", - tags = ["CalculusWithJulia", "derivatives", "numeric derivatives"], -); -nothing -``` - ----- - -`SymPy` returns symbolic derivatives. Up to choices of simplification, these answers match those that would be derived by hand. This is useful when comparing with known answers and for seeing the structure of the answer. However, there are times we just want to work with the answer numerically. For that we have other options within `Julia`. We discuss approximate derivatives and automatic derivatives. The latter will find wide usage in these notes. - -### Approximate derivatives - -By approximating the limit of the secant line with a value for a small, but positive, $h$, we get an approximation to the derivative. That is - -```math -f'(x) \approx \frac{f(x+h) - f(x)}{h}. -``` - -This is the forward-difference approximation. The central difference approximation looks both ways: - -```math -f'(x) \approx \frac{f(x+h) - f(x-h)}{2h}. -``` - -Though in general they are different, they are both -approximations. The central difference is usually more accurate for the -same size $h$. However, both are susceptible to round-off errors. The -numerator is a subtraction of like-size numbers - a perfect -opportunity to lose precision. - -As such there is a balancing act: - -* if $h$ is too small the round-off errors are problematic, -* if $h$ is too big, the approximation to the limit is not good. - -For the forward -difference $h$ values around $10^{-8}$ are typically good, for the central -difference, values around $10^{-6}$ are typically good. - -##### Example - -Let's verify that the forward difference isn't too far off. - -```julia; -f(x) = exp(-x^2/2) -c = 1 -h = 1e-8 -fapprox = (f(c+h) - f(c)) / h -``` - -We can compare to the actual with: - -```julia; -@syms x -df = diff(f(x), x) -factual = N(df(c)) -abs(factual - fapprox) -``` - -The error is about ``1`` part in ``100`` million. - -The central difference is better here: - -```julia; hold=true -h = 1e-6 -cdapprox = (f(c+h) - f(c-h)) / (2h) -abs(factual - cdapprox) -``` - ----- - -The [FiniteDifferences](https://github.com/JuliaDiff/FiniteDifferences.jl) and [FiniteDiff](https://github.com/JuliaDiff/FiniteDiff.jl) packages provide performant interfaces for differentiation based on finite differences. - -### Automatic derivatives - -There are some other ways to compute derivatives numerically that give -much more accuracy at the expense of slightly increased computing -time. Automatic differentiation is the general name for a few -different approaches. These approaches promise less complexity - in -some cases - than symbolic derivatives and more accuracy than -approximate derivatives; the accuracy is on the order of -machine precision. - -The `ForwardDiff` package provides one of [several](https://juliadiff.org/) ways for `Julia` to compute automatic derivatives. `ForwardDiff` is well suited for functions encountered in these notes, which depend on at most a few variables and output no more than a few values at once. - - -The `ForwardDiff` package was loaded in this section; in general its features are available when the `CalculusWithJulia` package is loaded, as that package provides a more convenient interface. -The `derivative` function is not exported by `FiniteDiff`, so its usage requires qualification. To illustrate, to find the derivative of $f(x)$ at a *point* we have this syntax: - -```julia -ForwardDiff.derivative(f, c) # derivative is qualified by a module name -``` - -The `CalculusWithJulia` package defines an operator `D` which goes from finding a derivative at a point with `ForwardDiff.derivative` to defining a function which evaluates the derivative at each point. It is defined along the lines of `D(f) = x -> ForwardDiff.derivative(f,x)` in parallel to how the derivative operation for a function is defined mathematically from the definition for its value at a point. - - -Here we see the error in estimating $f'(1)$: - -```julia; -fauto = D(f)(c) # D(f) is a function, D(f)(c) is the function called on c -abs(factual - fauto) -``` - -In this case, it is exact. - - -The `D` operator is defined for most all functions in `Julia`, though, like the `diff` operator in `SymPy` there are some for which it won't work. - -##### Example - -For $f(x) = \sqrt{1 + \sin(\cos(x))}$ compare the difference between the forward derivative with $h=1e-8$ and that computed by `D` at $x=\pi/4$. - -The forward derivative is found with: - -```julia; -𝒇(x) = sqrt(1 + sin(cos(x))) -𝒄, 𝒉 = pi/4, 1e-8 -fwd = (𝒇(𝒄+𝒉) - 𝒇(𝒄))/𝒉 -``` - -That given by `D` is: - -```julia; -ds_value = D(𝒇)(𝒄) -ds_value, fwd, ds_value - fwd -``` - -Finally, `SymPy` gives an exact value we use to compare: - -```julia; -𝒇𝒑 = diff(𝒇(x), x) -``` - -```julia -actual = N(𝒇𝒑(PI/4)) -actual - ds_value, actual - fwd -``` - -#### Convenient notation - -`Julia` allows the possibility of extending functions to different -types. Out of the box, the `'` notation is not employed for functions, -but is used for matrices. It is used in postfix position, as with -`A'`. We can define it to do the same thing as `D` for functions and -then, we can evaluate derivatives with the familiar `f'(x)`. -This is done in `CalculusWithJulia` along the lines of `Base.adjoint(f::Function) = D(f)`. - - -Then, we have, for example: - -```julia; hold=true; -f(x) = sin(x) -f'(pi), f''(pi) -``` - - - - - -##### Example - -Suppose our task is to find a zero of the second derivative of $k(x) = -e^{-x^2/2}$ in $[0, 10]$, a known bracket. The `D` function takes a second argument to indicate the order of the derivative (e.g., `D(f,2)`), but we use the more familiar notation: - - -```julia; hold=true -k(x) = exp(-x^2/2) -find_zero(k'', 0..10) -``` - -We pass in the function object, `k''`, and not the evaluated function. - - -## Recap on derivatives in Julia - -A quick summary for finding derivatives in `Julia`, as there are $3$ different manners: - -* Symbolic derivatives are found using `diff` from `SymPy` -* Automatic derivatives are found using the notation `f'` using `ForwardDiff.derivative` -* approximate derivatives at a point, `c`, for a given `h` are found with `(f(c+h)-f(c))/h`. - - -For example, here all three are computed and compared: - -```julia; hold=true -f(x) = exp(-x)*sin(x) - -c = pi -h = 1e-8 - -fp = diff(f(x),x) - -fp, fp(c), f'(c), (f(c+h) - f(c))/h -``` - -!!! note - The use of `'` to find derivatives provided by `CalculusWithJulia` is convenient, and used extensively in these notes, but it needs to be noted that it does **not conform** with the generic meaning of `'` within `Julia`'s wider package ecosystem and may cause issue with linear algebra operations; the symbol is meant for the adjoint of a matrix. - - -## Questions - - -##### Question - -Find the derivative using a forward difference approximation of $f(x) = x^x$ at the point $x=2$ using `h=0.1`: - -```julia; hold=true; echo=false -f(x) = x^x -c, h = 2, 0.1 -val = (f(c+h) - f(c))/h -numericq(val) -``` - -Using `D` or `f'` find the value using automatic differentiation - -```julia; hold=true; echo=false -f(x) = x^x -c = 2 -val = f'(c) -numericq(val) -``` - - - -##### Question - -Mathematically, as the value of `h` in the forward difference gets -smaller the forward difference approximation gets better. On the -computer, this is thwarted by floating point representation issues (in -particular the error in subtracting two like-sized numbers in forming -$f(x+h)-f(x)$.) - -For `1e-16` what is the error (in absolute value) in finding the forward difference -approximation for the derivative of $\sin(x)$ at $x=0$? - -```julia; hold=true; echo=false -f(x) = sin(x) -h = 1e-16 -c = 0 -approx = (f(c+h)-f(c))/h -val = abs(cos(0) - approx) -numericq(val) -``` - -Repeat for $x=\pi/4$: - - -```julia; hold=true; echo=false -f(x) = sin(x) -h = 1e-16 -c = pi/4 -approx = (f(c+h)-f(c))/h -val = abs(cos(0) - approx) -numericq(val) -``` - - - - - -###### Question - -Let $f(x) = x^x$. Using `D`, find $f'(3)$. - -```julia; hold=true; echo=false -f(x) = x^x -val = D(f)(3) -numericq(val) -``` - -###### Question - - -Let $f(x) = \lvert 1 - \sqrt{1 + x}\rvert$. Using `D`, find $f'(3)$. - -```julia; hold=true; echo=false -f(x) = abs(1 - sqrt(1 + x)) -val = D(f)(3) -numericq(val) -``` - - -###### Question - - -Let $f(x) = e^{\sin(x)}$. Using `D`, find $f'(3)$. - -```julia; hold=true; echo=false -f(x) = exp(sin(x)) -val = D(f)(3) -numericq(val) -``` - - - - -###### Question - -For `Julia`'s -`airyai` function find a numeric derivative using the -forward difference. For $c=3$ and $h=10^{-8}$ find the forward -difference approximation to $f'(3)$ for the `airyai` function. - -```julia; hold=true; echo=false -h = 1e-8 -c = 3 -val = (airyai(c+h) - airyai(c))/h -numericq(val) -``` - - -###### Question - -Find the rate of change with respect to time of the function $f(t)= 64 - 16t^2$ at $t=1$. - -```julia; hold=true; echo=false -fp_(t) = -16*2*t -c = 1 -numericq(fp_(c)) -``` - -###### Question - -Find the rate of change with respect to height, $h$, of $f(h) = 32h^3 - 62 h + 12$ at $h=2$. - -```julia; hold=true; echo=false -fp_(h) = 3*32h^2 - 62 -c = 2 -numericq(fp_(2)) -``` diff --git a/CwJ/derivatives/optimization-trapezoid.js b/CwJ/derivatives/optimization-trapezoid.js deleted file mode 100644 index eae3b4e..0000000 --- a/CwJ/derivatives/optimization-trapezoid.js +++ /dev/null @@ -1,36 +0,0 @@ -// inscribe trapezoid -var R = 5; -var Delta = 0.5 -const b = JXG.JSXGraph.initBoard('jsxgraph', { - boundingbox: [-R-Delta,R+Delta,R+Delta,-1], axis:true -}); - -var xax = b.create("segment", [[0,0],[R,0]]); - -var P4 = b.create("glider", [R/2,0, xax], {name: "P_4=(r,0)"}); - - - -var CL = b.create('point', [function() {return -P4.X()},0], {name:''}); -var CR = b.create('point', [function() {return P4.X()},0], {name:''}); -var C = b.create('semicircle', [CL,CR]); - -var Crestricted = b.create("functiongraph", - [function(x) { - r = P4.X(); - y = Math.sqrt(r*r - x*x); - return y; - }, 0, function() {return P4.X()}]); - -var P3 = b.create("glider", [ - P4.X()/2, - Math.sqrt(P4.X()*P4.X()*(1 - 1/4)), - Crestricted], {name:"P_3=(x,y)"}); - -var P1 = b.create('point', [function() {return -Math.abs(P4.X());}, - function() {return P4.Y();}], {name:'P_1'}); -var P2 = b.create('point', [function() {return -P3.X();}, - function() {return P3.Y();}], {name:'P_2'}); - -var poly = b.create('polygon',[P1, P2, P3, P4], { borders:{strokeColor:'black'} }); -b.create('text',[-1.5,.25, function(){ return 'Area='+ poly.Area().toFixed(1); }]); diff --git a/CwJ/derivatives/optimization.jmd b/CwJ/derivatives/optimization.jmd deleted file mode 100644 index 8ba8471..0000000 --- a/CwJ/derivatives/optimization.jmd +++ /dev/null @@ -1,1402 +0,0 @@ -# Optimization - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using Roots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -fig_size = (800, 600) -frontmatter = ( - title = "Optimization", - description = "Calculus with Julia: Optimization", - tags = ["CalculusWithJulia", "derivatives", "optimization"], -); - -nothing -``` - ----- - -A basic application of calculus is to answer -questions which relate to the largest or smallest a function can be given -some constraints. - - -For example, - - -> Of all rectangles with perimeter ``20``, which has of the largest area? - - -The main tool is the extreme value theorem of Bolzano and Fermat's -theorem about critical points, which combined say: - -> If the function $f(x)$ is continuous on $[a,b]$ and differentiable -> on $(a,b)$, then the extrema exist and must -> occur at either an end point or a critical point. - - -Though not all of our problems lend themselves to a description of a -continuous function on a closed interval, if they do, we have an -algorithmic prescription to find the absolute extrema of a function: - -1) Find the critical points. For this we can use a root-finding algorithm like `find_zero`. - -2) Evaluate the function values at the critical points and at the end points. - -3) Identify the largest and smallest values. - -With the computer we can take some shortcuts, as we will be able to -graph our function to see where the extreme values will be, and in particular if they occur at end points or critical points. - -## Fixed perimeter and area - -The simplest way to investigate the maximum or minimum value of a -function over a closed interval is to just graph it and look. - -We began with the question of which rectangles of perimeter ``20`` have -the largest area? The figure shows a few different rectangles with -this perimeter and their respective areas. - -```julia; hold=true; echo=false; cache=true -### {{{perimeter_area_graphic}}} - - -function perimeter_area_graphic_graph(n) - h = 1 + 2n - w = 10-h - plt = plot([0,0,w,w,0], [0,h,h,0,0], legend=false, size=fig_size, - xlim=(0,10), ylim=(0,10)) - scatter!(plt, [w], [h], color=:orange, markersize=5) - annotate!(plt, [(w/2, h/2, "Area=$(round(w*h,digits=1))")]) - plt -end - -caption = """ - -Some possible rectangles that satisfy the constraint on the perimeter and their area. - -""" -n = 6 -anim = @animate for i=1:n - perimeter_area_graphic_graph(i-1) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - - -The basic mathematical approach is to find a function of a single -variable to maximize or minimize. In this case we have two variables -describing a rectangle: a base $b$ and height $h$. Our formulas are the area of a rectangle: - -```math -A = bh, -``` - -and the formula for the perimeter of a rectangle: - -```math -P = 2b + 2h = 20. -``` - -From this last one, we see that $b$ can be no bigger than ``10`` and no -smaller than ``0`` from the restriction put in place through the -perimeter. Solving for $h$ in terms of $b$ then yields this -restatement of the problem: - -Maximize $A(b) = b \cdot (10 - b)$ over the interval $[0,10]$. - -This is exactly the form needed to apply our theorem about the -existence of extrema (a continuous function on a closed -interval). Rather than solve analytically by taking a derivative, we -simply graph to find the value: - -```julia; -Area(b) = b * (10 - b) -plot(Area, 0, 10) -``` - -You should see the maximum occurs at $b=5$ by symmetry, so $h=5$ as -well, and the maximum area is then $25$. This gives the satisfying -answer that among all rectangles of fixed perimeter, that with the -largest area is a square. As well, this indicates a common result: -there is often some underlying symmetry in the answer. - -### Exploiting polymorphism - -Before moving on, let's see a slightly different way to do this -problem with `Julia`, where we trade off some algebra for a bit of -abstraction. This technique was discussed in the section on -[functions](../precalc/functions.html). - -Let's first write area as a function of both base and height: - -```julia; -A(b, h) = b * h -``` - -From the constraint given by the perimeter being a fixed value we -can solve for `h` in terms of `b`. We write this as a function: - -```julia; -h(b) = (20 - 2b) / 2 -``` - -To get `A(b)` we simply need to substitute `h(b)` into our -formula for the area, `A`. However, instead of doing the substitution -ourselves using algebra we let `Julia` do it through composition of -functions: - -```julia; -A(b) = A(b, h(b)) -``` - -Now we can solve graphically as before, or numerically, such as here where -we search for zeros of the derivative: - -```julia; -find_zeros(A', 0, 10) # find_zeros in `Roots`, -``` - -(As a reminder, the notation `A'` is defined in `CalculusWithJulia` using the `derivative` function from the `ForwardDiff` package.) - - - -!!! note - Look at the last definition of `A`. The function `A` appears on both sides, though on the left side with one argument and on the right with two. These are two "methods" of a *generic* function, `A`. `Julia` allows multiple definitions for the same name as long as the arguments (their number and type) can disambiguate which to use. In this instance, when one argument is passed in then the last defintion is used (`A(b,h(b))`), whereas if two are passed in, then the method that multiplies both arguments is used. The advantage of multiple dispatch is illustrated: the same concept - area - has one function name, though there may be different ways to compute the area, so there is more than one implementation. - - -#### Example: Norman windows - -Here is a similar, though more complicated, example where the analytic -approach can be a bit more tedious, but the graphical one mostly -satisfying, though we do use a numerical algorithm to find -an exact final answer. - -Let a "[Norman](https://en.wikipedia.org/wiki/Norman_architecture)" -window consist of a rectangular window of top length $x$ and side -length $y$ and a half circle on top. The goal is to maximize the area -for a fixed value of the perimeter. Again, assume this perimeter is ``20`` -units. - -This figure shows two such windows, one with base length given by -$x=3$, the other with base length given by $x=4$. The one with base -length $4$ seems to have much bigger area, what value of $x$ will lead to the largest area? - -```julia; hold=true; echo=false -ts = range(0, stop=pi, length=50) -x1,y1 = 4, 4.85840 -x2,y2 = 3, 6.1438 -delta = 4 -p = plot(delta .+ x1*[0, 1,1,0], y1*[0,0,1,1], linetype=:polygon, fillcolor=:blue, legend=false) -plot!(p, x2*[0, 1,1,0], y2*[0,0,1,1], linetype=:polygon, fillcolor=:blue) - -plot!(p, delta .+ x1/2 .+ x1/2*cos.(ts), y1.+x1/2*sin.(ts), linetype=:polygon, fillcolor=:red) -plot!(p, x2/2 .+ x2/2*cos.(ts), y2 .+ x2/2*sin.(ts), linetype=:polygon, fillcolor=:red) -p -``` - -For this problem, we have two equations. - -The area is the area of the rectangle plus the area of the half circle ($\pi r^2/2$ with $r=x/2$). - -```math -A = xy + \pi(x/2)^2/2 -``` - -In `Julia` this is - -```julia; -Aᵣ(x, y) = x*y + pi*(x/2)^2 / 2 -``` - - -The perimeter consists of ``3`` sides of the rectangle and the perimeter -of half a circle ($\pi r$, with $r=x/2$): - -```math -P = 2y + x + \pi(x/2) = 20 -``` - -We solve for $y$ in the first with $y = (20 - x - \pi(x/2))/2$ so that in `Julia` we have: - -```julia; -y(x) = (20 - x - pi * x/2) / 2 -``` - -And then we substitute in `y(x)` for `y` in the area formula through: - -```julia; -Aᵣ(x) = Aᵣ(x, y(x)) -``` - -Of course both $x$ and $y$ are non-negative. The latter forces $x$ to -be no more than $x=20/(1+\pi/2)$. - -This leaves us the calculus problem of finding an absolute maximum of -a continuous function over the closed interval -$[0, 20/(1+\pi/2)]$. Our theorem tells us this maximum must occur, we -now proceed to find it. - -We begin by simply graphing and estimating the values of the maximum and -where it occurs. - -```julia; -plot(Aᵣ, 0, 20/(1+pi/2)) -``` - -The naked eye sees that maximum value is somewhere around $27$ and -occurs at $x\approx 5.6$. Clearly from the graph, we know the maximum -value happens at the critical point and there is only one such -critical point. - -As reading the maximum from the graph is more difficult than reading a -$0$ of a function, we plot the derivative using our approximate -derivative. - -```julia; -plot(Aᵣ', 5.5, 5.7) -``` - -We confirm that the critical point is around $5.6$. - -#### Using `find_zero` to locate critical points. - -Rather than zoom in graphically, we now use a root-finding algorithm, -to find a more precise value of the zero of ``A'``. We know that the -maximum will occur at a critical point, a zero of the derivative. The -`find_zero` function from the `Roots` package provides a non-linear -root-finding algorithm based on the bisection method. The only thing -to keep track of is that solving $f'(x) = 0$ means we use the -derivative and not the original function. - -We see from the graph that $[0, 20/(1+\pi/2)]$ will provide a bracket, as there is only one relative maximum: - -```julia; -x′ = find_zero(Aᵣ', (0, 20/(1+pi/2))) -``` - -This value is the lone critical point, and in this case gives -the position of the value that will maximize the function. The value and maximum area are then given by: - -```julia; -(x′, Aᵣ(x′)) -``` - -(Compare this answer to the previous, is the square the figure of -greatest area for a fixed perimeter, or just the figure amongst all -rectangles? See [Isoperimetric inequality](https://en.wikipedia.org/wiki/Isoperimetric_inequality) for an answer.) - - -### Using `argmax` to identify where a function is maximized - -This value that maximizes a function is sometimes referred to as the -*argmax*, or argument which maximizes the function. In `Julia` the -`argmax(f,domain)` function is defined to "Return a value ``x`` in the -domain of ``f`` for which ``f(x)`` is maximized. If there are multiple -maximal values for ``f(x)`` then the first one will be found." The -domain is some iterable collection. In the mathematical world this -would be an interval ``[a,b]``, but on the computer it is an -approximation, such as is returned by `range` below. Without out -having to take a derivative, as above, but sacrificing some accuracy, -the task of identifying `x` for where `A` is maximum, could be done -with - -```julia -argmax(Aᵣ, range(0, 20/(1+pi/2), length=10000)) -``` - - -#### A symbolic approach - -We could also do the above problem symbolically with the aid of `SymPy`. Here are the steps: - -```julia; -@syms 𝒘::real 𝒉::real - -𝑨₀ = 𝒘 * 𝒉 + pi * (𝒘/2)^2 / 2 -𝑷erim = 2*𝒉 + 𝒘 + pi * 𝒘/2 -𝒉₀ = solve(𝑷erim - 20, 𝒉)[1] -𝑨₁ = 𝑨₀(𝒉 => 𝒉₀) -𝒘₀ = solve(diff(𝑨₁,𝒘), 𝒘)[1] -``` - - -We know that `𝒘₀` is the maximum in this example from our previous -work. We shall see soon, that just knowing that the second derivative -is negative at `𝒘₀` would suffice to know this. Here we check that -condition: - -```julia; -diff(𝑨₁, 𝒘, 𝒘)(𝒘 => 𝒘₀) -``` - -As an aside, compare the steps involved above for a symbolic solution to those of previous work for a numeric solution: - -```julia; hold=true -Aᵣ(w, h) = w*h + pi*(w/2)^2 / 2 -h(w) = (20 - w - pi * w/2) / 2 -Aᵣ(w) = A(w, h(w)) -find_zero(A', (0, 20/(1+pi/2))) # 40 / (pi + 4) -``` - -They are similar, except we solved for `h0` symbolically, rather than by hand, when we solved for `h(w)`. - -##### Example - -```julia; hold=true; echo=false -caption = """ -The figure shows a trapezoid inscribed in a circle. By adjusting the point ``P_3 = (x,y)`` on the upper-half circle, the area of the trapezoid changes. What value of ``x`` will produce the maximum area for a given ``r`` (from ``P_4``, which can also be adjusted)? By playing around with different values of ``P_3`` and ``P_4`` the answer can be guessed. -""" - -#JSXGraph(:derivatives, "optimization-trapezoid.js", caption) -nothing -``` - -```julia; hold=true; echo=false -function trapezoid(r) - plot(x -> sqrt(1 - x^2), -1, 1, legend=false, aspect_ratio=:equal) - plot!([-1,1,r,-r,-1], [0,0,sqrt(1-r^2), sqrt(1-r^2), 0], lw=3, color=:red) -end -trapezoid(0.75) -``` - - -A trapezoid is *inscribed* in the upper-half circle of radius ``r``. The trapezoid is found be connecting the points ``(x,y)`` (in the first quadrant) with ``(r, 0)``, ``(-r,0)``, and ``(-x, y)``. Find the maximum area. (The above figure has ``x=0.75`` and ``r=1``.) - -Here the constraint is simply ``r^2 = x^2 + y^2`` with ``x`` and ``y`` being non-negative. The area is then found through the average of the two lengths times the height. Using `height` for `y`, we have: - -```julia -@syms x::positive r::positive -hₜ = sqrt(r^2 - x^2) -aₜ = (2x + 2r)/2 * hₜ -possible_sols = solve(diff(aₜ, x) ~ 0, x) # possibly many solutions -x0 = first(possible_sols) # only solution is also found from first or [1] indexing -``` - -The other values of interest can be found through substitution. For example: - -```julia -hₜ(x => x0) -``` - -## Trigonometry problems - -Many maximization and minimization problems involve triangles, which -in turn use trigonometry in their description. Here is an example, the -"ladder corner problem." (There are many other [ladder](http://www.mathematische-basteleien.de/ladder.htm) problems.) - - -A ladder is to be moved through a two-dimensional hallway which has a -bend and gets narrower after the bend. The hallway is ``8`` feet wide then -``5`` feet wide. What is the longest such ladder that can be navigated -around the corner? - -The figure shows a ladder of length $l_1 + l_2$ that got stuck - it -was too long. - - -```julia; hold=true; echo=false -p = plot([0, 0, 15], [15, 0, 0], color=:blue, legend=false) -plot!(p, [5, 5, 15], [15, 8, 8], color=:blue) -plot!(p, [0,14.53402874075368], [12.1954981558864, 0], linewidth=3) -plot!(p, [0,5], [8,8], color=:orange) -plot!(p, [5,5], [0,8], color=:orange) -annotate!(p, [(13, 1/2, "θ"), - (2.5, 11, "l₂"), (10, 5, "l₁"), (2.5, 7.0, "l₂ ⋅ cos(θ)"), - (5.1, 4, "l₁ ⋅ sin(θ)")]) -``` - - - -We approach this problem in reverse. It is easy to see when a ladder -is too long. It gets stuck at some angle $\theta$. So for each -$\theta$ we find that ladder length that is just too long. Then we -find the minimum length of all these ladders that are too long. If a -ladder is this length or more it will get stuck for some -angle. However, if it is less than this length it will not get stuck. -So to maximize a ladder length, we minimize a different -function. Neat. - -Now, to find the length $l = l_1 + l_2$ as a function of $\theta$. - -We need to brush off our trigonometry, in particular right triangle -trigonometry. We see from the figure that $l_1$ is the hypotenuse of a -right triangle with opposite side $8$ and $l_2$ is the hypotenuse of a -right triangle with adjacent side $5$. So, $8/l_1 = \sin\theta$ and -$5/l_2 = \cos\theta$. - - -That is, we have - -```julia; -l(l1, l2) = l1 + l2 -l1(t) = 8/sin(t) -l2(t) = 5/cos(t) - -l(t) = l(l1(t), l2(t)) # or simply l(t) = 8/sin(t) + 5/cos(t) -``` - -Our goal is to minimize this function for all angles between $0$ and $90$ degrees, or $0$ and $\pi/2$ radians. - -This is not a continuous function on a closed interval - it is -undefined at the endpoints. That being said, a quick plot will -convince us that the minimum occurs at a critical point and there is -only one critical point in $(0, \pi/2)$. - -```julia; -delta = 0.2 -plot(l, delta, pi/2 - delta) -``` - -The graph shows the minimum occurs between ``0.50`` and ``1.00`` radians, a bracket for the derivative. Here we find ``x`` and the minimum value: - -```julia; hold=true; -x = find_zero(l', (0.5, 1.0)) -x, l(x) -``` - -That is, any ladder less than this length can get around the hallway. - -## Rate times time problems - -Ethan Hunt, a top secret spy, has a mission to chase a bad guy. Here -is what we know: - -* Ethan likes to run. He can run at ``10`` miles per hour. -* He can drive a car - usually some concept car by BMW - at ``30`` miles per hour, but only on the road. - -For his mission, he needs to go ``10`` miles west and ``5`` `miles north. He -can do this by: - -* just driving ``8.310`` miles west then ``5`` miles north, or -* just running the diagonal distance, or -* driving $0 < x < 10$ miles west, then running on the diagonal - - -A quick analysis says: - -* It would take $(10+5)/30$ hours to just drive -* It would take $\sqrt{10^2 + 5^2}/10$ hours to just run - -Now, if he drives $x$ miles west ($0 < x < 10$) he would run an amount -given by the hypotenuse of a triangle with lengths $5$ and $10-x$. His -time driving would be $x/30$ and his time running would be -$\sqrt{5^2+(10-x)^2}/10$ for a total of: - -```math -T(x) = x/30 + \sqrt{5^2 + (10-x)^2}/10, \quad 0 < x < 10 -``` - -With the endpoints given by -$T(0) = \sqrt{10^2 + 5^2}/10$ and $T(10) = (10 + 5)/30$. - - -Let's plot $T(x)$ over the interval $(0,10)$ and look: - -```julia; -T(x) = x/30 + sqrt(5^2 + (10-x)^2)/10 -``` - -```julia; -plot(T, 0, 10) -``` - -The minimum happens way out near 8. We zoom in a bit: - -```julia; -plot(T, 7, 9) -``` - -It appears to be around ``8.3``. We now use `find_zero` to refine our -guess at the critical point using $[7,9]$: - -```julia; -α = find_zero(T', (7, 9)) -``` - -Okay, got it. Around``8.23``. So is our minimum time - -```julia; -T(α) -``` - -We know this is a relative minimum, but not that it is the global -minimum over the closed time interlal. For that we must also check the -endpoints: - -```julia; -sqrt(10^2 + 5^2)/10, T(α), (10+5)/30 -``` - -Ahh, we see that $T(x)$ is not continuous on $[0, 10]$, as it jumps at -$x=10$ down to an even smaller amount of $1/2$. It may not look as -impressive as a miles-long sprint, but Mr. Hunt is advised by Benji to drive -the whole way. - -### Rate times time ... the origin story - -```julia;hold=true; echo=false -### {{{lhopital_43}}} - -imgfile = "figures/fcarc-may2016-fig43-250.gif" -caption = L""" - -Image number $43$ from l'Hospital's calculus book (the first). A -traveler leaving location $C$ to go to location $F$ must cross two -regions separated by the straight line $AEB$. We suppose that in the -region on the side of $C$, he covers distance $a$ in time $c$, and -that on the other, on the side of $F$, distance $b$ in the same time -$c$. We ask through which point $E$ on the line $AEB$ he should pass, -so as to take the least possible time to get from $C$ to $F$? (From -http://www.ams.org/samplings/feature-column/fc-2016-05.) - - -""" -ImageFile(:derivatives, imgfile, caption) -``` - -The last example is a modern day illustration of a problem of calculus -dating back to l'Hospital. His parameterization is a bit -different. Let's change his by taking two points $(0, a)$ and -$(L,-b)$, with $a,b,L$ positive values. Above the $x$ axis travel -happens at rate $r_0$, and below, travel happens at rate $r_1$, again, -both positive. What value $x$ in $[0,L]$ will minimize the total travel time? - -We approach this symbolically with `SymPy`: - -```julia; -@syms x::positive a::positive b::positive L::positive r0::positive r1::positive - -d0 = sqrt(x^2 + a^2) -d1 = sqrt((L-x)^2 + b^2) - -t = d0/r0 + d1/r1 # time = distance/rate -dt = diff(t, x) # look for critical points -``` - -The answer will occur at a critical point or an endpoint, either $x=0$ or $x=L$. - -The structure of `dt` is too complicated for simply calling `solve` to find the critical points. Instead we help `SymPy` out a bit. We are solving an equation of the form $a/b + c/d = 0$. These solutions will also be solutions of $(a/b)^2 - (c/d)^2=0$ or even $a^2d^2 - c^2b^2 = 0$. This follows as solutions to $u+v=0$, also solve $(u+v)\cdot(u-v)=0$, or $u^2 - v^2=0$. Setting $u=a/b$ and $v=c/d$ completes the comparison. - -We can get these terms - $a$, $b$, $c$, and $d$ - as follows: - -```julia; -t1, t2 = dt.args # the `args` property returns the arguments to the outer function (+ in this case) -``` - -The equivalent of $a^2d^2 - c^2 b^2$ is found using the generic functions `numerator` and `denominator` to access the numerator and denominator of the fractions: - -```julia; -ex = numerator(t1^2)*denominator(t2^2) - denominator(t1^2)*numerator(t2^2) -``` - -This is a polynomial in the `x` variable of degree $4$, as seen here where the `sympy.Poly` function is used to identify the symbols of the polynomial from the parameters: - -```julia; -p = sympy.Poly(ex, x) # a0 + a1⋅x + a2⋅x^2 + a3⋅x^3 + a4⋅x^4 -p.coeffs() -``` - -Fourth degree polynomials can be solved. The critical points of the -original equation will be among the ``4`` solutions given. However, the result -is complicated. The -[article](http://www.ams.org/samplings/feature-column/fc-2016-05) -- from -which the figure came -- states that "In today's textbooks the problem, -usually involving a river, involves walking along one bank and then -swimming across; this corresponds to setting $g=0$ in l'Hospital's -example, and leads to a quadratic equation." Let's see that case, -which we can get in our notation by taking $b=0$: - -```julia; -q = ex(b=>0) -factor(q) -``` - -We see two terms: one with $x=L$ and another quadratic. For the simple -case $r_0=r_1$, a straight line is the best solution, and this -corresponds to $x=L$, which is clear from the formula above, as we -only have one solution to the following: - -```julia; -solve(q(r1=>r0), x) -``` - -Well, not so fast. We need to check the other endpoint, $x=0$: - -```julia; -ta = t(b=>0, r1=>r0) -ta(x=>0), ta(x=>L) -``` - -The value at $x=L$ is smaller, as $L^2 + a^2 \leq (L+a)^2$. (Well, that was a bit pedantic. The travel rates being identical means the fastest path will also be the shortest path and that is clearly ``x=L`` and not ``x=0``.) - - -Now, if, say, travel above the line is half as slow as travel along, then $2r_0 = r_1$, and the critical points will be: - -```julia; -out = solve(q(r1 => 2r0), x) -``` - -It is hard to tell which would minimize time without more work. To check a case ($a=1, L=2, r_0=1$) we might have - -```julia; -x_straight = t(r1 =>2r0, b=>0, x=>out[1], a=>1, L=>2, r0 => 1) # for x=L -``` - -Compared to the smaller ($x=\sqrt{3}a/3$): - -```julia; -x_angle = t(r1 =>2r0, b=>0, x=>out[2], a=>1, L=>2, r0 => 1) -``` - -What about $x=0$? - -```julia; -x_bent = t(r1 =>2r0, b=>0, x=>0, a=>1, L=>2, r0 => 1) -``` - -The value of $x=\sqrt{3}a/3$ minimizes time: - -```julia; -min(x_straight, x_angle, x_bent) -``` - -The traveler in this case is advised to head to the $x$ axis at $x=\sqrt{3}a/3$ and then travel along the $x$ axis. - - - -Will this approach always be true? Consider different parameters, say we -switch the values of $a$ and $L$ so $a > L$: - -```julia; -pts = [0, out...] -m,i = findmin([t(r1 =>2r0, b=>0, x=>u, a=>2, L=>1, r0 => 1) for u in pts]) # min, index -m, pts[i] -``` - -Here traveling directly to the point $(L,0)$ is fastest. Though travel -is slower, the route is more direct and there is no time saved by -taking the longer route with faster travel for part of it. - - - - -## Unbounded domains - -Maximize the function $xe^{-(1/2) x^2}$ over the interval $[0, \infty)$. - -Here the extreme value theorem doesn't technically apply, as we don't -have a closed interval. However, **if** we can eliminate the endpoints as -candidates, then we should be able to convince ourselves the maximum -must occur at a critical point of $f(x)$. (If not, then convince yourself for all sufficiently large $M$ the maximum over $[0,M]$ occurs at -a critical point, not an endpoint. Then let $M$ go to infinity. In general, for an optimization problem of a continuous function on the interval ``(a,b)`` if the right limit at ``a`` and left limit at ``b`` can be ruled out as candidates, the optimal value must occur at a critical point.) - -So to approach this problem we first graph it over a wide interval. - -```julia; -f(x) = x * exp(-x^2) -plot(f, 0, 100) -``` - -Clearly the action is nearer to ``1`` than ``100``. We try graphing the -derivative near that area: - -```julia; -plot(f', 0, 5) -``` - -This shows the value of interest near $0.7$ for a critical point. We use `find_zero` with $[0,1]$ as a bracket - -```julia; -c = find_zero(f', (0, 1)) -``` - -The maximum is then at - -```julia; -f(c) -``` - -##### Example: Minimize the surface area of a can - - -For a more applied problem of this type (infinite domain), consider a -can of some soft drink that is to contain ``355``ml which is ``355`` cubic -centimeters. We use metric units, as the relationship between volume -(cubic centimeters) and fluid amount (ml) is clear. A can to hold -this amount is produced in the shape of cylinder with radius $r$ and -height $h$. The materials involved give the surface area, which would -be: - -```math -SA = h \cdot 2\pi r + 2 \cdot \pi r^2. -``` - -The volume satisfies: - -```math -V = 355 = h \cdot \pi r^2. -``` - -Find the values of $r$ and $h$ which minimize the surface area. - - -First the surface area in both variables is given by - -```julia; -SA(h, r) = h * 2pi * r + 2pi * r^2 -``` - -Solving from the constraint on the volume for `h` in terms of `r` yields: - -```julia; -canheight(r) = 355 / (pi * r^2) -``` - -Composing gives a function of `r` alone: - -```julia; -SA(r) = SA(canheight(r), r) -``` - -This is minimized subject to the constraint that $r \geq 0$. A quick -glance shows that as $r$ gets close to $0$, the can must get -infinitely tall to contain that fixed volume, and would have infinite -surface area as the $1/r^2$ in the first term implies. On the other -hand, as $r$ goes to infinity, the height must go to ``0`` to make a -really flat can. Again, we would have infinite surface area, as the -$r^2$ term at the end indicates. With this observation, we can rule -out the endpoints as possible minima, so any minima must occur at a -critical point. - -We start by making a graph, making an educated guess that the answer -is somewhere near a real life answer, or around ``3``-``5`` cms in radius: - - -```julia; -plot(SA, 2, 10) -``` - -The minimum looks to be around $4$cm and is clearly between $2$cm and -$6$cm. We can use `find_zero` to zero in on the value of the critical point: - -```julia; -rₛₐ = find_zero(SA', (2, 6)) -``` - -Okay, $3.837...$ is our answer for $r$. Use this to get $h$: - -```julia; -canheight(rₛₐ) -``` - - - -This produces a can which is about square in profile. This is not how -most cans look though. Perhaps our model is too simple, or the cans -are optimized for some other purpose than minimizing materials. - - - -## Questions - -###### Question - -A geometric figure has area given in terms of two measurements by -$A=\pi a b$ and perimeter $P = \pi (a + b)$. If the perimeter is fixed -to be 20 units long, what is the maximal area the figure can be? - -```julia; hold=true; echo=false -A(a,b) = pi*a*b -P = 20 -b1(a) = 20/pi - a -A(a) = A(a, b1(a)) -x = find_zero(A', (0, 10)) -val = A(x) -numericq(val) -``` - -###### Question - -A geometric figure has area given in terms of two measurements by -$A=\pi a b$ and perimeter $P=\pi \cdot \sqrt{a^2 + b^2}/2$. If the -perimeter is 20 units long, what is the maximal area? - -```julia; hold=true; echo=false -A(a,b) = pi*a*b -P = 20 -b1(a) = sqrt((P*2/pi)^2 - a^2) -A(a) = A(a, b1(a)) -x = find_zero(A', (0, 10)) -val = A(x) -numericq(val) -``` - - -###### Question - -A rancher with ``10`` meters of fence wishes to make a pen adjacent to an -existing fence. The pen will be a rectangle with one edge using the -existing fence. Say that has length $x$, then $10 = 2y + x$, with $y$ -the other dimension of the pen. What is the maximum area that can be -made? - -```julia; hold=true; echo=false -p = plot(; legend=false, aspect_ratio=:equal, axis=nothing, border=:none) -plot!([0,10, 10, 0, 0], [0,0,10,10,0]; linewidth=3) -plot!(p, [10,14,14,10], [2, 2, 8,8]; linewidth = 1) -annotate!(p, [(15, 5, "x"), (12,1, "y")]) -p -``` - - - - -```julia; hold=true; echo=false -Ar(y) = (10-2y)*y; -val = Ar(find_zero(Ar', 5)) -numericq(val, 1e-3) -``` - -Is there "symmetry" in the answer between $x$ and $y$? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -What is you were do do two pens like this back to back, then the -answer would involve a rectangle. Is there symmetry in the answer now? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - - - -###### Question - -A rectangle of sides $w$ and $h$ has fixed area $20$. What is the *smallest* perimeter it can have? - -```julia; hold=true; echo=false -Prim(x,y) = 2x + 2y -Prim(x) = Prim(x, 20/x) -xstar = find_zero(Prim', 5) -val = Prim(xstar) -numericq(val) -``` - - -###### Question - -A rectangle of sides $w$ and $h$ has fixed area $20$. What is the *largest* perimeter it can have? - -```julia; hold=true; echo=false -choices = [ -"It can be infinite", -"It is also 20", -"``17.888``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -A cardboard box is to be designed with a square base and an open top holding a fixed volume ``V``. What dimensions yield the minimal surface area? - - - -If this problem were approached symbolically, we might see the following code. First: - -```julia;eval=false -@syms V::positive x::positive z::positive -SA = 1 * x * x + 4 * x * z -``` - -What does this express? - -```julia; hold=true; echo=false -radioq(( -"The box has a square base with open top, so `x*x` is the amount of material in the base; the 4 sides each have `x*z` area.", -"The volume is a fixed amount, so is `x*x*z`, with sides suitably labeled", -"The surface area of a box is `6x*x`, so this is wrong." -), 1) -``` - -What does this command express? - -```julia; eval=false -SA = subs(SA, z => V / x^2) -``` - -```julia; hold=true; echo=false -radioq(( -"This command replaces `z` with an expression in `x` using the constraint of fixed volume `V`", -"This command replaces `z`, reparameterizing in `V` instead.", -"This command is merely algebraic simplification" -), 1) -``` - -What does this command find? - -```julia; eval=false -solve(diff(SA, x) ~ 0, x) -``` - - -```julia; hold=true; echo=false -radioq(( -"This solves ``SA'=0``, that is it find critical points of a continuously differentiable function", -"This solves for ``V`` the fixed, but unknown volume", -"This checks the values of `SA` at the end points of the domain" -), 1) -``` - -What do these commands do? - -```julia; eval=false -cps = solve(diff(SA, x) ~ 0, x) -xx = filter(isreal, cps)[1] -diff(SA, x, x)(xx) > 0 -``` - -```julia; hold=true; echo=false -radioq(( -"This applies the second derivative test to the lone *real* critical point showing there is a local minimum at that point.", -"This applies the first derivative test to the lone *real* critical point showing there is a local minimum at that point.", -"This finds the ``4`th derivative of `SA`" -), 1) -``` - - - - - - -###### Question - -A rain gutter is constructed from a 30" wide sheet of tin by bending -it into thirds. If the sides are bent 90 degrees, then the -cross-sectional area would be $100 = 10^2$. This is not the largest -possible amount. For example, if the sides are bent by 45 degrees, the cross sectional area is: - -```julia; hold=true; echo=false -2 * (1/2 * 10*cos(pi/4) * 10 * sin(pi/4)) + 10*sin(pi/4) * 10 -``` - -Find a value in degrees that gives the maximum. (The first task is to -write the area in terms of $\theta$. - - -```julia; hold=true; echo=false -function Ar(t) - opp = 10 * sin(t) - adj = 10 * cos(t) - 2 * opp * adj/2 + opp * 10 -end -t = find_zero(Ar', pi/4); ## Has issues with order=8 algorithm, tol > 1e-14 is needed -val = t * 180/pi; -numericq(val, 1e-3) -``` - -###### Question Non-Norman windows - -Suppose our new "Norman" window has half circular tops at the top and bottom? If the perimeter is fixed at $20$ and the dimensions of the rectangle are $x$ for the width and $y$ for the height. - -What is the value of $y$ that maximizes the area? - -```julia; hold=true; echo=false -P = 20 -A(x,y) = x*y + pi * (x/2)^2 -y(x) = (P - pi*x)/2 # P = 2y + 2pi*x/2 -A(x) = A(x,y(x)) -x0 = find_zero(D(A), (0, 10)) -val = y(x0) -numericq(val) # 0 -``` - - -###### Question (Thanks https://www.math.ucdavis.edu/~kouba) - -A movie screen projects on a wall 20 feet high beginning 10 feet above -the floor. This figure shows $\theta$ for $x=30$: - -```julia; hold=true; echo=false -p = plot([0, 30,30], [0,0,10], xlim=(0, 32), color=:blue, legend=false) -plot!(p, [30, 30], [10, 30], color=:blue, linewidth=4) -plot!(p, [0, 30,30,0], [0,10,30,0], color=:orange) -annotate!(p, [(x,y,l) for (x,y,l) in zip([15, 5, 31, 31], [1.5, 3.5, 5, 20], ["x=30", "θ", "10", "20"])]) -``` - -What value of $x$ gives the largest angle $\theta$? (In degrees.) - -```julia; hold=true; echo=false -theta(x) = atan(30/x) - atan(10/x) -val = find_zero(D(theta), 20); ## careful where one starts -val = theta(val) * 180/pi -numericq(val, 1e-1) -``` - - -###### Question - -A maximum likelihood estimator is a value derived by maximizing a function. For example, if - -```julia; -Likhood(t) = t^3 * exp(-3t) * exp(-2t) * exp(-4t) ## 0 <= t <= 10 -``` - -Then `Likhood(t)` is continuous and has single peak, so the maximum occurs -at the lone critical point. It turns out that this problem is bit sensitive to an initial condition, so we bracket - -```julia -find_zero(Likhood', (0.1, 0.5)) -``` - -Now if $Likhood(t) = \exp(-3t) \cdot \exp(-2t) \cdot \exp(-4t), \quad 0 \leq t \leq 10$, by graphing, explain why the same approach won't work: - -```julia; hold=true; echo=false -choices=["It does work and the answer is x = 2.27...", - L" $Likhood(t)$ is not continuous on $0$ to $10$", - L" $Likhood(t)$ takes its maximum at a boundary point - not a critical point"]; -answ = 3; -radioq(choices, answ) -``` - -##### Question - -Let $x_1$, $x_2$, $x_n$ be a set of unspecified numbers in a data -set. Form the expression $s(x) = (x-x_1)^2 + \cdots (x-x_n)^2$. What -is the smallest this can be (in $x$)? - -We approach this using `SymPy` and $n=10$ - -```julia; hold=true; eval=false -@syms s xs[1:10] -s(x) = sum((x-xi)^2 for xi in xs) -cps = solve(diff(s(x), x), x) -``` - -Run the above code. Baseed on the critical points found, what do you guess will be the -minimum value in terms of the values $x_1$, $x_2, \dots$? - -```julia; hold=true; echo=false -choices=[ -"The mean, or average, of the values", -"The median, or middle number, of the values", -L"The square roots of the values squared, $(x_1^2 + \cdots x_n^2)^2$" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Minimize the function $f(x) = 2x + 3/x$ over $(0, \infty)$. - -```julia; hold=true; echo=false -f(x) = 2x + 3/x; -val = find_zero(f', 1); -numericq(val, 1e-3) -``` - - -###### Question - -Of all rectangles of area 4, find the one with smallest perimeter. What is the perimeter? - -```julia; hold=true; echo=false -# 4 = xy -Prim(x) = 2x + 2*(4/x); -val = find_zero(D(Prim), 1); -numericq(Prim(val), 1e-3) ## a square! -``` - - -###### Question - -A running track is in the shape of two straight aways and two half -circles. The total distance (perimeter) is 400 meters. Suppose $w$ is -the width (twice the radius of the circles) and $h$ is the -height. What dimensions minimize the sum $w + h$? - -You have $P(w, h) = 2\pi \cdot (w/2) + 2\cdot(h-w)$. - - -```julia; hold=true; echo=false -Ar(w,h) = w + h -h(w) = (400 - 2pi*w/2 + 2w) / 2 -Ar(w) = Ar(w, h(w)) ## linear -val = Ar(0) -numericq(val) -``` - -###### Question - -A cell phone manufacturer wishes to make a rectangular phone with -total surface area of 12,000 $mm^2$ and maximal screen area. The -screen is surrounded by bezels with sizes of 8$mm$ on the long sides -and 32$mm$ on the short sides. (So, for example, the screen width is -shorter by $2\cdot 8$ mm than the phone width.) - -What are the dimensions (width and -height) that allow the maximum screen area? - -The width is: - -```julia; hold=true; echo=false -#A = w*h = 12000 -w(h) = 12_000 / h -S(w, h) = (w- 2*8) * (h - 2*32) -S(h) = S(w(h), h) -hstar =find_zero(D(S), 500) -wstar = w(hstar) -numericq(wstar) -``` - -The height is? - -```julia; hold=true; echo=false -w(h) = 12_000 / h -S(w, h) = (w- 2*8) * (h - 2*32) -S(h) = S(w(h), h) -hstar =find_zero(D(S), 500) -numericq(hstar) -``` - -###### Question - -Find the value $x > 0$ which minimizes the distance from the graph of -$f(x) = \log_e(x) - x$ to the origin $(0,0)$. - -```julia; hold=true; echo=false -f(x) = log(x) - x -p = plot(f, 0.2, 2, ylim=(-2,0.25), legend=false, linewidth=3) -plot!(p, [0,0], [-2, 0.25], color=:blue) -plot!(p, [0,2],[0,0], color=:blue) -xs = [0,1]; ys = [0, f(1)] -scatter!(p, xs,ys, color=:orange) -plot!(p, xs, ys, color=:orange, linewidth=3) -annotate!(p, [(.75, f(.5)/2, "d = $(round(sqrt(.5^2 + f(.5)^2), digits=2))")]) -p -``` - -```julia; hold=true; echo=false -d2(x) = sqrt((0-x)^2 + (0 - f(x))^2) -xstar = find_zero(D(d2), 1) -val = d2(xstar) -numericq(val) -``` - - -###### Question - -```julia; hold=true; echo=false -### {{{lhopital_40}}} -imgfile ="figures/fcarc-may2016-fig40-300.gif" -caption = L""" - -Image number $40$ from l'Hospital's calculus book (the first calculus book). Among all the cones that can be inscribed in a sphere, determine which one has the largest lateral area. (From http://www.ams.org/samplings/feature-column/fc-2016-05) - -""" -ImageFile(:derivatives, imgfile, caption) -``` - -The figure above poses a problem about cones in spheres, which can be reduced to a two-dimensional problem. Take a sphere of radius $r=1$, and imagine a secant line of length $l$ connecting $(-r, 0)$ to another point $(x,y)$ with $y>0$. Rotating that line around the $x$ axis produces a cone and its lateral surface is given by $SA=\pi \cdot y \cdot l$. Write $SA$ as a function of $x$ and solve. - -The largest lateral surface area is: - -```julia; hold=true; echo=false -r = 1 -SA(r,l) = pi * r * l -y(x) = sqrt(1 - x^2) -l(x) = sqrt((x-(-1))^2 + y(x)^2) -SA(x) = SA(y(x), l(x)) -cp = find_zero(SA', (-1/2, 1/2)) -val = SA(cp) -numericq(val) -``` - -The surface area of a sphere of radius $1$ is $4\pi r^2 = 4 \pi$. This is how many times greater than that of the largest cone? - -```julia; hold=true; echo=false -choices = ["exactly four times", -L"exactly $\pi$ times", -L"about $2.6$ times as big", -"about the same"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -In the examples the functions `argmax(f, itr)` and `findmin(collection)` were used. These have mathematical analogs. What is `argmax(f,itr)` in terms of math notation, where ``vs`` is the iterable collection of values: - -```julia; hold=true; echo=false -choices = [ - raw"``\{v \mid v \text{ in } vs, f(v) = \max(f(vs))\}``", - raw"``\{f(v) \mid v \text{ in } vs, f(v) = \max(f(vs))\}``", - raw"``\{i \mid v_i \text{ in } vs, f(v_i) = \max(f(vs))\}``" - ] -radioq(choices, 1) -``` - -The functions are related: `findmax` returns the maximum value and an index in the collection for which the value will be largest; `argmax` returns an element of the set for which the function is largest, so `argmax(identify, itr)` should correspond to the index found by `findmax` (through `itr[findmax(itr)[2]`) - -###### Question - -Let ``f(x) = (a/x)^x`` for ``a,x > 0``. When is this maximized? The following might be helpful - -```julia; hold=true -@syms x::positive a::postive -diff((a/x)^x, x) -``` - -This can be `solve`d to discover the answer. - -```julia; hold=true; echo=false -choices = [ -"``e``", -"``a/e``", -"``e/a``", -"``a \\cdot e``" -] -answ=2 -radioq(choices, answ) -``` - -###### Question - -The ladder problem has an trigonometry-free solution. We show one attributed to [Asma](http://www.mathematische-basteleien.de/ladder.htm). - -```julia; hold=true; echo=false -plt = plot(; axis=nothing, border=:none, legend=false, aspect_ratio=:equal) -a,b = 1, 2 -p = 1/2 -x = a/p -plot!(plt, [0, b*(1+p), 0, 0], [0, 0, a+x, 0]) -plot!(plt, [b,b,0,0],[0,a,a,0]) -annotate!(plt, [(b/2,0, "b"), (0,a/2,"a"), (0,a+x/2,"x"), (b+b*p/2,0,"bp")]) -plt -``` - -Introducing a variable ``p``, we get, following the above figure, the ladder of length ``c`` touching the wall at ``b+bp`` and ``a + x``. - -Using similar triangles, we have: - -```julia; hold=true -@syms a::positive b::positive p::positive x::positive -solve(x/b ~ (x+a)/(b + b*p), x) -``` - -With ``x = a/p`` we get by Pythagorean's theorem that - -```math -\begin{align*} -c^2 &= (a + a/p)^2 + (b + bp)^2 \\ - &= a^2(1 + \frac{1}{p})^2 + b^2(1+p)^2. -\end{align*} -``` - -The ladder problem minimizes ``c`` or equivalently ``c^2``. - -Why is the following set of commands useful in this task: - -```julia; eval=false -c2 = a^2*(1 + 1/p)^2 + b^2*(1+p)^2 -c2p = diff(c2, p) -eq = numer(together(c2p)) -solve(eq ~ 0, p) -``` - -```julia; hold=true; echo=false -choices = ["It finds the critical points", - "It finds the minimal value of `c`", - "It finds the minimal value of `p`"] -radioq(choices, 1) -``` - -The polynomial `nu` is what degree in `p`? - -```julia; hold=true; echo=false -numericq(4) -``` - -The only positive real solution for ``p`` from ``nu`` is - -```julia; hold=true;echo=false -choices = [ -raw" ``(a/b)^{2/3}``", -raw" ``1``", -raw" ``\sqrt{3}/2 \cdot (a/b)^{2/3}``" -] -radioq(choices, 1) -``` - -###### Question - -In [Hall](https://www.maa.org/sites/default/files/hall03010308158.pdf) we can find a dozen optimization problems related to the following figure of the parabola ``y=x^2`` a point ``P=(a,a^2)``, ``a > 0``, and its normal line. We will do two. - - -```julia; hold=true, echo=false -p = plot(; legend=false, aspect_ratio=:equal, axis=nothing, border=:none) - b = 2. - plot!(p, x -> x^2, -b, b) - plot!(p, [-b,b], [0,0]) - plot!(p, [0,0], [0, b^2]) - a = 1 - scatter!(p, [a],[a^2]) - m = 2a - plot!(p, x -> a^2 + m*(x-a), 1/2, b) - mₚ = -1/m - plot!(p, x -> a^2 + mₚ*(x-a)) - scatter!(p, [-3/2], [(3/2)^2]) - annotate!(p, [(1+1/4, 1+1/8, "P"), (-3/2-1/4, (-3/2)^2-1/4, "Q")]) -p -``` - -What do these commands do? - -```julia; hold=true; -@syms x::real, a::real -mₚ = - 1 / diff(x^2, x)(a) -solve(x^2 - (a^2 + mₚ*(x-a)) ~ 0, x) -``` - -```julia; hold=true; echo=false -radioq(( -"It finds the ``x`` value of the intersection points of the normal line and the parabola", -"It finds the tangent line", -"It finds the point ``P``" -), 1) -``` - -Numerically, find the value of ``a`` that minimizes the ``y`` coordinate of ``Q``. - -```julia; hold=true; echo=false -y(a) = (-a - 1/(2a))^2 -a = find_zero(y', 1) -numericq(a) -``` - - -Numerically find the value of ``a`` that minimizes the length of the line seqment ``PQ``. - -```juila; hold=true; echo=false -x(a) = -a - 1/(2a) -d(a) = (a-x(a))^2 + (a^2 - x(a)^2)^2 -a = find_zero(d', 1) -numericq(a) -``` diff --git a/CwJ/derivatives/process.jl b/CwJ/derivatives/process.jl deleted file mode 100644 index 3ef5ee5..0000000 --- a/CwJ/derivatives/process.jl +++ /dev/null @@ -1,44 +0,0 @@ -using WeavePynb - -using CwJWeaveTpl - - - -fnames = [ - "derivatives", ## more questions - "numeric_derivatives", - - "mean_value_theorem", - "optimization", - "curve_sketching", - - "linearization", - "newtons_method", - "lhopitals_rule", ## Okay - -but could beef up questions.. - - - "implicit_differentiation", ## add more questions? - "related_rates", - "taylor_series_polynomials" -] - - -process_file(nm; cache=:off) = CwJWeaveTpl.mmd(nm * ".jmd", cache=cache) - -function process_files(;cache=:user) - for f in fnames - @show f - process_file(f, cache=cache) - end -end - - - - -""" -## TODO derivatives - -tangent lines intersect at avearge for a parabola - -Should we have derivative results: inverse functions, logarithmic differentiation... -""" diff --git a/CwJ/derivatives/related_rates.jmd b/CwJ/derivatives/related_rates.jmd deleted file mode 100644 index ea96a82..0000000 --- a/CwJ/derivatives/related_rates.jmd +++ /dev/null @@ -1,781 +0,0 @@ -# Related rates - -This section uses these add-on packaages: - -```julia -using CalculusWithJulia -using Plots -using Roots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -fig_size=(800, 600) -const frontmatter = ( - title = "Related rates", - description = "Calculus with Julia: Related rates", - tags = ["CalculusWithJulia", "derivatives", "related rates"], -); -nothing -``` - ----- - -Related rates problems involve two (or more) unknown quantities that -are related through an equation. As the two variables depend on each -other, also so do their rates - change with respect to some variable -which is often time, though exactly how remains to be -discovered. Hence the name "related rates." - -#### Examples - -The following is a typical "book" problem: - -> A screen saver displays the outline of a ``3`` cm by ``2`` cm rectangle and -> then expands the rectangle in such a way that the ``2`` cm side is -> expanding at the rate of ``4`` cm/sec and the proportions of the -> rectangle never change. How fast is the area of the rectangle -> increasing when its dimensions are ``12`` cm by ``8`` cm? -> [Source.](http://oregonstate.edu/instruct/mth251/cq/Stage9/Practice/ratesProblems.html) - -```julia; hold=true; echo=false; cache=true -### {{{growing_rects}}} -## Secant line approaches tangent line... -function growing_rects_graph(n) - w = (t) -> 2 + 4t - h = (t) -> 3/2 * w(t) - t = n - 1 - - w_2 = w(t)/2 - h_2 = h(t)/2 - - w_n = w(5)/2 - h_n = h(5)/2 - - plt = plot(w_2 * [-1, -1, 1, 1, -1], h_2 * [-1, 1, 1, -1, -1], xlim=(-17,17), ylim=(-17,17), - legend=false, size=fig_size) - annotate!(plt, [(-1.5, 1, "Area = $(round(Int, 4*w_2*h_2))")]) - plt - - -end -caption = L""" - -As $t$ increases, the size of the rectangle grows. The ratio of width to height is fixed. If we know the rate of change in time for the width ($dw/dt$) and the height ($dh/dt$) can we tell the rate of change of *area* with respect to time ($dA/dt$)? - -""" -n=6 - -anim = @animate for i=1:n - growing_rects_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - -Here we know $A = w \cdot h$ and we know some things about how $w$ and -$h$ are related *and* about the rate of how both $w$ and $h$ grow in -time $t$. That means that we could express this growth in terms of -some functions $w(t)$ and $h(t)$, then we can figure out that the area - as a function of $t$ - will be expressed as: - -```math -A(t) = w(t) \cdot h(t). -``` - -We would get by the product rule that the *rate of change* of area with respect to time, $A'(t)$ is just: - -```math -A'(t) = w'(t) h(t) + w(t) h'(t). -``` - -As an aside, it is fairly conventional to suppress the $(t)$ part of -the notation $A=wh$ and to use the Leibniz notation for derivatives: - -```math -\frac{dA}{dt} = \frac{dw}{dt} h + w \frac{dh}{dt}. -``` - -This relationship is true for all $t$, but the problem discusses a -certain value of $t$ - when $w(t)=8$ and $h(t) = 12$. At this same -value of $t$, we have $w'(t) = 4$ and so $h'(t) = 6$. Substituting these 4 values into the 4 unknowns in the formula for $A'(t)$ gives: - -```math -A'(t) = 4 \cdot 12 + 8 \cdot 6 = 96. -``` - -Summarizing, from the relationship between $A$, $w$ and $t$, there is -a relationship between their rates of growth with respect to $t$, a -time variable. Using this and known values, we can compute. In this -case, $A'$ at the specific $t$. - - -We could also have done this differently. We would recognize the following: - -- The area of a rectangle is just: - -```julia; -A(w,h) = w * h -``` - -- The width - expanding at a rate of $4t$ from a starting value of $2$ - must satisfy: - -```julia; -w(t) = 2 + 4*t -``` - -- The height is a constant proportion of the width: - -```julia; -h(t) = 3/2 * w(t) -``` - -This means again that area depends on $t$ through this formula: - -```julia; -A(t) = A(w(t), h(t)) -``` - - -This is why the rates of change are related: as $w$ and $h$ change in -time, the functional relationship with $A$ means $A$ also changes in -time. - - - -Now to answer the question, when the width is 8, we must have that $t$ is: - -```julia; -tstar = find_zero(x -> w(x) - 8, [0, 4]) # or solve by hand to get 3/2 -``` - -The question is to find the rate the area is increasing at the given -time $t$, which is $A'(t)$ or $dA/dt$. We get this by performing the -differentiation, then substituting in the value. - -Here we do so with the aid of `Julia`, though this problem could readily be done "by hand." - -We have expressed $A$ as a function of $t$ by composition, so can differentiate that: - -```julia; -A'(tstar) -``` - - ----- - -Now what? Why is ``96`` of any interest? It is if the value at a specific -time is needed. But in general, a better question might be to -understand if there is some pattern to the numbers in the figure, -these being $6, 54, 150, 294, 486, 726$. Their differences are the -*average* rate of change: - -```julia; -xs = [6, 54, 150, 294, 486, 726] -ds = diff(xs) -``` - -Those seem to be increasing by a fixed amount each time, which we can see by one more application of `diff`: - -```julia; -diff(ds) -``` - -How can this relationship be summarized? Well, let's go back to what we know, though this time using symbolic math: - -```julia; -@syms t -diff(A(t), t) -``` - -This should be clear: the rate of change, $dA/dt$, is increasing -linearly, hence the second derivative, $dA^2/dt^2$ would be constant, -just as we saw for the average rate of change. - -So, for this problem, a constant rate of change in width and height -leads to a linear rate of change in area, put otherwise, linear growth -in both width and height leads to quadratic growth in area. - -##### Example - -A ladder, with length $l$, is leaning against a wall. We parameterize -this problem so that the top of the ladder is at $(0,h)$ and the -bottom at $(b, 0)$. Then $l^2 = h^2 + b^2$ is a constant. - -If the ladder starts to slip away at the base, but remains in contact -with the wall, express the rate of change of $h$ with respect to $t$ -in terms of $db/dt$. - -We have from implicitly differentiating in $t$ the equation $l^2 = h^2 + b^2$, noting that $l$ is a constant, that: - -```math -0 = 2h \frac{dh}{dt} + 2b \frac{db}{dt}. -``` - - -Solving, yields: - -```math -\frac{dh}{dt} = -\frac{b}{h} \cdot \frac{db}{dt}. -``` - - -* If when $l = 12$ it is known that $db/dt = 2$ when $b=4$, find $dh/dt$. - -We just need to find $h$ for this value of $b$, as the other two quantities in the last equation are known. - -But $h = \sqrt{l^2 - b^2}$, so the answer is: - - -```julia; -length, bottom, dbdt = 12, 4, 2 -height = sqrt(length^2 - bottom^2) --bottom/height * dbdt -``` - -* What happens to the rate as $b$ goes to $l$? - -As $b$ goes to $l$, $h$ goes to ``0``, so $b/h$ blows up. Unless $db/dt$ -goes to $0$, the expression will become $-\infty$. - -!!! note - Often, this problem is presented with ``db/dt`` having a constant rate. In this case, the ladder problem defies physics, as ``dh/dt`` eventually is faster than the speed of light as ``h \rightarrow 0+``. In practice, were ``db/dt`` kept at a constant, the ladder would necessarily come away from the wall. The trajectory would follow that of a tractrix were there no gravity to account for. - - -##### Example - -```julia; hold=true; echo=false -caption = "A man and woman walk towards the light." - -imgfile = "figures/long-shadow-noir.png" -ImageFile(:derivatives, imgfile, caption) -``` - -Shadows are a staple of film noir. In the photo, suppose a man and a woman walk towards a street light. As they approach the light the length of their shadow changes. - -Suppose, we focus on the ``5`` foot tall woman. Her shadow comes from a streetlight ``15`` feet high. She is walking at ``3`` feet per second towards the light. What is the rate of change of her shadow? - -The setup for this problem involves drawing a right triangle with height ``12`` and base given by the distance ``x`` from the light the woman is *plus* the length ``l`` of the shadow. There is a similar triangle formed by the woman's height with length ``l``. Equating the ratios of the sided gives: - -```math -\frac{5}{l} = \frac{12}{x + l} -``` - -As we need to take derivatives, we work with the reciprocal relationship: - -```math -\frac{l}{5} = \frac{x + l}{12} -``` - -Differentiating in ``t`` gives: - -```math -\frac{l'}{5} = \frac{x' + l'}{12} -``` - -Or - -```math -l' \cdot (\frac{1}{5} - \frac{1}{12}) = \frac{x'}{12} -``` - -Solving for ``l'`` gives an answer in terms of ``x'`` the rate the woman is walking. In this description ``x`` is getting shorter, so ``x'`` would be ``-3`` feet per second and the shadow length would be decreasing at a rate proportional to the walking speed. - -##### Example - -```julia; hold=true; echo=false -p = plot(; axis=nothing, border=:none, legend=false, aspect_ratio=:equal) -scatter!(p, [0],[50], color=:yellow, markersize=50) -plot!(p, [0, 50], [0,0], linestyle=:dash) -plot!(p, [0,50], [50,0], linestyle=:dot) -plot!(p, [25,25],[25,0], linewidth=5, color=:black) -plot!(p, [25,50], [0,0], linewidth=2, color=:black) -``` - -The sun is setting at the rate of ``1/20`` radian/min, and appears to be dropping perpendicular to the horizon, as depicted in the figure. How fast is the shadow of a ``25`` meter wall lengthening at the moment when the shadow is ``25`` meters long? - -Let the shadow length be labeled ``x``, as it appears on the ``x`` axis above. Then we have by right-angle trigonometry: - -```math -\tan(\theta) = \frac{25}{x} -``` - -of ``x\tan(\theta) = 25``. - -As ``t`` evolves, we know ``d\theta/dt`` but what is ``dx/dt``? Using implicit differentiation yields: - -```math -\frac{dx}{dt} \cdot \tan(\theta) + x \cdot (\sec^2(\theta)\cdot \frac{d\theta}{dt}) = 0 -``` - -Substituting known values and identifying ``\theta=\pi/4`` when the shadow length, ``x``, is ``25`` gives: - -```math -\frac{dx}{dt} \cdot \tan(\pi/4) + 25 \cdot((4/2) \cdot \frac{-1}{20} = 0 -``` - -This can be solved for the unknown: ``dx/dt = 50/20``. - - - -##### Example - -A batter hits a ball toward third base at ``75`` ft/sec and runs toward first base at a rate of ``24`` ft/sec. At what rate does the distance between the ball and the batter change when ``2`` seconds have passed? - - -We will answer this with `SymPy`. First we create some symbols for the movement of the ball towardsthird base, `b(t)`, the runner toward first base, `r(t)`, and the two velocities. We use symbolic functions for the movements, as we will be differentiating them in time: - -```julia -@syms b() r() v_b v_r -d = sqrt(b(t)^2 + r(t)^2) -``` - -The distance formula applies to give ``d``. As the ball and runner are moving in a perpendicular direction, the formula is easy to apply. - -We can differentiate `d` in terms of `t` and in process we also find the derivatives of `b` and `r`: - -```julia -db, dr = diff(b(t),t), diff(r(t),t) # b(t), r(t) -- symbolic functions -dd = diff(d,t) # d -- not d(t) -- an expression -``` - -The slight difference in the commands is due to `b` and `r` being symbolic functions, whereas `d` is a symbolic expression. Now we begin substituting. First, from the problem `db` is just the velocity in the ball's direction, or `v_b`. Similarly for `v_r`: - -```julia -ddt = subs(dd, db => v_b, dr => v_r) -``` - -Now, we can substitute in for `b(t)`, as it is `v_b*t`, etc.: - -```julia -ddt₁ = subs(ddt, b(t) => v_b * t, r(t) => v_r * t) -``` - -This finds the rate of change of time for any `t` with symbolic values of the velocities. (And shows how the answer doesn't actually depend on ``t``.) The problem's answer comes from a last substitution: - -```julia -ddt₁(t => 2, v_b => 75, v_r => 24) -``` - -Were this done by "hand," it would be better to work with distance squared to avoid the expansion of complexity from the square root. That is, using implicit differentiation: - -```math -\begin{align*} -d^2 &= b^2 + r^2\\ -2d\cdot d' &= 2b\cdot b' + 2r\cdot r'\\ -d' &= (b\cdot b' + r \cdot r')/d\\ -d' &= (tb'\cdot b' + tr' \cdot r')/d\\ -d' &= \left((b')^2 + (r')^2\right) \cdot \frac{t}{d}. -\end{align*} -``` - -##### Example - -```julia; hold=true; echo=false; cache=true -###{{{baseball_been_berry_good}}} -## Secant line approaches tangent line... -function baseball_been_berry_good_graph(n) - - v0 = 15 - x = (t) -> 50t - y = (t) -> v0*t - 5 * t^2 - - - ns = range(.25, stop=3, length=8) - - t = ns[n] - ts = range(0, stop=t, length=50) - xs = map(x, ts) - ys = map(y, ts) - - degrees = atand(y(t)/(100-x(t))) - degrees = degrees < 0 ? 180 + degrees : degrees - - plt = plot(xs, ys, legend=false, size=fig_size, xlim=(0,150), ylim=(0,15)) - plot!(plt, [x(t), 100], [y(t), 0.0], color=:orange) - annotate!(plt, [(55, 4,"θ = $(round(Int, degrees)) degrees"), - (x(t), y(t), "($(round(Int, x(t))), $(round(Int, y(t))))")]) - -end -caption = L""" - -The flight of the ball as being tracked by a stationary outfielder. This ball will go over the head of the player. What can the player tell from the quantity $d\theta/dt$? - -""" -n = 8 - - -anim = @animate for i=1:n - baseball_been_berry_good_graph(i) -end - - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - - -A baseball player stands ``100`` meters from home base. A batter hits the -ball directly at the player so that the distance from home plate is -$x(t)$ and the height is $y(t)$. - -The player tracks the flight of the ball in terms of the angle -$\theta$ made between the ball and the player. This will satisfy: - -```math -\tan(\theta) = \frac{y(t)}{100 - x(t)}. -``` - -What is the rate of change of $\theta$ with respect to $t$ in terms of that of $x$ and $y$? - -We have by the chain rule and quotient rule: - -```math -\sec^2(\theta) \theta'(t) = \frac{y'(t) \cdot (100 - x(t)) - y(t) \cdot (-x'(t))}{(100 - x(t))^2}. -``` - -If we have $x(t) = 50t$ and $y(t)=v_{0y} t - 5 t^2$ when is the rate of change of the angle happening most quickly? - - -The formula for $\theta'(t)$ is - -```math -\theta'(t) = \cos^2(\theta) \cdot \frac{y'(t) \cdot (100 - x(t)) - y(t) \cdot (-x'(t))}{(100 - x(t))^2}. -``` - - -This question requires us to differentiate *again* in $t$. Since we -have fairly explicit function for $x$ and $y$, we will use `SymPy` to -do this. - -```julia; -@syms theta() - -v0 = 5 -x(t) = 50t -y(t) = v0*t - 5 * t^2 -eqn = tan(theta(t)) - y(t) / (100 - x(t)) -``` - -```julia; -thetap = diff(theta(t),t) -dtheta = solve(diff(eqn, t), thetap)[1] -``` - -We could proceed directly by evaluating: - -```julia; -d2theta = diff(dtheta, t)(thetap => dtheta) -``` - -That is not so tractable, however. - -It helps to simplify -$\cos^2(\theta(t))$ using basic right-triangle trigonometry. Recall, $\theta$ comes from a right triangle with -height $y(t)$ and length $(100 - x(t))$. The cosine of this angle will -be $100 - x(t)$ divided by the length of the hypotenuse. So we can -substitute: - -```julia; -dtheta₁ = dtheta(cos(theta(t))^2 => (100 -x(t))^2/(y(t)^2 + (100-x(t))^2)) -``` - - -Plotting reveals some interesting things. For $v_{0y} < 10$ we have graphs that look like: - -```julia; -plot(dtheta₁, 0, v0/5) -``` - -The ball will drop in front of the player, and the change in $d\theta/dt$ is monotonic. - - - -But let's rerun the code with $v_{0y} > 10$: - -```julia; hold=true -v0 = 15 -x(t) = 50t -y(t) = v0*t - 5 * t^2 -eqn = tan(theta(t)) - y(t) / (100 - x(t)) -thetap = diff(theta(t),t) -dtheta = solve(diff(eqn, t), thetap)[1] -dtheta₁ = subs(dtheta, cos(theta(t))^2, (100 - x(t))^2/(y(t)^2 + (100 - x(t))^2)) -plot(dtheta₁, 0, v0/5) -``` - - -In the second case we have a different shape. The graph is not -monotonic, and before the peak there is an inflection point. Without -thinking too hard, we can see that the greatest change in the angle is -when it is just above the head ($t=2$ has $x(t)=100$). - -That these two graphs differ so, means that the player may be able to -read if the ball is going to go over his or her head by paying -attention to the how the ball is being tracked. - -##### Example - -Hipster pour-over coffee is made with a conical coffee filter. The -cone is actually a [frustum](http://en.wikipedia.org/wiki/Frustum) of -a cone with small diameter, say $r_0$, chopped off. We will parameterize -our cone by a value $h \geq 0$ on the $y$ axis and an angle $\theta$ -formed by a side and the $y$ axis. Then the coffee filter is the part -of the cone between some $h_0$ (related $r_0=h_0 \tan(\theta)$) and $h$. - -The volume of a cone of height $h$ is $V(h) = \pi/3 h \cdot -R^2$. From the geometry, $R = h\tan(\theta)$. The volume of the -filter then is: - -```math -V = V(h) - V(h_0). -``` - -What is $dV/dh$ in terms of $dR/dh$? - -Differentiating implicitly gives: - - -```math -\frac{dV}{dh} = \frac{\pi}{3} ( R(h)^2 + h \cdot 2 R \frac{dR}{dh}). -``` - -We see that it depends on $R$ and the change in $R$ with respect to $h$. However, we visualize $h$ - the height - so it is better to re-express. Clearly, $dR/dh = \tan\theta$ and using $R(h) = h \tan(\theta)$ we get: - -```math -\frac{dV}{dh} = \pi h^2 \tan^2(\theta). -``` - -The rate of change goes down as $h$ gets smaller ($h \geq h_0$) and gets bigger for bigger $\theta$. - -How do the quantities vary in time? - -For an incompressible fluid, by balancing the volume leaving with how -it leaves we will have $dh/dt$ is the ratio of the cross-sectional -area at bottom over that at the height of the fluid $(\pi \cdot (h_0\tan(\theta))^2) / -(\pi \cdot ((h\tan\theta))^2)$ times the outward velocity of the fluid. - -That is $dh/dt = (h_0/h)^2 \cdot v$. Which makes sense - larger openings -($h_0$) mean more fluid lost per unit time so the height change -follows, higher levels ($h$) means the change in height is slower, as -the cross-sections have more volume. - - -By [Torricelli's](http://en.wikipedia.org/wiki/Torricelli's_law) law, -the out velocity follows the law $v = \sqrt{2g(h-h_0)}$. This gives: - -```math -\frac{dh}{dt} = \frac{h_0^2}{h^2} \cdot v = \frac{h_0^2}{h^2} \sqrt{2g(h-h_0)}. -``` - -If $h >> h_0$, then $\sqrt{h-h_0} = \sqrt{h}\sqrt(1 - h_0/h) \approx \sqrt{h}(1 - (1/2)(h_0/h)) \approx \sqrt{h}$. So the rate of change of height in time is like $1/h^{3/2}$. - - -Now, by the chain rule, we have then the rate of change of volume with respect to time, $dV/dt$, is: - -```math -\begin{align*} -\frac{dV}{dt} &= -\frac{dV}{dh} \cdot \frac{dh}{dt}\\ -&= \pi h^2 \tan^2(\theta) \cdot \frac{h_0^2}{h^2} \sqrt{2g(h-h_0)} \\ -&= \pi \sqrt{2g} \cdot (r_0)^2 \cdot \sqrt{h-h_0} \\ -&\approx \pi \sqrt{2g} \cdot r_0^2 \cdot \sqrt{h}. -\end{align*} -``` - - - -This rate depends on the square of the size of the -opening ($r_0^2$) and the square root of the height ($h$), but not the -angle of the cone. - - -## Questions - -###### Question - -Supply and demand. Suppose demand for product $XYZ$ is $d(x)$ and supply -is $s(x)$. The excess demand is $d(x) - s(x)$. Suppose this is positive. How does this influence -price? Guess the "law" of economics that applies: - -```julia; hold=true; echo=false -choices = [ -"The rate of change of price will be ``0``", -"The rate of change of price will increase", -"The rate of change of price will be positive and will depend on the rate of change of excess demand." -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -(Theoretically, when demand exceeds supply, prices increase.) - -###### Question - -Which makes more sense from an economic viewpoint? - -```julia; hold=true; echo=false -choices = [ -"If the rate of change of unemployment is negative, the rate of change of wages will be negative.", -"If the rate of change of unemployment is negative, the rate of change of wages will be positive." -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -(Colloquially, "the rate of change of unemployment is negative" means the unemployment rate is going down, so there are fewer workers available to fill new jobs.) - -###### Question - -In chemistry there is a fundamental relationship between pressure -($P$), temperature ($T)$ and volume ($V$) given by $PV=cT$ where $c$ -is a constant. Which of the following would be true with respect to time? - -```julia; hold=true; echo=false -choices = [ -L"The rate of change of pressure is always increasing by $c$", -"If volume is constant, the rate of change of pressure is proportional to the temperature", -"If volume is constant, the rate of change of pressure is proportional to the rate of change of temperature", -"If pressure is held constant, the rate of change of pressure is proportional to the rate of change of temperature"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -A pebble is thrown into a lake causing ripples to form expanding -circles. Suppose one of the circles expands at a rate of ``1`` foot per second and -the radius of the circle is ``10`` feet, what is the rate of change of -the area enclosed by the circle? - -```julia; hold=true; echo=false -# a = pi*r^2 -# da/dt = pi * 2r * drdt -r = 10; drdt = 1 -val = pi * 2r * drdt -numericq(val, units=L"feet$^2$/second") -``` - -###### Question - -A pizza maker tosses some dough in the air. The dough is formed in a -circle with radius ``10``. As it rotates, its area increases at a rate of -``1`` inch$^2$ per second. What is the rate of change of the radius? - -```julia; hold=true; echo=false -# a = pi*r^2 -# da/dt = pi * 2r * drdt -r = 10; dadt = 1 -val = dadt /( pi * 2r) -numericq(val, units="inches/second") -``` - -###### Question - - -An FBI agent with a powerful spyglass is located in a boat anchored -400 meters offshore. A gangster under surveillance is driving along -the shore. Assume the shoreline is straight and that the gangster is 1 -km from the point on the shore nearest to the boat. If the spyglasses -must rotate at a rate of $\pi/4$ radians per minute to track -the gangster, how fast is the gangster moving? (In kilometers per minute.) -[Source.](http://oregonstate.edu/instruct/mth251/cq/Stage9/Practice/ratesProblems.html) - - -```julia; hold=true; echo=false -## tan(theta) = x/y -## sec^2(theta) dtheta/dt = 1/y dx/dt (y is constant) -## dxdt = y sec^2(theta) dtheta/dt -dthetadt = pi/4 -y0 = .4; x0 = 1.0 -theta = atan(x0/y0) -val = y0 * sec(theta)^2 * dthetadt -numericq(val, units="kilometers/minute") -``` - - -###### Question - -A flood lamp is installed on the ground 200 feet from a vertical -wall. A six foot tall man is walking towards the wall at the rate of -4 feet per second. How fast is the tip of his shadow moving down the -wall when he is 50 feet from the wall? -[Source.](http://oregonstate.edu/instruct/mth251/cq/Stage9/Practice/ratesProblems.html) -(As the question is written the answer should be positive.) - -```julia; hold=true; echo=false -## y/200 = 6/x -## dydt = 200 * 6 * -1/x^2 dxdt -x0 = 200 - 50 -dxdt = 4 -val = 200 * 6 * (1/x0^2) * dxdt -numericq(val, units="feet/second") -``` - - -###### Question - - -Consider the hyperbola $y = 1/x$ and think of it as a slide. A -particle slides along the hyperbola so that its x-coordinate is -increasing at a rate of $f(x)$ units/sec. If its $y$-coordinate is -decreasing at a constant rate of $1$ unit/sec, what is $f(x)$? -[Source.](http://oregonstate.edu/instruct/mth251/cq/Stage9/Practice/ratesProblems.html) - -```julia; hold=true; echo=false -choices = [ -"``f(x) = 1/x``", -"``f(x) = x^0``", -"``f(x) = x``", -"``f(x) = x^2``" -] -answ = 4 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -A balloon is in the shape of a sphere, fortunately, as this gives -a known formula, $V=4/3 \pi r^3$, for the volume. If the balloon is being filled with a rate of -change of volume per unit time is $2$ and the radius is $3$, what is -rate of change of radius per unit time? - -```julia; hold=true; echo=false -r, dVdt = 3, 2 -drdt = dVdt / (4 * pi * r^2) -numericq(drdt, units="units per unit time") -``` - -###### Question - -Consider the curve $f(x) = x^2 - \log(x)$. For a given $x$, the tangent line intersects the $y$ axis. Where? - -```julia; hold=true; echo=false -choices = [ -"``y = 1 - x^2 - \\log(x)``", -"``y = 1 - x^2``", -"``y = 1 - \\log(x)``", -"``y = x(2x - 1/x)``" -] -answ = 1 -radioq(choices, answ) -``` - -If $dx/dt = -1$, what is $dy/dt$? - -```julia; hold=true; echo=false -choices = [ -"``dy/dt = 2x + 1/x``", -"``dy/dt = 1 - x^2 - \\log(x)``", -"``dy/dt = -2x - 1/x``", -"``dy/dt = 1``" -] -answ=1 -radioq(choices, answ) -``` diff --git a/CwJ/derivatives/symbolic_derivatives.jmd b/CwJ/derivatives/symbolic_derivatives.jmd deleted file mode 100644 index f579056..0000000 --- a/CwJ/derivatives/symbolic_derivatives.jmd +++ /dev/null @@ -1,218 +0,0 @@ -# Symbolic derivatives - -This section uses this add-on package: - -```julia -using TermInterface -``` - -```julia; echo=false -const frontmatter = ( - title = "Symbolic derivatives", - description = "Calculus with Julia: Symbolic derivatives", - tags = ["CalculusWithJulia", "derivatives", "symbolic derivatives"], -); -``` - ----- - -The ability to breakdown an expression into operations and their -arguments is necessary when trying to apply the differentiation -rules. Such rules are applied from the outside in. Identifying -the proper "outside" function is usually most of the battle when finding derivatives. - -In the following example, we provide a sketch of a framework to differentiate expressions by a chosen symbol to illustrate how the outer function drives the task of differentiation. - -The `Symbolics` package provides native symbolic manipulation abilities for `Julia`, similar to `SymPy`, though without the dependence on `Python`. The `TermInterface` package, used by `Symbolics`, provides a generic interface for expression manipulation for this package that *also* is implemented for `Julia`'s expressions and symbols. - -An expression is an unevaluated portion of code that for our purposes -below contains other expressions, symbols, and numeric literals. They -are held in the `Expr` type. A symbol, such as `:x`, is distinct from -a string (e.g. `"x"`) and is useful to the programmer to distinguish -between the contents a variable points to from the name of the -variable. Symbols are fundamental to metaprogramming in `Julia`. An -expression is a specification of some set of statements to execute. A -numeric literal is just a number. - -The three main functions from `TermInterface` we leverage are `istree`, `operation`, and `arguments`. The `operation` function returns the "outside" function of an expression. For example: - -```julia -operation(:(sin(x))) -``` - -We see the `sin` function, referred to by a symbol (`:sin`). -The `:(...)` above *quotes* the argument, and does not evaluate it, hence `x` need not be defined above. (The `:` notation is used to create both symbols and expressions.) - - -The arguments are the terms that the outside function is called on. For our purposes there may be ``1`` (*unary*), ``2`` (*binary*), or more than ``2`` (*nary*) arguments. (We ignore zero-argument functions.) For example: - -```julia -arguments(:(-x)), arguments(:(pi^2)), arguments(:(1 + x + x^2)) -``` - -(The last one may be surprising, but all three arguments are passed to the `+` function.) - - -Here we define a function to decide the *arity* of an expression based on the number of arguments it is called with: - -```julia -function arity(ex) - n = length(arguments(ex)) - n == 1 ? Val(:unary) : - n == 2 ? Val(:binary) : Val(:nary) -end -``` - - -Differentiation must distinguish between expressions, variables, and -numbers. Mathematically expressions have an "outer" function, whereas variables and numbers can be directly differentiated. The `istree` -function in `TermInterface` returns `true` when passed an expression, -and `false` when passed a symbol or numeric literal. The latter two -may be distinguished by `isa(..., Symbol)`. - -Here we create a function, `D`, that when it encounters an expression it *dispatches* to a specific method of `D` based on the outer operation and arity, otherwise if it encounters a symbol or a numeric literal it does the differentiation: - -```julia -function D(ex, var=:x) - if istree(ex) - op, args = operation(ex), arguments(ex) - D(Val(op), arity(ex), args, var) - elseif isa(ex, Symbol) && ex == :x - 1 - else - 0 - end -end -``` - -Now to develop methods for `D` for different "outside" functions and arities. - -Addition can be unary (`:(+x)` is a valid quoting, even if it might simplify to the symbol `:x` when evaluated), *binary*, or *nary*. Here we implement the *sum rule*: - -```julia -D(::Val{:+}, ::Val{:unary}, args, var) = D(first(args), var) - -function D(::Val{:+}, ::Val{:binary}, args, var) - a′, b′ = D.(args, var) - :($a′ + $b′) -end - -function D(::Val{:+}, ::Val{:nary}, args, var) - a′s = D.(args, var) - :(+($a′s...)) -end -``` - -The `args` are always held in a container, so the unary method must pull out the first one. The binary case should read as: apply `D` to each of the two arguments, and then create a quoted expression containing the sum of the results. The dollar signs interpolate into the quoting. (The "primes" are unicode notation achieved through `\prime[tab]` and not operations.) The *nary* case does something similar, only uses splatting to produce the sum. - -Subtraction must also be implemented in a similar manner, but not for the *nary* case: - -```julia -function D(::Val{:-}, ::Val{:unary}, args, var) - a′ = D(first(args), var) - :(-$a′) -end -function D(::Val{:-}, ::Val{:binary}, args, var) - a′, b′ = D.(args, var) - :($a′ - $b′) -end -``` - -The *product rule* is similar to addition, in that ``3`` cases are considered: - -```julia -D(op::Val{:*}, ::Val{:unary}, args, var) = D(first(args), var) - -function D(::Val{:*}, ::Val{:binary}, args, var) - a, b = args - a′, b′ = D.(args, var) - :($a′ * $b + $a * $b′) -end - -function D(op::Val{:*}, ::Val{:nary}, args, var) - a, bs... = args - b = :(*($(bs...))) - a′ = D(a, var) - b′ = D(b, var) - :($a′ * $b + $a * $b′) -end -``` - -The *nary* case above just peels off the first factor and then uses the binary product rule. - -Division is only a binary operation, so here we have the *quotient rule*: - -```julia -function D(::Val{:/}, ::Val{:binary}, args, var) - u,v = args - u′, v′ = D(u, var), D(v, var) - :( ($u′*$v - $u*$v′)/$v^2 ) -end -``` - -Powers are handled a bit differently. The power rule would require checking if the exponent does not contain the variable of differentiation, exponential derivatives would require checking the base does not contain the variable of differentation. Trying to implement both would be tedious, so we use the fact that ``x = \exp(\log(x))`` (for `x` in the domain of `log`, more care is necessary if `x` is negative) to differentiate: - -```julia -function D(::Val{:^}, ::Val{:binary}, args, var) - a, b = args - D(:(exp($b*log($a))), var) # a > 0 assumed here -end -``` - - -That leaves the task of defining a rule to differentiate both `exp` and `log`. -We do so with *unary* definitions. In the following we also implement `sin` and `cos` rules: - -```julia -function D(::Val{:exp}, ::Val{:unary}, args, var) - a = first(args) - a′ = D(a, var) - :(exp($a) * $a′) -end - -function D(::Val{:log}, ::Val{:unary}, args, var) - a = first(args) - a′ = D(a, var) - :(1/$a * $a′) -end - -function D(::Val{:sin}, ::Val{:unary}, args, var) - a = first(args) - a′ = D(a, var) - :(cos($a) * $a′) -end - -function D(::Val{:cos}, ::Val{:unary}, args, var) - a = first(args) - a′ = D(a, var) - :(-sin($a) * $a′) -end -``` - -The pattern is similar for each. The `$a′` factor is needed due to the *chain rule*. The above illustrates the simple pattern necessary to add a derivative rule for a function. More could be, but for this example the above will suffice, as now the system is ready to be put to work. - - -```julia -ex₁ = :(x + 2/x) -D(ex₁, :x) -``` - -The output does not simplify, so some work is needed to identify `1 - 2/x^2` as the answer. - - -```julia -ex₂ = :( (x + sin(x))/sin(x)) -D(ex₂, :x) -``` - -Again, simplification is not performed. - -Finally, we have a second derivative taken below: - -```julia -ex₃ = :(sin(x) - x - x^3/6) -D(D(ex₃, :x), :x) -``` - - -The length of the expression should lead to further appreciation for simplification steps taken when doing such a computation by hand. diff --git a/CwJ/derivatives/taylor_series_polynomials.jmd b/CwJ/derivatives/taylor_series_polynomials.jmd deleted file mode 100644 index 4258ab8..0000000 --- a/CwJ/derivatives/taylor_series_polynomials.jmd +++ /dev/null @@ -1,1217 +0,0 @@ -# Taylor Polynomials and other Approximating Polynomials - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using Unitful -``` - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using Roots - -fig_size = (800, 600) -const frontmatter = ( - title = "Taylor Polynomials and other Approximating Polynomials", - description = "Calculus with Julia: Taylor Polynomials and other Approximating Polynomials", - tags = ["CalculusWithJulia", "derivatives", "taylor polynomials and other approximating polynomials"], -); -nothing -``` - -The tangent line was seen to be the "best" linear approximation to a -function at a point $c$. Approximating a function by a linear function -gives an easier to use approximation at the expense of accuracy. It -suggests a tradeoff between ease and accuracy. Is there a way to gain more accuracy at the expense of ease? - -Quadratic functions are still fairly easy to work with. Is it possible to find the best "quadratic" -approximation to a function at a point $c$. - -More generally, for a given $n$, what would be the best polynomial of -degree $n$ to approximate $f(x)$ at $c$? - -We will see in this section how the Taylor polynomial answers these -questions, and is the appropriate generalization of the tangent -line approximation. - - -```julia; hold=true; echo=false; cache=true -###{{{taylor_animation}}} -taylor(f, x, c, n) = series(f, x, c, n+1).removeO() -function make_taylor_plot(u, a, b, k) - k = 2k - plot(u, a, b, title="plot of T_$k", linewidth=5, legend=false, size=fig_size, ylim=(-2,2.5)) - if k == 1 - plot!(zero, range(a, stop=b, length=100)) - else - plot!(taylor(u, x, 0, k), range(a, stop=b, length=100)) - end -end - - - -@syms x -u = 1 - cos(x) -a, b = -2pi, 2pi -n = 8 -anim = @animate for i=1:n - make_taylor_plot(u, a, b, i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -caption = L""" - -Illustration of the Taylor polynomial of degree $k$, $T_k(x)$, at $c=0$ and its graph overlayed on that of the function $1 - \cos(x)$. - -""" - -ImageFile(imgfile, caption) -``` - -## The secant line and the tangent line - -We approach this general problem **much** more indirectly than is needed. We introducing notations that are attributed to Newton and proceed from there. By leveraging `SymPy` we avoid tedious computations and *hopefully* gain some insight. - -Suppose ``f(x)`` is a function which is defined in a neighborhood of -$c$ and has as many continuous derivatives as we care to take at $c$. - - -We have two related formulas: - -* The *secant line* connecting $(c, f(c))$ and $(c+h, f(c+h))$ for a - value of $h>0$ is given in point-slope form by - -```math -sl(x) = f(c) + \frac{(f(c+h) - f(c))}{h} \cdot (x-c). -``` - -The slope is the familiar approximation to the derivative: $(f(c+h)-f(c))/h$. - -* The *tangent line* to the graph of $f(x)$ at $x=c$ is described by - the function - -```math -tl(x) = f(c) + f'(c) \cdot(x - c). -``` - -The key is the term multiplying ``(x-c)`` for the secant line is an approximation to the related term for the tangent line. -That is, the secant line approximates the tangent -line, which is the linear function that -best approximates the function at the point $(c, f(c))$. -This is -quantified by the *mean value theorem* which states under our -assumptions on ``f(x)`` that there exists some $\xi$ between $x$ and -$c$ for which: - -```math -f(x) - tl(x) = \frac{f''(\xi)}{2} \cdot (x-c)^2. -``` - - -The term "best" is deserved, as any other straight line will differ -at least in an $(x-c)$ term, which in general is larger than an -$(x-c)^2$ term for $x$ "near" $c$. - - -(This is a consequence of Cauchy's mean value theorem with ``F(c) = f(c) - f'(c)\cdot(c-x)`` and ``G(c) = (c-x)^2`` - -```math -\begin{align*} -\frac{F'(\xi)}{G'(\xi)} &= -\frac{f'(\xi) - f''(\xi)(\xi-x) - f(\xi)\cdot 1}{2(\xi-x)} \\ -&= -f''(\xi)/2\\ -&= \frac{F(c) - F(x)}{G(c) - G(x)}\\ -&= \frac{f(c) - f'(c)(c-x) - (f(x) - f'(x)(x-x))}{(c-x)^2 - (x-x)^2} \\ -&= \frac{f(c) + f'(c)(x-c) - f(x)}{(x-c)^2} -\end{align*} -``` - -That is, ``f(x) = f(c) + f'(c)(x-c) + f''(\xi)/2\cdot(x-c)^2``, or ``f(x)-tl(x)`` is as described.) - - -The secant line also has an interpretation that will generalize - it is the smallest order polynomial that goes through, or *interpolates*, the points $(c,f(c))$ and $(c+h, f(c+h))$. This is obvious from the construction - as this is how the slope is derived - but from the formula itself requires showing $tl(c) = f(c)$ and $tl(c+h) = f(c+h)$. The former is straightforward, as $(c-c) = 0$, so clearly $tl(c) = f(c)$. The latter requires a bit of algebra. - - -We have: - -> The best *linear* approximation at a point ``c`` is related to the *linear* polynomial interpolating the points ``c`` and ``c+h`` as ``h`` goes to ``0``. - -This is the relationship we seek to generalize through our round about approach below: - -> The best approximation at a point ``c`` by a polynomial of degree ``n`` or less is related to the polynomial interpolating through the points ``c, c+h, \dots, c+nh`` as ``h`` goes to ``0``. - -As in the linear case, there is flexibility in the exact points chosen for the interpolation. - ----- - -Now, we take a small detour to define some notation. Instead of -writing our two points as $c$ and $c+h,$ we use $x_0$ and -$x_1$. For any set of points $x_0, x_1, \dots, x_n$, -define the **divided differences** of $f$ inductively, as follows: - -```math -\begin{align} -f[x_0] &= f(x_0) \\ -f[x_0, x_1] &= \frac{f[x_1] - f[x_0]}{x_1 - x_0}\\ -\cdots &\\ -f[x_0, x_1, x_2, \dots, x_n] &= \frac{f[x_1, \dots, x_n] - f[x_0, x_1, x_2, \dots, x_{n-1}]}{x_n - x_0}. -\end{align} -``` - - -We see the first two values look familiar, and to generate more we just take certain ratios akin to those formed when finding a secant line. - - -With this notation the secant line can be re-expressed as: - -```math -sl(x) = f[c] + f[c, c+h] \cdot (x-c). -``` - -If we think of $f[c, c+h]$ as an approximate *first* derivative, we -have an even stronger parallel between a secant line $x=c$ and the -tangent line at $x=c$: ``tl(x) = f(c) + f'(c)\cdot (x-c)``. - -We use `SymPy` to investigate. First we create a *recursive* function to compute the divided differences: - -```julia; -divided_differences(f, x) = f(x) - -function divided_differences(f, x, xs...) - xs = sort(vcat(x, xs...)) - (divided_differences(f, xs[2:end]...) - divided_differences(f, xs[1:end-1]...)) / (xs[end] - xs[1]) -end -``` - -In the following, by adding a `getindex` method, we enable the `[]` notation of Newton to work with symbolic functions, like `u()` defined below, which is used in place of ``f``: - -```julia; -Base.getindex(u::SymFunction, xs...) = divided_differences(u, xs...) - -@syms x::real c::real h::positive u() -ex = u[c, c+h] -``` - -We can take a limit and see the familiar (yet differently represented) value of $u'(c)$: - -```julia; -limit(ex, h => 0) -``` - -The choice of points is flexible. Here we use ``c-h`` and ``c+h``: - -```julia -limit(u[c-h, c+h], h=>0) -``` - - -Now, let's look at: - -```julia; -ex₂ = u[c, c+h, c+2h] -simplify(ex₂) -``` - -Not so bad after simplification. The limit shows this to be an approximation to the second derivative divided by $2$: - -```julia; -limit(ex₂, h => 0) -``` - -(The expression is, up to a divisor of $2$, the second order forward -[difference equation](http://tinyurl.com/n4235xy), a well-known -approximation to $f''$.) - - -This relationship between higher-order divided differences and higher-order derivatives generalizes. This is expressed in this -[theorem](http://tinyurl.com/zjogv83): - -> Suppose $m=x_0 < x_1 < x_2 < \dots < x_n=M$ are distinct points. If $f$ has $n$ -> continuous derivatives then there exists a value $\xi$, where $m < \xi < M$, satisfying: - -```math -f[x_0, x_1, \dots, x_n] = \frac{1}{n!} \cdot f^{(n)}(\xi). -``` - -This immediately applies to the above, where we parameterized by $h$: -$x_0=c, x_1=c+h, x_2 = c+2h$. For then, as $h$ goes to $0$, it must be that $m, M -\rightarrow c$, and so the limit of the divided differences must -converge to $(1/2!) \cdot f^{(2)}(c)$, as $f^{(2)}(\xi)$ converges to $f^{(2)}(c)$. - -A proof based on Rolle's theorem appears in the appendix. - - -## Quadratic approximations; interpolating polynomials - -Why the fuss? The answer comes from a result of Newton on -*interpolating* polynomials. Consider a function $f$ and $n+1$ points -$x_0$, $x_1, \dots, x_n$. Then an interpolating polynomial is a -polynomial of least degree that goes through each point $(x_i, -f(x_i))$. The [Newton form](https://en.wikipedia.org/wiki/Newton_polynomial) of such a -polynomial can be written as: - -```math -\begin{align*} -f[x_0] &+ f[x_0,x_1] \cdot (x-x_0) + f[x_0, x_1, x_2] \cdot (x-x_0) \cdot (x-x_1) + \\ -& \cdots + f[x_0, x_1, \dots, x_n] \cdot (x-x_0)\cdot \cdots \cdot (x-x_{n-1}). -\end{align*} -``` - -The case $n=0$ gives the value $f[x_0] = f(c)$, which can be interpreted as the slope-$0$ line that goes through the point $(c,f(c))$. - -We are familiar with the case $n=1$, with $x_0=c$ and $x_1=c+h$, this becomes our secant-line formula: - -```math -f[c] + f[c, c+h](x-c). -``` - -As mentioned, we can verify directly that it -interpolates the points $(c,f(c))$ and $(c+h, f(c+h))$. He we let `SymPy` do the algebra: - -```julia; -p₁ = u[c] + u[c, c+h] * (x-c) -p₁(x => c) - u(c), p₁(x => c+h) - u(c+h) -``` - - -Now for something new. Take the $n=2$ case with -$x_0 = c$, $x_1 = c + h$, and $x_2 = c+2h$. Then the interpolating polynomial is: - -```math -f[c] + f[c, c+h](x-c) + f[c, c+h, c+2h](x-c)(x-(c+h)). -``` - -We add the next term to our previous polynomial and simplify - -```julia; -p₂ = p₁ + u[c, c+h, c+2h] * (x-c) * (x-(c+h)) -simplify(p₂) -``` - -We can check that this interpolates the three points. Notice that at -$x_0=c$ and $x_1=c+h$, the last term, $f[x_0, x_1, -x_2]\cdot(x-x_0)(x-x_1)$, vanishes, so we already have the polynomial -interpolating there. Only the -value $x_2=c+2h$ remains to be checked: - -```julia; -p₂(x => c+2h) - u(c+2h) -``` - -Hmm, doesn't seem correct - that was supposed to be $0$. The issue isn't the math, it is that SymPy needs to be encouraged to simplify: - -```julia; -simplify(p₂(x => c+2h) - u(c+2h)) -``` - -By contrast, at the point $x=c+3h$ we have no guarantee of interpolation, and indeed don't, as this expression is non always zero: - -```julia; -simplify(p₂(x => c+3h) - u(c+3h)) -``` - -Interpolating polynomials are of interest in their own right, but for now we want to use them as motivation for the best polynomial approximation of a certain degree for a function. Motivated by how the secant line leads to the tangent line, we note that coefficients of the quadratic interpolating polynomial above have limits as $h$ goes to $0$, leaving this polynomial: - -```math -f(c) + f'(c) \cdot (x-c) + \frac{1}{2!} \cdot f''(c) (x-c)^2. -``` - -This is clearly related to the tangent line approximation of $f(x)$ at -$x=c$, but carrying an extra quadratic term. - -Here we visualize the approximations with -the function $f(x) = \cos(x)$ at $c=0$. - -```julia; hold=true -f(x) = cos(x) -a, b = -pi/2, pi/2 -c = 0 -h = 1/4 - -fp = -sin(c) # by hand, or use diff(f), ... -fpp = -cos(c) - - -p = plot(f, a, b, linewidth=5, legend=false, color=:blue) -plot!(p, x->f(c) + fp*(x-c), a, b; color=:green, alpha=0.25, linewidth=5) # tangent line is flat -plot!(p, x->f(c) + fp*(x-c) + (1/2)*fpp*(x-c)^2, a, b; color=:green, alpha=0.25, linewidth=5) # a parabola -p -``` - -This graph illustrates that the extra quadratic term can track the -curvature of the function, whereas the tangent line itself can't. So, -we have a polynomial which is a "better" approximation, is it the best -approximation? - - -The Cauchy mean value theorem, as in the case of the tangent line, will guarantee the existence of $\xi$ between $c$ and $x$, for which - -```math -f(x) - \left(f(c) + f'(c) \cdot(x-c) + \frac{1}{2}\cdot f''(c) \cdot (x-c)^2 \right) = -\frac{1}{3!}f'''(\xi) \cdot (x-c)^3. -``` - -In this sense, the above quadratic polynomial, called the Taylor Polynomial of degree 2, is the best *quadratic* approximation to $f$, as the difference goes to $0$ at a rate of ``(x-c)^3``. - - -The graphs of the secant line and approximating parabola for $h=1/4$ are similar: - - -```julia; hold=true -f(x) = cos(x) -a, b = -pi/2, pi/2 -c = 0 -h = 1/4 - -x0, x1, x2 = c-h, c, c+h - -f0 = divided_differences(f, x0) -fd = divided_differences(f, x0, x1) -fdd = divided_differences(f, x0, x1, x2) - -p = plot(f, a, b, color=:blue, linewidth=5, legend=false) -plot!(p, x -> f0 + fd*(x-x0), a, b, color=:green, alpha=0.25, linewidth=5); -plot!(p, x -> f0 + fd*(x-x0) + fdd * (x-x0)*(x-x1), a,b, color=:green, alpha=0.25, linewidth=5); -p -``` - -Though similar, the graphs are **not** identical, as the interpolating - polynomials aren't the best approximations. For example, in the -tangent-line graph the parabola only intersects the cosine graph at -$x=0$, whereas for the secant-line graph - by definition - the -parabola intersects the graph at least $2$ times and the -interpolating polynomial $3$ times (at $x_0$, $x_1$, and $x_2$). - - - - -##### Example - -Consider the function $f(t) = \log(1 + t)$. We have mentioned that for $t$ small, the value $t$ is a good approximation. A better one becomes: - -```math -f(0) + f'(0) \cdot t + \frac{1}{2} \cdot f''(0) \cdot t^2 = 0 + 1t - \frac{t^2}{2} -``` - -A graph shows the difference: - -```julia; hold=true -f(t) = log(1 + t) -a, b = -1/2, 1 -plot(f, a, b, legend=false, linewidth=5) -plot!(t -> t, a, b) -plot!(t -> t - t^2/2, a, b) -``` - -Though we can see that the tangent line is a good approximation, the -quadratic polynomial tracks the logarithm better farther from $c=0$. - -##### Example - -A wire is bent in the form of a half circle with radius $R$ centered -at $(0,R)$, so the bottom of the wire is at the origin. A bead is -released on the wire at angle $\theta$. As time evolves, the bead will -slide back and forth. How? (Ignoring friction.) - - -Let $U$ be the potential energy, $U=mgh = mgR \cdot (1 - -\cos(\theta))$. The velocity of the object will depend on $\theta$ - -it will be $0$ at the high point, and largest in magnitude at the -bottom - and is given by $v(\theta) = R \cdot d\theta/ dt$. (The bead -moves along the wire so its distance traveled is $R\cdot \Delta -\theta$, this, then, is just the time derivative of distance.) - -By ignoring friction, the total energy is conserved giving: - -```math -K = \frac{1}{2}m v^2 + mgR \cdot (1 - \cos(\theta) = -\frac{1}{2} m R^2 (\frac{d\theta}{dt})^2 + mgR \cdot (1 - \cos(\theta)). -``` - -The value of $1-\cos(\theta)$ inhibits further work which would be possible were there an easier formula there. In fact, we could try the excellent approximation $1 - \theta^2/2$ from the quadratic approximation. Then we have: - -```math -K \approx \frac{1}{2} m R^2 (\frac{d\theta}{dt})^2 + mgR \cdot (1 - \theta^2/2). -``` - -Assuming equality and differentiating in $t$ gives by the chain rule: - -```math -0 = \frac{1}{2} m R^2 2\frac{d\theta}{dt} \cdot \frac{d^2\theta}{dt^2} - mgR \theta\cdot \frac{d\theta}{dt}. -``` - -This can be solved to give this relationship: - -```math -\frac{d^2\theta}{dt^2} = - \frac{g}{R}\theta. -``` - -The solution to this "equation" can be written (in some -parameterization) as $\theta(t)=A\cos \left(\omega t+\phi -\right)$. This motion is the well-studied simple [harmonic -oscillator](https://en.wikipedia.org/wiki/Harmonic_oscillator), a -model for a simple pendulum. - -#### Example: optimization - -Consider the following approach to finding the minimum or maximum of a function: - -* At ``x_k`` fit a quadratic polynomial to ``f(x)`` matching the derivative and second derivative of ``f``. -* Let ``x_{k+1}`` be at the vertex of this fitted quadratic polynomial -* Iterate to convergence - - -The polynomial in question will be the Taylor polynomial of degree ``2``: - -```math -T_2(x_k) = f(x_k) + f'(x_k)(x-x_k) + \frac{f''(x_k)}{2}(x - x_k)^2 -``` - -The vertex of this quadratic polynomial will be when its derivative is ``0`` which can be solved for ``x_{k+1}`` giving: - -```math -x_{k+1} = x_k - \frac{f'(x_k)}{f''(x_k)}. -``` - -This assumes ``f''(x_k)`` is non-zero. - -On inspection, it is seen that this is Newton's method applied to -``f'(x)``. This method, when convergent, finds a zero of ``f'(x)``. We -know that should the algorithm converge, it will have found a critical -point, not necessarily a value for a local extrema. - -## The Taylor polynomial of degree ``n`` - - -Starting with the Newton form of the interpolating polynomial of smallest degree: - -```math -\begin{align*} -f[x_0] &+ f[x_0,x_1] \cdot (x - x_0) + f[x_0, x_1, x_2] \cdot (x - x_0)\cdot(x-x_1) + \\ -& \cdots + f[x_0, x_1, \dots, x_n] \cdot (x-x_0) \cdot \cdots \cdot (x-x_{n-1}). -\end{align*} -``` - -and taking $x_i = c + i\cdot h$, for a given $n$, we have in the limit as $h > 0$ goes to zero that coefficients of this polynomial converge to the coefficients of the *Taylor Polynomial of degree n*: - -```math -f(c) + f'(c)\cdot(x-c) + \frac{f''(c)}{2!}(x-c)^2 + \cdots + \frac{f^{(n)}(c)}{n!} (x-c)^n. -``` - - - -This polynomial will be the best approximation of degree ``n`` or less -to the function $f$, near $c$. The error will be given - again by an -application of the Cauchy mean value theorem: - -```math -\frac{1}{(n+1)!} \cdot f^{(n+1)}(\xi) \cdot (x-c)^n -``` - -for some $\xi$ between $c$ and $x$. - - - -The Taylor polynomial for $f$ about $c$ of degree $n$ can be computed -by taking $n$ derivatives. For such a task, the computer is very -helpful. In `SymPy` the `series` function will compute the Taylor -polynomial for a given $n$. For example, here is the series expansion -to 10 terms of the function $\log(1+x)$ about $c=0$: - - -```julia; hold=true -c, n = 0, 10 -l = series(log(1 + x), x, c, n+1) -``` - -A pattern can be observed. - - - -Using `series`, we can see Taylor polynomials for several familiar functions: - - -```julia; -series(1/(1-x), x, 0, 10) # sum x^i for i in 0:n -``` - -```julia; -series(exp(x), x, 0, 10) # sum x^i/i! for i in 0:n -``` - -```julia; -series(sin(x), x, 0, 10) # sum (-1)^i * x^(2i+1) / (2i+1)! for i in 0:n -``` - - -```julia; -series(cos(x), x, 0, 10) # sum (-1)^i * x^(2i) / (2i)! for i in 0:n -``` - -Each of these last three have a pattern that can be expressed quite succinctly if the denominator is recognized as $n!$. - - -The output of `series` includes a big "Oh" term, which identifies the -scale of the error term, but also gets in the way of using the -output. `SymPy` provides the `removeO` method to strip this. (It is called as `object.removeO()`, as it is a method of an object in SymPy.) - - - -!!! note - A Taylor polynomial of degree ``n`` consists of ``n+1`` terms and an error term. The "Taylor series" is an *infinite* collection of terms, the first ``n+1`` matching the Taylor polynomial of degree ``n``. The fact that series are *infinite* means care must be taken when even talking about their existence, unlike a Tyalor polynomial, which is just a polynomial and exists as long as a sufficient number of derivatives are available. - - - -We define a function to compute Taylor polynomials from a function. The following returns a function, not a symbolic object, using `D`, from `CalculusWithJulia`, which is based on `ForwardDiff.derivative`, to find higher-order derivatives: - -```julia; -function taylor_poly(f, c=0, n=2) - x -> f(c) + sum(D(f, i)(c) * (x-c)^i / factorial(i) for i in 1:n) -end -``` - -With a function, we can compare values. -For example, here we see the difference between the Taylor polynomial and the answer for a small value of $x$: - -```julia; hold=true -a = .1 -f(x) = log(1+x) -Tn = taylor_poly(f, 0, 5) -Tn(a) - f(a) -``` - - -### Plotting - -Let's now visualize a function and the two approximations - the Taylor -polynomial and the interpolating polynomial. We use this function to -generate the interpolating polynomial as a function: - -```julia; -function newton_form(f, xs) - x -> begin - tot = divided_differences(f, xs[1]) - for i in 2:length(xs) - tot += divided_differences(f, xs[1:i]...) * prod([x-xs[j] for j in 1:(i-1)]) - end - tot - end -end -``` - -To see a plot, we have - -```julia; -𝒇(x) = sin(x) -𝒄, 𝒉, 𝒏 = 0, 1/4, 4 -int_poly = newton_form(𝒇, [𝒄 + i*𝒉 for i in 0:𝒏]) -tp = taylor_poly(𝒇, 𝒄, 𝒏) -𝒂, 𝒃 = -pi, pi -plot(𝒇, 𝒂, 𝒃; linewidth=5, label="f") -plot!(int_poly; color=:green, label="interpolating") -plot!(tp; color=:red, label="Taylor") -``` - -To get a better sense, we plot the residual differences here: - -```julia -d1(x) = 𝒇(x) - int_poly(x) -d2(x) = 𝒇(x) - tp(x) -plot(d1, 𝒂, 𝒃; color=:blue, label="interpolating") -plot!(d2; color=:green, label="Taylor") -``` - -The graph should be $0$ at each of the the points in `xs`, which we -can verify in the graph above. Plotting over a wider region shows a -common phenomenon that these polynomials approximate the function near -the values, but quickly deviate away: - - -In this graph we make a plot of the Taylor polynomial for different sizes of $n$ for the function ``f(x) = 1 - \cos(x)``: - - -```julia; hold=true -f(x) = 1 - cos(x) -a, b = -pi, pi -plot(f, a, b, linewidth=5, label="f") -plot!(taylor_poly(f, 0, 2), label="T₂") -plot!(taylor_poly(f, 0, 4), label="T₄") -plot!(taylor_poly(f, 0, 6), label="T₆") -``` - -Though all are good approximations near $c=0$, as more terms are -included, the Taylor polynomial becomes a better approximation over a wider -range of values. - - -##### Example: period of an orbiting satellite - -Kepler's third [law](http://tinyurl.com/y7oa4x2g) of planetary motion states: - -> The square of the orbital period of a planet is directly proportional to the cube of the semi-major axis of its orbit. - -In formulas, $P^2 = a^3 \cdot (4\pi^2) / (G\cdot(M + m))$, where $M$ and $m$ are the respective masses. Suppose a satellite is in low earth orbit with a constant height, $a$. Use a Taylor polynomial to approximate the period using Kepler's third law to relate the quantities. - -Suppose $R$ is the radius of the earth and $h$ the height above the earth assuming $h$ is much smaller than $R$. The mass $m$ of a satellite is negligible to that of the earth, so $M+m=M$ for this purpose. We have: - -```math -P = \frac{2\pi}{\sqrt{G\cdot M}} \cdot (h+R)^{3/2} = \frac{2\pi}{\sqrt{G\cdot M}} \cdot R^{3/2} \cdot (1 + h/R)^{3/2} = P_0 \cdot (1 + h/R)^{3/2}, -``` - -where $P_0$ collects terms that involve the constants. - -We can expand $(1+x)^{3/2}$ to fifth order, to get: - -```math -(1+x)^{3/2} \approx 1 + \frac{3x}{2} + \frac{3x^2}{8} - \frac{1x^3}{16} + \frac{3x^4}{128} -\frac{3x^5}{256} -``` - -Our approximation becomes: - -```math -P \approx P_0 \cdot (1 + \frac{3(h/R)}{2} + \frac{3(h/R)^2}{8} - \frac{(h/R)^3}{16} + \frac{3(h/R)^4}{128} - \frac{3(h/R)^5}{256}). -``` - -Typically, if $h$ is much smaller than $R$ the first term is enough giving a formula like $P \approx P_0 \cdot(1 + \frac{3h}{2R})$. - - -A satellite phone utilizes low orbit satellites to relay phone communications. The [Iridium](http://www.kddi.com/english/business/cloud-network-voice/satellite/iridium/mobile/) system uses satellites with an elevation ``h=780km``. The radius of the earth is $3,959 miles$, the mass of the earth is $5.972 × 10^{24} kg$, and the gravitational [constant](https://en.wikipedia.org/wiki/Gravitational_constant), $G$ is $6.67408 \cdot 10^{-11}$ $m^3/(kg \cdot s^2)$. - -Compare the approximate value with ``1`` term to the exact value. - -```julia; -G = 6.67408e-11 -H = 780 * 1000 -R = 3959 * 1609.34 # 1609 meters per mile -M = 5.972e24 -P0, HR = (2pi)/sqrt(G*M) * R^(3/2), H/R - -Preal = P0 * (1 + HR)^(3/2) -P1 = P0 * (1 + 3*HR/2) -Preal, P1 -``` - -With terms out to the fifth power, we get a better approximation: - -```julia; -P5 = P0 * (1 + 3*HR/2 + 3*HR^2/8 - HR^3/16 + 3*HR^4/128 - 3*HR^5/256) -``` - -The units of the period above are in seconds. That answer here is about ``100`` minutes: - -```julia; -Preal/60 -``` - -When $H$ is much smaller than $R$ the approximation with ``5``th order is -really good, and serviceable with just ``1`` term. Next we check if this -is the same when $H$ is larger than $R$. - ----- - -The height of a [GPS satellite](http://www.gps.gov/systems/gps/space/) is about $12,550$ miles. Compute the period of a circular orbit and compare with the estimates. - -```julia; -Hₛ = 12250 * 1609.34 # 1609 meters per mile -HRₛ = Hₛ/R - -Prealₛ = P0 * (1 + HRₛ)^(3/2) -P1ₛ = P0 * (1 + 3*HRₛ/2) -P5ₛ = P0 * (1 + 3*HRₛ/2 + 3*HRₛ^2/8 - HRₛ^3/16 + 3*HRₛ^4/128 - 3*HRₛ^5/256) - -Prealₛ, P1ₛ, P5ₛ -``` - -We see the Taylor polynomial underestimates badly in this case. A reminder -that these approximations are locally good, but may not be good on all -scales. Here $h \approx 3R$. We can see from this graph -of $(1+x)^{3/2}$ and its ``5``th degree Taylor polynomial $T_5$ that it is a bad approximation when $x > 2$. - -```julia; echo=false -f1(x) = (1+x)^(3/2) -p2(x) = 1 + 3x/2 + 3x^2/8 - x^3/16 + 3x^4/128 - 3x^5/256 -plot(f1, -1, 3, linewidth=4, legend=false) -plot!(p2, -1, 3) -``` - ----- - -Finally, we show how to use the `Unitful` package. This package allows us to define different units, carry these -units through computations, and convert between similar units with -`uconvert`. In this example, we define several units, then show how -they can then be used as constants. - -```julia; hold=true -m, mi, kg, s, hr = u"m", u"mi", u"kg", u"s", u"hr" - -G = 6.67408e-11 * m^3 / kg / s^2 -H = uconvert(m, 12250 * mi) # unit convert miles to meter -R = uconvert(m, 3959 * mi) -M = 5.972e24 * kg - -P0, HR = (2pi)/sqrt(G*M) * R^(3/2), H/R -Preal = P0 * (1 + HR)^(3/2) # in seconds -Preal, uconvert(hr, Preal) # ≈ 11.65 hours -``` - -We see `Preal` has the right units - the units of mass and distance cancel leaving a measure of time - but it is hard to sense how long this is. Converting to hours, helps us see the satellite orbits about twice per day. - - - -##### Example: computing $\log(x)$ - -Where exactly does the value assigned to $\log(5)$ come from? The -value needs to be computed. At some level, many questions resolve down -to the basic operations of addition, subtraction, multiplication, and -division. Preferably not the latter, as division is slow. Polynomials -then should be fast to compute, and so computing logarithms using a -polynomial becomes desirable. - -But how? One can see details of a possible -way -[here](https://github.com/musm/Amal.jl/blob/master/src/log.jl). - -First, there is usually a reduction stage. In this phase, the problem -is transformed in a manner to one involving only a fixed interval of values. For this, -function values of $k$ and $m$ are found so that $x = 2^k \cdot (1+m)$ -*and* $\sqrt{2}/2 < 1+m < \sqrt{2}$. If these are found, then $\log(x)$ can be computed with -$k \cdot \log(2) + \log(1+m)$. The first value - a multiplication - can easily be -computed using pre-computed value of $\log(2)$, the second then *reduces* the problem to an interval. - - -Now, for this problem a further -trick is utilized, writing $s= f/(2+f)$ so that -$\log(1+m)=\log(1+s)-\log(1-s)$ for some small range of $s$ values. These combined make it possible to compute $\log(x)$ for any real $x$. - -To compute $\log(1\pm s)$, we can find a Taylor polynomial. Let's go out to degree $19$ and use `SymPy` to do the work: - -```julia; -@syms s -aₗ = series(log(1 + s), s, 0, 19) -bₗ = series(log(1 - s), s, 0, 19) -a_b = (aₗ - bₗ).removeO() # remove"Oh" not remove"zero" -``` - -This is re-expressed as $2s + s \cdot p$ with $p$ given by: - -```julia; -cancel(a_b - 2s/s) -``` - -Now, $2s = m - s\cdot m$, so the above can be reworked to be $\log(1+m) = m - s\cdot(m-p)$. - - -(For larger values of $m$, a similar, but different approximation, can be used to minimize floating point errors.) - - -How big can the error be between this *approximations* and $\log(1+m)$? We plot to see how big $s$ can be: - -```julia; -@syms v -plot(v/(2+v), sqrt(2)/2 - 1, sqrt(2)-1) -``` - -This shows, $s$ is as big as - -```julia; -Max = (v/(2+v))(v => sqrt(2) - 1) -``` - -The error term is like $2/19 \cdot \xi^{19}$ which is largest at this value of $M$. Large is relative - it is really small: - -```julia; -(2/19)*Max^19 -``` - -Basically that is machine precision. Which means, that as far as can be told on the computer, the value produced by $2s + s \cdot p$ is about as accurate as can be done. - -To try this out to compute $\log(5)$. We have $5 = 2^2(1+0.25)$, so $k=2$ and $m=0.25$. - -```julia -k, m = 2, 0.25 -𝒔 = m / (2+m) -pₗ = 2 * sum(𝒔^(2i)/(2i+1) for i in 1:8) # where the polynomial approximates the logarithm... - -log(1 + m), m - 𝒔*(m-pₗ), log(1 + m) - ( m - 𝒔*(m-pₗ)) - -``` - -The two values differ by less than $10^{-16}$, as advertised. Re-assembling then, we compare the computed values: - -```julia; -Δ = k * log(2) + (m - 𝒔*(m-pₗ)) - log(5) -``` - - -The actual code is different, as the Taylor polynomial isn't -used. The Taylor polynomial is a great approximation near a point, but -there might be better polynomial approximations for all values in an interval. -In this case there is, and that polynomial is used in the production -setting. This makes things a bit more efficient, but the basic idea -remains - for a prescribed accuracy, a polynomial approximation can -be found over a given interval, which can be cleverly utilized to -solve for all applicable values. - - -##### Example: higher order derivatives of the inverse function - -For notational purposes, let ``g(x)`` be the inverse function for ``f(x)``. Assume *both* functions have a Taylor polynomial expansion: - -```math -\begin{align*} -f(x_0 + \Delta_x) &= f(x_0) + a_1 \Delta_x + a_2 (\Delta_x)^2 + \cdots a_n + (\Delta_x)^n + \dots\\ -g(y_0 + \Delta_y) &= g(y_0) + b_1 \Delta_y + b_2 (\Delta_y)^2 + \cdots b_n + (\Delta_y)^n + \dots -\end{align*} -``` - -Then using ``x = g(f(x))``, we have expanding the terms and using ``\approx`` to drop the ``\dots``: - -```math -\begin{align*} -x_0 + \Delta_x &= g(f(x_0 + \Delta_x)) \\ -&\approx g(f(x_0) + \sum_{j=1}^n a_j (\Delta_x)^j) \\ -&\approx g(f(x_0)) + \sum_{i=1}^n b_i \left(\sum_{j=1}^n a_j (\Delta_x)^j \right)^i \\ -&\approx x_0 + \sum_{i=1}^{n-1} b_i \left(\sum_{j=1}^n a_j (\Delta_x)^j\right)^i + b_n \left(\sum_{j=1}^n a_j (\Delta_x)^j\right)^n -\end{align*} -``` - -That is: - -```math -b_n \left(\sum_{j=1}^n a_j (\Delta_x)^j \right)^n = -(x_0 + \Delta_x) - \left( x_0 + \sum_{i=1}^{n-1} b_i \left(\sum_{j=1}^n a_j (\Delta_x)^j \right)^i \right) -``` - -Solving for ``b_n = g^{(n)}(y_0) / n!`` gives the formal expression: - -```math -g^{(n)}(y_0) = n! \cdot \lim_{\Delta_x \rightarrow 0} -\frac{\Delta_x - \sum_{i=1}^{n-1} b_i \left(\sum_{j=1}^n a_j (\Delta_x)^j \right)^i}{ -\left(\sum_{j=1}^n a_j \left(\Delta_x^j\right)^i\right)^n} -``` - -(This is following [Liptaj](https://vixra.org/pdf/1703.0295v1.pdf)). - -We will use `SymPy` to take this limit for the first `4` derivatives. Here is some code that expands ``x + \Delta_x = g(f(x_0 + \Delta_x))`` and then uses `SymPy` to solve: - -```julia; -@syms x₀ Δₓ f′[1:4] g′[1:4] - -as(i) = f′[i]/factorial(i) -bs(i) = g′[i]/factorial(i) - -gᵏs = Any[] -eqns = Any[] -for n ∈ 1:4 - Δy = sum(as(j) * Δₓ^j for j ∈ 1:n) - left = x₀ + Δₓ - right = x₀ + sum(bs(i)*Δy^i for i ∈ 1:n) - - eqn = left ~ right - push!(eqns, eqn) - - gⁿ = g′[n] - ϕ = solve(eqn, gⁿ)[1] - - # replace g′ᵢs in terms of computed f′ᵢs - for j ∈ 1:n-1 - ϕ = subs(ϕ, g′[j] => gᵏs[j]) - end - - L = limit(ϕ, Δₓ => 0) - push!(gᵏs, L) - -end -gᵏs -``` - -We can see the expected `g' = 1/f'` (where the point of evalution is ``g(y) = 1/f'(f^{-1}(y))`` is not written). In addition, we get 3 more formulas, hinting that the answers grow rapidly in terms of their complexity. - -In the above, for each `n`, the code above sets up the two sides, `left` and `right`, of an equation involving the higher-order derivatives of ``g``. For example, when `n=2` we have: - -```julia; -eqns[2] -``` - - -The `solve` function is used to identify ``g^{(n)}`` represented in terms of lower-order derivatives of ``g``. These values have been computed and stored and are then substituted into `ϕ`. Afterwards a limit is taken and the answer recorded. - - - - - -## Questions - -###### Question - -Compute the Taylor polynomial of degree ``10`` for $\sin(x)$ about $c=0$ using `SymPy`. Based on the form, which formula seems appropriate: - -```julia; hold=true; echo=false -choices = [ -"``\\sum_{k=0}^{10} x^k``", -"``\\sum_{k=1}^{10} (-1)^{n+1} x^n/n``", -"``\\sum_{k=0}^{4} (-1)^k/(2k+1)! \\cdot x^{2k+1}``", -"``\\sum_{k=0}^{10} x^n/n!``" -] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -Compute the Taylor polynomial of degree ``10`` for $e^x$ about $c=0$ using `SymPy`. Based on the form, which formula seems appropriate: - -```julia; hold=true; echo=false -choices = [ -"``\\sum_{k=0}^{10} x^k``", -"``\\sum_{k=1}^{10} (-1)^{n+1} x^n/n``", -"``\\sum_{k=0}^{4} (-1)^k/(2k+1)! \\cdot x^{2k+1}``", -"``\\sum_{k=0}^{10} x^n/n!``" -] -answ = 4 -radioq(choices, answ) -``` - - - -###### Question - -Compute the Taylor polynomial of degree ``10`` for $1/(1-x)$ about $c=0$ using `SymPy`. Based on the form, which formula seems appropriate: - -```julia; hold=true; echo=false -choices = [ -"``\\sum_{k=0}^{10} x^k``", -"``\\sum_{k=1}^{10} (-1)^{n+1} x^n/n``", -"``\\sum_{k=0}^{4} (-1)^k/(2k+1)! \\cdot x^{2k+1}``", -"``\\sum_{k=0}^{10} x^n/n!``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Let $T_5(x)$ be the Taylor polynomial of degree ``5`` for the function $\sqrt{1+x}$ about $x=0$. What is the coefficient of the $x^5$ term? - -```julia; hold=true; echo=false -choices = [ -"``7/256``", -"``-5/128``", -"``1/5!``", -"``2/15``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The ``5``th order Taylor polynomial for $\sin(x)$ about $c=0$ is: $x - x^3/3! + x^5/5!$. Use this to find the first ``3`` terms of the Taylor polynomial of $\sin(x^2)$ about $c=0$. - -They are: - -```julia; hold=true; echo=false -choices = [ -"``x^2 - x^6/3! + x^{10}/5!``", -"``x^2``", -"``x^2 \\cdot (x - x^3/3! + x^5/5!)``" -] -answ = 1 -radioq(choices, answ) -``` - - - -###### Question - -A more direct derivation of the form of the Taylor polynomial (here taken about $c=0$) is to *assume* a polynomial form that matches $f$: - -```math -f(x) = a + bx + cx^2 + dx^3 + ex^4 + \cdots -``` - -If this is true, then formally evaluating at $x=0$ gives $f(0) = a$, so $a$ is determined. Similarly, formally differentiating and evaluating at $0$ gives $f'(0) = b$. What is the result of formally differentiating $4$ times and evaluating at $0$: - -```julia; hold=true; echo=false -choices = ["``f''''(0) = e``", -"``f''''(0) = 4 \\cdot 3 \\cdot 2 e = 4! e``", -"``f''''(0) = 0``"] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -How big an error is there in approximating $e^x$ by its ``5``th degree Taylor polynomial about $c=0$, $1 + x + x^2/2! + x^3/3! + x^4/4! + x^5/5!$, over $[-1,1]$? - -The error is known to be $( f^{(6)}(\xi)/6!) \cdot x^6$ for some $\xi$ in $[-1,1]$. - - -* The ``6``th derivative of $e^x$ is still $e^x$: - -```julia; hold=true; echo=false -yesnoq(true) -``` - -* Which is true about the function $e^x$: - -```julia; hold=true; echo=false -choices =["It is increasing", "It is decreasing", "It both increases and decreases"] -answ = 1 -radioq(choices, answ) -``` - - -* The maximum value of $e^x$ over $[-1,1]$ occurs at - -```julia; hold=true; echo=false -choices=["A critical point", "An end point"] -answ = 2 -radioq(choices, answ) -``` - -* Which theorem tells you that for a *continuous* function over *closed* interval, a maximum value will exist? - -```julia; hold=true; echo=false -choices = [ -"The intermediate value theorem", -"The mean value theorem", -"The extreme value theorem"] -answ = 3 -radioq(choices, answ) -``` - -* What is the *largest* possible value of the error: - -```julia; hold=true; echo=false -choices = [ -"``1/6!\\cdot e^1 \\cdot 1^6``", -"``1^6 \\cdot 1 \\cdot 1^6``"] -answ = 1 -radioq(choices,answ) -``` - -###### Question - -The error in using $T_k(x)$ to approximate $e^x$ over the interval $[-1/2, 1/2]$ is $(1/(k+1)!) e^\xi x^{k+1}$, for some $\xi$ in the interval. This is *less* than $1/((k+1)!) e^{1/2} (1/2)^{k+1}$. - -* Why? - -```julia; hold=true; echo=false -choices = [ -L"The function $e^x$ is increasing, so takes on its largest value at the endpoint and the function $|x^n| \leq |x|^n \leq (1/2)^n$", -L"The function has a critical point at $x=1/2$", -L"The function is monotonic in $k$, so achieves its maximum at $k+1$" -] -answ = 1 -radioq(choices, answ) -``` - -Assuming the above is right, find the smallest value $k$ guaranteeing a error no more than $10^{-16}$. - -```julia; hold=true; echo=false -f(k) = 1/factorial(k+1) * exp(1/2) * (1/2)^(k+1) -(f(13) > 1e-16 && f(14) < 1e-16) && numericq(14) -``` - -* The function $f(x) = (1 - x + x^2) \cdot e^x$ has a Taylor polynomial about ``0`` such that all coefficients are rational numbers. Is it true that the numerators are all either ``1`` or prime? (From the 2014 [Putnam](http://kskedlaya.org/putnam-archive/2014.pdf) exam.) - -Here is one way to get all the values bigger than 1: - -```julia; hold=true; -ex = (1 - x + x^2)*exp(x) -Tn = series(ex, x, 0, 100).removeO() -ps = sympy.Poly(Tn, x).coeffs() -qs = numer.(ps) -qs[qs .> 1] |> Tuple # format better for output -``` - -Verify by hand that each of the remaining values is a prime number to answer the question (Or you can use `sympy.isprime.(qs)`). - -Are they all prime or $1$? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -## Appendix - -We mentioned two facts that could use a proof: the Newton form of the interpolating polynomial and the mean value theorem for divided differences. Our explanation tries to emphasize a parallel with the secant line's relationship with the tangent line. The standard way to discuss Taylor polynomials is different (also more direct) and so these two proofs are not in most calculus texts. - -A [proof](https://www.math.uh.edu/~jingqiu/math4364/interpolation.pdf) of the Newton form can be done knowing that the interpolating polynomial is unique and can be expressed either as - -```math -g(x)=a_0 + a_1 (x-x_0) + \cdots + a_n (x-x_0)\cdot\cdots\cdot(x-x_{n-1}) -``` - -*or* in this reversed form - -```math -h(x)=b_0 + b_1 (x-x_n) + b_2(x-x_n)(x-x_{n-1}) + \cdots + b_n (x-x_n)(x-x_{n-1})\cdot\cdots\cdot(x-x_1). -``` - -These two polynomials are of degree $n$ or less and have $u(x) = h(x)-g(x)=0$, by uniqueness. So the coefficients of $u(x)$ are $0$. We have that the coefficient of $x^n$ must be $a_n-b_n$ so $a_n=b_n$. Our goal is to express $a_n$ in terms of $a_{n-1}$ and $b_{n-1}$. Focusing on the $x^{n-1}$ term, we have: - -```math -\begin{align*} -b_n(x-x_n)(x-x_{n-1})\cdot\cdots\cdot(x-x_1) -&- a_n\cdot(x-x_0)\cdot\cdots\cdot(x-x_{n-1}) \\ -&= -a_n [(x-x_1)\cdot\cdots\cdot(x-x_{n-1})] [(x- x_n)-(x-x_0)] \\ -&= -a_n \cdot(x_n - x_0) x^{n-1} + p_{n-2}, -\end{align*} -``` - -where $p_{n-2}$ is a polynomial of at most degree $n-2$. (The expansion of $(x-x_1)\cdot\cdots\cdot(x-x_{n-1}))$ leaves $x^{n-1}$ plus some lower degree polynomial.) Similarly, we have -$a_{n-1}(x-x_0)\cdot\cdots\cdot(x-x_{n-2}) = a_{n-1}x^{n-1} + q_{n-2}$ and -$b_{n-1}(x-x_n)\cdot\cdots\cdot(x-x_2) = b_{n-1}x^{n-1}+r_{n-2}$. Combining, we get that the $x^{n-1}$ term of $u(x)$ is - -```math -(b_{n-1}-a_{n-1}) - a_n(x_n-x_0) = 0. -``` - -On rearranging, this yields $a_n = (b_{n-1}-a_{n-1}) / (x_n - x_0)$. By *induction* - that $a_i=f[x_0, x_1, \dots, x_i]$ and $b_i = f[x_n, x_{n-1}, \dots, x_{n-i}]$ (which has trivial base case) - this is $(f[x_1, \dots, x_n] - f[x_0,\dots x_{n-1}])/(x_n-x_0)$. - -Now, assuming the Newton form is correct, a -[proof](http://tinyurl.com/zjogv83) of the mean value theorem for -divided differences comes down to Rolle's theorem. Starting from the -Newton form of the polynomial and expanding in terms of -$1, x, \dots, x^n$ we see that -$g(x) = p_{n-1}(x) + f[x_0, x_1, \dots,x_n]\cdot x^n$, -where now $p_{n-1}(x)$ is a -polynomial of degree at most $n-1$. That is, the coefficient of -$x^n$ is $f[x_0, x_1, \dots, x_n]$. Consider the function $h(x)=f(x) - g(x)$. -It has zeros $x_0, x_1, \dots, x_n$. - -By Rolle's theorem, between any two such zeros $x_i, x_{i+1}$, $0 \leq i < n$ there must be a zero of the derivative of $h(x)$, say $\xi^1_i$. So $h'(x)$ has zeros $\xi^1_0 < \xi^1_1 < \dots < \xi^1_{n-1}$. - - -We visualize this with $f(x) = \sin(x)$ and $x_i = i$ for $i=0, 1, 2, 3$, The $x_i$ values are indicated with circles, the $\xi^1_i$ values indicated with squares: - -```julia; hold=true; echo=false -f(x) = sin(x) -xs = 0:3 -dd = divided_differences -g(x) = dd(f,0) + dd(f, 0,1)*x + dd(f, 0,1,2)*x*(x-1) + dd(f, 0,1,2,3)*x*(x-1)*(x-2) -h1(x) = f(x) - g(x) -cps = find_zeros(D(h1), -1, 4) -plot(h1, -1/4, 3.25, linewidth=3, legend=false) -scatter!(xs, h1.(xs), markersize=5) -scatter!(cps, h1.(cps), markersize=5, marker=:square) -``` - - - -Again by Rolle's theorem, between any pair of adjacent zeros $\xi^1_i, \xi^1_{i+1}$ there must be a zero $\xi^2_i$ of $h''(x)$. So there are $n-1$ zeros of $h''(x)$. Continuing, we see that there will be -$n+1-3$ zeros of $h^{(3)}(x)$, -$n+1-4$ zeros of $h^{4}(x)$, $\dots$, -$n+1-(n-1)$ zeros of $h^{n-1}(x)$, and finally -$n+1-n$ ($1$) zeros of $h^{(n)}(x)$. Call this last zero $\xi$. It satisfies $x_0 \leq \xi \leq x_n$. Further, -$0 = h^{(n)}(\xi) = f^{(n)}(\xi) - g^{(n)}(\xi)$. But $g$ is a degree $n$ polynomial, so the $n$th derivative is the coefficient of $x^n$ times $n!$. In this case we have $0 = f^{(n)}(\xi) - f[x_0, \dots, x_n] n!$. Rearranging yields the result. diff --git a/CwJ/differentiable_vector_calculus/Project.toml b/CwJ/differentiable_vector_calculus/Project.toml deleted file mode 100644 index c6d07d2..0000000 --- a/CwJ/differentiable_vector_calculus/Project.toml +++ /dev/null @@ -1,15 +0,0 @@ -[deps] -CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b" -Contour = "d38c429a-6771-53c6-b99e-75d170b6e991" -DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" -DifferentialEquations = "0c46a032-eb83-5123-abaf-570d42b7fbaa" -ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" -IntervalSets = "8197267c-284f-5f27-9208-e0e47529a953" -JSON = "682c06a0-de6a-54ab-a142-c8b1cf79cde6" -LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" -MDBM = "dd61e66b-39ce-57b0-8813-509f78be4b4d" -Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" -PyPlot = "d330b81b-6aea-500a-939a-2ce795aea3ee" -QuadGK = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" -Roots = "f2b01f46-fcfa-551c-844a-d8ac1e96c665" -SymPy = "24249f21-da20-56a4-8eb1-6a02cf4ae2e6" diff --git a/CwJ/differentiable_vector_calculus/cache/polar_coordinates.cache b/CwJ/differentiable_vector_calculus/cache/polar_coordinates.cache deleted file mode 100644 index c60435f..0000000 Binary files a/CwJ/differentiable_vector_calculus/cache/polar_coordinates.cache and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/cache/scalar_functions.cache b/CwJ/differentiable_vector_calculus/cache/scalar_functions.cache deleted file mode 100644 index 99023b4..0000000 Binary files a/CwJ/differentiable_vector_calculus/cache/scalar_functions.cache and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/cache/scalar_functions_applications.cache b/CwJ/differentiable_vector_calculus/cache/scalar_functions_applications.cache deleted file mode 100644 index 5b88a36..0000000 Binary files a/CwJ/differentiable_vector_calculus/cache/scalar_functions_applications.cache and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/cache/vector_fields.cache b/CwJ/differentiable_vector_calculus/cache/vector_fields.cache deleted file mode 100644 index 94110da..0000000 Binary files a/CwJ/differentiable_vector_calculus/cache/vector_fields.cache and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/cache/vector_valued_functions.cache b/CwJ/differentiable_vector_calculus/cache/vector_valued_functions.cache deleted file mode 100644 index d8b1599..0000000 Binary files a/CwJ/differentiable_vector_calculus/cache/vector_valued_functions.cache and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/cache/vectors.cache b/CwJ/differentiable_vector_calculus/cache/vectors.cache deleted file mode 100644 index 1fead85..0000000 Binary files a/CwJ/differentiable_vector_calculus/cache/vectors.cache and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/data/hearts.mmd b/CwJ/differentiable_vector_calculus/data/hearts.mmd deleted file mode 100644 index e86a9b7..0000000 --- a/CwJ/differentiable_vector_calculus/data/hearts.mmd +++ /dev/null @@ -1,29 +0,0 @@ -From [bennedich](https://discourse.julialang.org/t/love-in-245-characters-code-golf/20771) - -``` - 0:2e-3:2π .|>d->(P= - fill(5<<11,64 ,25);z=8cis( -d)sin(.46d);P[ 64,:].=10;for -r=0:98,c=0 :5^3 x,y=@.mod(2- -$reim((.016c-r/49im-1-im)z), - 4)-2;4-x^2>√2(y+.5-√√x^2)^ - 2&&(P[c÷2+1,r÷4+1]|=Int( - ")*,h08H¨"[4&4c+1+r& - 3])-40)end;print( - "\e[H\e[1;31m", - join(Char.( - P))) - ); -``` - - - -[New York Times](https://www.nytimes.com/2019/02/14/science/math-algorithm-valentine.html) - -Süss — German for “sweet” — is an interactive widget that allows you to tweak the algebra and customize the heart to your souls’s delight. It was created for Valentine’s Day by Imaginary, a nonprofit organization in Berlin that designs open-source mathematics programs and exhibitions. - -You can stretch and squeeze the heart by moving the two left-most sliders, which change the “a” and “b” parameters; the right-most slider zooms in and out. Better yet, canoodle directly with Süss’s equation and engage in the dialectical interplay between algebra and geometry. (Change that final z³ to a z² to see the heart in its underwear.) - -``` -(x^2+((1+b)*y)^2+z^2-1)^3-x^2*z^3-a*y^2*z^3 -``` diff --git a/CwJ/differentiable_vector_calculus/data/lenape.csv b/CwJ/differentiable_vector_calculus/data/lenape.csv deleted file mode 100644 index d05e3a0..0000000 --- a/CwJ/differentiable_vector_calculus/data/lenape.csv +++ /dev/null @@ -1,72 +0,0 @@ -"","elevation","elev_units","longitude","latitude" -"1",126.85,"meters",-74.2986363,40.7541939 -"2",125.19,"meters",-74.298561,40.754122 -"3",123.52,"meters",-74.298505,40.754049 -"4",121.92,"meters",-74.298435,40.753972 -"5",119.86,"meters",-74.298402,40.753872 -"6",119.86,"meters",-74.298416,40.753818 -"7",119.86,"meters",-74.298393,40.753805 -"8",118.32,"meters",-74.298233,40.753717 -"9",118.48,"meters",-74.298113,40.753706 -"10",118.48,"meters",-74.298079,40.753714 -"11",110.65,"meters",-74.297548,40.753434 -"12",108.68,"meters",-74.297364,40.753392 -"13",108.68,"meters",-74.2973338,40.7533463 -"14",107.67,"meters",-74.2972265,40.7533169 -"15",107.54,"meters",-74.297087,40.753356 -"16",107.54,"meters",-74.2970438,40.7533584 -"17",106.74,"meters",-74.296979,40.753397 -"18",107.69,"meters",-74.29689,40.753533 -"19",108.01,"meters",-74.296812,40.753661 -"20",108.34,"meters",-74.296718,40.753785 -"21",108.93,"meters",-74.296627,40.753874 -"22",109.26,"meters",-74.296514,40.753973 -"23",109.44,"meters",-74.296377,40.754026 -"24",107.8,"meters",-74.296184,40.754049 -"25",108.14,"meters",-74.29596,40.754119 -"26",108.31,"meters",-74.295761,40.754191 -"27",107.08,"meters",-74.295542,40.754277 -"28",106.54,"meters",-74.295345,40.754276 -"29",105.18,"meters",-74.295177,40.754295 -"30",104.93,"meters",-74.2951,40.754358 -"31",103.79,"meters",-74.294976,40.754381 -"32",103.79,"meters",-74.294943,40.754379 -"33",103.62,"meters",-74.294873,40.754362 -"34",103.46,"meters",-74.294805,40.754359 -"35",102.68,"meters",-74.294687,40.754349 -"36",102.78,"meters",-74.294537,40.754269 -"37",100.91,"meters",-74.294341,40.754248 -"38",101.24,"meters",-74.294228,40.754249 -"39",101.15,"meters",-74.294146,40.75427 -"40",100.73,"meters",-74.294043,40.754277 -"41",100.77,"meters",-74.293997,40.75418 -"42",97.54,"meters",-74.293672,40.75418 -"43",97.58,"meters",-74.293539,40.754324 -"44",97.41,"meters",-74.293442,40.754447 -"45",97.02,"meters",-74.29342,40.754555 -"46",96.78,"meters",-74.293397,40.754677 -"47",96.72,"meters",-74.293319,40.754787 -"48",96.98,"meters",-74.2933093,40.7549621 -"49",97.04,"meters",-74.2931914,40.7550903 -"50",95.89,"meters",-74.2931359,40.7552002 -"51",95.48,"meters",-74.293124,40.75528 -"52",95.43,"meters",-74.293142,40.755375 -"53",95.58,"meters",-74.293163,40.7554692 -"54",95.58,"meters",-74.2931806,40.7555174 -"55",95.31,"meters",-74.2930826,40.7555402 -"56",95.45,"meters",-74.2930283,40.7555572 -"57",94.19,"meters",-74.2929292,40.7555853 -"58",93.57,"meters",-74.2928114,40.7556067 -"59",92.9,"meters",-74.2927408,40.7556127 -"60",92.9,"meters",-74.2926921,40.7556257 -"61",91.46,"meters",-74.2926528,40.7556602 -"62",91.46,"meters",-74.2926104,40.7556888 -"63",88.42,"meters",-74.2925696,40.7557042 -"64",88.42,"meters",-74.2925272,40.7556876 -"65",85.62,"meters",-74.2924927,40.7556674 -"66",85.32,"meters",-74.2924503,40.755646 -"67",85.32,"meters",-74.2924377,40.7556222 -"68",85.32,"meters",-74.2924377,40.7555877 -"69",84.49,"meters",-74.2924346,40.7555365 -"70",84.49,"meters",-74.2924236,40.755502 -"71",84.36,"meters",-74.2923562,40.7554961 diff --git a/CwJ/differentiable_vector_calculus/data/somocon.json b/CwJ/differentiable_vector_calculus/data/somocon.json deleted file mode 100644 index 4bd5066..0000000 --- a/CwJ/differentiable_vector_calculus/data/somocon.json +++ /dev/null @@ -1 +0,0 @@ -{"ys":[40.7261855236006,40.72742629854822,40.728667073495835,40.72990784844345,40.73114862339107,40.73238939833869,40.73363017328631,40.734870948233926,40.736111723181544,40.73735249812916,40.73859327307678,40.7398340480244,40.74107482297202,40.742315597919635,40.74355637286725,40.74479714781487,40.74603792276249,40.74727869771011,40.748519472657726,40.749760247605344,40.75100102255296,40.75224179750058,40.7534825724482,40.75472334739582,40.755964122343435,40.75720489729106,40.75844567223868,40.7596864471863,40.760927222133915,40.762167997081534,40.76340877202915,40.76464954697677,40.76589032192439,40.767131096872006,40.768371871819625,40.76961264676724,40.77085342171486,40.77209419666248,40.7733349716101,40.774575746557716,40.775816521505334,40.77705729645295,40.77829807140057,40.77953884634819,40.78077962129581,40.782020396243425,40.78326117119104,40.78450194613866,40.78574272108628,40.7869834960339],"zs":[56.01,51.48,51.74,44.9,45.6,48.76,51.94,61.55,73.28,66.29,63.29,61.46,58.49,50.06,44.18,41.35,39.9,39.84,36.59,39.0,32.85,30.03,33.51,41.28,48.27,62.42,60.31,47.04,45.5,48.2,53.46,64.52,90.94,99.45,88.22,77.94,74.07,67.39,57.02,48.88,45.05,48.09,43.18,39.9,39.08,40.62,33.1,31.9,36.77,43.92,71.71,69.82,57.64,46.98,60.73,60.83,96.1,142.76,137.34,115.32,94.89,84.02,76.91,67.68,56.16,48.75,51.23,49.58,44.37,39.98,41.33,39.08,33.27,34.35,38.63,74.36,76.11,64.27,50.65,61.01,87.64,127.37,144.27,154.88,154.21,121.78,101.25,89.54,78.44,65.93,57.72,53.51,54.97,50.69,45.22,40.48,37.04,35.56,33.26,36.96,77.43,64.79,71.06,63.36,57.43,106.1,134.93,142.34,148.85,155.99,159.95,124.49,103.86,87.97,76.57,68.22,60.35,57.23,55.58,51.15,42.79,40.6,41.94,34.53,34.96,78.13,73.79,75.67,56.64,60.87,96.08,128.31,138.22,143.42,153.35,161.87,155.41,122.68,104.06,87.5,77.81,68.75,64.85,60.88,59.44,47.98,44.57,45.44,42.71,35.23,77.37,77.0,77.61,59.73,63.84,99.1,113.48,135.22,152.77,151.16,158.9,162.84,153.3,120.58,103.98,87.91,79.23,73.92,67.96,64.94,53.54,44.0,50.07,46.93,43.32,88.65,84.02,80.83,65.49,59.07,77.84,103.68,125.53,147.66,160.47,159.49,164.66,165.91,146.24,118.18,102.01,88.47,80.25,74.83,70.48,62.1,49.47,48.1,53.33,48.34,99.44,88.01,84.27,82.97,60.09,69.83,92.51,113.41,135.15,149.8,164.15,160.75,163.8,167.33,147.47,118.41,101.95,88.65,82.18,78.29,69.59,55.74,49.71,51.8,57.88,102.64,104.12,97.82,74.39,68.53,60.14,77.6,99.22,117.22,130.03,153.71,162.09,161.09,164.47,167.19,139.18,114.62,97.97,87.68,83.73,76.03,60.99,51.93,49.41,59.15,114.76,115.62,99.65,95.47,95.31,70.6,60.14,75.41,100.54,122.61,141.39,157.32,160.93,163.17,167.0,158.25,129.02,112.59,96.6,89.03,80.11,65.83,56.0,49.52,54.51,121.11,111.04,114.21,111.8,105.23,102.43,66.8,74.98,99.48,111.59,126.23,144.81,163.02,164.44,165.56,165.01,157.87,129.18,108.67,96.9,87.28,74.81,60.53,53.12,52.44,123.96,126.43,127.51,124.19,107.58,92.65,67.27,74.09,95.48,102.63,123.05,143.73,153.72,166.16,164.96,166.27,166.6,156.49,122.54,104.98,94.24,81.41,70.53,60.0,55.06,132.49,128.75,142.22,132.8,116.22,95.17,78.05,65.54,73.44,94.77,122.63,150.45,159.95,162.54,164.74,165.55,167.95,167.5,141.39,116.28,102.84,88.36,78.59,68.04,59.33,135.9,129.63,135.65,149.65,131.83,109.12,92.13,67.31,71.28,95.2,105.78,133.74,155.48,169.13,167.16,163.63,164.41,168.11,159.05,128.09,109.31,95.72,85.54,79.65,67.48,139.88,133.57,135.32,141.51,145.89,119.2,96.72,77.69,69.21,80.55,95.18,129.15,144.98,165.06,158.96,167.14,159.53,164.18,156.31,133.83,121.74,108.38,93.46,86.54,79.24,142.73,152.05,156.37,143.59,147.61,126.53,103.08,90.34,92.32,71.97,83.82,113.32,135.91,156.39,150.75,161.93,157.2,155.83,154.93,150.37,140.2,119.8,102.33,94.4,90.57,136.21,140.78,156.4,162.33,152.52,138.31,127.93,95.01,89.0,83.32,74.11,102.61,125.18,134.58,135.15,151.46,152.13,149.58,153.91,156.0,162.27,132.0,115.84,105.2,99.94,133.21,139.49,147.1,157.23,167.48,154.91,135.91,111.99,94.57,90.76,78.29,83.11,102.76,112.19,120.68,136.88,137.72,144.85,153.46,161.94,169.27,165.91,133.39,118.98,107.7,136.68,139.87,152.59,158.63,167.04,165.88,161.25,126.56,102.07,93.71,79.93,81.49,96.21,93.82,101.26,124.57,146.39,150.26,147.96,158.73,168.95,173.35,156.12,130.89,116.13,131.8,135.34,149.18,158.94,158.41,153.19,153.96,152.89,109.61,101.17,103.41,91.12,79.35,83.68,90.6,118.7,139.47,152.39,155.23,154.36,163.94,171.28,171.87,141.11,123.49,128.26,127.93,136.81,138.07,139.58,140.93,137.3,139.21,119.46,103.23,100.02,95.77,81.51,93.65,103.9,120.91,126.51,147.94,160.93,159.58,158.97,170.02,173.4,171.77,133.46,129.34,123.57,125.8,127.17,134.63,140.14,141.37,132.18,121.88,112.02,104.23,98.86,81.28,97.91,96.43,113.3,139.2,143.14,155.76,166.2,158.24,165.4,171.77,172.06,140.75,112.75,119.41,120.86,124.88,134.42,145.88,159.11,160.31,147.24,125.13,112.38,100.79,87.49,88.87,99.02,114.24,130.88,155.17,156.39,166.16,161.29,161.34,161.94,158.02,158.28,106.87,113.99,118.24,122.56,133.41,145.66,155.71,165.71,165.42,155.14,127.78,111.46,91.2,91.16,97.31,106.22,134.23,149.23,164.3,163.19,170.52,162.95,168.76,152.24,161.13,121.83,121.15,119.0,121.56,129.72,140.09,153.12,162.86,168.34,168.92,161.0,128.69,108.79,89.1,87.66,99.02,125.32,149.07,162.35,168.22,170.28,171.39,167.62,172.27,174.12,124.29,121.09,122.74,121.68,128.9,135.39,145.94,155.36,157.98,166.38,171.83,161.26,117.38,101.24,92.25,90.26,111.61,132.57,145.7,165.92,169.35,170.59,169.76,175.52,175.58,127.46,123.56,122.96,122.32,125.0,129.83,139.32,144.75,145.12,151.16,159.37,163.38,125.29,108.04,102.34,100.38,100.38,118.72,144.25,162.56,168.8,175.06,173.27,175.9,176.65,129.23,123.85,124.3,124.41,126.73,127.33,130.66,135.53,143.09,163.18,161.45,160.31,146.71,123.03,104.09,100.38,100.38,104.73,136.26,158.26,162.47,172.58,172.73,171.67,176.73,120.55,127.56,125.67,126.14,127.67,129.82,129.3,132.26,143.3,161.39,171.13,173.89,171.66,131.8,107.61,100.38,100.38,102.62,126.85,144.66,154.41,172.86,175.48,172.4,173.72,116.01,126.43,125.59,126.5,131.19,129.04,134.07,136.27,135.63,142.09,159.11,175.01,179.52,138.05,101.75,102.8,100.38,100.38,112.51,127.4,152.04,170.15,172.66,169.62,171.11,110.18,130.36,129.34,129.04,134.03,141.08,139.77,145.16,147.8,149.31,165.49,172.97,175.88,134.81,108.36,103.24,100.38,100.38,106.34,121.12,144.63,160.44,166.56,164.1,169.59,104.51,119.17,134.58,130.33,134.6,141.51,155.8,151.57,159.5,155.76,158.77,163.45,159.48,143.39,111.96,107.48,100.38,100.38,104.99,113.14,132.14,150.97,163.91,158.67,167.88,96.5,107.46,121.18,145.9,141.42,141.48,145.83,143.74,150.62,145.75,144.48,148.36,146.6,149.51,122.77,111.64,104.44,100.38,102.92,111.09,126.52,143.73,159.52,155.23,163.23,90.59,97.17,112.25,134.94,140.29,135.93,137.72,134.54,137.69,132.11,135.01,141.86,149.14,153.19,150.77,116.52,106.07,101.64,102.67,106.72,122.36,136.38,153.14,160.57,154.01,84.27,91.0,99.89,120.09,123.15,120.86,128.41,127.42,126.42,129.29,143.8,149.59,165.05,163.71,156.91,126.17,110.32,103.98,102.7,105.38,113.28,126.0,141.28,156.94,142.7,79.52,85.85,89.72,96.0,103.94,109.11,113.31,125.44,129.86,134.67,139.17,154.89,169.61,178.97,171.75,153.95,122.39,110.25,101.31,104.01,111.03,119.34,127.07,138.48,150.35,78.71,83.85,92.84,103.12,118.74,123.26,124.38,132.82,135.71,139.43,142.35,153.33,166.66,178.73,179.83,177.3,146.91,115.31,103.65,102.19,108.29,120.87,140.26,153.15,161.64,76.53,81.19,85.54,97.94,114.77,120.33,127.56,136.91,150.14,148.19,151.04,158.93,159.35,167.55,176.78,177.43,168.22,122.85,106.91,103.39,105.17,112.83,127.37,149.56,158.63,76.9,80.11,83.05,89.9,103.9,107.88,113.18,128.27,146.95,157.8,156.48,162.97,152.02,155.46,175.3,175.85,165.8,124.73,118.12,105.54,104.85,113.42,120.65,135.84,155.64,76.36,79.74,86.01,86.54,92.08,95.01,105.42,123.23,141.01,149.2,147.39,149.84,141.68,150.58,164.87,169.4,157.16,159.46,134.27,110.78,103.99,110.46,118.75,135.33,152.69,78.1,78.68,86.48,90.58,93.27,92.16,97.28,110.67,126.28,130.56,128.0,128.45,127.66,137.06,149.28,153.26,160.61,169.86,156.81,123.71,103.82,106.59,113.73,127.34,148.38,82.35,80.09,82.09,88.63,95.05,96.56,97.26,100.99,110.38,113.98,120.77,132.22,134.76,147.55,166.07,170.47,169.35,168.89,155.55,134.24,110.96,104.36,108.32,118.05,136.24,84.15,84.23,82.32,87.76,91.64,93.12,96.96,106.33,111.31,106.79,119.35,132.95,147.96,153.55,165.8,178.43,178.35,170.88,165.74,140.36,119.17,107.65,106.75,113.56,127.58,82.99,85.54,84.77,84.53,90.08,95.19,95.25,105.43,118.46,119.85,116.2,129.13,142.15,149.56,155.14,171.6,179.44,174.73,161.75,142.72,129.9,115.26,107.99,111.72,121.99,83.01,84.0,87.7,86.06,85.61,94.36,100.3,105.68,117.92,125.39,115.72,129.22,138.43,141.11,151.63,160.55,168.46,164.25,159.16,150.23,136.74,122.5,113.66,108.45,114.78,84.11,84.63,86.98,90.22,88.1,91.68,100.12,113.41,115.57,121.86,128.81,118.67,129.66,133.23,141.51,145.97,148.93,150.71,162.12,166.27,141.86,130.13,115.63,109.05,112.53,85.03,84.97,87.49,89.39,93.67,94.55,102.32,113.85,127.45,126.0,130.58,124.19,132.07,132.58,131.1,139.71,148.52,162.21,157.78,164.7,149.31,142.46,128.04,109.0,110.09,85.93,87.73,89.92,90.55,93.51,98.52,103.9,118.72,129.55,133.86,136.06,137.44,136.47,146.6,143.25,136.06,157.12,172.07,169.4,168.79,158.26,144.45,138.0,111.86,109.03,87.34,90.39,91.4,93.1,91.44,93.54,97.58,115.07,124.02,123.47,137.17,141.91,149.48,150.24,151.28,139.57,144.66,162.92,170.62,177.08,172.96,151.89,129.57,120.07,112.54],"xs":[-74.3129825592041,-74.311283826828,-74.3095850944519,-74.3078863620758,-74.3061876296997,-74.30448889732361,-74.30279016494751,-74.3010914325714,-74.2993927001953,-74.2976939678192,-74.2959952354431,-74.294296503067,-74.2925977706909,-74.2908990383148,-74.2892003059387,-74.28750157356261,-74.28580284118651,-74.28410410881041,-74.2824053764343,-74.2807066440582,-74.2790079116821,-74.277309179306,-74.2756104469299,-74.2739117145538,-74.2722129821777]} \ No newline at end of file diff --git a/CwJ/differentiable_vector_calculus/data/xy_ys.jl b/CwJ/differentiable_vector_calculus/data/xy_ys.jl deleted file mode 100644 index c2f3d1c..0000000 --- a/CwJ/differentiable_vector_calculus/data/xy_ys.jl +++ /dev/null @@ -1,294 +0,0 @@ -## container of points into vectors n vectors of length N -## N points, each of size n - -## Lesson learned -- this is a very bad idea! -## better to handle the T a different way -evec(T,n) = Tuple(T[] for _ in 1:n) -evec(T, N, n) = Tuple(Vector{T}(undef, N) for _ in 1:n) - - -## julia> @btime xs_ys1(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 83.308 μs (1013 allocations: 172.67 KiB) - -## julia> @btime xs_ys2(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 222.371 μs (2016 allocations: 180.72 KiB) - -## julia> @btime xs_ys3(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 1.003 ms (1019 allocations: 165.20 KiB) - -## julia> @btime xs_ys4(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 1.115 ms (5474 allocations: 210.95 KiB) - -## julia> @btime xs_ys5(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 1.120 ms (5474 allocations: 210.95 KiB) - -## julia> @btime xs_ys6(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 76.604 μs (1008 allocations: 164.63 KiB) - -## julia> @btime xs_ys7(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 74.306 μs (1008 allocations: 164.63 KiB) - -## julia> @btime xs_ys8(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 36.098 μs (2006 allocations: 94.25 KiB) - -## julia> @btime xs_ys9(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 85.732 μs (3006 allocations: 203.63 KiB) - -## .... - - ## THE WINNER, but we would use one with keywords -## julia> @btime xs_ys13a(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 62.768 μs (1003 allocations: 117.28 KiB) - -## julia> @btime xs_ys13akw(vs) setup=(vs=[randn(1000) for i in 1:3]); -## 65.905 μs (1003 allocations: 117.28 KiB) - - -## make a matrix n x N, then go down 1:n -function xs_ys1(vs) - A=hcat(vs...) - Tuple([A[i,:] for i in eachindex(first(vs))]) -end - -## broadcast push! -function xs_ys2(vs) - u = first(vs); N = length(vs) - T = eltype(u); n = length(u) - v0 = evec(T,n) - for v in vs - push!.(v0, v) - end - v0 -end - - -## broadcast push! -function xs_ys2a(vs) - u = first(vs); N = length(vs) - n = length(u) - v0 = Tuple(eltype(u)[] for _ in eachindex(u)) - for v in vs - push!.(v0, v) - end - v0 -end - -## broadcast setindex! -function xs_ys3(vs) - u = first(vs); N = length(vs) - T = eltype(u); n = length(u) - v0 = evec(T,N,n) - for (i,v) in enumerate(vs) - setindex!.(v0, v, i) - end - - v0 -end - -## 10 times faster ~77mus avoiding passing T -function xs_ys3a(vs) - u = first(vs); N = length(vs) - n = length(u) - v0 = Tuple(Vector{eltype(u)}(undef, N) for _ in eachindex(u)) - for (i,v) in enumerate(vs) - setindex!.(v0, v, i) - end - - v0 -end - - -function xs_ys3b(vs) - u = first(vs); N = length(vs) - n = length(u) - v0 = ntuple(_ -> Vector{eltype(u)}(undef, N), n) - for (i,v) in enumerate(vs) - setindex!.(v0, v, i) - end - - v0 -end - -## loop N n -function xs_ys4(vs) - u = first(vs); N = length(vs) - T = eltype(u); n = length(u) - v0 = evec(T,N,n) - - for i in 1:N - for j in 1:n - v0[j][i] = vs[i][j] - end - end - v0 -end - - -## loop N n -function xs_ys4a(vs) - u = first(vs); N = length(vs) - T = eltype(u); n = length(u) - v0 = evec(T,N,n) - - for (i,v) in enumerate(vs) - for j in 1:n - v0[j][i] = v[j] - end - end - v0 -end - -## fast 67mus -function xs_ys4b(vs) - u = first(vs); N = length(vs) - n = length(u) - v0 = Tuple(Vector{eltype(u)}(undef, N) for _ in eachindex(u)) - - for (i,v) in enumerate(vs) - for j in 1:n - v0[j][i] = v[j] - end - end - v0 -end - -## loop n N -function xs_ys5(vs) - u = first(vs); N = length(vs) - T = eltype(u); n = length(u) - v0 = evec(T,N,n) - - for j in 1:n - for i in 1:N - v0[j][i] = vs[i][j] - end - end - v0 -end - -function xs_ys6(vs) - u = first(vs); N = length(vs) - T = eltype(u); n = length(u) - A = Matrix{T}(undef, (N,n)) - for (i,v) in enumerate(vs) - A[i,:] = v - end - Tuple(A[:,i] for i in 1:n) -end - -function xs_ys7(vs) - u = first(vs); N = length(vs) - T = eltype(u); n = length(u) - A = Matrix{T}(undef, (n, N)) - for (i,v) in enumerate(vs) - A[:,i] = v - end - Tuple(A[i, :] for i in 1:n) -end - -# faster but doesn't wotk with plot recipes -# and may be slower once realized -function xs_ys8(vs) - N = length(vs) - u = first(vs); T = eltype(u); n = length(u) - Tuple((vs[j][i] for j in 1:N) for i in 1:n) -end - -function xs_ys9(vs) - N = length(vs) - u = first(vs); T = eltype(u); n = length(u) - Tuple(collect(vs[j][i] for j in 1:N) for i in 1:n) -end - -function xs_ys10(vs) - N = length(vs) - u = first(vs); T = eltype(u); n = length(u) - v0 = evec(T,N, n) - for j in 1:n - v0[j][:] .= (v[j] for v in vs) - end - v0 -end - -# mauro3 https://github.com/JuliaDiffEq/ODE.jl/issues/80 -_pluck(y,i) = eltype(first(y))[el[i] for el in y] -xs_ys11(vs) = Tuple(_pluck(vs, i) for i in eachindex(first(vs))) - -# slower -xs_ys11a(vs) = ntuple(i->_pluck(vs, i), length(first(vs))) - -# one liner -xs_ys11b(vs) = Tuple(eltype(first(vs))[el[i] for el in vs] for i in eachindex(first(vs))) - - -function xs_ys11c(vs) - u = first(vs) - Tuple(eltype(u)[el[i] for el in vs] for i in eachindex(u)) -end -xs_ys11d(vs) = (u=first(vs); Tuple(eltype(u)[el[i] for el in vs] for i in eachindex(u))) -xs_ys11e(vs) = (u=first(vs); ntuple(i->eltype(u)[v[i] for v in vs], length(u))) -xs_ys11f(vs) = (u=first(vs);n::Int=length(u);T::DataType=eltype(u);ntuple(i->eltype(u)[v[i] for v in vs], n)) -xs_ys11g(vs::Vector{Vector{T}}) where {T} = (u=first(vs);n::Int=length(u);ntuple(i->T[v[i] for v in vs], n)) - - -@inline _pluck(T, y, i) = T[el[i] for el in y] -function xs_ys11b(vs) - T = eltype(first(vs)) - Tuple(_pluck(T, vs, i) for i in eachindex(first(vs))) -end - -function xs_ys12(vs) - N = length(vs) - u = first(vs); T = eltype(u); n = length(u) - Tuple(T[el[i] for el in vs] for i in eachindex(first(vs))) - end - -function xs_ys12a(vs) - N = length(vs) - u = first(vs); T = eltype(u); n = length(u) - ntuple( i -> T[el[i] for el in vs], n) -end - - - -function xs_ys11h(vs) - u = first(vs) - T = eltype(u) - Tuple(T[el[i] for el in vs] for i in eachindex(u)) - end - -function xs_ys11i(vs) - u = first(vs) - Tuple(eltype(u)[el[i] for el in vs] for i in eachindex(u)) -end - - -function _xs_ys12(vs, u::Vector{T}) where {T} - Tuple(T[el[i] for el in vs] for i in eachindex(u)) -end - -xs_ys13(vs, u::Vector{T}=first(vs)) where {T} = Tuple(T[el[i] for el in vs] for i in eachindex(u)) - -xs_ys13a(vs, u::Vector{T}=first(vs), n::Val{N}=Val(length(u))) where {T,N} = ntuple(i -> T[el[i] for el in vs], n) - - -## cleaned up -function xs_ys13a(vs, u::Vector{T}=first(vs), n::Val{N}=Val(length(u))) where {T,N} - plucki = i -> T[el[i] for el in vs] - ntuple(plucki, n) -end - - -function xs_ys13akw(vs; u::Vector{T}=first(vs), n::Val{N}=Val(length(u))) where {T,N} - plucki = i -> T[el[i] for el in vs] - ntuple(plucki, n) -end - -function xs_ys13b(vs, u::Vector{T}=first(vs), n::Val{N}=Val(length(u))) where {T,N} - Tuple(T[el[i] for el in vs] for i in eachindex(u)) -end - - -xs_ys14(vs) = Tuple(eltype(vs[1])[vs[i][j] for i in 1:length(vs)] for j in 1:length(vs[1])) - -xs_ys14a(vs) = Tuple([vs[i][j] for i in 1:length(vs)] for j in 1:length(first(vs))) diff --git a/CwJ/differentiable_vector_calculus/figures/200px-Cross_product_vector.svg.png b/CwJ/differentiable_vector_calculus/figures/200px-Cross_product_vector.svg.png deleted file mode 100644 index 19b0def..0000000 Binary files a/CwJ/differentiable_vector_calculus/figures/200px-Cross_product_vector.svg.png and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/figures/DailyWxMap-NCUS-012043.JPG b/CwJ/differentiable_vector_calculus/figures/DailyWxMap-NCUS-012043.JPG deleted file mode 100644 index afde4e8..0000000 Binary files a/CwJ/differentiable_vector_calculus/figures/DailyWxMap-NCUS-012043.JPG and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/figures/Ellipse-def0.svg.png b/CwJ/differentiable_vector_calculus/figures/Ellipse-def0.svg.png deleted file mode 100644 index edf5871..0000000 Binary files a/CwJ/differentiable_vector_calculus/figures/Ellipse-def0.svg.png and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/figures/australia.png b/CwJ/differentiable_vector_calculus/figures/australia.png deleted file mode 100644 index b25d133..0000000 Binary files a/CwJ/differentiable_vector_calculus/figures/australia.png and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/figures/daily-map.jpg b/CwJ/differentiable_vector_calculus/figures/daily-map.jpg deleted file mode 100644 index afde4e8..0000000 Binary files a/CwJ/differentiable_vector_calculus/figures/daily-map.jpg and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/figures/everest.png b/CwJ/differentiable_vector_calculus/figures/everest.png deleted file mode 100644 index c6864e0..0000000 Binary files a/CwJ/differentiable_vector_calculus/figures/everest.png and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/figures/magnetic-field.png b/CwJ/differentiable_vector_calculus/figures/magnetic-field.png deleted file mode 100644 index 30d433f..0000000 Binary files a/CwJ/differentiable_vector_calculus/figures/magnetic-field.png and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/figures/stelvio-pass.png b/CwJ/differentiable_vector_calculus/figures/stelvio-pass.png deleted file mode 100644 index ef3e401..0000000 Binary files a/CwJ/differentiable_vector_calculus/figures/stelvio-pass.png and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/figures/stelvio-pass.png-large b/CwJ/differentiable_vector_calculus/figures/stelvio-pass.png-large deleted file mode 100644 index ef3e401..0000000 Binary files a/CwJ/differentiable_vector_calculus/figures/stelvio-pass.png-large and /dev/null differ diff --git a/CwJ/differentiable_vector_calculus/plots_plotting.jmd b/CwJ/differentiable_vector_calculus/plots_plotting.jmd deleted file mode 100644 index ad17675..0000000 --- a/CwJ/differentiable_vector_calculus/plots_plotting.jmd +++ /dev/null @@ -1,371 +0,0 @@ -# 2D and 3D plots in Julia with Plots - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -import Contour: contours, levels, level, lines, coordinates -using LinearAlgebra -using ForwardDiff -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -frontmatter = ( - title = "2D and 3D plots in Julia with Plots", - description = "Calculus with Julia: 2D and 3D plots in Julia with Plots", - tags = ["CalculusWithJulia", "differentiable_vector_calculus", "2d and 3d plots in julia with plots"], -); -nothing -``` - ----- - -This covers plotting the typical 2D and 3D plots in Julia with the `Plots` package. - -We will make use of some helper functions that will simplify plotting provided by the `CalculusWithJulia` package. As well, we will need to manipulate contours directly, so pull in the `Contours` package, using `import` to avoid name collisions and explicitly listing the methods we will use. - - -## Parametrically described curves in space - -Let $r(t)$ be a vector-valued function with values in $R^d$, $d$ being $2$ or $3$. A familiar example is the equation for a line that travels in the direction of $\vec{v}$ and goes through the point $P$: $r(t) = P + t \cdot \vec{v}$. -A *parametric plot* over $[a,b]$ is the collection of all points $r(t)$ for $a \leq t \leq b$. - - -In `Plots`, parameterized curves can be plotted through two interfaces, here illustrated for $d=2$: `plot(f1, f2, a, b)` or `plot(xs, ys)`. The former is convenient for some cases, but typically we will have a function `r(t)` which is vector-valued, as opposed to a vector of functions. As such, we only discuss the latter. - -An example helps illustrate. Suppose $r(t) = \langle \sin(t), 2\cos(t) \rangle$ and the goal is to plot the full ellipse by plotting over $0 \leq t \leq 2\pi$. As with plotting of curves, the goal would be to take many points between `a` and `b` and from there generate the $x$ values and $y$ values. - -Let's see this with 5 points, the first and last being identical due to the curve: - -```julia -r₂(t) = [sin(t), 2cos(t)] -ts = range(0, stop=2pi, length=5) -``` - -Then we can create the $5$ points easily through broadcasting: - -```julia -vs = r₂.(ts) -``` - -This returns a vector of points (stored as vectors). The plotting function wants two collections: the set of $x$ values for the points and the set of $y$ values. The data needs to be generated differently or reshaped. The function `unzip` above takes data in this style and returns the desired format, returning a tuple with the $x$ values and $y$ values pulled out: - -```julia -unzip(vs) -``` - -To plot this, we "splat" the tuple so that `plot` gets the arguments separately: - -```julia -plot(unzip(vs)...) -``` - -This basic plot is lacking, of course, as there are not enough points. Using more initially is a remedy. - -```julia; hold=true -ts = range(0, 2pi, length=100) -plot(unzip(r₂.(ts))...) -``` - - -As a convenience, `CalculusWithJulia` provides `plot_parametric` to produce this plot. The interval is specified with the `a..b` notation of `IntervalSets` (which is available when the `CalculusWithJulia` package is loaded), the points to plot are adaptively chosen: - -```julia -plot_parametric(0..2pi, r₂) # interval first -``` - - -### Plotting a space curve in 3 dimensions - -A parametrically described curve in 3D is similarly created. For example, a helix is described mathematically by $r(t) = \langle \sin(t), \cos(t), t \rangle$. Here we graph two turns: - -```julia; -r₃(t) = [sin(t), cos(t), t] -plot_parametric(0..4pi, r₃) -``` - -### Adding a vector - -The tangent vector indicates the instantaneous direction one would travel were they walking along the space curve. We can add a tangent vector to the graph. The `quiver!` function would be used to add a 2D vector, but `Plots` does not currently have a `3D` analog. In addition, `quiver!` has a somewhat cumbersome calling pattern when adding just one vector. The `CalculusWithJulia` package defines an `arrow!` function that uses `quiver` for 2D arrows and a simple line for 3D arrows. As a vector incorporates magnitude and direction, but not a position, `arrow!` needs both a point for the position and a vector. - -Here is how we can visualize the tangent vector at a few points on the helix: - -```julia; hold=true -plot_parametric(0..4pi, r₃, legend=false) -ts = range(0, 4pi, length=5) -for t in ts - arrow!(r₃(t), r₃'(t)) -end -``` - - -```julia; echo=false -note("""Adding many arrows this way would be inefficient.""") -``` - -### Setting a viewing angle for 3D plots - -For 3D plots, the viewing angle can make the difference in visualizing the key features. In `Plots`, some backends allow the viewing angle to be set with the mouse by clicking and dragging. Not all do. For such, the `camera` argument is used, as in `camera(azimuthal, elevation)` where the angles are given in degrees. If the $x$-$y$-$z$ coorinates are given, then `elevation` or *inclination*, is the angle between the $z$ axis and the $x-y$ plane (so `90` is a top view) and `azimuthal` is the angle in the $x-y$ plane from the $x$ axes. - - -## Visualizing functions from $R^2 \rightarrow R$ - -If a function $f: R^2 \rightarrow R$ then a graph of $(x,y,f(x,y))$ can be represented in 3D. It will form a surface. Such graphs can be most simply made by specifying a set of $x$ values, a set of $y$ values and a function $f$, as with: - -```julia -xs = range(-2, stop=2, length=100) -ys = range(-pi, stop=pi, length=100) -f(x,y) = x*sin(y) -surface(xs, ys, f) -``` - -Rather than pass in a function, values can be passed in. Here they are generated with a list comprehension. The `y` values are innermost to match the graphic when passing in a function object: - -```julia; hold=true -zs = [f(x,y) for y in ys, x in xs] -surface(xs, ys, zs) -``` - -Remembering if the `ys` or `xs` go first in the above can be -hard. Alternatively, broadcasting can be used. The command `f.(xs,ys)` -would return a vector, as the `xs` and `ys` match in shape--they are both column vectors. But the -*transpose* of `xs` looks like a *row* vector and `ys` looks like a -column vector, so broadcasting will create a matrix of values, as -desired here: - -```julia -surface(xs, ys, f.(xs', ys)) -``` - - -This graph shows the tessalation algorithm. Here only the grid in the $x$-$y$ plane is just one cell: - -```julia; hold=true -xs = ys = range(-1, 1, length=2) -f(x,y) = x*y -surface(xs, ys, f) -``` - -A more accurate graph, can be seen here: - -```julia; hold=true -xs = ys = range(-1, 1, length=100) -f(x,y) = x*y -surface(xs, ys, f) -``` - - -### Contour plots - -Returning to the - - - -The contour plot of $f:R^2 \rightarrow R$ draws level curves, $f(x,y)=c$, for different values of $c$ in the $x-y$ plane. -They are produced in a similar manner as the surface plots: - -```julia; hold=true -xs = ys = range(-2,2, length=100) -f(x,y) = x*y -contour(xs, ys, f) -``` - -The cross in the middle corresponds to $c=0$, as when $x=0$ or $y=0$ then $f(x,y)=0$. - -Similarly, computed values for $f(x,y)$ can be passed in. Here we change the function: - -```julia; hold=true -f(x,y) = 2 - (x^2 + y^2) -xs = ys = range(-2,2, length=100) - -zs = [f(x,y) for y in ys, x in xs] - -contour(xs, ys, zs) -``` - -The chosen levels can be specified by the user through the `levels` argument, as in: - -```julia; hold=true -f(x,y) = 2 - (x^2 + y^2) -xs = ys = range(-2,2, length=100) - -zs = [f(x,y) for y in ys, x in xs] - -contour(xs, ys, zs, levels = [-1.0, 0.0, 1.0]) -``` - -If only a single level is desired, as scalar value can be specified. Though not with all backends for `Plots`. For example, this next graphic shows the $0$-level of the [devil](http://www-groups.dcs.st-and.ac.uk/~history/Curves/Devils.html)'s curve. - -```julia; hold=true -a, b = -1, 2 -f(x,y) = y^4 - x^4 + a*y^2 + b*x^2 -xs = ys = range(-5, stop=5, length=100) -contour(xs, ys, f, levels=[0.0]) -``` - - -Contour plots are well known from the presence of contour lines on many maps. Contour lines indicate constant elevations. A peak is characterized by a series of nested closed paths. The following graph shows this for the peak at $(x,y)=(0,0)$. - -```julia; hold=true -xs = ys = range(-pi/2, stop=pi/2, length=100) -f(x,y) = sinc(sqrt(x^2 + y^2)) # sinc(x) is sin(x)/x -contour(xs, ys, f) -``` - -Contour plots can be filled with colors through the `contourf` function: - -```julia; hold=true -xs = ys = range(-pi/2, stop=pi/2, length=100) -f(x,y) = sinc(sqrt(x^2 + y^2)) - -contourf(xs, ys, f) -``` - - -### Combining surface plots and contour plots - - -In `PyPlot` it is possible to add a contour lines to the surface, or projected onto an axis. -To replicate something similar, though not as satisfying, in `Plots` we use the `Contour` package. - -```julia; hold=true -f(x,y) = 2 + x^2 + y^2 -xs = ys = range(-2, stop=2, length=100) -zs = [f(x,y) for y in ys, x in xs] - -p = surface(xs, ys, zs, legend=false, fillalpha=0.5) - -## we add to the graphic p, then plot -for cl in levels(contours(xs, ys, zs)) - lvl = level(cl) # the z-value of this contour level - for line in lines(cl) - _xs, _ys = coordinates(line) # coordinates of this line segment - _zs = 0 * _xs - plot!(p, _xs, _ys, lvl .+ _zs, alpha=0.5) # add on surface - plot!(p, _xs, _ys, _zs, alpha=0.5) # add on x-y plane - end -end -p -``` - -There is no hidden line calculuation, in place we give the contour lines a transparency through the argument `alpha=0.5`. - - -### Gradient and surface plots - -The surface plot of $f: R^2 \rightarrow R$ plots $(x, y, f(x,y))$ as a surface. The *gradient* of $f$ is $\langle \partial f/\partial x, \partial f/\partial y\rangle$. It is a two-dimensional object indicating the direction at a point $(x,y)$ where the surface has the greatest ascent. Illurating the gradient and the surface on the same plot requires embedding the 2D gradient into the 3D surface. This can be done by adding a constant $z$ value to the gradient, such as $0$. - -```julia; hold=true -f(x,y) = 2 - (x^2 + y^2) -xs = ys = range(-2, stop=2, length=100) -zs = [f(x,y) for y in ys, x in xs] - -surface(xs, ys, zs, camera=(40, 25), legend=false) -p = [-1, 1] # in the region graphed, [-2,2] × [-2, 2] - -f(x) = f(x...) -v = ForwardDiff.gradient(f, p) - - -# add 0 to p and v (two styles) -push!(p, -15) -scatter!(unzip([p])..., markersize=3) - -v = vcat(v, 0) -arrow!(p, v) -``` - - -### The tangent plane - -Let $z = f(x,y)$ describe a surface, and $F(x,y,z) = f(x,y) - z$. The the gradient of $F$ at a point $p$ on the surface, $\nabla F(p)$, will be normal to the surface and for a function, $f(p) + \nabla f \cdot (x-p)$ describes the tangent plane. We can visualize each, as follows: - -```julia; hold=true -f(x,y) = 2 - x^2 - y^2 -f(v) = f(v...) -F(x,y,z) = z - f(x,y) -F(v) = F(v...) -p = [1/10, -1/10] -global p1 = vcat(p, f(p...)) # note F(p1) == 0 -global n⃗ = ForwardDiff.gradient(F, p1) -global tl(x) = f(p) + ForwardDiff.gradient(f, p) ⋅ (x - p) -tl(x,y) = tl([x,y]) - -xs = ys = range(-2, stop=2, length=100) -surface(xs, ys, f) -surface!(xs, ys, tl) -arrow!(p1, 5n⃗) -``` - -From some viewing angles, the normal does not look perpendicular to the tangent plane. This is a quick verification for a randomly chosen point in the $x-y$ plane: - -```julia -a, b = randn(2) -dot(n⃗, (p1 - [a,b, tl(a,b)])) -``` - - - - -### Parameterized surface plots - -As illustrated, we can plot surfaces of the form $(x,y,f(x,y)$. However, not all surfaces are so readily described. For example, if $F(x,y,z)$ is a function from $R^3 \rightarrow R$, then $F(x,y,z)=c$ is a surface of interest. For example, the sphere of radius one is a solution to $F(x,y,z)=1$ where $F(x,y,z) = x^2 + y^2 + z^2$. - -Plotting such generally described surfaces is not so easy, but *parameterized* surfaces can be represented. For example, the sphere as a surface is not represented as a surface of a function, but can be represented in spherical coordinates as parameterized by two angles, essentially an "azimuth" and and "elevation", as used with the `camera` argument. - -Here we define functions that represent $(x,y,z)$ coordinates in terms of the corresponding spherical coordinates $(r, \theta, \phi)$. - -```julia -# spherical: (radius r, inclination θ, azimuth φ) -X(r,theta,phi) = r * sin(theta) * sin(phi) -Y(r,theta,phi) = r * sin(theta) * cos(phi) -Z(r,theta,phi) = r * cos(theta) -``` - -We can parameterize the sphere by plotting values for $x$, $y$, and $z$ produced by a sequence of values for $\theta$ and $\phi$, holding $r=1$: - -```julia; hold=true -thetas = range(0, stop=pi, length=50) -phis = range(0, stop=pi/2, length=50) - -xs = [X(1, theta, phi) for theta in thetas, phi in phis] -ys = [Y(1, theta, phi) for theta in thetas, phi in phis] -zs = [Z(1, theta, phi) for theta in thetas, phi in phis] - -surface(xs, ys, zs) -``` - -```julia; echo=false -note("The above may not work with all backends for `Plots`, even if those that support 3D graphics.") -``` - -For convenience, the `plot_parametric` function from `CalculusWithJulia` can produce these plots using interval notation, `a..b`, and a function: - -```julia; hold=true -F(theta, phi) = [X(1, theta, phi), Y(1, theta, phi), Z(1, theta, phi)] -plot_parametric(0..pi, 0..pi/2, F) -``` - - -### Plotting F(x,y, z) = c - -There is no built in functionality in `Plots` to create surface described by $F(x,y,z) = c$. An example of how to provide some such functionality for `PyPlot` appears [here](https://stackoverflow.com/questions/4680525/plotting-implicit-equations-in-3d ). The non-exported `plot_implicit_surface` function can be used to approximate this. - - -To use it, we see what happens when a sphere if rendered: - -```julia; hold=true -f(x,y,z) = x^2 + y^2 + z^2 - 25 -CalculusWithJulia.plot_implicit_surface(f) -``` - - -This figure comes from a February 14, 2019 article in the [New York Times](https://www.nytimes.com/2019/02/14/science/math-algorithm-valentine.html). It shows an equation for a "heart," as the graphic will illustrate: - -```julia; hold=true -a,b = 1,3 -f(x,y,z) = (x^2+((1+b)*y)^2+z^2-1)^3-x^2*z^3-a*y^2*z^3 -CalculusWithJulia.plot_implicit_surface(f, xlim=-2..2, ylim=-1..1, zlim=-1..2) -``` diff --git a/CwJ/differentiable_vector_calculus/polar_coordinates.jmd b/CwJ/differentiable_vector_calculus/polar_coordinates.jmd deleted file mode 100644 index f9822bd..0000000 --- a/CwJ/differentiable_vector_calculus/polar_coordinates.jmd +++ /dev/null @@ -1,720 +0,0 @@ -# Polar Coordinates and Curves - -This section uses these add-on packages: - -```julia; -using CalculusWithJulia -using Plots -using SymPy -using Roots -using QuadGK -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -frontmatter = ( - title = "Polar Coordinates and Curves", - description = "Calculus with Julia: Polar Coordinates and Curves", - tags = ["CalculusWithJulia", "differentiable_vector_calculus", "polar coordinates and curves"], -); -using LaTeXStrings - -nothing -``` - ----- - -The description of the $x$-$y$ plane via Cartesian coordinates is not -the only possible way, though one that is most familiar. Here we discuss -a different means. Instead of talking about over and up from an -origin, we focus on a direction and a distance from the origin. - - -## Definition of polar coordinates - -Polar coordinates parameterize the plane though an angle $\theta$ made from the positive ray of the $x$ axis and a radius $r$. - -```julia; hold=true; echo=false -theta = pi/6 -rr = 1 - -p = plot(xticks=nothing, yticks=nothing, border=:none, aspect_ratio=:equal, xlim=(-.1,1), ylim=(-.1,3/4)) -plot!([0,rr*cos(theta)], [0, rr*sin(theta)], legend=false, color=:blue, linewidth=2) -scatter!([rr*cos(theta)],[rr*sin(theta)], markersize=3, color=:blue) -arrow!([0,0], [0,3/4], color=:black) -arrow!([0,0], [1,0], color=:black) -ts = range(0, theta, length=50) -rr = 1/6 -plot!(rr*cos.(ts), rr*sin.(ts), color=:black) -plot!([cos(theta),cos(theta)],[0, sin(theta)], linestyle=:dash, color=:gray) -plot!([0,cos(theta)],[sin(theta), sin(theta)], linestyle=:dash, color=:gray) -annotate!([ - (1/5*cos(theta/2), 1/5*sin(theta/2), L"\theta"), - (1/2*cos(theta*1.2), 1/2*sin(theta*1.2), L"r"), - (cos(theta), sin(theta)+.05, L"(x,y)"), - (cos(theta),-.05, L"x"), - (-.05, sin(theta),L"y") - ]) -``` - - -To recover the Cartesian coordinates from the pair $(r,\theta)$, we have these formulas from [right](http://en.wikipedia.org/wiki/Polar_coordinate_system#Converting_between_polar_and_Cartesian_coordinates) triangle geometry: - -```math -x = r \cos(\theta),~ y = r \sin(\theta). -``` - -Each point $(x,y)$ corresponds to several possible values of -$(r,\theta)$, as any integer multiple of $2\pi$ added to $\theta$ will -describe the same point. Except for the origin, there is only one pair -when we restrict to $r > 0$ and $0 \leq \theta < 2\pi$. - -For values in the first and fourth quadrants (the range of -$\tan^{-1}(x)$), we have: - -```math -r = \sqrt{x^2 + y^2},~ \theta=\tan^{-1}(y/x). -``` - -For the other two quadrants, the signs of $y$ and $x$ must be -considered. This is done with the function `atan` when two arguments are used. - - -For example, $(-3, 4)$ would have polar coordinates: - -```julia; -x,y = -3, 4 -rad, theta = sqrt(x^2 + y^2), atan(y, x) -``` - -And reversing - -```julia; -rad*cos(theta), rad*sin(theta) -``` - -This figure illustrates: - -```julia; hold=true; echo=false - -p = plot([-5,5], [0,0], color=:blue, legend=false) -plot!([0,0], [-5,5], color=:blue) -plot!([-3,0], [4,0]) -scatter!([-3], [4]) -title!("(-3,4) Cartesian or (5, 2.21...) polar") - -p -``` - - -The case where $r < 0$ is handled by going ``180`` degrees in the opposite direction, in other -words the point $(r, \theta)$ can be described as well by $(-r,\theta+\pi)$. - -## Parameterizing curves using polar coordinates - -If $r=r(\theta)$, then the parameterized curve $(r(\theta), \theta)$ -is just the set of points generated as $\theta$ ranges over some set -of values. There are many examples of parameterized curves that -simplify what might be a complicated presentation in Cartesian coordinates. - -For example, a circle has the form $x^2 + y^2 = R^2$. Whereas -parameterized by polar coordinates it is just $r(\theta) = R$, or a -constant function. - -The circle centered at $(r_0, \gamma)$ (in polar coordinates) with -radius $R$ has a more involved description in polar coordinates: - -```math -r(\theta) = r_0 \cos(\theta - \gamma) + \sqrt{R^2 - r_0^2\sin^2(\theta - \gamma)}. -``` - -The case where $r_0 > R$ will not be defined for all values of $\theta$, only when $|\sin(\theta-\gamma)| \leq R/r_0$. - -#### Examples - -The `Plots.jl` package provides a means to visualize polar plots through `plot(thetas, rs, proj=:polar)`. For example, to plot a circe with $r_0=1/2$ and $\gamma=\pi/6$ we would have: - -```julia; hold=true -R, r0, gamma = 1, 1/2, pi/6 -r(theta) = r0 * cos(theta-gamma) + sqrt(R^2 - r0^2*sin(theta-gamma)^2) -ts = range(0, 2pi, length=100) -rs = r.(ts) -plot(ts, rs, proj=:polar, legend=false) -``` - -To avoid having to create values for $\theta$ and values for $r$, the `CalculusWithJulia` package provides a helper function, `plot_polar`. To distinguish it from other functions provided by `Plots`, the calling pattern is different. It specifies an interval to plot over by `a..b` and puts that first (this notation for closed intervals is from `IntervalSets`), followed by `r`. Other keyword arguments are passed onto a `plot` call. - -We will use this in the following, as the graphs are a bit more familiar and the calling pattern similar to how we have plotted functions. - -As `Plots` will make a parametric plot when called as `plot(function, function, a,b)`, the above -function creates two such functions using the relationship $x=r\cos(\theta)$ and $y=r\sin(\theta)$. - - -Using `plot_polar`, we can plot circles with the following. We have to be a bit careful for the general circle, as when the center is farther away from the origin that the radius ($R$), then not all angles will be acceptable and there are two functions needed to describe the radius, as this comes from a quadratic equation and both the "plus" and "minus" terms are used. - -```julia; hold=true -R=4; r(t) = R; - -function plot_general_circle!(r0, gamma, R) - # law of cosines has if gamma=0, |theta| <= asin(R/r0) - # R^2 = a^2 + r^2 - 2a*r*cos(theta); solve for a - r(t) = r0 * cos(t - gamma) + sqrt(R^2 - r0^2*sin(t-gamma)^2) - l(t) = r0 * cos(t - gamma) - sqrt(R^2 - r0^2*sin(t-gamma)^2) - - if R < r0 - theta = asin(R/r0)-1e-6 # avoid round off issues - plot_polar!((gamma-theta)..(gamma+theta), r) - plot_polar!((gamma-theta)..(gamma+theta), l) - else - plot_polar!(0..2pi, r) - end -end - -plot_polar(0..2pi, r, aspect_ratio=:equal, legend=false) -plot_general_circle!(2, 0, 2) -plot_general_circle!(3, 0, 1) -``` - - -There are many interesting examples of curves described by polar coordinates. An interesting [compilation](http://www-history.mcs.st-and.ac.uk/Curves/Curves.html) of famous curves is found at the MacTutor History of Mathematics archive, many of which have formulas in polar coordinates. - -##### Example - - -The [rhodenea](http://www-history.mcs.st-and.ac.uk/Curves/Rhodonea.html) curve has - -```math -r(\theta) = a \sin(k\theta) -``` - -```julia; hold=true -a, k = 4, 5 -r(theta) = a * sin(k * theta) -plot_polar(0..pi, r) -``` - -This graph has radius $0$ whenever $\sin(k\theta) = 0$ or $k\theta -=n\pi$. Solving means that it is $0$ at integer multiples of -$\pi/k$. In the above, with $k=5$, there will $5$ zeroes in -$[0,\pi]$. The entire curve is traced out over this interval, the -values from $\pi$ to $2\pi$ yield negative value of $r$, so are -related to values within $0$ to $\pi$ via the relation $(r,\pi -+\theta) = (-r, \theta)$. - -##### Example - -The [folium](http://www-history.mcs.st-and.ac.uk/Curves/Folium.html) -is a somewhat similar looking curve, but has this description: - -```math -r(\theta) = -b \cos(\theta) + 4a \cos(\theta) \sin(2\theta) -``` - - -```julia; -𝒂, 𝒃 = 4, 2 -𝒓(theta) = -𝒃 * cos(theta) + 4𝒂 * cos(theta) * sin(2theta) -plot_polar(0..2pi, 𝒓) -``` - -The folium has radial part $0$ when $\cos(\theta) = 0$ or -$\sin(2\theta) = b/4a$. This could be used to find out what values -correspond to which loop. For our choice of $a$ and $b$ this gives $\pi/2$, $3\pi/2$ or, as -$b/4a = 1/8$, when $\sin(2\theta) = 1/8$ which happens at -$a_0=\sin^{-1}(1/8)/2=0.0626...$ and $\pi/2 - a_0$, $\pi+a_0$ and $3\pi/2 - a_0$. The first folium can be plotted with: - -```julia; -𝒂0 = (1/2) * asin(1/8) -plot_polar(𝒂0..(pi/2-𝒂0), 𝒓) -``` - - -The second - which is too small to appear in the initial plot without zooming in - with - -```julia; -plot_polar((pi/2 - 𝒂0)..(pi/2), 𝒓) -``` - -The third with - -```julia; -plot_polar((pi/2)..(pi + 𝒂0), 𝒓) -``` - -The plot repeats from there, so the initial plot could have been made over $[0, \pi + a_0]$. - -##### Example - -The [Limacon of Pascal](http://www-history.mcs.st-and.ac.uk/Curves/Limacon.html) has - -```math -r(\theta) = b + 2a\cos(\theta) -``` - -```julia; hold=true -a,b = 4, 2 -r(theta) = b + 2a*cos(theta) -plot_polar(0..2pi, r) -``` - -##### Example - -Some curves require a longer parameterization, such as this where we -plot over $[0, 8\pi]$ so that the cosine term can range over an entire -half period: - -```julia; hold=true -r(theta) = sqrt(abs(cos(theta/8))) -plot_polar(0..8pi, r) -``` - -## Area of polar graphs - -Consider the [cardioid](http://www-history.mcs.st-and.ac.uk/Curves/Cardioid.html) described by $r(\theta) = 2(1 + \cos(\theta))$: - -```julia; hold=true -r(theta) = 2(1 + cos(theta)) -plot_polar(0..2pi, r) -``` - -How much area is contained in the graph? - -In some cases it might be possible to translate back into Cartesian -coordinates and compute from there. In practice, this is not usually the best -solution. - -The area can be approximated by wedges (not rectangles). For example, here we see that the area over a given angle is well approximated by the wedge for each of the sectors: - -```julia; hold=true; echo=false -r(theta) = 1/(1 + (1/3)cos(theta)) -p = plot_polar(0..pi/2, r, legend=false, linewidth=3, aspect_ratio=:equal) -t0, t1, t2, t3 = collect(range(pi/12, pi/2 - pi/12, length=4)) - -for s in (t0,t1,t2,t3) - plot!(p, [0, r(s)*cos(s)], [0, r(s)*sin(s)], linewidth=3) -end - -for (s0,s1) in ((t0,t1), (t1, t2), (t2,t3)) - s = (s0 + s1)/2 - plot!(p, [0, ]) - plot!(p, [0,r(s)*cos(s)], [0, r(s)*sin(s)]) - ts = range(s0, s1, length=25) - xs, ys = r(s)*cos.(ts), r(s)*sin.(ts) - plot!(p, xs, ys) - plot!(p, [0,xs[1]],[0,ys[1]]) -end -p -``` - -As well, see this part of a -[Wikipedia](http://en.wikipedia.org/wiki/Polar_coordinate_system#Integral_calculus_.28area.29) -page for a figure. - -Imagine we have $a < b$ and a partition $a=t_0 < t_1 < \cdots < t_n = -b$. Let $\phi_i = (1/2)(t_{i-1} + t_{i})$ be the midpoint. -Then the wedge of radius $r(\phi_i)$ with angle between $t_{i-1}$ and $t_i$ will have area $\pi r(\phi_i)^2 (t_i-t_{i-1}) / (2\pi) = (1/2) r(\phi_i)(t_i-t_{i-1})$, the ratio $(t_i-t_{i-1}) / (2\pi)$ being the angle to the total angle of a circle. - Summing the area of these wedges -over the partition gives a Riemann sum approximation for the integral $(1/2)\int_a^b -r(\theta)^2 d\theta$. This limit of this sum defines the area in polar coordinates. - -> *Area of polar regions*. Let $R$ denote the region bounded by the curve $r(\theta)$ and bounded by the rays -> $\theta=a$ and $\theta=b$ with $b-a \leq 2\pi$, then the area of $R$ is given by: -> -> ``A = \frac{1}{2}\int_a^b r(\theta)^2 d\theta.`` - -So the area of the cardioid, which is parameterized over $[0, 2\pi]$ is found by - -```julia; hold=true -r(theta) = 2(1 + cos(theta)) -@syms theta -(1//2) * integrate(r(theta)^2, (theta, 0, 2PI)) -``` - -##### Example - -The folium has general formula $r(\theta) = -b \cos(\theta) -+4a\cos(\theta)\sin(\theta)^2$. When $a=1$ and $b=1$ a leaf of the -folium is traced out between $\pi/6$ and $\pi/2$. What is the area of -that leaf? - - -An antiderivative exists for arbitrary $a$ and $b$: - -```julia; -@syms 𝐚 𝐛 𝐭heta -𝐫(theta) = -𝐛*cos(theta) + 4𝐚*cos(theta)*sin(theta)^2 -integrate(𝐫(𝐭heta)^2, 𝐭heta) / 2 -``` - -For our specific values, the answer can be computed with: - -```julia; -ex = integrate(𝐫(𝐭heta)^2, (𝐭heta, PI/6, PI/2)) / 2 -ex(𝐚 => 1, 𝐛=>1) -``` - - -###### Example - -Pascal's -[limacon](http://www-history.mcs.st-and.ac.uk/Curves/Limacon.html) is -like the cardioid, but contains an extra loop. When $a=1$ and $b=1$ we -have this graph. - -```julia; hold=true; echo=false -a,b = 1,1 -r(theta) = b + 2a*cos(theta) -p = plot(t->r(t)*cos(t), t->r(t)*sin(t), 0, pi/2 + pi/6, legend=false, color=:blue) -plot!(p, t->r(t)*cos(t), t->r(t)*sin(t), 3pi/2 - pi/6, pi/2 + pi/6, color=:orange) -plot!(p, t->r(t)*cos(t), t->r(t)*sin(t), 3pi/2 - pi/6, 2pi, color=:blue) - -p -``` - -What is the area contained in the outer loop, that is not in the inner loop? - -To answer, we need to find out what range of values in $[0, 2\pi]$ the -inner and outer loops are traced. This will be when $r(\theta) = 0$, -which for the choice of $a$ and $b$ solves $1 + 2\cos(\theta) = 0$, or -$\cos(\theta) = -1/2$. This is $\pi/2 + \pi/6$ and $3\pi/2 - -\pi/6$. The inner loop is traversed between those values and has area: - -```julia; -@syms 𝖺 𝖻 𝗍heta -𝗋(theta) = 𝖻 + 2𝖺*cos(𝗍heta) -𝖾x = integrate(𝗋(𝗍heta)^2 / 2, (𝗍heta, PI/2 + PI/6, 3PI/2 - PI/6)) -𝗂nner = 𝖾x(𝖺=>1, 𝖻=>1) -``` - -The outer area (including the inner loop) is the integral from $0$ to $\pi/2 + \pi/6$ plus that from $3\pi/2 - \pi/6$ to $2\pi$. These areas are equal, so we double the first: - -```julia; -𝖾x1 = 2 * integrate(𝗋(𝗍heta)^2 / 2, (𝗍heta, 0, PI/2 + PI/6)) -𝗈uter = 𝖾x1(𝖺=>1, 𝖻=>1) -``` - -The answer is the difference: - -```julia; -𝗈uter - 𝗂nner -``` - -## Arc length - -The length of the arc traced by a polar graph can also be expressed -using an integral. Again, we partition the interval $[a,b]$ and -consider the wedge from $(r(t_{i-1}), t_{i-1})$ to $(r(t_i), -t_i)$. The curve this wedge approximates will have its arc length -approximated by the line segment connecting the points. Expressing the -points in Cartesian coordinates and simplifying gives the distance -squared as: - -```math -\begin{align} -d_i^2 &= (r(t_i) \cos(t_i) - r(t_{i-1})\cos(t_{i-1}))^2 + (r(t_i) \sin(t_i) - r(t_{i-1})\sin(t_{i-1}))^2\\ -&= r(t_i)^2 - 2r(t_i)r(t_{i-1}) \cos(t_i - t_{i-1}) + r(t_{i-1})^2 \\ -&\approx r(t_i)^2 - 2r(t_i)r(t_{i-1}) (1 - \frac{(t_i - t_{i-1})^2}{2})+ r(t_{i-1})^2 \quad(\text{as} \cos(x) \approx 1 - x^2/2)\\ -&= (r(t_i) - r(t_{i-1}))^2 + r(t_i)r(t_{i-1}) (t_i - t_{i-1})^2. -\end{align} -``` - -As was done with arc length we multiply $d_i$ by $(t_i - t_{i-1})/(t_i - t_{i-1})$ -and move the bottom factor under the square root: - - -```math -\begin{align} -d_i -&= d_i \frac{t_i - t_{i-1}}{t_i - t_{i-1}} \\ -&\approx \sqrt{\frac{(r(t_i) - r(t_{i-1}))^2}{(t_i - t_{i-1})^2} + -\frac{r(t_i)r(t_{i-1}) (t_i - t_{i-1})^2}{(t_i - t_{i-1})^2}} \cdot (t_i - t_{i-1})\\ -&= \sqrt{(r'(\xi_i))^2 + r(t_i)r(t_{i-1})} \cdot (t_i - t_{i-1}).\quad(\text{the mean value theorem}) -\end{align} -``` - -Adding the approximations to the $d_i$ looks like a Riemann sum approximation to the -integral $\int_a^b \sqrt{(r'(\theta)^2) + r(\theta)^2} d\theta$ (with -the extension to the Riemann sum formula needed to derive the arc -length for a parameterized curve). That is: - -> *Arc length of a polar curve*. The arc length of the curve described in polar coordinates by $r(\theta)$ for $a \leq \theta \leq b$ is given by: -> -> ``\int_a^b \sqrt{r'(\theta)^2 + r(\theta)^2} d\theta.`` - -We test this out on a circle with $r(\theta) = R$, a constant. The -integrand simplifies to just $\sqrt{R^2}$ and the integral is from $0$ -to $2\pi$, so the arc length is $2\pi R$, precisely the formula for -the circumference. - -##### Example - -A cardioid is described by $r(\theta) = 2(1 + \cos(\theta))$. What is the arc length from $0$ to $2\pi$? - -The integrand is integrable with antiderivative $4\sqrt{2\cos(\theta) + 2} \cdot \tan(\theta/2)$, -but `SymPy` isn't able to find the integral. Instead we give a numeric answer: - -```julia; hold=true -r(theta) = 2*(1 + cos(theta)) -quadgk(t -> sqrt(r'(t)^2 + r(t)^2), 0, 2pi)[1] -``` - -##### Example - -The [equiangular](http://www-history.mcs.st-and.ac.uk/Curves/Equiangular.html) spiral has polar representation - -```math -r(\theta) = a e^{\theta \cot(b)} -``` - -With $a=1$ and $b=\pi/4$, find the arc length traced out from $\theta=0$ to $\theta=1$. - -```julia; hold=true -a, b = 1, PI/4 -@syms θ -r(theta) = a * exp(theta * cot(b)) -ds = sqrt(diff(r(θ), θ)^2 + r(θ)^2) -integrate(ds, (θ, 0, 1)) -``` - - -##### Example - -An Archimedean [spiral](http://en.wikipedia.org/wiki/Archimedean_spiral) is defined in polar form by - -```math -r(\theta) = a + b \theta -``` - -That is, the radius increases linearly. The crossings of the positive $x$ axis occur at $a + b n 2\pi$, so are evenly spaced out by $2\pi b$. These could be a model for such things as coils of materials of uniform thickness. - -For example, a roll of toilet paper promises ``1000`` sheets with the -[smaller](http://www.phlmetropolis.com/2011/03/the-incredible-shrinking-toilet-paper.php) -$4.1 \times 3.7$ inch size. This $3700$ inch long connected sheet of -paper is wrapped around a paper tube in an Archimedean spiral with -$r(\theta) = d_{\text{inner}}/2 + b\theta$. The entire roll must fit in a standard -dimension, so the outer diameter will be $d_{\text{outer}} = 5~1/4$ inches. Can we figure out -$b$? - -Let $n$ be the number of windings and assume the starting and ending point is on the positive $x$ axis, -$r(2\pi n) = d_{\text{outer}}/2 = d_{\text{inner}}/2 + b (2\pi n)$. Solving for $n$ in terms of $b$ we get: -$n = ( d_{\text{outer}} - d_{\text{inner}})/2 / (2\pi b)$. With this, the following must hold as the total arc length is $3700$ inches. - - -```math -\int_0^{n\cdot 2\pi} \sqrt{r(\theta)^2 + r'(\theta)^2} d\theta = 3700 -``` - -Numerically then we have: - -```julia; hold=true - -dinner = 1 + 5/8 -douter = 5 + 1/4 -r(b,t) = dinner/2 + b*t -rp(b,t) = b -integrand(b,t) = sqrt((r(b,t))^2 + rp(b,t)^2) # sqrt(r^2 + r'^2) -n(b) = (douter - dinner)/2/(2*pi*b) -b = find_zero(b -> quadgk(t->integrand(b,t), 0, n(b)*2*pi)[1] - 3700, (1/100000, 1/100)) -b, b*25.4 -``` - -The value `b` gives a value in inches, the latter in millimeters. - - -## Questions - -###### Question - -Let $r=3$ and $\theta=\pi/8$. In Cartesian coordinates what is $x$? - -```julia; hold=true; echo=false -x,y = 3 * [cos(pi/8), sin(pi/8)] -numericq(x) -``` - -What is $y$? - -```julia; hold=true; echo=false -numericq(y) -``` - -###### Question - -A point in Cartesian coordinates is given by $(-12, -5)$. In has a polar coordinate representation with an angle $\theta$ in $[0,2\pi]$ and $r > 0$. What is $r$? - -```julia; hold=true; echo=false -x,y = -12, -5 -r1, theta1 = sqrt(x^2 + y^2), atan(y,x) -numericq(r1) -``` - -What is $\theta$? - -```julia; hold=true; echo=false -x,y = -12, -5 -r1, theta1 = sqrt(x^2 + y^2), atan(y,x) -numericq(theta1) -``` - -###### Question - -Does $r(\theta) = a \sec(\theta - \gamma)$ describe a line for $0$ when $a=3$ and $\gamma=\pi/4$? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -If yes, what is the $y$ intercept - -```julia; hold=true; echo=false -r(theta) = 3 * sec(theta -pi/4) -val = r(pi/2) -numericq(val) -``` - -What is slope of the line? - -```julia; hold=true; echo=false -r(theta) = 3 * sec(theta -pi/4) -val = (r(pi/2)*sin(pi/2) - r(pi/4)*sin(pi/4)) / (r(pi/2)*cos(pi/2) - r(pi/4)*cos(pi/4)) -numericq(val) -``` - -Does this seem likely: the slope is $-1/\tan(\gamma)$? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -###### Question - -The polar curve $r(\theta) = 2\cos(\theta)$ has tangent lines at most points. This differential representation of the chain rule - -```math -\frac{dy}{dx} = \frac{dy}{d\theta} / \frac{dx}{d\theta}, -``` - -allows the slope to be computed when $y$ and $x$ are the Cartesian -form of the polar curve. For this curve, we have - -```math -\frac{dy}{d\theta} = \frac{d}{d\theta}(2\cos(\theta) \cdot \cos(\theta)),~ \text{ and } -\frac{dx}{d\theta} = \frac{d}{d\theta}(2\sin(\theta) \cdot \cos(\theta)). -``` - -Numerically, what is the slope of the tangent line when $\theta = \pi/4$? - -```julia; hold=true; echo=false -r(theta) = 2cos(theta) -g(theta) = r(theta)*cos(theta) -f(theta) = r(theta)*sin(theta) -c = pi/4 -val = D(g)(c) / D(f)(c) -numericq(val) -``` - -###### Question - -For different values $k > 0$ and $e > 0$ the polar equation - -```math -r(\theta) = \frac{ke}{1 + e\cos(\theta)} -``` - -has a familiar form. The value of $k$ is just a scale factor, but different values of $e$ yield different shapes. - -When $0 < e < 1$ what is the shape of the curve? (Answer by making a plot and guessing.) - -```julia; hold=true; echo=false -choices = [ -"an ellipse", -"a parabola", -"a hyperbola", -"a circle", -"a line" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - - -When $e = 1$ what is the shape of the curve? - -```julia; hold=true; echo=false -choices = [ -"an ellipse", -"a parabola", -"a hyperbola", -"a circle", -"a line" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - - -When $1 < e$ what is the shape of the curve? - -```julia; hold=true; echo=false -choices = [ -"an ellipse", -"a parabola", -"a hyperbola", -"a circle", -"a line" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Find the area of a lobe of the -[lemniscate](http://www-history.mcs.st-and.ac.uk/Curves/Lemniscate.html) -curve traced out by $r(\theta) = \sqrt{\cos(2\theta)}$ between -$-\pi/4$ and $\pi/4$. What is the answer? - -```julia; hold=true; echo=false -choices = [ -"``1/2``", -"``\\pi/2``", -"``1``" -] -answ=1 -radioq(choices, answ) -``` - -###### Question - -Find the area of a lobe of the [eight](http://www-history.mcs.st-and.ac.uk/Curves/Eight.html) curve traced out by $r(\theta) = \cos(2\theta)\sec(\theta)^4$ from $-\pi/4$ to $\pi/4$. Do this numerically. - -```julia; hold=true; echo=false -r(theta) = sqrt(cos(2theta) * sec(theta)^4) -val, _ = quadgk(t -> r(t)^2/2, -pi/4, pi/4) -numericq(val) -``` - -###### Question - -Find the arc length of a lobe of the -[lemniscate](http://www-history.mcs.st-and.ac.uk/Curves/Lemniscate.html) -curve traced out by $r(\theta) = \sqrt{\cos(2\theta)}$ between -$-\pi/4$ and $\pi/4$. What is the answer (numerically)? - -```julia; hold=true; echo=false -r(theta) = sqrt(cos(2theta)) -val, _ = quadgk(t -> sqrt(D(r)(t)^2 + r(t)^2), -pi/4, pi/4) -numericq(val) -``` - -###### Question - - -Find the arc length of a lobe of the [eight](http://www-history.mcs.st-and.ac.uk/Curves/Eight.html) curve traced out by $r(\theta) = \cos(2\theta)\sec(\theta)^4$ from $-\pi/4$ to $\pi/4$. Do this numerically. - -```julia; hold=true; echo=false -r(theta) = sqrt(cos(2theta) * sec(theta)^4) -val, _ = quadgk(t -> sqrt(D(r)(t)^2 + r(t)^2), -pi/4, pi/4) -numericq(val) -``` diff --git a/CwJ/differentiable_vector_calculus/process.jl b/CwJ/differentiable_vector_calculus/process.jl deleted file mode 100644 index aeee5ed..0000000 --- a/CwJ/differentiable_vector_calculus/process.jl +++ /dev/null @@ -1,36 +0,0 @@ -using WeavePynb -using Mustache - -mmd(fname) = mmd_to_html(fname, BRAND_HREF="../toc.html", BRAND_NAME="Calculus with Julia") -## uncomment to generate just .md files -#mmd(fname) = mmd_to_md(fname, BRAND_HREF="../toc.html", BRAND_NAME="Calculus with Julia") - - - - -fnames = ["polar_coordinates", - "vectors", - "vector_valued_functions", - "scalar_functions", - "scalar_functions_applications", - "vector_fields" -] - -function process_file(nm, twice=false) - include("$nm.jl") - mmd_to_md("$nm.mmd") - markdownToHTML("$nm.md") - twice && markdownToHTML("$nm.md") -end - -process_files(twice=false) = [process_file(nm, twice) for nm in fnames] - - -""" -## TODO differential_vector_calcululs - -### Add questions for scalar_function_applications -* Newton's method?? -* optimization. Find least squares for perpendicular distance using the same 3 points...?? - -""" diff --git a/CwJ/differentiable_vector_calculus/scalar_functions.jmd b/CwJ/differentiable_vector_calculus/scalar_functions.jmd deleted file mode 100644 index d5cc3b7..0000000 --- a/CwJ/differentiable_vector_calculus/scalar_functions.jmd +++ /dev/null @@ -1,2129 +0,0 @@ -# Scalar functions - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using ForwardDiff -using SymPy -using Roots -using QuadGK -using JSON -``` - -Also, these methods from the `Contour` package: - -```julia -import Contour: contours, levels, level, lines, coordinates -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using CSV, DataFrames - -const frontmatter = ( - title = "Scalar functions", - description = "Calculus with Julia: Scalar functions", - tags = ["CalculusWithJulia", "differentiable_vector_calculus", "scalar functions"], -); - -nothing -``` - ----- - -Consider a function $f: R^n \rightarrow R$. It has multiple arguments for its input (an $x_1, x_2, \dots, x_n$) and only one, *scalar*, value for an output. Some simple examples might be: - -```math -\begin{align} -f(x,y) &= x^2 + y^2\\ -g(x,y) &= x \cdot y\\ -h(x,y) &= \sin(x) \cdot \sin(y) -\end{align} -``` - - -For two examples from real life consider the elevation Point Query Service (of the [USGS](https://nationalmap.gov/epqs/)) returns the elevation in international feet or meters for a specific latitude/longitude within the United States. The longitude can be associated to an $x$ coordinate, the latitude to a $y$ coordinate, and the elevation a $z$ coordinate, and as long as the region is small enough, the $x$-$y$ coordinates can be thought to lie on a plane. (A flat earth assumption.) - -Similarly, a weather map, say of the United States, may show the maximum predicted temperature for a given day. This describes a function that take a position ($x$, $y$) and returns a predicted temperature ($z$). - - - -Mathematically, we may describe the values $(x,y)$ in terms of a point, $P=(x,y)$ or a vector $\vec{v} = \langle x, y \rangle$ using the identification of a point with a vector. As convenient, we may write any of $f(x,y)$, $f(P)$, or $f(\vec{v})$ to describe the evaluation of $f$ at the value $x$ and $y$ - - - ----- - -Returning to the task at hand, -in `Julia`, defining a scalar function is straightforward, the syntax following mathematical notation: - -```julia; -f(x,y) = x^2 + y^2 -g(x,y) = x * y -h(x,y) = sin(x) * sin(y) -``` - -To call a scalar function for specific values of $x$ and $y$ is also similar to the mathematical case: - -```julia; -f(1,2), g(2, 3), h(3,4) -``` - -It may be advantageous to have the values as a vector or a point, as in `v=[x,y]`. Splatting can be used to turn a vector or tuple into two arguments: - -```julia; -v = [1,2] -f(v...) -``` - -Alternatively, the function may be defined using a vector argument: - -```julia; -f(v) = v[1]^2 + v[2]^2 -``` - -A style required for other packages within the `Julia` ecosystem, as there are many advantages to passing containers of values: they can have arbitrary length, they can be modified inside a function, the functions can be more generic, etc. - -More verbosely, but avoiding index notation, we can use multiline functions: - -```julia; -function g(v) - x, y = v - x * y -end -``` - -Then we have - -```julia; -f(v), g([2,3]) -``` - ----- - -More elegantly, perhaps -- and the approach we will use in this section -- is to mirror the mathematical notation through multiple dispatch. If we define `j` for multiple variables, say with: - -```julia; -j(x,y) = x^2 - 2x*y^2 -``` - -The we can define an alternative method with just a single variable and use splatting to turn it into multiple variables: - -```julia; -j(v) = j(v...) -``` - -The we can call `j` with a vector or point: - -```julia; -j([1,2]) -``` - -or by passing in the individual components: - -```julia; -j(1,2) -``` - ----- - -Following a calculus perspective, we take up the question of how to -visualize scalar functions within `Julia`? Further, how to describe -the change in the function between nearby values? - -## Visualizing scalar functions - -Suppose for the moment that $f:R^2 \rightarrow R$. The equation $z = f(x,y)$ may be visualized by the set of points in ``3``-dimensions $\{(x,y,z): z = f(x,y)\}$. This will render as a surface, and that surface will pass a "vertical line test", in that each $(x,y)$ value corresponds to at most one $z$ value. -We will see alternatives for describing surfaces beyond through a function of the form $z=f(x,y)$. These are similar to how a curve in the $x$-$y$ plane can be described by a function of the form $y=f(x)$ but also through an equation of the form $F(x,y) = c$ or through a parametric description, such as is used for planar curves. For now though we focus on the case where $z=f(x,y)$. - - - -In `Julia`, plotting such a surface requires a generalization to plotting a univariate function where, typically, a grid of evenly spaced values is given between some $a$ and $b$, the corresponding $y$ or $f(x)$ values are found, and then the points are connected in a dot-to-dot manner. - -Here, a two-dimensional grid of $x$-$y$ values needs specifying, and the corresponding $z$ values found. As the grid will be assumed to be regular only the $x$ and $y$ values need specifying, the set of pairs can be computed. The $z$ values, it will be seen, are easily computed. This cloud of points is plotted and each cell in the $x$-$y$ plane is plotted with a surface giving the $x$-$y$-$z$, ``3``-dimensional, view. One way to plot such a surface is to tessalate the cell and then for each triangle, represent a plane made up of the ``3`` boundary points. - - -Here is an example: - -```julia -𝒇(x, y) = x^2 + y^2 -``` - -```julia; -xs = range(-2, 2, length=100) -ys = range(-2, 2, length=100) - -surface(xs, ys, 𝒇) -``` - -The `surface` function will generate the surface. - -!!! note - Using `surface` as a function name is equivalent to `plot(xs, ys, f, seriestype=:surface)`. - -We can also use `surface(xs, ys, zs)` where `zs` is not a vector, but -rather a *matrix* of values corresponding to a grid described by the -`xs` and `ys`. A matrix is a rectangular collection of values indexed -by row and column through indices `i` and `j`. Here the values in `zs` -should satisfy: the $i$th row and $j$th column entry should be $z_{ij} -= f(x_i, y_j)$ where $x_i$ is the $i$th entry from the `xs` and $y_j$ -the $j$th entry from the `ys`. - -We can generate this using a comprehension: - -```julia; hold=true -zs = [𝒇(x,y) for y in ys, x in xs] -surface(xs, ys, zs) -``` - -If remembering that the $y$ values go first, and then the $x$ values in the above is too hard, then an alternative can be used. Broadcasting `f.(xs,ys)` may not make sense, were the `xs` and `ys` not of commensurate lengths, and when it does, this call pairs off `xs` and `ys` values and passes them to `f`. What is desired here is different, where for each `xs` value there are pairs for each of the `ys` values. The syntax `xs'` can ve viewed as creating a *row* vector, where `xs` is a *column* vector. Broadcasting will create a *matrix* of values in this case. So the following is identical to the above: - -```julia; -surface(xs, ys, 𝒇.(xs', ys)) -``` - -(This is still subtle. The use of the adjoint operation on `ys` will error if the dimensions are not square, but will produce an incorrect surface if not. It would be best to simply pass the function and let `Plots` handle this detail which for the alternative `Makie` is reversed.) - ----- - -An alternate to `surface` is `wireframe` -- which may not use shading in all backenends. This displays a grid in the $x$-$y$ plane mapped to the surface: - -```julia; hold=true -xs = ys = range(-2,2, length=10) # downsample to see the frame -wireframe(xs, ys, 𝒇) # gr() or pyplot() wireplots render better than plotly() -``` - - - -##### Example - -The surface $f(x,y) = x^2 - y^2$ has a "saddle," as this shows: - -```julia;hold =true -f(x,y) = x^2 - y^2 -xs = ys = range(-2, 2, length=100) -surface(xs, ys, f) -``` - -##### Example - -As mentioned. In plots of univariate functions, a dot-to-dot algorithm is followed. For surfaces, the two dots are replaced by four points, which over determines a plane. Some choice is made to partition that rectangle into two triangles, and for each triangle, the ``3`` resulting points determines a plane, which can be suitably rendered. - -We can see this in the default `gr` toolkit by forcing the surface to show just one cell, as the `xs` and `ys` below only contain $2$ values: - -```julia; hold=true -xs = [-1,1]; ys = [-1,1] -f(x,y) = x*y -surface(xs, ys, f) -``` - - -Compare this, to the same region, but with many cells to represent the surface: - -```julia; hold=true -xs = ys = range(-1, 1, length=100) -f(x,y) = x*y -surface(xs, ys, f) -``` - - -### Contour plots and heatmaps - -```julia; echo=false -#n,m = 25,50 -#xs = range(-74.3129825592041, -74.2722129821777, length=n) -#ys = range(40.7261855236006, 40.7869834960339, length=m) -#d = DataFrame(xs =reshape([m[1] for m in [(xi,yi) for xi in xs, yi in ys]], (n*m),), -# ys = reshape([m[2] for m in [(xi,yi) for xi in xs, yi in ys]], (n*m,))) -# In RCall -#using RCall -#z = R""" -#library(elevatr) -#z = get_elev_point($d, prj="+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs") -#z = data.frame(z) -#""" -#elev = rcopy(DataFrame, z).elevation -#zs = reshape(elev, m, n) -#D = Dict(:xs => xs, :ys=>ys, :zs => elev) -#io = open("data/somocon.json", "w") -#JSON.print(io, D) -#close(io) -nothing -``` - - - -Consider the example of latitude, longitude, and elevation data describing a surface. The following graph is generated from such data, which was retrieved from the USGS website for a given area. The grid points are chosen about every ``150``m, so this is not too fine grained. - -```julia; echo=false -somocon = """ -{"ys":[40.7261855236006,40.72742629854822,40.728667073495835,40.72990784844345,40.73114862339107,40.73238939833869,40.73363017328631,40.734870948233926,40.736111723181544,40.73735249812916,40.73859327307678,40.7398340480244,40.74107482297202,40.742315597919635,40.74355637286725,40.74479714781487,40.74603792276249,40.74727869771011,40.748519472657726,40.749760247605344,40.75100102255296,40.75224179750058,40.7534825724482,40.75472334739582,40.755964122343435,40.75720489729106,40.75844567223868,40.7596864471863,40.760927222133915,40.762167997081534,40.76340877202915,40.76464954697677,40.76589032192439,40.767131096872006,40.768371871819625,40.76961264676724,40.77085342171486,40.77209419666248,40.7733349716101,40.774575746557716,40.775816521505334,40.77705729645295,40.77829807140057,40.77953884634819,40.78077962129581,40.782020396243425,40.78326117119104,40.78450194613866,40.78574272108628,40.7869834960339],"zs":[56.01,51.48,51.74,44.9,45.6,48.76,51.94,61.55,73.28,66.29,63.29,61.46,58.49,50.06,44.18,41.35,39.9,39.84,36.59,39.0,32.85,30.03,33.51,41.28,48.27,62.42,60.31,47.04,45.5,48.2,53.46,64.52,90.94,99.45,88.22,77.94,74.07,67.39,57.02,48.88,45.05,48.09,43.18,39.9,39.08,40.62,33.1,31.9,36.77,43.92,71.71,69.82,57.64,46.98,60.73,60.83,96.1,142.76,137.34,115.32,94.89,84.02,76.91,67.68,56.16,48.75,51.23,49.58,44.37,39.98,41.33,39.08,33.27,34.35,38.63,74.36,76.11,64.27,50.65,61.01,87.64,127.37,144.27,154.88,154.21,121.78,101.25,89.54,78.44,65.93,57.72,53.51,54.97,50.69,45.22,40.48,37.04,35.56,33.26,36.96,77.43,64.79,71.06,63.36,57.43,106.1,134.93,142.34,148.85,155.99,159.95,124.49,103.86,87.97,76.57,68.22,60.35,57.23,55.58,51.15,42.79,40.6,41.94,34.53,34.96,78.13,73.79,75.67,56.64,60.87,96.08,128.31,138.22,143.42,153.35,161.87,155.41,122.68,104.06,87.5,77.81,68.75,64.85,60.88,59.44,47.98,44.57,45.44,42.71,35.23,77.37,77.0,77.61,59.73,63.84,99.1,113.48,135.22,152.77,151.16,158.9,162.84,153.3,120.58,103.98,87.91,79.23,73.92,67.96,64.94,53.54,44.0,50.07,46.93,43.32,88.65,84.02,80.83,65.49,59.07,77.84,103.68,125.53,147.66,160.47,159.49,164.66,165.91,146.24,118.18,102.01,88.47,80.25,74.83,70.48,62.1,49.47,48.1,53.33,48.34,99.44,88.01,84.27,82.97,60.09,69.83,92.51,113.41,135.15,149.8,164.15,160.75,163.8,167.33,147.47,118.41,101.95,88.65,82.18,78.29,69.59,55.74,49.71,51.8,57.88,102.64,104.12,97.82,74.39,68.53,60.14,77.6,99.22,117.22,130.03,153.71,162.09,161.09,164.47,167.19,139.18,114.62,97.97,87.68,83.73,76.03,60.99,51.93,49.41,59.15,114.76,115.62,99.65,95.47,95.31,70.6,60.14,75.41,100.54,122.61,141.39,157.32,160.93,163.17,167.0,158.25,129.02,112.59,96.6,89.03,80.11,65.83,56.0,49.52,54.51,121.11,111.04,114.21,111.8,105.23,102.43,66.8,74.98,99.48,111.59,126.23,144.81,163.02,164.44,165.56,165.01,157.87,129.18,108.67,96.9,87.28,74.81,60.53,53.12,52.44,123.96,126.43,127.51,124.19,107.58,92.65,67.27,74.09,95.48,102.63,123.05,143.73,153.72,166.16,164.96,166.27,166.6,156.49,122.54,104.98,94.24,81.41,70.53,60.0,55.06,132.49,128.75,142.22,132.8,116.22,95.17,78.05,65.54,73.44,94.77,122.63,150.45,159.95,162.54,164.74,165.55,167.95,167.5,141.39,116.28,102.84,88.36,78.59,68.04,59.33,135.9,129.63,135.65,149.65,131.83,109.12,92.13,67.31,71.28,95.2,105.78,133.74,155.48,169.13,167.16,163.63,164.41,168.11,159.05,128.09,109.31,95.72,85.54,79.65,67.48,139.88,133.57,135.32,141.51,145.89,119.2,96.72,77.69,69.21,80.55,95.18,129.15,144.98,165.06,158.96,167.14,159.53,164.18,156.31,133.83,121.74,108.38,93.46,86.54,79.24,142.73,152.05,156.37,143.59,147.61,126.53,103.08,90.34,92.32,71.97,83.82,113.32,135.91,156.39,150.75,161.93,157.2,155.83,154.93,150.37,140.2,119.8,102.33,94.4,90.57,136.21,140.78,156.4,162.33,152.52,138.31,127.93,95.01,89.0,83.32,74.11,102.61,125.18,134.58,135.15,151.46,152.13,149.58,153.91,156.0,162.27,132.0,115.84,105.2,99.94,133.21,139.49,147.1,157.23,167.48,154.91,135.91,111.99,94.57,90.76,78.29,83.11,102.76,112.19,120.68,136.88,137.72,144.85,153.46,161.94,169.27,165.91,133.39,118.98,107.7,136.68,139.87,152.59,158.63,167.04,165.88,161.25,126.56,102.07,93.71,79.93,81.49,96.21,93.82,101.26,124.57,146.39,150.26,147.96,158.73,168.95,173.35,156.12,130.89,116.13,131.8,135.34,149.18,158.94,158.41,153.19,153.96,152.89,109.61,101.17,103.41,91.12,79.35,83.68,90.6,118.7,139.47,152.39,155.23,154.36,163.94,171.28,171.87,141.11,123.49,128.26,127.93,136.81,138.07,139.58,140.93,137.3,139.21,119.46,103.23,100.02,95.77,81.51,93.65,103.9,120.91,126.51,147.94,160.93,159.58,158.97,170.02,173.4,171.77,133.46,129.34,123.57,125.8,127.17,134.63,140.14,141.37,132.18,121.88,112.02,104.23,98.86,81.28,97.91,96.43,113.3,139.2,143.14,155.76,166.2,158.24,165.4,171.77,172.06,140.75,112.75,119.41,120.86,124.88,134.42,145.88,159.11,160.31,147.24,125.13,112.38,100.79,87.49,88.87,99.02,114.24,130.88,155.17,156.39,166.16,161.29,161.34,161.94,158.02,158.28,106.87,113.99,118.24,122.56,133.41,145.66,155.71,165.71,165.42,155.14,127.78,111.46,91.2,91.16,97.31,106.22,134.23,149.23,164.3,163.19,170.52,162.95,168.76,152.24,161.13,121.83,121.15,119.0,121.56,129.72,140.09,153.12,162.86,168.34,168.92,161.0,128.69,108.79,89.1,87.66,99.02,125.32,149.07,162.35,168.22,170.28,171.39,167.62,172.27,174.12,124.29,121.09,122.74,121.68,128.9,135.39,145.94,155.36,157.98,166.38,171.83,161.26,117.38,101.24,92.25,90.26,111.61,132.57,145.7,165.92,169.35,170.59,169.76,175.52,175.58,127.46,123.56,122.96,122.32,125.0,129.83,139.32,144.75,145.12,151.16,159.37,163.38,125.29,108.04,102.34,100.38,100.38,118.72,144.25,162.56,168.8,175.06,173.27,175.9,176.65,129.23,123.85,124.3,124.41,126.73,127.33,130.66,135.53,143.09,163.18,161.45,160.31,146.71,123.03,104.09,100.38,100.38,104.73,136.26,158.26,162.47,172.58,172.73,171.67,176.73,120.55,127.56,125.67,126.14,127.67,129.82,129.3,132.26,143.3,161.39,171.13,173.89,171.66,131.8,107.61,100.38,100.38,102.62,126.85,144.66,154.41,172.86,175.48,172.4,173.72,116.01,126.43,125.59,126.5,131.19,129.04,134.07,136.27,135.63,142.09,159.11,175.01,179.52,138.05,101.75,102.8,100.38,100.38,112.51,127.4,152.04,170.15,172.66,169.62,171.11,110.18,130.36,129.34,129.04,134.03,141.08,139.77,145.16,147.8,149.31,165.49,172.97,175.88,134.81,108.36,103.24,100.38,100.38,106.34,121.12,144.63,160.44,166.56,164.1,169.59,104.51,119.17,134.58,130.33,134.6,141.51,155.8,151.57,159.5,155.76,158.77,163.45,159.48,143.39,111.96,107.48,100.38,100.38,104.99,113.14,132.14,150.97,163.91,158.67,167.88,96.5,107.46,121.18,145.9,141.42,141.48,145.83,143.74,150.62,145.75,144.48,148.36,146.6,149.51,122.77,111.64,104.44,100.38,102.92,111.09,126.52,143.73,159.52,155.23,163.23,90.59,97.17,112.25,134.94,140.29,135.93,137.72,134.54,137.69,132.11,135.01,141.86,149.14,153.19,150.77,116.52,106.07,101.64,102.67,106.72,122.36,136.38,153.14,160.57,154.01,84.27,91.0,99.89,120.09,123.15,120.86,128.41,127.42,126.42,129.29,143.8,149.59,165.05,163.71,156.91,126.17,110.32,103.98,102.7,105.38,113.28,126.0,141.28,156.94,142.7,79.52,85.85,89.72,96.0,103.94,109.11,113.31,125.44,129.86,134.67,139.17,154.89,169.61,178.97,171.75,153.95,122.39,110.25,101.31,104.01,111.03,119.34,127.07,138.48,150.35,78.71,83.85,92.84,103.12,118.74,123.26,124.38,132.82,135.71,139.43,142.35,153.33,166.66,178.73,179.83,177.3,146.91,115.31,103.65,102.19,108.29,120.87,140.26,153.15,161.64,76.53,81.19,85.54,97.94,114.77,120.33,127.56,136.91,150.14,148.19,151.04,158.93,159.35,167.55,176.78,177.43,168.22,122.85,106.91,103.39,105.17,112.83,127.37,149.56,158.63,76.9,80.11,83.05,89.9,103.9,107.88,113.18,128.27,146.95,157.8,156.48,162.97,152.02,155.46,175.3,175.85,165.8,124.73,118.12,105.54,104.85,113.42,120.65,135.84,155.64,76.36,79.74,86.01,86.54,92.08,95.01,105.42,123.23,141.01,149.2,147.39,149.84,141.68,150.58,164.87,169.4,157.16,159.46,134.27,110.78,103.99,110.46,118.75,135.33,152.69,78.1,78.68,86.48,90.58,93.27,92.16,97.28,110.67,126.28,130.56,128.0,128.45,127.66,137.06,149.28,153.26,160.61,169.86,156.81,123.71,103.82,106.59,113.73,127.34,148.38,82.35,80.09,82.09,88.63,95.05,96.56,97.26,100.99,110.38,113.98,120.77,132.22,134.76,147.55,166.07,170.47,169.35,168.89,155.55,134.24,110.96,104.36,108.32,118.05,136.24,84.15,84.23,82.32,87.76,91.64,93.12,96.96,106.33,111.31,106.79,119.35,132.95,147.96,153.55,165.8,178.43,178.35,170.88,165.74,140.36,119.17,107.65,106.75,113.56,127.58,82.99,85.54,84.77,84.53,90.08,95.19,95.25,105.43,118.46,119.85,116.2,129.13,142.15,149.56,155.14,171.6,179.44,174.73,161.75,142.72,129.9,115.26,107.99,111.72,121.99,83.01,84.0,87.7,86.06,85.61,94.36,100.3,105.68,117.92,125.39,115.72,129.22,138.43,141.11,151.63,160.55,168.46,164.25,159.16,150.23,136.74,122.5,113.66,108.45,114.78,84.11,84.63,86.98,90.22,88.1,91.68,100.12,113.41,115.57,121.86,128.81,118.67,129.66,133.23,141.51,145.97,148.93,150.71,162.12,166.27,141.86,130.13,115.63,109.05,112.53,85.03,84.97,87.49,89.39,93.67,94.55,102.32,113.85,127.45,126.0,130.58,124.19,132.07,132.58,131.1,139.71,148.52,162.21,157.78,164.7,149.31,142.46,128.04,109.0,110.09,85.93,87.73,89.92,90.55,93.51,98.52,103.9,118.72,129.55,133.86,136.06,137.44,136.47,146.6,143.25,136.06,157.12,172.07,169.4,168.79,158.26,144.45,138.0,111.86,109.03,87.34,90.39,91.4,93.1,91.44,93.54,97.58,115.07,124.02,123.47,137.17,141.91,149.48,150.24,151.28,139.57,144.66,162.92,170.62,177.08,172.96,151.89,129.57,120.07,112.54],"xs":[-74.3129825592041,-74.311283826828,-74.3095850944519,-74.3078863620758,-74.3061876296997,-74.30448889732361,-74.30279016494751,-74.3010914325714,-74.2993927001953,-74.2976939678192,-74.2959952354431,-74.294296503067,-74.2925977706909,-74.2908990383148,-74.2892003059387,-74.28750157356261,-74.28580284118651,-74.28410410881041,-74.2824053764343,-74.2807066440582,-74.2790079116821,-74.277309179306,-74.2756104469299,-74.2739117145538,-74.2722129821777]} -""" -lenape_csv = """ -"","elevation","elev_units","longitude","latitude" -"1",126.85,"meters",-74.2986363,40.7541939 -"2",125.19,"meters",-74.298561,40.754122 -"3",123.52,"meters",-74.298505,40.754049 -"4",121.92,"meters",-74.298435,40.753972 -"5",119.86,"meters",-74.298402,40.753872 -"6",119.86,"meters",-74.298416,40.753818 -"7",119.86,"meters",-74.298393,40.753805 -"8",118.32,"meters",-74.298233,40.753717 -"9",118.48,"meters",-74.298113,40.753706 -"10",118.48,"meters",-74.298079,40.753714 -"11",110.65,"meters",-74.297548,40.753434 -"12",108.68,"meters",-74.297364,40.753392 -"13",108.68,"meters",-74.2973338,40.7533463 -"14",107.67,"meters",-74.2972265,40.7533169 -"15",107.54,"meters",-74.297087,40.753356 -"16",107.54,"meters",-74.2970438,40.7533584 -"17",106.74,"meters",-74.296979,40.753397 -"18",107.69,"meters",-74.29689,40.753533 -"19",108.01,"meters",-74.296812,40.753661 -"20",108.34,"meters",-74.296718,40.753785 -"21",108.93,"meters",-74.296627,40.753874 -"22",109.26,"meters",-74.296514,40.753973 -"23",109.44,"meters",-74.296377,40.754026 -"24",107.8,"meters",-74.296184,40.754049 -"25",108.14,"meters",-74.29596,40.754119 -"26",108.31,"meters",-74.295761,40.754191 -"27",107.08,"meters",-74.295542,40.754277 -"28",106.54,"meters",-74.295345,40.754276 -"29",105.18,"meters",-74.295177,40.754295 -"30",104.93,"meters",-74.2951,40.754358 -"31",103.79,"meters",-74.294976,40.754381 -"32",103.79,"meters",-74.294943,40.754379 -"33",103.62,"meters",-74.294873,40.754362 -"34",103.46,"meters",-74.294805,40.754359 -"35",102.68,"meters",-74.294687,40.754349 -"36",102.78,"meters",-74.294537,40.754269 -"37",100.91,"meters",-74.294341,40.754248 -"38",101.24,"meters",-74.294228,40.754249 -"39",101.15,"meters",-74.294146,40.75427 -"40",100.73,"meters",-74.294043,40.754277 -"41",100.77,"meters",-74.293997,40.75418 -"42",97.54,"meters",-74.293672,40.75418 -"43",97.58,"meters",-74.293539,40.754324 -"44",97.41,"meters",-74.293442,40.754447 -"45",97.02,"meters",-74.29342,40.754555 -"46",96.78,"meters",-74.293397,40.754677 -"47",96.72,"meters",-74.293319,40.754787 -"48",96.98,"meters",-74.2933093,40.7549621 -"49",97.04,"meters",-74.2931914,40.7550903 -"50",95.89,"meters",-74.2931359,40.7552002 -"51",95.48,"meters",-74.293124,40.75528 -"52",95.43,"meters",-74.293142,40.755375 -"53",95.58,"meters",-74.293163,40.7554692 -"54",95.58,"meters",-74.2931806,40.7555174 -"55",95.31,"meters",-74.2930826,40.7555402 -"56",95.45,"meters",-74.2930283,40.7555572 -"57",94.19,"meters",-74.2929292,40.7555853 -"58",93.57,"meters",-74.2928114,40.7556067 -"59",92.9,"meters",-74.2927408,40.7556127 -"60",92.9,"meters",-74.2926921,40.7556257 -"61",91.46,"meters",-74.2926528,40.7556602 -"62",91.46,"meters",-74.2926104,40.7556888 -"63",88.42,"meters",-74.2925696,40.7557042 -"64",88.42,"meters",-74.2925272,40.7556876 -"65",85.62,"meters",-74.2924927,40.7556674 -"66",85.32,"meters",-74.2924503,40.755646 -"67",85.32,"meters",-74.2924377,40.7556222 -"68",85.32,"meters",-74.2924377,40.7555877 -"69",84.49,"meters",-74.2924346,40.7555365 -"70",84.49,"meters",-74.2924236,40.755502 -"71",84.36,"meters",-74.2923562,40.7554961 -""" -nothing -``` - -```julia; -SC = JSON.parse(somocon) # defined in a hidden cell -xsₛ, ysₛ, zsₛ = [float.(SC[i]) for i in ("xs", "ys","zs")] -zzsₛ = reshape(zsₛ, (length(xsₛ), length(ysₛ)))' # reshape to matrix -surface(xsₛ, ysₛ, zzsₛ) -``` - -This shows a bit of the topography. If we look at the region from directly above, the graph looks different: - -```julia; -surface(xsₛ, ysₛ, zzsₛ, camera=(0, 90)) -``` - -The rendering uses different colors to indicate height. A more typical graph, that is somewhat similar to the top down view, is a *contour* map. - -For a scalar function, Define a *level curve* as the solutions to the equations $f(x,y) = c$ for a given $c$. (Or more generally $f(\vec{x}) = c$ for a vector if dimension $2$ or more.) Plotting a selection of level curves yields a *contour* graph. These are produced with `contour` and called as above. For example, we have: - -```julia; -contour(xsₛ, ysₛ, zzsₛ) -``` - -Were one to walk along one of the contour lines, then there would be no change in elevation. The areas of greatest change in elevation - basically the hills - occur where the different contour lines are closest. In this particular area, there is a river that runs from the upper right through to the lower left and this is flanked by hills. - - -The $c$ values for the levels drawn may be specified through the `levels` argument: - -```julia; -contour(xsₛ, ysₛ, zzsₛ, levels=[50,75,100, 125, 150, 175]) -``` - -That shows the ``50``m, ``75``m, ... contours. - -If a fixed number of evenly spaced levels is desirable, then the `nlevels` argument is available. - - -```julia; -contour(xsₛ, ysₛ, zzsₛ, nlevels = 5) -``` - - - -If a function describes the surface, then the function may be passed as the third value: - -```julia; hold=true -f(x, y) = sin(x) - cos(y) -xs = range(0, 2pi, length=100) -ys = range(-pi, pi, length = 100) -contour(xs, ys, f) -``` - -##### Example - -An informative graphic mixes both a surface plot with a contour plot. The `PyPlot` package can be used to generate one, but such graphs are not readily -made within the `Plots` framework. Here is a workaround, where the contours are generated through the `Contours` package. At the beginning of this section several of its methods are imported. - -This example shows how to add a contour at a fixed level ($0$ below). As no hidden line algorithm is used to hide the contour line if the surface were to cover it, a transparency is specified through `alpha=0.5`: - - -```julia; hold=true - -function surface_contour(xs, ys, f; offset=0) - p = surface(xs, ys, f, legend=false, fillalpha=0.5) - - ## we add to the graphic p, then plot - zs = [f(x,y) for x in xs, y in ys] # reverse order for use with Contour package - for cl in levels(contours(xs, ys, zs)) - lvl = level(cl) # the z-value of this contour level - for line in lines(cl) - _xs, _ys = coordinates(line) # coordinates of this line segment - _zs = offset .+ (0 .* _xs) - plot!(p, _xs, _ys, _zs, alpha=0.5) # add curve on x-y plane - end - end - p -end - -xs = ys = range(-pi, stop=pi, length=100) -f(x,y) = 2 + sin(x) - cos(y) - -surface_contour(xs, ys, f) -``` - -We can see that at the minimum of the surface, the contour lines are nested closed loops with decreasing area. - -##### Example - - -The figure shows a weather map from ``1943`` with contour lines based on atmospheric pressure. These are also know as *isolines*. - -```julia; hold=true; echo=false -imgfile = "figures/daily-map.jpg" -caption = """ -Image from [weather.gov](https://www.weather.gov/unr/1943-01-22) of a contour map showing atmospheric pressures from January 22, 1943 in Rapid City, South Dakota. -""" -ImageFile(:differentiable_vector_calculus, imgfile, caption) -``` - -This day is highlighted as "The most notable temperature fluctuations -occurred on January 22, 1943 when temperatures rose and fell almost 50 -degrees in a few minutes. This phenomenon was caused when a frontal -boundary separating extremely cold Arctic air from warmer Pacific air -rolled like an ocean tide along the northern and eastern slopes of the -Black Hills." - -This frontal boundary is marked with triangles and half circles along the thicker black line. The tight spacing of the contour lines above that marked line show a big change in pressure in a short distance. - - - -##### Example - - -Sea surface temperature varies with latitude and other factors, such as water depth. The following figure shows average temperatures for January 1982 around Australia. The filled contours allow for an easier identification of the ranges represented. - -```julia; hold=true; echo=false -imgfile = "figures/australia.png" -caption = """ -Image from [IRI](https://iridl.ldeo.columbia.edu/maproom/Global/Ocean_Temp/Monthly_Temp.html) shows mean sea surface temperature near Australia in January 1982. IRI has zoomable graphs for this measurement from 1981 to the present. The contour lines are in 2 degree Celsius increments. -""" -ImageFile(:differentiable_vector_calculus, imgfile, caption) -``` - - -##### Example - -The filled contour and the heatmap are related figures to a simple contour graph. The heatmap uses a color gradient to indicate the value at $(x,y)$: - -```julia; hold=true -f(x,y) = exp(-(x^2 + y^2)/5) * sin(x) * cos(y) -xs= ys = range(-pi, pi, length=100) -heatmap(xs, ys, f) -``` - -The filled contour layers on the contour lines to a heatmap: - -```julia; hold=true -f(x,y) = exp(-(x^2 + y^2)/5) * sin(x) * cos(y) -xs= ys = range(-pi, pi, length=100) -contourf(xs, ys, f) -``` - - -This function has a prominent peak and a prominent valley, around the middle of the viewing window. The nested contour lines indicate this, and the color key can be used to identify which is the peak and which the valley. - -## Limits - -The notion of a limit for a univariate function: as $x$ gets close to $c$ then $f(x)$ gets close to $L$, needs some modification: - -> Let $f: R^n \rightarrow R$ and $C$ be a point in $R^n$. Then $\lim_{P \rightarrow C}f(P) = L$ if for every $\epsilon > 0$ there exists a $\delta > 0$ such that $|f(P) - L| < \epsilon$ whenever $0 < \| P - C \| < \delta$. - -(If $P=(x_1, x_2, \dots, x_n)$ we use $f(P) = f(x_1, x_2, \dots, x_n)$.) - -This says, informally, for any scale about $L$ there is a "ball" about $C$ (not including ``C``) for which the images of $f$ always sit in the ball. -Formally we define a ball of radius $r$ about a point $C$ to be all points $P$ with distance between $P$ and $C$ less than $r$. A ball is an *open* set. An [open](https://en.wikipedia.org/wiki/Open_set#Euclidean_space) is a set $U$ such that for any $x$ in $U$, there is a radius $r$ such that the ball of radius $r$ about $x$ is *still* within $U$. An open set generalizes an open interval. A *closed* set generalizes a *closed* interval. These are [defined](https://en.wikipedia.org/wiki/Closed_set) by a set that contains its boundary. Boundary points are any points that can be approached in the limit by points within the set. - - - - -In the univariate case, it can be useful to characterize a limit at $x=c$ existing if *both* the left and right limits exist and the two are equal. Generalizing to getting close in $R^m$ leads to -the intuitive idea of a limit existing in terms of any continuous "path" that approaches $C$ in the $x$-$y$ plane has a limit and all are equal. Let $\gamma$ describe the path, and $\lim_{s \rightarrow t}\gamma(s) = C$. Then $f \circ \gamma$ will be a univariate function. If there is a limit, $L$, then this composition will also have the same limit as $s \rightarrow t$. Conversely, if for *every* path this composition has the *same* limit, then $f$ will have a limit. - -The "two path corollary" is a trick to show a limit does not exist - just find two paths where there is a limit, but they differ, then a limit does not exist in general. - -### Continuity of scalar functions - -Continuity is defined in a familiar manner: $f(P)$ is continuous at $C$ if $\lim_{P \rightarrow C} f(P) = f(C)$, where we interpret $P \rightarrow C$ in the sense of a ball about $C$. - -As with univariate functions continuity will be preserved under function addition, subtraction, multiplication, and division (provided there is no dividing by $0$). With this, all these functions are continuous everywhere and so have limits everywhere: - -```math -f(x,y) = \sin(x + y), \quad -g(x,y,z) = x^2 + y^2 + z^2, \quad -h(w, x,y,z) = \sqrt{w^2 + x^2 + y^2 + z^2}. -``` - - - -Not all functions will have a limit though. Consider $f(x,y) = 2x^2/(x^2+y^2)$ and $C=(0,0)$. It is not defined at $C$ (dividing by ``0``), but may have a limit at ``C``. Consider the path $x=0$ (the $y$-axis) parameterized by $\vec\gamma(t) = \langle 0, t\rangle$. Along this path $(f\circ \vec\gamma)(t) = 0/t^2 = 0$ so will have a limit of $0$. If the limit of $f$ exists it must be $0$. But, along the line $y=0$ (the $x$ axis) parameterized by $\vec{\gamma}(t) = \langle t, 0 \rangle$, the function simplifies to $(f\circ\vec\gamma)(t)=2$, so would have a limit of $2$. As the limit along different paths is different, this function has no limit in general. - - -##### Example - -If is not enough that a limit exist along many paths to say a limit exists in general. It must be all paths and be equal. An example might be this function: - -```math -f(x,y) = -\begin{cases} -(x + y)/(x-y) & x \neq y,\\ -0 & x = y -\end{cases} -``` - -At $\vec{0}$ this will not have a limit. However, along any line $y=mx$ we have a limit. If $m=1$ the function is constantly $0$, and so has the limit. If $m \neq 1$, then we get $f(x, y) = f(x, mx) = (1 + m)/(1-m)$, a constant So for each $m$ there is a different limit. Consequently, the scalar function does not have a limit. - - - - - -## Partial derivatives and the gradient - -Discussing the behaviour of a scalar function along a path is described mathematically through composition. If $\vec\gamma(t)$ is a path in $R^n$, then the composition $f \circ \vec\gamma$ will be a univariate function. When $n=2$, we can visualize this composition directly, or as a ``3``-D path on the surface given by $\vec{r}(t) = \langle \gamma_1(t), \gamma_2(t), \dots, \gamma_n(t), (f \circ \vec\gamma)(t) \rangle$. - -```julia; -f₁(x,y) = 2 - x^2 - 3y^2 -f₁(x) = f₁(x...) -γ₁(t) = 2 * [t, -t^2] # use \gamma[tab] -x₁s = y₁s = range(-1, 1, length=100) -surface(x₁s, y₁s, f₁) -r3₁(t) = [γ₁(t)..., f₁(γ₁(t))] # to plot the path on the surface -plot_parametric!(0..1/2, r3₁, linewidth=5, color=:black) - -r2₁(t) = [γ₁(t)..., 0] -plot_parametric!(0..1/2, r2₁, linewidth=5, color=:black) # in the $x$-$y$ plane -``` - -The vector valued function `r3(t) = [γ(t)..., f(γ(t))]` takes the ``2``-dimensional path specified by $\vec\gamma(t)$ and adds a third, $x$, direction by composing the position with `f`. In this way, a ``2``-D path is visualized with a ``3``-D path. This viewpoint can be reversed, as desired. - -However, the composition, $f\circ\vec\gamma$, is a univariate function, so this can also be visualized by - -```julia; -plot(f₁ ∘ γ₁, 0, 1/2) -``` - -With this graph, we might be led to ask about derivatives or rates of change. For this example, we can algebraically compute the composition: - -```math -(f \circ \vec\gamma)(t) = 2 - (2t) - 3(-2t^2)^2 = 2 - 2t +12t^4 -``` - -From here we clearly have $f'(t) = -2 + 48t^3$. But could this be computed in terms of a "derivative" of $f$ and the derivative of $\vec\gamma$? - -Before answering this, we discuss *directional* derivatives along the simplified paths $\vec\gamma_x(t) = \langle t, c\rangle$ or $\vec\gamma_y(t) = \langle c, t\rangle$. - -If we compose $f \circ \vec\gamma_x$, we can visualize this as a curve on the surface from $f$ that moves in the $x$-$y$ plane along the line $y=c$. The derivative of this curve will satisfy: - -```math -\begin{align} -(f \circ \vec\gamma_x)'(x) &= -\lim_{t \rightarrow x} \frac{(f\circ\vec\gamma_x)(t) - (f\circ\vec\gamma_x)(x)}{t-x}\\ -&= \lim_{t\rightarrow x} \frac{f(t, c) - f(x,c)}{t-x}\\ -&= \lim_{h \rightarrow 0} \frac{f(x+h, c) - f(x, c)}{h}. -\end{align} -``` - -The latter expresses this to be the derivative of the function that holds the $y$ value fixed, but lets the $x$ value vary. It is the rate of change in the $x$ direction. There is special notation for this: - -```math -\begin{align} -\frac{\partial f(x,y)}{\partial x} &= -\lim_{h \rightarrow 0} \frac{f(x+h, y) - f(x, y)}{h},\quad\text{and analogously}\\ -\frac{\partial f(x,y)}{\partial y} &= -\lim_{h \rightarrow 0} \frac{f(x, y+h) - f(x, y)}{h}. -\end{align} -``` - -These are called the *partial* derivatives of $f$. The symbol $\partial$, read as "partial", is reminiscent of "$d$", but indicates the derivative is only in a given direction. Other notations exist for this: - -```math -\frac{\partial f}{\partial x}, \quad f_x, \quad \partial_x f, -``` - -and more generally, when $n$ may be ``2`` or more, - -```math -\frac{\partial f}{\partial x_i}, \quad f_{x_i}, \quad f_i, \quad \partial_{x_i} f, \quad \partial_i f. -``` - - -The *gradient* of a scalar function $f$ is the vector comprised of the partial derivatives: - -```math -\nabla f(x_1, x_2, \dots, x_n) = \langle -\frac{\partial f}{\partial x_1}, -\frac{\partial f}{\partial x_2}, \dots, -\frac{\partial f}{\partial x_n} \rangle. -``` - -As seen, the gradient is a vector-valued function, but has, also, multivariable inputs. It is a function from $R^n \rightarrow R^n$. - - - -##### Example - -Let $f(x,y) = x^2 - 2xy$, then to compute the partials, we just treat the other variables like a constant. (This is consistent with the view that the partial derivative is just a regular derivative along a line where all other variables are constant.) - -Then - -```math -\begin{align} -\frac{\partial (x^2 - 2xy)}{\partial x} &= 2x - 2y\\ -\frac{\partial (x^2 - 2xy)}{\partial y} &= 0 - 2x = -2x. -\end{align} -``` - -Combining, gives $\nabla{f} = \langle 2x -2y, -2x \rangle$. - - - -If $g(x,y,z) = \sin(x) + z\cos(y)$, then - -```math -\begin{align} -\frac{\partial g }{\partial x} &= \cos(x) + 0 = \cos(x),\\ -\frac{\partial g }{\partial y} &= 0 + z(-\sin(y)) = -z\sin(y),\\ -\frac{\partial g }{\partial z} &= 0 + \cos(y) = \cos(y). -\end{align} -``` - -Combining, gives $\nabla{g} = \langle \cos(x), -z\sin(y), \cos(y) \rangle$. - - -### Finding partial derivatives in Julia - - -Two different methods are described, one for working with functions, the other symbolic expressions. This mirrors our treatment for vector-valued functions, where `ForwardDiff.derivative` was used for functions, and `SymPy`'s `diff` function for symbolic expressions. - - -Suppose, we consider $f(x,y) = x^2 - 2xy$. We may define it with `Julia` through: - -```julia; -f₂(x,y) = x^2 - 2x*y -f₂(v) = f₂(v...) # to handle vectors. Need not be defined each time -``` - -The numeric gradient at a point, can be found from the function `ForwardDiff.gradient` through: - -```julia; -pt₂ = [1, 2] -ForwardDiff.gradient(f₂, pt₂) # uses the f(v) call above -``` - -This, of course matches the computation above, where $\nabla f = \langle (2x -2y, -2x)$, so at $(1,2)$ is $(-2, 2)$, as a point in $R^2$. - -The `ForwardDiff.gradient` function expects a function that accepts a vector of values, so the method for `f(v)` is needed for the computation. - - -To go from a function that takes a point to a function of that point, we have the following definition. This takes advantage of `Julia`'s multiple dispatch to add a new method for the `gradient` generic. This is done in the `CalculusWithJulia` package along the lines of: - -```julia; eval=false -FowardDiff.gradient(f::Function) = x -> ForwardDiff.gradient(f, x) -``` - -It works as follows, where a vector of values is passed in for the point in question: - -```julia; -gradient(f₂)([1,2]), gradient(f₂)([3,4]) -``` - -This expects a point or vector for its argument, and not the expanded values. Were that desired, something like this would work: - -```julia; eval=false -ForwardDiff.gradient(f::Function) = (x, xs...) -> ForwardDiff.gradient(f, vcat(x, xs...)) -``` - -```julia; -gradient(f₂)([1,2]), gradient(f₂)(3,4) -``` - - -From the gradient, finding the partial derivatives involves extraction of the corresponding component. - - -For example, were it desirable, this function could be used to find the partial in $x$ for some constant $y$: - -```julia; -partial_x(f, y) = x -> ForwardDiff.gradient(f,[x,y])[1] # first component of gradient -``` - -Another alternative would be to hold one variable constant, and use the `derivative` function, as in: - -```julia; hold=true -partial_x(f, y) = x -> ForwardDiff.derivative(u -> f(u,y), x) -``` - - -!!! note - For vector-valued functions, we can overide the syntax `'` using `Base.adjoint`, as `'` is treated as a postfix operator in `Julia` for the `adjoint` operation. The symbol `\\nabla` is also available in `Julia`, but it is not an operator, so can't be used as mathematically written `∇f` (this could be used as a name though). In `CalculusWithJulia` a definition is made so essentially `∇(f) = x -> ForwardDiff.gradient(f, x)`. It does require parentheses to be called, as in `∇(f)`. - - -#### Symbolic expressions - -The partial derivatives are more directly found with `SymPy`. As with univariate functions, the `diff` function is used by simply passing in the variable in which to find the partial derivative: - -```julia; -@syms x y -ex = x^2 - 2x*y -diff(ex, x) -``` - -And evaluation: - -```julia; -diff(ex,x)(x=>1, y=>2) -``` - -Or - -```julia; -diff(ex, y)(x=>1, y=>2) -``` - -The gradient would be found by combining the two: - -```julia; -[diff(ex, x), diff(ex, y)] -``` - -This can be simplified through broadcasting: - -```julia; -grad_ex = diff.(ex, [x,y]) -``` - - -To evaluate at a point we have: - -```julia; -subs.(grad_ex, x=>1, y=>2) -``` - -The above relies on broadcasting treating the pair as a single value so the substitution is repeated for each entry of `grad_ex`. - - -The `gradient` function from `CalculusWithJulia` is defined to find the symbolic gradient. It uses `free_symbols` to specify the number and order of the variables, but that may be wrong; they are specified below: - -```julia; -gradient(ex, [x, y]) # [∂f/∂x, ∂f/∂y] -``` - -To use `∇` and specify the variables, a tuple (grouping parentheses) is used: - -```julia; -∇((ex, [x,y])) -``` - ----- - -In computer science there are two related concepts [Currying](https://en.wikipedia.org/wiki/Currying) and [Partial application](https://en.wikipedia.org/wiki/Partial_application). For a function $f(x,y)$, say, partial application is the process of fixing one of the variables, producing a new function of fewer variables. For example, fixing $y=c$, the we get a new function (of just $x$ and not $(x,y)$) $g(x) = f(x,c)$. In partial derivatives the partial derivative of $f(x,y)$ with respect to $x$ is the derivative of the function $g$, as defined above. - -Currying, is related, but technically returns a function, so we think of the curried version of $f$ as a function, $h$, which takes $x$ and returns the function $y \rightarrow f(x,y)$ so that $h(x)(y) = f(x, y)$. - - -### Visualizing the gradient - -The gradient is not a univariate function, a simple vector-valued -function, or a scalar function, but rather a *vector field* (which will be -discussed later). For the case, $f: R^2 \rightarrow R$, the gradient -will be a function which takes a point $(x,y)$ and returns a vector , -$\langle \partial{f}/\partial{x}(x,y), \partial{f}/\partial{y}(x,y) \rangle$. We can visualize this by -plotting a vector at several points on a grid. This task is made -easier with a function like the following, which handles the task of vectorizing the values. It is provided within the `CalculusWithJulia` package: - -```julia; eval=false -function vectorfieldplot!(V; xlim=(-5,5), ylim=(-5,5), nx=10, ny=10, kwargs...) - - dx, dy = (xlim[2]-xlim[1])/nx, (ylim[2]-ylim[1])/ny - xs, ys = xlim[1]:dx:xlim[2], ylim[1]:dy:ylim[2] - - ps = [[x,y] for x in xs for y in ys] - vs = V.(ps) - λ = 0.9 * minimum([u/maximum(getindex.(vs,i)) for (i,u) in enumerate((dx,dy))]) - - quiver!(unzip(ps)..., quiver=unzip(λ * vs)) - -end -``` - - -Here we show the gradient for the scalar function $f(x,y) = 2 - x^2 - 3y^2$ over the region $[-2, 2]\times[-2,2]$ along with a contour plot: - -```julia; hold=true -f(x,y) = 2 - x^2 - 3y^2 -f(v) = f(v...) - -xs = ys = range(-2,2, length=50) - -p = contour(xs, ys, f, nlevels=12) -vectorfieldplot!(p, gradient(f), xlim=(-2,2), ylim=(-2,2), nx=10, ny=10) - -p -``` - -The figure suggests a potential geometric relationship between the gradient and the contour line to be explored later. - -## Differentiable - -We see here how the gradient of $f$, $\nabla{f} = \langle f_{x_1}, f_{x_2}, \dots, f_{x_n} \rangle$, plays a similar role as the derivative does for univariate functions. - -First, we consider the role of the derivative for univariate functions. The main characterization - the derivative is the slope of the line that best approximates the function at a point - is quantified by Taylor's theorem. For a function $f$ with a continuous second derivative: - -```math -f(c+h) = f(c) = f'(c)h + \frac{1}{2} f''(\xi) h^2, -``` - -for some $\xi$ within $c$ and $c+h$. - -We re-express this through: - -```math -(f(c+h) - f(c)) - f'(c)h =\frac{1}{2} f''(\xi) h^2. -``` - -The right hand side is the *error* term between the function value at $c+h$ and, in this case, the linear approximation at the same value. - - -If the assumptions are relaxed, and $f$ is just assumed to be *differentiable* at $x=c$, then only this is known: - -```math -(f(c+h) - f(c)) - f'(c)h = \epsilon(h) h, -``` - -where $\epsilon(h) \rightarrow 0$ as $h \rightarrow 0$. - - -It is this characterization of differentiable that is generalized to define when a scalar function is *differentiable*. - -> *Differentiable*: Let $f$ be a scalar function. -> Then $f$ is [differentiable](https://tinyurl.com/qj8qcbb) at a point $C$ **if** the first order partial derivatives exist at $C$ **and** for $\vec{h}$ going to $\vec{0}$: -> -> ``\|f(C + \vec{h}) - f(C) - \nabla{f}(C) \cdot \vec{h}\| = \mathcal{o}(\|\vec{h}\|),`` -> -> where ``\mathcal{o}(\|\vec{h}\|)`` means that dividing the left hand side by $\|\vec{h}\|$ and taking a limit as $\vec{h}\rightarrow 0$ the limit will be $0$.. - - -The limits here are for limits of scalar functions, which means along any path going to $\vec{0}$, not just straight line paths, as are used to define the partial derivatives. Hidden above, is an assumption that there is some open set around $C$ for which $f$ is defined for $f(C + \vec{h})$ when $C+\vec{h}$ is in this open set. - - -The role of the derivative in the univariate case is played by the -gradient in the scalar case, where $f'(c)h$ is replaced by $\nabla{f}(C) \cdot \vec{h}$. -For the univariate case, differentiable is simply -the derivative existing, but saying a scalar function is -differentiable at $C$ is a stronger statement than saying it has a -gradient or, equivalently, it has partial derivatives at $C$, as this is assumed -in the statement along with the other condition. - -Later we will see how Taylor's theorem generalizes for scalar functions and interpret the gradient geometrically, as was done for the derivative (it being the slope of the tangent line). - - - -## The chain rule to evaluate $f\circ\vec\gamma$ - - -In finding a partial derivative, we restricted the surface along a curve in the $x$-$y$ plane, in this case the curve $\vec{\gamma}(t)=\langle t, c\rangle$. In general if we have a curve in the $x$-$y$ plane, $\vec{\gamma}(t)$, we can compose the scalar function $f$ with $\vec{\gamma}$ to create a univariate function. If the functions are "smooth" then this composed function should have a derivative, and some version of a "chain rule" should provide a means to compute the derivative in terms of the "derivative" of $f$ (the gradient) and the derivative of $\vec{\gamma}$ ($\vec{\gamma}'$). - -> *Chain rule*: Suppose $f$ is *differentiable* at $C$, and $\vec{\gamma}(t)$ is -> differentiable at $c$ with $\vec{\gamma}(c) = C$. Then -> $f\circ\vec{\gamma}$ is differentiable at $c$ with derivative -> $\nabla f(\vec{\gamma}(c)) \cdot \vec{\gamma}'(c)$. - - -This is similar to the chain rule for univariate functions $(f\circ g)'(u) = f'(g(u)) g'(u)$ or $df/dx = df/du \cdot du/dx$. However, when we write out in components there are more terms. For example, for $n=2$ we have with $\vec{\gamma} = \langle x(t), y(t) \rangle$: - -```math -\frac{d(f\circ\vec{\gamma})}{dt} = -\frac{\partial f}{\partial x} \frac{dx}{dt} + -\frac{\partial f}{\partial y} \frac{dy}{dt}. -``` - - -The proof is a consequence of the definition of differentiability and -will be shown in more generality later. - - -##### Example - - -Consider the function $f(x,y) = 2 - x^2 - y^2$ and the curve $\vec\gamma(t) = t\langle \cos(t), -\sin(t) \rangle$ at $t=\pi/6$. We visualize this below: - - -```julia; -f₃(x,y) = 2 - x^2 - y^2 -f₃(x) = f₃(x...) -γ₃(t) = t*[cos(t), -sin(t)] -t0₃ = pi/6 -``` - -```julia; hold=true -xs = ys = range(-3/2, 3/2, length=100) -surface(xs, ys, f₃, legend=false) - -r(t) = [γ₃(t)..., (f₃∘γ₃)(t)] -plot_parametric!(0..1/2, r, linewidth=5, color=:black) - -arrow!(r(t0₃), r'(t0₃), linewidth=5, color=:black) -``` - -In three dimensions, the tangent line is seen, but the univariate function $f \circ \vec\gamma$ looks like: - -```julia; hold=true -plot(f₃ ∘ γ₃, 0, pi/2) -plot!(t -> (f₃ ∘ γ₃)(t0₃) + (f₃ ∘ γ₃)'(t0₃)*(t - t0₃), 0, pi/2) -``` - -From the graph, the slope of the tangent line looks to be about $-1$, using the chain rule gives the exact value: - -```julia; -ForwardDiff.gradient(f₃, γ₃(t0₃)) ⋅ γ₃'(t0₃) -``` - -We can compare this to taking the derivative after composition: - -```julia; -(f₃ ∘ γ₃)'(t0₃) -``` - - - -##### Example - - -Consider the following plot showing a hiking trail on a surface: - -```julia;echo=false -lenape = CSV.File(IOBuffer(lenape_csv)) |> DataFrame -nothing -``` - -```julia;hold=true;echo=false -xs, ys, zs = [float.(SC[i]) for i in ("xs", "ys","zs")] -zzs = reshape(zs, (length(xs), length(ys)))' # reshape to matrix -surface(xs, ys, zzs, legend=false) -plot!(lenape.longitude, lenape.latitude, lenape.elevation, linewidth=5, color=:black) -``` - -Though here it is hard to see the trail rendered on the surface, for the hiker, such questions are far from the mind. Rather, questions such as what is the steepest part of the trail may come to mind. - -For this question, we can answer it in turns of the sampled data in the `lenape` variable. The steepness being the change in elevation with respect to distance in the $x$-$y$ direction. Treating latitude and longitude coordinates describing motion in a plane (as opposed to a very big sphere), we can compute the maximum steepness: - -```julia; hold=true -xs, ys, zs = lenape.longitude, lenape.latitude, lenape.elevation -dzs = zs[2:end] - zs[1:end-1] -dxs, dys = xs[2:end] - xs[1:end-1], ys[2:end] - ys[1:end-1] -deltas = sqrt.(dxs.^2 + dys.^2) * 69 / 1.6 * 1000 # in meters now -global slopes = abs.(dzs ./ deltas) # to re-use -m = maximum(slopes) -atand(maximum(slopes)) # in degrees due to the `d` -``` - -This is certainly too steep for a trail, which should be at most $10$ to $15$ degrees or so, not $58$. This is due to the inaccuracy in the measurements. An average might be better: - -```julia; -import Statistics: mean -atand(mean(slopes)) -``` - -Which seems about right for a generally uphill trail section, as this is. - - -In the above example, the data is given in terms of a sample, not a functional representation. Suppose instead, the surface was generated by `f` and the path - in the $x$-$y$ plane - by $\gamma$. Then we could estimate the maximum and average steepness by a process like this: - -```julia; -f₄(x,y) = 2 - x^2 - y^2 -f₄(x) = f₄(x...) -γ₄(t) = t*[cos(t), -sin(t)] -``` - -```julia; hold=true -xs = ys = range(-3/2, 3/2, length=100) - -surface(xs, ys, f₄, legend=false) -r(t) = [γ₄(t)..., (f₄ ∘ γ₄)(t)] -plot_parametric!(0..1/2, r, linewidth=5, color=:black) -``` - - -```julia; -plot(f₄ ∘ γ₄, 0, pi/2) -slope(t) = abs((f₄ ∘ γ₄)'(t)) - -1/(pi/2 - 0) * quadgk(t -> atand(slope(t)), 0, pi/2)[1] # the average -``` - -the average is $50$ degrees. As for the maximum slope: - -```julia; hold=true -cps = find_zeros(slope, 0, pi/2) # critical points - -append!(cps, (0, pi/2)) # add end points -unique!(cps) - -M, i = findmax(slope.(cps)) # max, index - -cps[i], slope(cps[i]) -``` - -The maximum slope occurs at an endpoint. - - - - - - - -## Directional Derivatives - -The last example, how steep is the direction we are walking, is a question that can be asked when walking in a straight line in the $x$-$y$ plane. The answer has a simplified answer: - -Let $\vec\gamma(t) = C + t \langle a, b \rangle$ be a line that goes through the point $C$ parallel, or in the direction of, to $\vec{v} = \langle a , b \rangle$. - -Then the function $f \circ \vec\gamma(t)$ will have a derivative when $f$ is differentiable and by the chain rule will be: - -```math -(f\circ\vec\gamma)'(\vec\gamma(t)) = \nabla{f}(\vec\gamma(t)) \cdot \vec\gamma'(t) = -\nabla{f}(\vec\gamma(t)) \cdot \langle a, b\rangle = -\vec{v} \cdot \nabla{f}(\vec\gamma(t)). -``` - -At $t=0$, we see that $(f\circ\vec\gamma)'(C) = \nabla{f}(C)\cdot \vec{v}$. - - -This defines the *directional derivative* at $C$ in the direction $\vec{v}$: - -```math -\text{Directional derivative} = \nabla_{\vec{v}}(f) = \nabla{f} \cdot \vec{v}. -``` - -If $\vec{v}$ is a *unit* vector, then the value of the directional derivative is the rate of increase in $f$ in the direction of $\vec{v}$. - - -This is a *natural* generalization of the partial derivatives, which, in two dimensions, are the directional derivative in the $x$ direction and the directional derivative in the $y$ direction. - -The following figure shows $C = (1/2, -1/2)$ and the two curves. Planes are added, as it can be easiest to visualize these curves as the intersection of the surface generated by $f$ and the vertical planes $x=C_x$ and $y=C_y$ - - -```julia; hold=true; echo=false -f(x,y) = 2 - x^2 - y^2 - -xs = ys = range(-3/2, 3/2, length=100) -surface(xs, ys, f, legend=false) -M=f(3/2,3/2) - -x0,y0 = 1/2, -1/2 -plot!([-3/2, 3/2, 3/2, -3/2, -3/2], y0 .+ [0,0,0,0, 0], [M,M,2,2,M], linestyle=:dash) -r(x) = [x, y0, f(x,y0)] -plot_parametric!(-3/2..3/2, r, linewidth=5, color=:black) - - -plot!(x0 .+ [0,0,0,0, 0], [-3/2, 3/2, 3/2, -3/2, -3/2], [M,M,2,2,M], linestyle=:dash) -r(y) = [x0, y, f(x0, y)] -plot_parametric!(-3/2..3/2, r, linewidth=5, color=:black) - - -scatter!([x0],[y0],[M]) -arrow!([x0,y0,M], [1,0,0], linewidth=3) -arrow!([x0,y0,M], [0, 1,0], linewidth=3) -``` - - -We can then visualize the directional derivative by a plane through $C$ in the direction $\vec{v}$. Here we take $C=(1/2, -1/2)$, as before, and $\vec{v} = \langle 1, 1\rangle$: - -```julia; hold=true; echo=false -f(x,y) = 2 - x^2 - y^2 -f(x) = f(x...) -xs = ys = range(-3/2, 3/2, length=100) -p = surface(xs, ys, f, legend=false) -M=f(3/2,3/2) - -x0,y0 = 1/2, -1/2 -vx, vy = 1, 1 -l1(t) = [x0, y0] .+ t*[vx, vy] -llx, lly = l1(-1) -rrx, rry = l1(1) -plot!([llx, rrx, rrx, llx, llx], [lly, rry, rry, lly, lly], [M,M, 2, 2, M], linestyle=:dash) - -r(t) = [l1(t)..., f(l1(t))] -plot_parametric!(-1..1, r, linewidth=5, color=:black) -arrow!(r(0), r'(0), linewidth=5, color=:black) - - -scatter!([x0],[y0],[M]) -arrow!([x0,y0,M], [vx, vy,0], linewidth=3) -``` - -In this figure, we see that the directional derivative appears to be $0$, unlike the partial derivatives in $x$ and $y$, which are negative and positive, respectively. - - -##### Example - -Let $f(x,y) = \sin(x+2y)$ and $\vec{v} = \langle 2, 1\rangle$. The directional derivative of $f$ in the direction of $\vec{v}$ at $(x,y)$ is: - -```math -\nabla{f}\cdot \frac{\vec{v}}{\|\vec{v}\|} = \langle \cos(x + 2y), 2\cos(x + 2y)\rangle \cdot \frac{\langle 2, 1 \rangle}{\sqrt{5}} = \frac{4}{\sqrt{5}} \cos(x + 2y). -``` - -##### Example - -Suppose $f(x,y)$ describes a surface, and $\vec\gamma(t)$ parameterizes a path in the $x$-$y$ plane. Then the vector valued function $\vec{r}(t) = \langle -\vec\gamma_1(t), \vec\gamma_2(t), (f\circ\vec\gamma)(t)\rangle$ describes a path on the surface. The maximum steepness of the this path is found by maximizing the slope of the directional derivative in the direction of the tangent line. This would be the function of $t$: - -```math -\nabla{f}(\vec\gamma(t)) \cdot \vec{T}(t), -``` - -Where $T(t) = \vec\gamma'(t)/\|\vec\gamma'(t)\|$ is the unit tangent vector to $\gamma$. - -Let $f(x,y) = 2 - x^2 - y^2$ and $\vec\gamma(t) = (\pi-t) \langle \cos(t), \sin(t) \rangle$. What is the maximum steepness? - -We have $\nabla{f} = \langle -2x, -2y \rangle$ and $\vec\gamma'(t) = -\langle(\cos(t), \sin(t)) + (\pi-t) \langle(-\sin(t), \cos(t)\rangle$. We maximize this over $[0, \pi]$: - -```julia; hold=true -f(x,y) = 2 - x^2 - y^2 -f(v) = f(v...) -gamma(t) = (pi-t) * [cos(t), sin(t)] -dd(t) = gradient(f)(gamma(t)) ⋅ gamma'(t) - -cps = find_zeros(dd, 0, pi) -unique!(append!(cps, (0, pi))) # add endpoints -M,i = findmax(dd.(cps)) -M -``` - - -##### Example: The gradient indicates the direction of steepest ascent - - -Consider this figure showing a surface and a level curve along with a contour line: - -```julia; hold=true; echo=false -f(x,y) = sqrt(x^2 + y^2) -f(v) = f(v...) - -xs = ys = range(-2, 2, length=100) -p = surface(xs, ys, f, legend=false) - -γ(t) = [cos(t), sin(t), f(cos(t), sin(t))] -plot_parametric!(0..2pi, γ, linewidth=5) - -t =7pi/4; -scatter!(p, unzip([γ(t)])...) - - -rad(t) = 1 * [cos(t), sin(t)] -γ(t) = [rad(t)..., 0] -plot_parametric!(0..2pi, γ, linestyle=:dash) - - - -arrow!(γ(t), γ'(t)) -arrow!(γ(t), [ForwardDiff.gradient(f, rad(t))..., 0]) -``` - -We have the level curve for $f(x,y) = c$ represented, and a point $(x,y, f(x,y))$ drawn. At the point $(x,y)$ which sits on the level curve, we have indicated the gradient and the tangent curve to the level curve, or contour line. Worth reiterating, the gradient is not on the surface, but rather is a $2$-dimensional vector, but it does indicate a direction that can be taken on the surface. We will see that this direction indicates the path of steepest ascent. - -The figure suggests a relationship between the gradient and the tangents to the contour lines. -Let's parameterize the contour line by $\vec\gamma(t)$, assuming such a parameterization exists, let $C = (x,y) = \vec\gamma(t)$, for some $t$, be a point on the level curve, and $\vec{T} = \vec\gamma'(t)/\|\vec\gamma'(t)\|$ be the tangent to to the level curve at $C$. Then the directional derivative at $C$ in the direction of $T$ must be $0$, as along the level curve, the function $f\circ \vec\gamma = c$, a constant. But by the chain rule, this says: - -```math -0 = (c)' = (f\circ\vec\gamma)'(t) = \nabla{f}(\vec\gamma(t)) \cdot \vec\gamma'(t) -``` - -That is the gradient is *orthogonal* to $\vec{\gamma}'(t)$. As well, is orthogonal to the tangent vector $\vec{T}$ and hence to the level curve at any point. (As the dot product is $0$.) - - -Now, consider a unit vector $\vec{v}$ in the direction of steepest ascent at $C$. Since $\nabla{f}(C)$ and $\vec{T}$ are orthogonal, we can express the unit vector uniquely as $a\nabla{f}(C) + b \vec{T}$ with $a^2 + b^2 = 1$. The directional derivative is then - -```math -\nabla{f} \cdot \vec{v} = \nabla{f} \cdot (a\nabla{f}(C) + b \vec{T}) = a \| \nabla{f} \|^2 + b \nabla{f} \cdot \vec{T} = a \| \nabla{f} \|^2. -``` - -The will be largest when $a=1$ and $b=0$. That is, the direction of greatest ascent in indicated by the gradient. (It is smallest when $a=-1$ and $b=0$, the direction opposite the gradient. - -In practical terms, if standing on a hill, walking in the direction of -the gradient will go uphill the fastest possible way, walking along a -contour will not gain any elevation. The two directions are orthogonal. - - - -## Other types of compositions and the chain rule - -The chain rule we discussed was for a composition of $f:R^n \rightarrow R$ with $\vec\gamma:R \rightarrow R^n$ resulting in a function $f\circ\vec\gamma:R \rightarrow R$. There are other possible compositions. - -For example, suppose we have an economic model for beverage consumption based on temperature given by $c(T)$. But temperature depends on geographic location, so may be modeled through a function $T(x,y)$. The composition $c \circ T$ would be a function from $R^2 \rightarrow R$, so should have partial derivatives with respect to $x$ and $y$ which should be expressible in terms of the derivative of $c$ and the partial derivatives of $T$. - -Consider a different situation, say we have $f(x,y)$ a scalar function, but want to consider the position in polar coordinates involving $r$ and $\theta$. We can think directly of $F(r,\theta) = f(r\cdot\cos(\theta), r\cdot\sin(\theta))$, but more generally, we have a function $G(r, \theta)$ that is vector valued: $G(r,\theta) = \langle r\cdot\cos(\theta), r\cdot\sin(\theta) \rangle$ ($G:R^2 \rightarrow R^2$). The composition $F=f\circ G$ is a scalar function of $r$ and $\theta$ and the partial derivatives with respect to these should be expressible in terms of the partial derivatives of $f$ and the partial derivatives of $G$. - -Finding the derivative of a composition in terms of the individual pieces involves some form of the chain rule, which will differ depending on the exact circumstances. - - -### Chain rule for a univariate function composed with a scalar function - -If $f(t)$ is a univariate function and $G(x,y)$ a scalar function, the $F(x,y) = f(G(x,y))$ will be a scalar function and may have partial derivatives. If $f$ and $G$ are differentiable at a point $P$, then - -```math -\frac{\partial F}{\partial x} = f'(G(x,y)) \frac{\partial G}{\partial x}, \quad -\frac{\partial F}{\partial y} = f'(G(x,y)) \frac{\partial G}{\partial y}, -``` - -and - -```math -\nabla{F} = \nabla{f \circ G} = f'(G(x,y)) \nabla{G}(x,y). -``` - -The result is an immediate application of the univariate chain rule, when the partial functions are considered. - -##### Example - -Imagine a scenario where sales of some commodity (say ice) depend on the temperature which in turn depends on location. Formally, we might have functions $S(T)$ and $T(x,y)$ and then sales would be the composition $S(T(x,y))$. How might sales go up or down if one moved west, or one moved in the northwest direction? These would be a *directional* derivative, answered by $\nabla{S}\cdot \hat{v}$, where $\vec{v}$ is the direction. Of importance would be to compute $\nabla{S}$ which might best be done through the chain rule. - -For example, if $S(T) = \exp((T - 70)/10)$ and $T(x,y) = (1-x^2)\cdot y$, the gradient of $S(T(x,y))$ would be given by: - -```math -S'(T(x,y)) \nabla{T}(x,y) = (S(T(x,y))/10) \langle(-2xy, 1-x^2 \rangle. -``` - - - - -### Chain rule for a scalar function, $f$, composed with a function $G: R^m \rightarrow R^n$. - -If $G(u_1, \dots, u_m) = \langle G_1, G_2,\dots, G_n\rangle$ is a function of $m$ inputs that returns $n$ outputs we may view it as $G: R^m \rightarrow R^n$. The composition with a scalar function $f(v_1, v_2, \dots, v_n)=z$ from $R^n \rightarrow R$ creates a scalar function from $R^m \rightarrow R$, so the question of partial derivatives is of interest. We have: - -```math -\frac{\partial (f \circ G)}{\partial u_i} = -\frac{\partial f}{\partial v_1} \frac{\partial G}{\partial u_i} + -\frac{\partial f}{\partial v_2} \frac{\partial G}{\partial u_i} + \dots + -\frac{\partial f}{\partial v_n} \frac{\partial G}{\partial u_i}. -``` - -The gradient is then: - -```math -\nabla(f\circ G) = -\frac{\partial f}{\partial v_1} \nabla{G_1} + -\frac{\partial f}{\partial v_2} \nabla{G_2} + \dots + -\frac{\partial f}{\partial v_n} \nabla{G_n} = \nabla(f) \cdot \langle \nabla{G_1}, \nabla{G_2}, \dots, \nabla{G_n} \rangle, -``` - -The last expression is a suggestion, as it is an abuse of previously used notation: the dot product isn't between vectors of the same type, as the rightmost vector is representing a vector of vectors. -The [Jacobian](https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant) matrix combines these vectors into a rectangular array, though with the vectors written as *row* vectors. If $G: R^m \rightarrow R^n$, then the Jacobian is the $n \times m$ matrix -with $(i,j)$ entry given by $\partial G_i, \partial u_j$: - -```math -J = \left[ -\begin{align} -\frac{\partial G_1}{\partial u_1} & \frac{\partial G_1}{\partial u_2} & \dots \frac{\partial G_1}{\partial u_m}\\ -\frac{\partial G_2}{\partial u_1} & \frac{\partial G_2}{\partial u_2} & \dots \frac{\partial G_2}{\partial u_m}\\ -& \vdots & \\ -\frac{\partial G_n}{\partial u_1} & \frac{\partial G_n}{\partial u_2} & \dots \frac{\partial G_n}{\partial u_m} -\end{align} -\right]. -``` - -With this notation, and matrix multiplication we have $(\nabla(f\circ G))^t = \nabla(f)^t J$. - - -(Later, we will see that the chain rule in general has a familiar form using matrices, not vectors, which will avoid the need for a transpose.) - -##### Example - -Let $f(x,y) = x^2 + y^2$ be a scalar function. We have if $G(r, \theta) = \langle r\cos(\theta)(, r\sin(\theta) \rangle$ then after simplification, we have $(f \circ G)(r, \theta) = r^2$. Clearly then $\partial(f\circ G)/\partial r = 2r$ and $\partial(f\circ G)/\partial \theta = 0$. - -Were this computed through the chain rule, we have: - -```math -\begin{align} -\nabla G_1 &= \langle \frac{\partial r\cos(\theta)}{\partial r}, \frac{\partial r\cos(\theta)}{\partial \theta} \rangle= -\langle \cos(\theta), -r \sin(\theta) \rangle,\\ -\nabla G_2 &= \langle \frac{\partial r\sin(\theta)}{\partial r}, \frac{\partial r\sin(\theta)}{\partial \theta} \rangle= -\langle \sin(\theta), r \cos(\theta) \rangle. -\end{align} -``` - -We have $\partial f/\partial x = 2x$ and $\partial f/\partial y = 2y$, which at $G$ are $2r\cos(\theta)$ and $2r\sin(\theta)$, so by the chain rule, we should have - -```math -\begin{align} -\frac{\partial (f\circ G)}{\partial r} &= -\frac{\partial{f}}{\partial{x}}\frac{\partial G_1}{\partial r} + -\frac{\partial{f}}{\partial{y}}\frac{\partial G_2}{\partial r} = -2r\cos(\theta) \cos(\theta) + 2r\sin(\theta) \sin(\theta) = -2r (\cos^2(\theta) + \sin^2(\theta)) = 2r, \\ -\frac{\partial (f\circ G)}{\partial \theta} &= -\frac{\partial f}{\partial x}\frac{\partial G_1}{\partial \theta} + -\frac{\partial f}{\partial y}\frac{\partial G_2}{\partial \theta} = -2r\cos(\theta)(-r\sin(\theta)) + 2r\sin(\theta)(r\cos(\theta)) = 0. -\end{align} -``` - - -## Higher order partial derivatives - -If $f:R^n \rightarrow R$, the $\partial f/\partial x_i$ takes $R^n \rightarrow R$ too, so may also have a partial derivative. - -Consider the case $f: R^2 \rightarrow R$, then there are $4$ possible partial derivatives of order 2: partial in $x$ then $x$, partial in $x$ then $y$, partial in $y$ and then $x$, and, finally, partial in $y$ and then $y$. - - -The notation for the partial in $y$ *of* the partial in $x$ is: - -```math -\frac{\partial^2 f}{\partial{y}\partial{x}} = \frac{\partial{\frac{\partial{f}}{\partial{x}}}}{\partial{y}} = \frac{\partial f_x}{\partial{y}} = f_{xy}. -``` - -The placement of $x$ and $y$ indicating the order is different in the two notations. - - -We can compute these for an example easily enough: - -```julia; hold=true -@syms x y -f(x, y) = exp(x) * cos(y) -ex = f(x,y) -diff(ex, x, x), diff(ex, x, y), diff(ex, y, x), diff(ex, y, y) -``` - -In `SymPy` the variable to differentiate by is taken from left to right, so `diff(ex, x, y, x)` would first take the partial in $x$, then $y$, and finally $x$. - -We see that `diff(ex, x, y)` and `diff(ex, y, x)` are identical. This is not a coincidence, as by [Schwarz's Theorem](https://tinyurl.com/y7sfw9sx) (also known as Clairaut's theorem) this will always be the case under typical assumptions: - -> Theorem on mixed partials. If the mixed partials $\partial^2 f/\partial x \partial y$ and $\partial^2 f/\partial y \partial x$ exist and are continuous, then they are equal. - -For higher order mixed partials, something similar to Schwarz's theorem still holds. Say $f:R^n \rightarrow R$ is $C^k$ if $f$ is continuous and all partial derivatives of order $j \leq k$ are continous. If $f$ is $C^k$, and $k=k_1+k_2+\cdots+k_n$ ($k_i \geq 0$) then - -```math -\frac{\partial^k f}{\partial x_1^{k_1} \partial x_2^{k_2} \cdots \partial x_n^{k_n}}, -``` - -is uniquely defined. That is, which order the partial derivatives are taken is unimportant if the function is sufficiently smooth. - ----- - -The [Hessian](https://en.wikipedia.org/wiki/Hessian_matrix) matrix is the matrix of mixed partials defined (for $n=2$) by: - -```math -H = \left[ -\begin{align} -\frac{\partial^2 f}{\partial x \partial x} & \frac{\partial^2 f}{\partial x \partial y}\\ -\frac{\partial^2 f}{\partial y \partial x} & \frac{\partial^2 f}{\partial y \partial y} -\end{align} -\right]. -``` - -For symbolic expressions, the Hessian may be computed directly in `SymPy` with its `hessian` function: - -```julia -ex -``` - -```julia; -hessian(ex, (x, y)) -``` - -When the mixed partials are continuous, this will be a symmetric matrix. The Hessian matrix plays the role of the second derivative in the multivariate Taylor theorem. - - -For numeric use, `FowardDiff` has a `hessian` function. It expects a scalar function and a point and returns the Hessian matrix. We have for $f(x,y) = e^x\cos(y)$ at the point $(1,2)$, the Hessian matrix is: - -```julia; hold=true -f(x,y) = exp(x) * cos(y) -f(v) = f(v...) -pt = [1, 2] - -ForwardDiff.hessian(f, pt) # symmetric -``` - - -## Questions - -###### Question - -Consider the graph of a function $z= f(x,y)$ presented below: - -```julia; hold=true; echo=false -f(x,y) = x * exp(-(x^2 + y^2)) -xs = ys = range(-2, stop=2, length=50) -surface(xs, ys, f) -``` - -From the graph, is the value of $f(1/2, 1)$ positive or negative? - -```julia; hold=true; echo=false -choices = ["positive", "negative"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - - -On which line is the function $0$: - -```julia; hold=true; echo=false -choices = [ -L"The line $x=0$", -L"The line $y=0$" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -Consider the contour plot - -```julia; hold=true; echo=false -f(x,y) = x * exp(-(x^2 + y^2)) -xs = ys = range(-2, stop=2, length=50) -contour(xs, ys, f) -``` - -What is the value of $f(1, 0)$? - -```julia; hold=true; echo=false -val = 0.367879 -numericq(val, 1/2) -``` - -From this graph, the minimum value over this region is - -```julia; hold=true; echo=false -choices = [ -L"is around $(-0.7, 0)$ and with a value less than $-0.4$", -L"is around $(0.7, 0)$ and with a value less than $-0.4$", -L"is around $(-2.0, 0)$ and with a value less than $-0.4$", -L"is around $(2.0, 0)$ and with a value less than $-0.4$" -] -answ = 1 -radioq(choices, answ) -``` - -From this graph, where is the surface steeper? - -```julia; hold=true; echo=false -choices = [ -L"near $(1/4, 0)$", -L"near $(1/2, 0)$", -L"near $(3/4, 0)$", -L"near $(1, 0)$" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Consider the contour graph of a function below: - -```julia; hold=true; echo=false -f(x,y)= sin(x)*cos(x*y) -xs = ys = range(-3, stop=3, length=50) -contour(xs, ys, f) -``` - -Are there any peaks or valleys (local extrema) indicated? - -```julia; hold=true; echo=false -choices = [ -L"Yes, the closed loops near $(-1.5, 0)$ and $(1.5, 0)$ will contain these", -L"No, the vertical lines parallel to $x=0$ show this function to be flat" -] -answ = 1 -radioq(choices, answ) -``` - -Imagine hiking on this surface within this region. Could you traverse from left to right without having to go up or down? - -```julia; hold=true; echo=false -yesnoq(false) -``` - -Imagine hiking on this surface within this region. Could you traverse from top to bottom without having to go up or down? - -```julia; hold=true; echo=false -yesnoq(true) -``` - - - - -###### Question - -The figure (taken from [openstreetmap.org](https://www.openstreetmap.org/way/537938655#map=15/46.5308/10.4556&layers=C) shows the [Stelvio](https://en.wikipedia.org/wiki/Stelvio_Pass) Pass in Northern Italy near the Swiss border. - -```julia; hold=true; echo=false -ImageFile(:differentiable_vector_calculus, "figures/stelvio-pass.png", "Stelvio Pass") -``` - -The road through the pass (on the right) makes a series of switch backs. - -Are these - -```julia; hold=true; echo=false -choices = [ -"running essentially parallel to the contour lines", -"running essentially perpendicular to the contour lines" -] -answ = 1 -radioq(choices, answ) -``` - -Why? - -```julia; hold=true; echo=false -choices = [ -"By being essentially parallel, the steepness of the roadway can be kept to a passable level", -"By being essentially perpendicular, the road can more quickly climb up the mountain" -] -answ = 1 -radioq(choices, answ) -``` - -The pass is at about 2700 meters. As shown towards the top and bottom of the figure the contour lines show increasing heights, and to the left and right decreasing heights. The shape of the [pass](https://en.wikipedia.org/wiki/Mountain_pass) would look like: - -```julia; hold=true; echo=false -choices = [ -"A saddle-like shape, called a *col* or *gap*", -"A upside down bowl-like shape like the top of a mountain" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Limits of scalar functions have the same set of rules as limits of univariate functions. These include limits of constants; limits of sums, differences, and scalar multiples; limits of products; and limits of ratios. The latter with the provision that division by $0$ does not occur at the point in question. - -Using these, identify any points where the following limit *may* not exist, knowing the limits of the individual functions exist at $\vec{c}$: - -```math -\lim_{\vec{x} \rightarrow \vec{x}} \frac{af(\vec{x})g(\vec{x}) + bh(\vec{x})}{ci(\vec{x})}. -``` - -```julia; hold=true; echo=false -choices = [ -L"When $i(\vec{x}) = 0$", -L"When any of $f(\vec{x})$, $g(\vec{x})$, or $i(\vec{x})$ are zero", -L"The limit exists everywhere, as the function $f$, $g$, $h$, and $i$ have limits at $\vec{c}$ by assumption" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Let $f(x,y) = (x^2 - y^2) /(x^2 + y^2)$. - -Fix $y=0$. What is $\lim_{x \rightarrow 0} f(x,0)$? - -```julia; hold=true; echo=false -numericq(1) -``` - -Fix $x=0$. What is $\lim_{y \rightarrow 0} f(0, y)$? - -```julia; hold=true; echo=false -numericq(-1) -``` - -The two paths technique shows a limit does not exist by finding two paths with *different* limits as $\vec{x}$ approaches $\vec{c}$. Does this apply to -$\lim_{\langle x,y\rangle \rightarrow\langle 0, 0 \rangle}f(x,y)$? - -```julia; hold=true; echo=false -yesnoq(true) -``` - - -###### Question - -Let $f(x,y) = \langle \sin(x)\cos(2y), \sin(2x)\cos(y) \rangle$ - -Compute $f_x$ - -```julia; hold=true; echo=false -choices = [ -raw"`` \langle \cos(x)\cos(2y), 2\cos(2x)\cos(y)\rangle``", -raw"`` \langle \cos(2y), \cos(y) \rangle``", -raw"`` \langle \sin(x), \sin(2x) \rangle``", -raw"`` \sin(x)\cos(2y)``" -] -answ = 1 -radioq(choices, answ) -``` - -Compute $f_y$ - -```julia; hold=true; echo=false -choices = [ -raw"`` \langle -2\sin(x)\sin(2y), -\sin(2x)\sin(y) \rangle``", -raw"`` \langle 2\sin(x), \sin(2x) \rangle``", -raw"`` \langle -2\sin(2y), -\sin(y) \rangle``", -raw"`` - \sin(2x)\sin(y)``" -] -answ = 1 -radioq(choices, answ) -``` - - - -###### Question - -Let $f(x,y) = x^{y\sin(xy)}$. Using `ForwardDiff`, at the point $(1/2, 1/2)$, compute the following. - -The value of $f_x$: - -```julia; hold=true; echo=false -f(x,y) = x^(y*sin(x*y)) -f(v) = f(v...) -pt = [1/2, 1/2] -fx, fy = ForwardDiff.gradient(f, pt) -numericq(fx) -``` - -The value of $\partial{f}/\partial{y}$: - -```julia; hold=true; echo=false -f(x,y) = x^(y*sin(x*y)) -f(v) = f(v...) -pt = [1/2, 1/2] -fx, fy = ForwardDiff.gradient(f, pt) -numericq(fy) -``` - -###### Question - - -Let $z = f(x,y)$ have gradient $\langle f_x, f_y \rangle$. - -The gradient is: - -```julia; hold=true; echo=false -choices = ["two dimensional", "three dimensional"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -The surface is: - -```julia; hold=true; echo=false -choices = ["two dimensional", "three dimensional"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -The gradient points in the direction of greatest increase of $f$. If a person were on a hill described by $z=f(x,y)$, what three dimensional vector would they follow to go the steepest way up the hill? - -```julia; hold=true; echo=false -choices = [ - raw"`` \langle f_x, f_y, -1 \rangle``", - raw"`` \langle -f_x, -f_y, 1 \rangle``", - raw"`` \langle f_x, f_y \rangle``" -] -answ = 1 -radioq(choices, answ) -``` - -##### Question - -The figure shows climbers on their way to summit Mt. Everest: - -```julia; hold=true; echo=false -imgfile = "figures/everest.png" -caption = "Climbers en route to the summit of Mt. Everest" -ImageFile(:differentiable_vector_calculus, imgfile, caption) -``` - -If the surface of the mountain is given by a function $z=f(x,y)$ then the climbers move along a single path parameterized, say, by $\vec{\gamma}(t) = \langle x(t), y(t)\rangle$, as set up by the Sherpas. - -Consider the composition $(f\circ\vec\gamma)(t)$. - -For a climber with GPS coordinates $(x,y)$. What describes her elevation? - -```julia; hold=true; echo=false -choices = [ - raw"`` f(x,y)``", - raw"`` (f\circ\vec\gamma)(x,y)``", - raw"`` \vec\gamma(x,y)``" -] -answ = 1 -radioq(choices, answ) -``` - -A climber leaves base camp at $t_0$. At time $t > t_0$, what describes her elevation? - -```julia; hold=true; echo=false -choices = [ - raw"`` (f\circ\vec\gamma)(t)``", - raw"`` \vec\gamma(t)``", - raw"`` f(t)``" -] -answ = 1 -radioq(choices, answ) -``` - -What does the vector-valued function $\vec{r}(t) = \langle x(t), y(t), (f\circ\vec\gamma(t))\rangle$ describe: - -```julia; hold=true; echo=false -choices = [ - "The three dimensional position of the climber", - "The climbers gradient, pointing in the direction of greatest ascent" -] -answ = 1 -radioq(choices, answ) -``` - -In the figure, the climbers are making a switch back, so as to avoid the steeper direct ascent. Mathematically $\nabla{f}(\vec\gamma(t)) \cdot \vec\gamma'(t)$ describes the directional derivative that they follow. Using $\|\vec{u}\cdot\vec{v}\| = \|\vec{u}\|\|\vec{v}\|\cos(\theta)$, does this route: - -```julia; hold=true; echo=false -choices = [ - L"Keep $\cos(\theta)$ smaller than $1$, so that the slope taken is not too great", - L"Keep $\cos(\theta)$ as close to $1$ as possible, so the slope taken is as big as possible", - L"Keep $̧\cos(\theta)$ as close to $0$ as possible, so that they climbers don't waste energy going up and down" -] -answ = 1 -radioq(choices, answ) -``` - -Suppose our climber reaches the top at time $t$. What would be $(f\circ\vec\gamma)'(t)$, assuming the derivative exists? - -```julia; hold=true; echo=false -choices = [ - L"It would be $0$, as the top would be maximum for $f\circ\vec\gamma$", - L"It would be $\langle f_x, f_y\rangle$ and point towards the sky, the direction of greatest ascent", - "It would not exist, as there would not be enough oxygen to compute it" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Building sustainable hiking trails involves proper water management. Two rules of thumb are 1) the trail should not be steeper than 10 degrees 2) the outward slope (in the steepest downhill direction) should be around 5%. (A trail tread is not flat, but rather sloped downward, similar to the crown on a road, so that water will run to the downhill side of the tread, not along the tread, which would cause erosion. In the best possible world, the outslope will exceed the downward slope.) - -Suppose a trail height is described parametrically by a composition $(f \circ \vec\gamma)(t))$, where $\vec\gamma(t) = \langle x(t),y(t)\rangle$. The vector $\vec{T}(t) = \langle x(t), y(t), \nabla{f}(\vec\gamma(t)) \rangle$ describes the tangent to the trail at a point ($\vec\gamma(t)$). Let $\hat{T}(t)$ be the unit normal, and $\hat{P}(t)$ be a unit normal in the direction of the *projection* of $\vec{T}$ onto the $x$-$y$ plane. (Make the third component of $\vec{T}$ $0$, and then form a unit vector from that.) - -What expression below captures point 1 that the steepness should be no more than 10 degrees ($\pi/18$ radians): - -```julia; hold=true; echo=false -choices = [ -raw"`` |\hat{T} \cdot \hat{P}| \leq \cos(π/18)``", - raw"`` |\hat{T} \cdot \hat{P}| \leq \sin(\pi/18)``", - raw"`` |\hat{T} \cdot \hat{P}| \leq \pi/18``" -] -answ = 1 -radioq(choices, answ) -``` - -The normal to the surface $z=f(x,y)$ is *not* the normal to the trail tread. Suppose $\vec{N}(t)$ is a function that returns this. At the same point $\vec\gamma(t)$, let $\vec{M} = \langle -f_x, -f_y, 0\rangle$ be a vector in 3 dimensions pointing downhill. Let "hats" indicate unit vectors. The outward slope is $\pi/2$ minus the angle between $\hat{N}$ and $\hat{M}$. What condition will ensure this angle is $5$ degrees ($\pi/36$ radians)? - -```julia; hold=true; echo=false -choices = [ - raw"`` |\hat{N} \cdot \hat{M}| \leq \cos(\pi/2 - π/36)``", - raw"`` |\hat{N} \cdot \hat{M}| \leq \sin(\pi/2 - \pi/18)``", - raw"`` |\hat{N} \cdot \hat{M}| \leq \pi/2 - \pi/18``" -] -answ = 1 -radioq(choices, answ) -``` - - - - - - - -###### Question - -Let $f(x,y) = x^2 \cdot(x - y^2)$. Let $\vec{v} = \langle 1, 2\rangle$. Find the directional derivative in the direction of $\vec{v}$. - -```julia; hold=true; echo=false -choices = [ - raw"`` \frac{\sqrt{5}}{5}\left(2 \cos{\left (3 \right )} - 7 \sin{\left (3 \right )}\right)``", - raw"`` 2 \cos{\left (3 \right )} - 7 \sin{\left (3 \right )}``", - raw"`` 4 x^{2} y \sin{\left (x - y^{2} \right )} - x^{2} \sin{\left (x - y^{2} \right )} + 2 x \cos{\left (x - y^{2} \right )}``" -] -answ = 1 -radioq(choices, answ) -``` - - - - -###### Question - -Let $\vec{v}$ be any non-zero vector. Does $\nabla{f}(\vec{x})\cdot\vec{v}$ give the rate of increase of $f$ per unit of distance in the direction of $\vec{v}$? - -```julia; hold=true; echo=false -choices = [ - "Yes, by definition", - L"No, not unless $\vec{v}$ were a unit vector" -] -answ = 2 -radioq(choices, answ) -``` - -###### Question - - -Let $f(x,y,z) = x^4 + 2xz + 2xy + y^4$ and $\vec\gamma(t) = \langle t, t^2, t^3\rangle$. Using the chain rule, compute $(f\circ\vec\gamma)'(t)$. - - -The value of $\nabla{f}(x,y,z)$ is - -```julia; hold=true; echo=false -choices = [ - raw"`` \langle 4x^3 + 2x + 2y, 2x + 4y^3, 2x \rangle``", - raw"`` \langle 4x^3, 2z, 2y\rangle``", - raw"`` \langle x^3 + 2x + 2x, 2y+ y^3, 2x\rangle``" -] -answ = 1 -radioq(choices, answ) -``` - -The value of $\vec\gamma'(t)$ is: - -```julia; hold=true; echo=false -choices = [ - raw"`` \langle 1, 2t, 3t^2\rangle``", - raw"`` 1 + 2y + 3t^2``", - raw"`` \langle 1,2, 3 \rangle``" -] -answ = 1 -radioq(choices, answ) -``` - -The value of $(f\circ\vec\gamma)'(t)$ is found by: - -```julia; hold=true; echo=false -choices = [ - L"Taking the dot product of $\nabla{f}(\vec\gamma(t))$ and $\vec\gamma'(t)$", - L"Taking the dot product of $\nabla{f}(\vec\gamma'(t))$ and $\vec\gamma(t)$", - L"Taking the dot product of $\nabla{f}(x,y,z)$ and $\vec\gamma'(t)$" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Let $z = f(x,y)$ be some unknown function, - -From the figure, which drawn vector is the gradient at $(1/2, -3/4)$? - -```julia; hold=true; echo=false -f(x,y) = 2 + x^2 - y^2 -f(v) = f(v...) -pt = [1/2, -3/4] -xs = ys = range(-1, stop=1, length=50) -uvec(x) = x/norm(x) - -gradf = ForwardDiff.gradient(f, pt) -surface(xs, ys, f, legend=false) #, aspect_ratio=:equal) -arrow!([pt...,0], [uvec(gradf)...,0], color=:blue, linewidth=3) -arrow!([pt...,0], [-1,0,0], color=:green, linewidth=3) -arrow!([pt..., f(pt...)], uvec([(-gradf)..., 1]), color=:red, linewidth=3) -``` - - -```julia; hold=true; echo=false -choices = [ - "The blue one", - "The green one", - "The red one" -] -answ = 1 -radioq(choices, answ) -``` - - -From the figure, which drawn vector is the gradient as $(1/2, -3/4)$? - -```julia; hold=true; echo=false -f(x,y) = 2 + x^2 - y^2 -f(v) = f(v...) - -uvec(v)=v/norm(v) -pt = [1/2, -3/4] -gradf = ForwardDiff.gradient(f, pt) -xs = ys = range(-3/2, stop=3/2, length=50) - -contour(xs, ys, f, aspect_ratio=:equal) -arrow!(pt, [uvec(gradf)...], color=:blue, linewidth=3) -arrow!(pt, [-1, 0], color=:green, linewidth=3) -arrow!([0,0], pt, color=:red, linewidth=3) -``` - -```julia; hold=true; echo=false -choices = [ - "The blue one", - "The green one", - "The red one" -] -answ = 1 -radioq(choices, answ) -``` - - - - - -###### Question - -For a function $f(x,y)$ and a point (as a vector, $\vec{c}$) we consider this derived function: - -```math -g(\vec{x}) = f(\vec{c}) + \nabla{f}(\vec{c}) \cdot(\vec{x} - \vec{c}) + \frac{1}{2}(\vec{x} - \vec{c})^tH(\vec{c})(\vec{x} - \vec{c}), -``` - -where $H(\vec{c})$ is the Hessian. - -Further, *suppose* $\nabla{f}(\vec{c}) = \vec{0}$, so in fact: - -```math -g(\vec{x}) = f(\vec{c}) + \frac{1}{2}(\vec{x} - \vec{c})^tH(\vec{c})(\vec{x} - \vec{c}). -``` - -If $f$ is a linear function at $\vec{c}$, what does this say about $g$? - -```julia; hold=true; echo=false -choices = [ - L"Linear means $H$ is the $0$ matrix, so $g(\vec{x})$ is the constant $f(\vec{c})$", - L"Linear means $H$ is linear, so $g(\vec{x})$ describes a plane", - L"Linear means $H$ is the $0$ matrix, so the gradient couldn't have been $\vec{0}$" -] -answ = 1 -radioq(choices, answ) -``` - -Suppose, $H$ has the magic property that for *any* vector $\vec{v}^tH\vec{v} < 0$. What does this imply: - -```julia; hold=true; echo=false -choices = [ - L"That $g(\vec{x}) \geq f(\vec{c})$", - L"That $g(\vec{x}) = f(\vec{c})$", - L"That $g(\vec{x}) \leq f(\vec{c})$" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Let $f(x,y) = x^3y^3$. Which partial derivative is identically $0$? - -```julia; hold=true; echo=false -choices = [ - raw"`` \partial^4{f}/\partial{x^4}``", - raw"`` \partial^4{f}/\partial{x^3}\partial{y}``", - raw"`` \partial^4{f}/\partial{x^2}\partial{y^2}``", - raw"`` \partial^4{f}/\partial{x^1}\partial{y^3}``" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Let $f(x,y) = 3x^2 y$. - -Which value is greater at the point $(1/2,2)$? - -```julia; hold=true; echo=false -choices = [ - raw"`` f_x``", - raw"`` f_y``", - raw"`` f_{xx}``", - raw"`` f_{xy}``", - raw"`` f_{yy}``" -] -x,y=1/2, 2 -val, answ = findmax([6x*y, 3x^2, 6*y, 6x, 0]) -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The order of partial derivatives matters if the mixed partials are not continuous. Take - -```math -f(x,y) = \frac{xy ( x^2 - y^2)}{x^2 + y^2}, \quad f(0,0) = 0 -``` - -Using the definition of the derivative from a limit, we have - -```math -\frac{\partial \frac{\partial f}{\partial x}}{ \partial y} = -\lim_{\Delta y \rightarrow 0} \lim_{\Delta x \rightarrow 0} -\frac{f(x+\Delta x, y + \Delta y) - f(x, y+\Delta{y}) - f(x+\Delta x,y) + f(x,y)}{\Delta x \Delta y}. -``` - -Whereas, - -```math -\frac{\partial \frac{\partial f}{\partial y}}{ \partial x} = -\lim_{\Delta x \rightarrow 0} \lim_{\Delta y \rightarrow 0} -\frac{f(x+\Delta x, y + \Delta y) - f(x, y+\Delta{y}) - f(x+\Delta x,y) + f(x,y)}{\Delta x \Delta y}. -``` - -At $(0,0)$ what is $ \frac{\partial \frac{\partial f}{\partial x}}{ \partial y}$? - -```julia; hold=true; echo=false -answ = -1 -numericq(answ) -``` - -At $(0,0)$ what is $ \frac{\partial \frac{\partial f}{\partial y}}{ \partial x}$? - -```julia; hold=true; echo=false -answ = 1 -numericq(answ) -``` - -Away from $(0,0)$ the mixed partial is $\frac{x^{6} + 9 x^{4} y^{2} - 9 x^{2} y^{4} - y^{6}}{x^{6} + 3 x^{4} y^{2} + 3 x^{2} y^{4} + y^{6}}$. - -```julia; hold=true; echo=false -choices = [ - "As this is the ratio of continuous functions, it is continuous at the origin", - L"This is not continuous at $(0,0)$, still the limit along the two paths $x=0$ and $y=0$ are equivalent.", - L"This is not continuous at $(0,0)$, as the limit along the two paths $x=0$ and $y=0$ are not equivalent." -] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -[Knill](http://www.math.harvard.edu/~knill/teaching/summer2018/handouts/week3.pdf). Clairaut's theorem is the name given to the fact that if the partial derivatives are continuous, the mixed partials are equal, $f_{xy} = f_{yx}$. - -Consider the following code which computes the mixed partials for the discrete derivative: - -```julia; hold=true; -@syms x::real y::real Δ::real G() - -Dx(f,h) = (subs(f, x=>x+h) - f)/h -Dy(f,h) = (subs(f, y=>y+h) - f)/h - -Dy(Dx(G(x,y), Δ), Δ) - Dx(Dy(G(x,y), Δ), Δ) -``` - -What does this simplify to? - -```julia; hold=true; echo=false -numericq(0) -``` - - - - -Is continuity required for this to be true? - -```julia; hold=true; echo=false -yesnoq(false) -``` - -###### Question - -(Examples and descriptions from Krill) - - -What equation does the function $f(x,y) = x^3 - 3xy^2$ satisfy? - -```julia; echo=false -# 4 questions, don't edit this order! -ode_choices = [ -L"The wave equation: $f_{tt} = f_{xx}$; governs motion of light or sound", -L"The heat equation: $f_t = f_{xx}$; describes diffusion of heat", -L"The Laplace equation: $f_{xx} + f_{yy} = 0$; determines shape of a membrane", -L"The advection equation: $f_t = f_x$; is used to model transport in a wire", -L"The eiconal equation: $f_x^2 + f_y^2 = 1$; is used to model evolution of a wave front in optics", -L"The Burgers equation: $f_t + ff_x = f_{xx}$; describes waves at the beach which break", -L"The KdV equation: $f_t + 6ff_x+ f_{xxx} = 0$; models water waves in a narrow channel", -L"The Schrodinger equation: $f_t = (i\hbar/(2m))f_xx$; used to describe a quantum particle of mass $m$" -] -answ′ = 3 -radioq(ode_choices, answ′, keep_order=true) -``` - - - - -What equation does the function $f(t, x) = sin(x-t) + sin(x+t)$ satisfy? - -```julia; hold=true; echo=false -answ = 1 -radioq(ode_choices, answ, keep_order=true) -``` - - - - -What equation does the function $f(t, x) = e^{-(x+t)^2}$ satisfy? - -```julia; hold=true; echo=false -answ = 4 -radioq(ode_choices, answ, keep_order=true) -``` - - -What equation does the function $f(x,y) = \cos(x) + \sin(y)$ satisfy? - -```julia; hold=true; echo=false -answ = 5 -radioq(ode_choices, answ, keep_order=true) -``` diff --git a/CwJ/differentiable_vector_calculus/scalar_functions_applications.jmd b/CwJ/differentiable_vector_calculus/scalar_functions_applications.jmd deleted file mode 100644 index 2a3edd8..0000000 --- a/CwJ/differentiable_vector_calculus/scalar_functions_applications.jmd +++ /dev/null @@ -1,2053 +0,0 @@ -# Applications with scalar functions - - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using Roots -``` - -And the following from the `Contour` package: - -```julia -import Contour: contours, levels, level, lines, coordinates -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Applications with scalar functions", - description = "Calculus with Julia: Applications with scalar functions", - tags = ["CalculusWithJulia", "differentiable_vector_calculus", "applications with scalar functions"], -); - -nothing -``` - -This section presents different applications of scalar functions. - - -## Tangent planes, linearization - - -Consider the case $f:R^2 \rightarrow R$. We visualize $z=f(x,y)$ through a surface. At a point $(a, b)$, this surface, if $f$ is sufficiently smooth, can be approximated by a flat area, or a plane. For example, the Northern hemisphere of the earth, might be modeled simplistically by $z = \sqrt{R^2 - (x^2 + y^2)}$ for some $R$ and with the origin at the earth's core. The ancient view of a "flat earth," can be more generously seen as identifying this tangent plane with the sphere. More apt for current times, is the use of GPS coordinates to describe location. The difference between any two coordinates is technically a distance on a curved, nearly spherical, surface. But if the two points are reasonably closes (miles, not tens of miles) and accuracy isn't of utmost importance (i.e., not used for self-driving cars), then the distance can be found from the Euclidean distance formula, $\sqrt{(\Delta\text{latitude})^2 + \Delta\text{longitude})^2}$. That is, as if the points were on a plane, not a curved surface. - -For the univariate case, the tangent line has many different uses. Here we see the tangent plane also does. - - -### Equation of the tangent plane - -The partial derivatives have the geometric view of being the derivative of the univariate functions $f(\vec\gamma_x(t))$ and $f(\vec\gamma_y(t))$, where $\vec\gamma_x$ moves just parallel to the $x$ axis (e.g. $\langle t + a, b\rangle$). and $\vec\gamma_y$ moves just parallel to the $y$ axis. The partial derivatives then are slopes of tangent lines to each curve. The tangent plane, should it exist, should match both slopes at a given point. With this observation, we can identify it. - -Consider $f(\vec\gamma_x)$ at a point $(a,b)$. The path has a tangent vector, which has "slope" $\frac{\partial f}{\partial x}$. and in the direction of the $x$ axis, but not the $y$ axis, as does this vector: $\langle 1, 0, \frac{\partial f}{\partial x} \rangle$. Similarly, this vector $\langle 0, 1, \frac{\partial f}{\partial y} \rangle$ describes the tangent line to $f(\vec\gamma_y)$ a the point. - - -These two vectors will lie in the plane. The normal vector is found by their cross product: - -```julia; -@syms f_x f_y -n = [1, 0, f_x] × [0, 1, f_y] -``` - -Let $\vec{x} = \langle a, b, f(a,b)$. The tangent plane at $\vec{x}$ then is described by all vectors $\vec{v}$ with $\vec{n}\cdot(\vec{v} - \vec{x}) = 0$. Using $\vec{v} = \langle x,y,z\rangle$, we have: - -```math -[-\frac{\partial f}{\partial x}, -\frac{\partial f}{\partial y}, 1] \cdot [x-a, y-b, z - f(a,b)] = 0, -``` - -or, - -```math -z = f(a,b) + \frac{\partial f}{\partial x} (x-a) + \frac{\partial f}{\partial y} (y-b), -``` - -which is more compactly expressed as - -```math -z = f(a,b) + \nabla(f) \cdot \langle x-a, y-b \rangle. -``` - -This form would then generalize to scalar functions from $R^n \rightarrow R$. This is consistent with the definition of $f$ being differentiable, where $\nabla{f}$ plays the role of the slope in the formulas. - - -The following figure illustrates the above for the function $f(x,y) = 6 - x^2 - y^2$: - -```julia; hold=true -f(x,y) = 6 - x^2 -y^2 -f(x)= f(x...) - -a,b = 1, -1/2 - - -# draw surface -xr = 7/4 -xs = ys = range(-xr, xr, length=100) -surface(xs, ys, f, legend=false) - -# visualize tangent plane as 3d polygon -pt = [a,b] -tplane(x) = f(pt) + gradient(f)(pt) ⋅ (x - [a,b]) - -pts = [[a-1,b-1], [a+1, b-1], [a+1, b+1], [a-1, b+1], [a-1, b-1]] -plot!(unzip([[pt..., tplane(pt)] for pt in pts])...) - -# plot paths in x and y direction through (a,b) -γ_x(t) = pt + t*[1,0] -γ_y(t) = pt + t*[0,1] - -plot_parametric!((-xr-a)..(xr-a), t -> [γ_x(t)..., (f∘γ_x)(t)], linewidth=3) -plot_parametric!((-xr-b)..(xr-b), t -> [γ_y(t)..., (f∘γ_y)(t)], linewidth=3) - -# draw directional derivatives in 3d and normal -pt = [a, b, f(a,b)] -fx, fy = gradient(f)(a,b) -arrow!(pt, [1, 0, fx], linewidth=3) -arrow!(pt, [0, 1, fy], linewidth=3) -arrow!(pt, [-fx, -fy, 1], linewidth=3) # normal - -# draw point in base, x-y, plane -pt = [a, b, 0] -scatter!(unzip([pt])...) -arrow!(pt, [1,0,0], linestyle=:dash) -arrow!(pt, [0,1,0], linestyle=:dash) -``` - -#### Alternate forms - -The equation for the tangent plane is often expressed in a more explicit form. For $n=2$, if we set $dx = x-a$ and $dy=y-a$, then the equation for the plane becomes: - -```math -f(a,b) + \frac{\partial f}{\partial x} dx + \frac{\partial f}{\partial y} dy, -``` - -which is a common form for the equation, though possibly confusing, as $\partial x$ and $dx$ need to be distinguished. For $n > 2$, additional terms follow this pattern. This explicit form is helpful when doing calculations by hand, but much less so when working on the computer, say with `Julia`, as the representations using vectors (or matrices) can be readily implemented and their representation much closer to the formulas. For example, consider these two possible functions to find the tangent plane (returned as a function) at a point in ``2`` dimensions - -```julia; -function tangent_plane_1st_crack(f, pt) - fx, fy = ForwardDiff.gradient(f, pt) - x -> f(x...) + fx * (x[1]-pt[1]) + fy * (x[2]-pt[2]) -end -``` - -It isn't so bad, but as written, we specialized to the number of dimensions, used indexing, and with additional dimensions, it clearly would get tedious to generalize. Using vectors, we might have: - -```julia; -function tangent_plane(f, pt) - ∇f = ForwardDiff.gradient(f, pt) # using a variable ∇f - x -> f(pt) + ∇f ⋅ (x - pt) -end -``` - -This is much more like the compact formula and able to handle higher dimensions without rewriting. - - -### Tangent plane for level curves - -Consider the surface described by $f(x,y,z) = c$, a constant. This is more general than surfaces described by $z = f(x,y)$. The concept of a tangent plane should still be applicable though. Suppose, $\vec{\gamma}(t)$ is a curve in the $x-y-z$ plane, then we have $(f\circ\vec\gamma)(t)$ is a curve on the surface and its derivative is given by the chain rule through: $\nabla{f}(\vec\gamma(t))\cdot \vec\gamma'(t)$. But this composition is constantly the same value, so the derivative is $0$. This says that $\nabla{f}(\vec\gamma(t))$ is *orthogonal* to $\vec\gamma'(t)$ for any curve. As these tangential vectors to $\vec\gamma$ lie in the tangent plane, the tangent plane can be characterized by having $\nabla{f}$ as the normal. - -This computation was previously done in two dimensions, and showed the gradient is orthogonal to the contour lines (and points in the direction of greatest ascent). It can be generalized to higher dimensions. - -The surface $F(x,y,z) = z - f(x,y) = 0$ has gradient given by $\langle --\partial{f}/\partial{x}, -\partial{f}/\partial{y}, 1\rangle$, and as seen -above, this vector is normal to the tangent plane, so this -generalization agrees on the easier case. - - -For clarity: - -* The scalar function $z = f(x,y)$ describes a surface, $(x,y,f(x,y))$; the gradient, $\nabla{f}$, is $2$ dimensional and points in the direction of greatest ascent for the surface. -* The scalar function $f(x,y,z)$ *also* describes a surface, through level curves $f(x,y,z) = c$, for some *constant* $c$. The gradient $\nabla{f}$ is $3$ dimensional and *orthogonal* to the surface. - - -##### Example - -Let $z = f(x,y) = \sin(x)\cos(x-y)$. Find an equation for the tangent plane at $(\pi/4, \pi/3)$. - -We have many possible forms to express this in, but we will use the functional description: - -```julia -@syms x, y -``` - -```julia; hold=true -f(x,y) = sin(x) * cos(x-y) -f(x) = f(x...) -vars = [x, y] - -gradf = diff.(f(x,y), vars) # or use gradient(f, vars) or ∇((f,vars)) - -pt = [PI/4, PI/3] -gradfa = subs.(gradf, x=>pt[1], y=>pt[2]) - -f(pt) + gradfa ⋅ (vars - pt) -``` - - -##### Example - -A cylinder $f(x,y,z) = (x-a)^2 + y^2 = (2a)^2$ is intersected with a sphere $g(x,y,z) = x^2 + y^2 + z^2 = a^2$. Let $V$ be the line of intersection. (Viviani's curve). Let $P$ be a point on the curve. Describe the tangent to the curve. - -We have the line of intersection will have tangent line lying in the tangent plane to both surfaces. These two surfaces have normal vectors given by the gradient, or $\vec{n}_1 = \langle 2(x-a), 2y, 0 \rangle$ and $\vec{n}_2 = \langle 2x, 2y, 2z \rangle$. The cross product of these two vectors will lie in both tangent planes, so we have: - -```math -P + t (\vec{n}_1 \times \vec{n}_2), -``` - -will describe the tangent. - -The curve may be described parametrically by $\vec\gamma(t) = a \langle 1 + \cos(t), \sin(t), 2\sin(t/2) \rangle$. Let's see that the above is correct by verifying that the cross product of the tangent vector computed two ways is $0$: - -```julia; hold=true -a = 1 -gamma(t) = a * [1 + cos(t), sin(t), 2sin(t/2) ] -P = gamma(1/2) -n1(x,y,z)= [2*(x-a), 2y, 0] -n2(x,y,z) = [2x,2y,2z] -n1(x) = n1(x...) -n2(x) = n2(x...) - -t = 1/2 -(n1(gamma(t)) × n2(gamma(t))) × gamma'(t) -``` - - - -#### Plotting level curves of $F(x,y,z) = c$ - -The `wireframe` plot can be used to visualize a surface of the type `z=f(x,y)`, as previously illustrated. However we have no way of plotting $3$-dimensional implicit surfaces (of the type $F(x,y,z)=c$) as we do for $2$-dimensional implicit surfaces with `Plots`. (The `MDBM` or `IntervalConstraintProgramming` packages can be used along with `Makie` plotting package to produce one.) - -The `CalculusWithJulia` package provides a stop-gap function, `plot_implicit_surface` for this task. -The basic idea is to slice an axis, by default the $z$ axis up and for each level plot the contours of $(x,y) \rightarrow f(x,y,z)-c$, which becomes a $2$-dimensional problem. The function allows any of 3 different axes to be chosen to slice over, the default being just the $z$ axis. - - -We demonstrate with an example from a February 14, 2019 article in the [New York Times](https://www.nytimes.com/2019/02/14/science/math-algorithm-valentine.html). It shows an equation for a "heart," as the graphic will illustrate: - -```julia; hold=true -a, b = 1, 3 -f(x,y,z) = (x^2 + ((1+b) * y)^2 + z^2 - 1)^3 - x^2 * z^3 - a * y^2 * z^3 - -CalculusWithJulia.plot_implicit_surface(f, xlim=-2..2, ylim=-1..1, zlim=-1..2) -``` - - - - -## Linearization - -The tangent plane is the best "linear approximation" to a function at a point. "Linear" refers to mathematical properties of the tangent plane, but at a practical level it means easy to compute, as it will involve only multiplication and addition. "Approximation" is useful in that if a bit of error is an acceptable tradeoff for computational ease, the tangent plane may be used in place of the function. In the univariate case, this is known as linearization, and the tradeoff is widely used in the derivation of theoretical relationships, as well as in practice to get reasonable numeric values. - -Formally, this is saying: - -```math -f(\vec{x}) \approx f(\vec{a}) + ∇f(\vec{a}) ⋅ (\vec{x} - \vec{a}). -``` - -The explicit meaning of $\approx$ will be made clear when the generalization of Taylor's theorem is to be stated. - - -##### Example: Linear approximation - -The volume of a cylinder is $V=\pi r^2 h$. It is thought a cylinder has $r=1$ and $h=2$. If instead, the amounts are $r=1.01, h=2.01$, what is the difference in volume? - -That is, if $V(r,h) = \pi r^2 h$, what is $V(1.01, 2.01) - V(1,2)$? - -We can use linear approximation to see that this difference is *approximately* $\nabla{V} \cdot \langle 0.01, 0.01 \rangle$. This is: - -```julia; -V(r, h) = pi * r^2 * h -V(v) = V(v...) -a₁ = [1,2] -dx₁ = [0.01, 0.01] -ForwardDiff.gradient(V, a₁) ⋅ dx₁ # or use ∇(V)(a) -``` - -The exact difference can be computed: - -```julia; -V(a₁ + dx₁) - V(a₁) -``` - - -##### Example - -Let $f(x,y) = \sin(\pi x y^2)$. Estimate $f(1.1, 0.9)$. - -Using linear approximation with $dx=0.1$ and $dy=-0.1$, this is - -```math -f(1,1) + \nabla{f}(1,1) \cdot \langle 0.1, -0.1\rangle, -``` - -where $f(1,1) = \sin(\pi) = 0$ and $\nabla{f} = \langle y^2\cos(\pi x y^2), \cos(\pi x y^2) 2y\rangle = \cos(\pi x y^2)\langle x,2y\rangle$. So, the answer is: - -```math -0 + \cos(\pi) \langle 1,2\rangle\cdot \langle 0.1, -0.1 \rangle = -(-1)(0.1 - 2(0.1)) = 0.1. -``` - -##### Example - -A [piriform](http://www.math.harvard.edu/~knill/teaching/summer2011/handouts/32-linearization.pdf) is described by the quartic surface $f(x,y,z) = x^4 -x^3 + y^2+z^2 = 0$. Find the tangent line at the point $\langle 2,2,2 \rangle$. - -Here, $\nabla{f}$ describes a *normal* to the tangent plane. The description of a plane may be described by $\hat{N}\cdot(\vec{x} - \vec{x}_0) = 0$, where $\vec{x}_0$ is identified with a point on the plane (the point $(2,2,2)$ here). With this, we have $\hat{N}\cdot\vec{x} = ax + by + cz = \hat{N}\cdot\langle 2,2,2\rangle = 2(a+b+c)$. For ths problem, $\nabla{f}(2,2,2) = \langle a, b, c\rangle$ is given by: - -```julia;hold=true -f(x,y,z) = x^4 -x^3 + y^2 + z^2 -f(v) = f(v...) -a, b,c = ∇(f)(2,2,2) -"$a x + $b y + $c z = $([a,b,c] ⋅ [2,2,2])" -``` - -### Newton's method to solve $f(x,y) = 0$ and $g(x,y)=0$. - - -The level curve $f(x,y)=0$ and the level curve $g(x,y)=0$ may intersect. Solving algebraically for the intersection may be difficult in most cases, though the linear case is not. (The linear case being the intersection of two lines). - -To elaborate, consider two linear equations written in a general form: - -```math -\begin{align} -ax + by &= u\\ -cx + dy &= v -\end{align} -``` - -A method to solve this by hand would be to solve for $y$ from one equation, replace this expression into the second equation and then solve for $x$. From there, $y$ can be found. A more advanced method expresses the problem in a matrix formulation of the form $Mx=b$ and solves that equation. This form of solving is implemented in `Julia`, through the "backslash" operator. Here is the general solution: - -```julia; hold=true -@syms a b c d u v -M = [a b; c d] -B = [u, v] -M \ B .|> simplify -``` - -The term $\det(M) = ad-bc$ term is important, as evidenced by its appearance in the denominator of each term. When this is zero there is not a unique solution, as in the typical case. - - - -Using Newton's method to solve for intersection points, uses -linearization of the surfaces to replace the problem to the -intersection of level curves for tangent planes. This is the linear -case that can be readily solved. As with Newton's method for the -univariate case, the new answer is generally a better *approximation* -to the answer, and the process is iterated to get a *good enough* -approximation, as defined through some tolerance. - -Consider the functions $f(x,y) =2 - x^2 - y^2$ and -$g(x,y) = 3 - 2x^2 - (1/3)y^2$. These graphs show their surfaces with the level sets for $c=0$ drawn and just the levels sets, showing they intersect in ``4`` places. - - -```julia; hold=true; echo=false -f(x,y) = 2 - x^2 - y^2 -g(x,y) = 3 - 2x^2 - (1/3)y^2 -xs = ys = range(-3, stop=3, length=100) -zfs = [f(x,y) for x in xs, y in ys] -zgs = [g(x,y) for x in xs, y in ys] - - -ps = Any[] -pf = surface(xs, ys, f, alpha=0.5, legend=false) - -for cl in levels(contours(xs, ys, zfs, [0.0])) - for line in lines(cl) - _xs, _ys = coordinates(line) - plot!(pf, _xs, _ys, 0*_xs, linewidth=3, color=:blue) - end -end - - -pg = surface(xs, ys, g, alpha=0.5, legend=false) -for cl in levels(contours(xs, ys, zgs, [0.0])) - for line in lines(cl) - _xs, _ys = coordinates(line) - plot!(pg, _xs, _ys, 0*_xs, linewidth=3, color=:red) - end -end - -pcnt = plot(legend=false) -for cl in levels(contours(xs, ys, zfs, [0.0])) - for line in lines(cl) - _xs, _ys = coordinates(line) - plot!(pcnt, _xs, _ys, linewidth=3, color=:blue) - end -end - -for cl in levels(contours(xs, ys, zgs, [0.0])) - for line in lines(cl) - _xs, _ys = coordinates(line) - plot!(pcnt, _xs, _ys, linewidth=3, color=:red) - end -end - -l = @layout([a b c]) -plot(pf, pg, pcnt, layout=l) -``` - -We look to find the intersection point near $(1,1)$ using Newton's method - - -We have by linearization: - -```math -\begin{align} -f(x,y) &\approx f(x_n, y_n) + \frac{\partial f}{\partial x}\Delta x + \frac{\partial f}{\partial y}\Delta y \\ -g(x,y) &\approx g(x_n, y_n) + \frac{\partial g}{\partial x}\Delta x + \frac{\partial g}{\partial y}\Delta y, -\end{align} -``` -where $\Delta x = x- x_n$ and $\Delta y = y-y_n$. Setting $f(x,y)=0$ and $g(x,y)=0$, leaves these two linear equations in $\Delta x$ and $\Delta y$: - -```math -\begin{align} -\frac{\partial f}{\partial x} \Delta x + \frac{\partial f}{\partial y} \Delta y &= -f(x_n, y_n)\\ -\frac{\partial g}{\partial x} \Delta x + \frac{\partial g}{\partial y} \Delta y &= -g(x_n, y_n). -\end{align} -``` - - -One step of Newton's method defines $(x_{n+1}, y_{n+1})$ to be the values $(x,y)$ that make the linearized functions about $(x_n, y_n)$ both equal to $\vec{0}$. - - - -As just described, we can use `Julia`'s `\` operation to solve the above system of equations, if we express them in matrix form. With this, one step of Newton's method can be coded as follows: - - -```julia; -function newton_step(f, g, xn) - M = [ForwardDiff.gradient(f, xn)'; ForwardDiff.gradient(g, xn)'] - b = -[f(xn), g(xn)] - Delta = M \ b - xn + Delta -end -``` - -We investigate what happens starting at $(1,1)$ after one step: - -```julia; -𝒇(x,y) = 2 - x^2 - y^2 -𝒈(x,y) = 3 - 2x^2 - (1/3)y^2 -𝒇(v) = 𝒇(v...); 𝒈(v) = 𝒈(v...) -𝒙₀ = [1,1] -𝒙₁ = newton_step(𝒇, 𝒈, 𝒙₀) -``` - -The new function values are - -```julia; -𝒇(𝒙₁), 𝒈(𝒙₁) -``` - -We can get better approximations by iterating. Here we hard code ``4`` more steps: - -```julia; -𝒙₂ = newton_step(𝒇, 𝒈, 𝒙₁) -𝒙₃ = newton_step(𝒇, 𝒈, 𝒙₂) -𝒙₄ = newton_step(𝒇, 𝒈, 𝒙₃) -𝒙₅ = newton_step(𝒇, 𝒈, 𝒙₄) -𝒙₅, 𝒇(𝒙₅), 𝒈(𝒙₅) -``` - -We see that at the new point, `x5`, both functions are basically the same value, $0$, so we have approximated the intersection point. - -For nearby initial guesses and reasonable functions, Newton's method is *quadratic*, so should take few steps for convergence, as above. - -Here is a simplistic method to iterate $n$ steps: - -```julia; -function nm(f, g, x, n=5) - for i in 1:n - x = newton_step(f, g, x) - end - x -end -``` - - -##### Example - -Consider the [bicylinder](https://blogs.scientificamerican.com/roots-of-unity/a-few-of-my-favorite-spaces-the-bicylinder/) the intersection of two perpendicular cylinders of the same radius. If the radius is $1$, we might express these by the functions: - -```math -f(x,y) = \sqrt{1 - y^2}, \quad g(x,y) = \sqrt{1 - x^2}. -``` - -We see that $(1,1)$, $(-1,1)$, $(1,-1)$ and $(-1,-1)$ are solutions to $f(x,y)=0$, $g(x,y)=0$ *and* -$(0,0)$ is a solution to $f(x,y)=1$ and $g(x,y)=1$. What about a level like $1/2$, say? - -Rather than work with $f(x,y) = c$ we solve $f(x,y)^2 = c^2$, as that will be avoid issues with the square root not being defined. Here is one way to solve: - -```julia; hold=true -c = 1/2 -f(x,y) = 1 - y^2 - c^2 -g(x,y) = (1 - x^2) - c^2 -f(v) = f(v...); g(v) = g(v...) -nm(f, g, [1/2, 1/3]) -``` - -That $x=y$ is not so surprising, and in fact, this problem can more easily be solved analytically through $x^2 = y^2 = 1 - c^2$. - - - - - - -## Implicit differentiation - -Implicit differentiation of an equation of two variables (say $x$ and $y$) is performed by *assuming* $y$ is a function of $x$ and when differentiating an expression with $y$, use the chain rule. For example, the slope of the tangent line, $dy/dx$, for the general ellipse $x^2/a + y^2/b = 1$ can be found through this calculation: - -```math -\frac{d}{dx}(\frac{x^2}{a} + \frac{y^2}{b}) = -\frac{d}{dx}(1), -``` - -or, using $d/dx(y^2) = 2y dy/dx$: - -```math -\frac{2x}{a} + \frac{2y \frac{dy}{dx}}{b} = 0. -``` - -From this, solving for $dy/dx$ is routine, as the equation is linear in that unknown: $dy/dx = -(b/a)(x/y)$ - -With more variables, the same technique may be used. Say we have variables $x$, $y$, and $z$ in a relation like $F(x,y,z) = 0$. If we assume $z=z(x,y)$ for some differentiable function (we mention later what conditions will ensure this assumption is valid for some open set), then we can proceed as before, using the chain rule as necessary. - - -For example, consider the ellipsoid: $x^2/a + y^2/b + z^2/c = 1$. What is $\partial z/\partial x$ and $\partial{z}/\partial{y}$, as needed to describe the tangent plane as above? - - -To find $\partial/\partial{x}$ we have: - -```math -\frac{\partial}{\partial{x}}(x^2/a + y^2/b + z^2/c) = -\frac{\partial}{\partial{x}}1, -``` - -or - -```math -\frac{2x}{a} + \frac{0}{b} + \frac{2z\frac{\partial{z}}{\partial{x}}}{c} = 0. -``` - -Again the desired unknown is within a linear equation so can readily be solved: - -```math -\frac{\partial{z}}{\partial{x}} = -\frac{c}{a} \frac{x}{z}. -``` - -A similar approach can be used for $\partial{z}/\partial{y}$. - -##### Example - -Let $f(x,y,z) = x^4 -x^3 + y^2 + z^2 = 0$ be a surface with point $(2,2,2)$. Find $\partial{z}/\partial{x}$ and $\partial{z}/\partial{y}$. - - -To find $\partial{z}/\partial{x}$ and $\partial{z}/\partial{y}$ we have: - -```julia; hold=true -@syms x, y, Z() -∂x = solve(diff(x^4 -x^3 + y^2 + Z(x,y)^2, x), diff(Z(x,y),x)) -∂y = solve(diff(x^4 -x^3 + y^2 + Z(x,y)^2, x), diff(Z(x,y),y)) -∂x, ∂y -``` - - -## Optimization - - -For a continuous univariate function $f:R \rightarrow R$ over an interval $I$ the question of finding a maximum or minimum value is aided by two theorems: - -* The Extreme Value Theorem, which states that if $I$ is closed (e.g, $I=[a,b]$) then $f$ has a maximum (minimum) value $M$ and there is at least one value $c$ with $a \leq c \leq b$ with $M = f(x)$. - -* [Fermat](https://tinyurl.com/nfgz8fz)'s theorem on critical points, which states that if $f:(a,b) \rightarrow R$ and $x_0$ is such that $a < x_0 < b$ and $f(x_0)$ is a *local* extremum. If $f$ is differentiable at $x_0$, then $f'(x_0) = 0$. That is, local extrema of $f$ happen at points where the derivative does not exist or is $0$ (critical points). - -These two theorems provide an algorithm to find the extreme values of a continuous function over a closed interval: find the critical points, check these and the end points for the maximum and minimum value. - -These checks can be reduced by two theorems that can classify critical points as local extrema, the first and second derivative tests. - - -These theorems have generalizations to scalar functions, allowing a -similar study of extrema. - -First, we define a *local* maximum for $f:R^n \rightarrow R$ over a -region $U$: a point $\vec{a}$ in $U$ is a *local* maximum if -$f(\vec{a}) \geq f(\vec{u})$ for all $u$ in some ball about -$\vec{a}$. A *local* minimum would have $\leq$ instead. - -An *absolute* maximum over $U$, should it exist, would be $f(\vec{a})$ -if there exists a value $\vec{a}$ in $U$ with the property $f(\vec{a}) -\geq f(\vec{u})$ for all $\vec{u}$ in $U$. - -The difference is the same as the one-dimensional case: local is a -statement about nearby points only, absolute a statement about all the -points in the specified set. - -> The [Extreme Value Theorem](https://tinyurl.com/yyhgxu8y) Let $f:R^n \rightarrow R$ be continuous and defined on *closed* set $V$. Then $f$ has a minimum value $m$ and maximum value $M$ over $V$ and there exists at least two points $\vec{a}$ and $\vec{b}$ with $m = f(\vec{a})$ and $M = f(\vec{b})$. - - -> [Fermat](https://tinyurl.com/nfgz8fz)'s theorem on critical points. Let $f:R^n \rightarrow R$ be a continuous function defined on an *open* set $U$. If $x \in U$ is a point where $f$ has a local extrema *and* $f$ is differentiable, then the gradient of $f$ at $x$ is $\vec{0}$. - - -Call a point in the domain of $f$ where the function is differentiable and the gradient is zero a *stationary point* and a point in the domain where the function is either not differentiable or is a stationary point a *critical point*. The local extrema can only happen at critical points by Fermat. - -Consider the function $f(x,y) = e^{-(x^2 + y^2)/5} \cos(x^2 + y^2)$. - -```julia; hold=true -f(x,y)= exp(-(x^2 + y^2)/5) * cos(x^2 + y^2) -xs = ys = range(-4, 4, length=100) -surface(xs, ys, f, legend=false) -``` - -This function is differentiable and the gradient is given by: - -```math -\nabla{f} = -2/5e^{-(x^2 + y^2)/5} (5\sin(x^2 + y^2) + \cos(x^2 + y^2)) \langle x, y \rangle. -``` - -This is zero at the origin, or when $5\sin(x^2 + y^2) = -\cos(x^2 + y^2)$. The latter is $0$ on circles of radius $r$ where $5\sin(r) = \cos(r)$ or $r = \tan^{-1}(-1/5) + k\pi$ for $k = 1, 2, \dots$. This matches the graph, where the extrema are on circles by symmetry. Imagine now, picking a value where the function takes a maximum and adding the tangent plane. As the gradient is $\vec{0}$, this will be flat. The point at the origin will have the surface fall off from the tangent plane in each direction, whereas the other points, will have a circle where the tangent plane rests on the surface, but otherwise will fall off from the tangent plane. Characterizing this "falling off" will help to identify local maxima that are distinct. - ----- - -Now consider the differentiable function $f(x,y) = xy$, graphed below with the projections of the $x$ and $y$ axes: - -```julia; hold=true -f(x,y) = x*y -xs = ys = range(-3, 3, length=100) -surface(xs, ys, f, legend=false) - -plot_parametric!(-4..4, t -> [t, 0, f(t, 0)], linewidth=5) -plot_parametric!(-4..4, t -> [0, t, f(0, t)], linewidth=5) -``` - -The extrema happen at the edges of the region. The gradient is $\nabla{f} = \langle y, x \rangle$. This is $\vec{0}$ only at the origin. At the origin, were we to imagine a tangent plane, the surface falls off in one direction but falls *above* in the other direction. Such a point is referred to as a *saddle point*. A saddle point for a continuous $f:R^n \rightarrow R$ would be a critical point, $\vec{a}$ where for any ball with non-zero radius about $\vec{a}$, there are values where the function is greater than $f(\vec{a})$ and values where the function is less. - -To identify these through formulas, and not graphically, we could try and use the first derivative test along all paths through $\vec{a}$, but this approach is better at showing something isn't the case, like two paths to show non-continuity. - -The generalization of the *second* derivative test is more concrete though. Recall, the second derivative test is about the concavity of the function at the critical point. When the concavity can be determined as non-zero, the test is conclusive; when the concavity is zero, the test is not conclusive. Similarly here: - -> The [second](https://en.wikipedia.org/wiki/Second_partial_derivative_test) Partial Derivative Test for $f:R^2 \rightarrow R$. -> -> Assume the first and second partial derivatives of $f$ are defined and continuous; $\vec{a}$ be a critical point of $f$; $H$ is the hessian matrix, $[f_{xx}\quad f_{xy};f_{xy}\quad f_{yy}]$, and $d = \det(H) = f_{xx} f_{yy} - f_{xy}^2$ is the determinant of the Hessian matrix. Then: -> -> * The function $f$ has a local minimum at $\vec{a}$ if $f_{xx} > 0$ *and* $d>0$, -> -> * The function $f$ has a local maximum at $\vec{a}$ if $f_{xx} < 0$ *and* $d>0$, -> -> * The function $f$ has a saddle point at $\vec{a}$ if $d < 0$, -> -> * Nothing can be said if $d=0$. - ----- - - -The intuition behind a proof follows. The case when $f_{xx} > 0$ and $d > 0$ uses a consequence of these assumptions that for any non-zero vector $\vec{x}$ it *must* be that $x\cdot(Hx) > 0$ ([positive definite](https://en.wikipedia.org/wiki/Definiteness_of_a_matrix)) *and* the quadratic approximation $f(\vec{a}+d\vec{x}) \approx f(\vec{a}) + \nabla{f}(\vec{a}) \cdot d\vec{x} + d\vec{x} \cdot (Hd\vec{x}) = f(\vec{a}) + d\vec{x} \cdot (Hd\vec{x})$, so for any $d\vec{x}$ small enough, $f(\vec{a}+d\vec{x}) \geq f(\vec{a})$. That is $f(\vec{a})$ is a local minimum. Similarly, a proof for the local maximum follows by considering $-f$. Finally, if $d < 0$, then there are vectors, $d\vec{x}$, for which $d\vec{x} \cdot (Hd\vec{x})$ will have different signs, and along these vectors the function will be concave up/concave down. - -Apply this to $f(x,y) = xy$ at $\vec{a} = \vec{0}$ we have $f_{xx} = f_{yy} = 0$ and $f_{xy} = 1$, so the determinant of the Hessian is $-1$. By the second partial derivative test, this critical point is a saddle point, as seen from a previous graph. - - -Applying this to $f(x,y) = e^{-(x^2 + y^2)/5} \cos(x^2 + y^2)$, we will use `SymPy` to compute the derivatives, as they get a bit involved: - -```julia; -fₖ(x,y) = exp(-(x^2 + y^2)/5) * cos(x^2 + y^2) -Hₖ = sympy.hessian(fₖ(x,y), (x,y)) -``` - -This is messy, but we only consider it at critical points. The point $(0,0)$ is graphically a local maximum. We can see from the Hessian, that the second partial derivative test will give the same characterization: - -```julia; -H₀₀ = subs.(Hₖ, x=>0, y=>0) -``` - -Which satisfies: - -```julia; -H₀₀[1,1] < 0 && det(H₀₀) > 0 -``` - -Now consider $\vec{a} = \langle \sqrt{2\pi + \tan^{-1}(-1/5)}, 0 \rangle$, a point on the first visible ring on the graph. The gradient vanishes here: - -```julia; hold=true -gradfₖ = diff.(fₖ(x,y), [x,y]) -a = [sqrt(2PI + atan(-Sym(1)//5)), 0] -subs.(gradfₖ, x => a[1], y => a[2]) -``` - -But the test is *inconclusive*, as the determinant of the Hessian is $0$: - -```julia; hold=true -a = [sqrt(PI + atan(-Sym(1)//5)), 0] -H_a = subs.(Hₖ, x => a[1], y => a[2]) -det(H_a) -``` - -(The test is inconclusive, as it needs the function to "fall away" from the tangent plane in all directions, in this case, along a circular curve, the function touches the tangent plane, so it doesn't fall away.) - -##### Example - -Characterize the critical points of $f(x,y) = 4xy - x^4 - y^4$. - -The critical points may be found by solving when the gradient is $\vec{0}$: - -```julia; -fⱼ(x,y) = 4x*y - x^4 - y^4 -gradfⱼ = diff.(fⱼ(x,y), [x,y]) -``` - -```julia; -all_ptsⱼ = solve(gradfⱼ, [x,y]) -ptsⱼ = filter(u -> all(isreal.(u)), all_ptsⱼ) -``` - -There are $3$ real critical points. To classify them we need the sign of $f_{xx}$ and the determinant of the Hessian. We make a simple function to compute these, then apply it to each point using a comprehension: - -```julia; -Hⱼ = sympy.hessian(fⱼ(x,y), (x,y)) -function classify(H, pt) - Ha = subs.(H, x .=> pt[1], y .=> pt[2]) - (det=det(Ha), f_xx=Ha[1,1]) -end -[classify(Hⱼ, pt) for pt in ptsⱼ] -``` - -We see the first and third points have positive determinant and negative $f_{xx}$, so are relative maxima, and the second point has negative derivative, so is a saddle point. We graphically confirm this: - - -```julia; hold=true -xs = ys = range(-3/2, 3/2, length=100) -p = surface(xs, ys, fⱼ, legend=false) -for pt ∈ ptsⱼ - scatter!(p, unzip([N.([pt...,fⱼ(pt...)])])..., - markercolor=:black, markersize=5) # add each pt on surface -end -p -``` - - -##### Example - -Consider the function $f(x,y) = x^2 + 3y^2 -x$ over the region $x^2 + y^2 \leq 1$. This is a continuous function over a closed set, so will have both an absolute maximum and minimum. Find these from an investigation of the critical points and the boundary points. - -The gradient is easily found: $\nabla{f} = \langle 2x - 1, 6y \rangle$, and is $\vec{0}$ only at $\vec{a} = \langle 1/2, 0 \rangle$. The Hessian is: - -```math -H = \left[ -\begin{array}{} -2 & 0\\ -0 & 6 -\end{array} -\right]. -``` - -At $\vec{a}$ this has positive determinant and $f_{xx} > 0$, so $\vec{a}$ corresponds to a *local* minimum with values $f(\vec{a}) = (1/2)^2 + 3(0) - 1/2 = -1/4$. The absolute maximum and minimum may occur here (well, not the maximum) or on the boundary, so that must be considered. In this case we can easily parameterize the boundary and turn this into the univariate case: - -```julia; -fₗ(x,y) = x^2 + 2y^2 - x -fₗ(v) = fₗ(v...) -gammaₗ(t) = [cos(t), sin(t)] # traces out x^2 + y^2 = 1 over [0, 2pi] -gₗ = fₗ ∘ gammaₗ - -cpsₗ = find_zeros(gₗ', 0, 2pi) # critical points of g -append!(cpsₗ, [0, 2pi]) -unique!(cpsₗ) -gₗ.(cpsₗ) -``` - -We see that maximum value is `2.25` and that the interior point, $\vec{a}$, will be where the minimum value occurs. To see exactly where the maximum occurs, we look at the values of gamma: - -```julia; -inds = [2,4] -cpsₗ[inds] -``` - -These are multiples of $\pi$: - -```julia; -cpsₗ[inds]/pi -``` - -So we have the maximum occurs at the angles $2\pi/3$ and $4\pi/3$. Here we visualize, using a hacky trick of assigning `NaN` values to the function to avoid plotting outside the circle: - -```julia; -hₗ(x,y) = fₗ(x,y) * (x^2 + y^2 <= 1 ? 1 : NaN) -``` - -```julia; hold=true -xs = ys = range(-1,1, length=100) -surface(xs, ys, hₗ) - -ts = cpsₗ # 2pi/3 and 4pi/3 by above -xs, ys = cos.(ts), sin.(ts) -scatter!(xs, ys, fₗ) -``` - -A contour plot also shows that some - and only one - extrema happens on the interior: - -```julia; hold=true -xs = ys = range(-1,1, length=100) -contour(xs, ys, hₗ) -``` - -The extrema are identified by the enclosing regions, in this case the one around the point $(1/2, 0)$. - - -##### Example: Steiner's problem - -This is from [Strang](https://ocw.mit.edu/resources/res-18-001-calculus-online-textbook-spring-2005/textbook/MITRES_18_001_strang_13.pdf) p 506. - -We have three points in the plane, $(x_1, y_1)$, $(x_2, y_2)$, and $(x_3,y_3)$. A point $p=(p_x, p_y)$ will have $3$ distances $d_1$, $d_2$, and $d_3$. Broadly speaking we want to minimize to find the point $p$ "nearest" the three fixed points within the triangle. Locating a facility so that it can service ``3`` separate cities might be one application. The answer depends on the notion of what measure of distance to use. - -If the measure is the Euclidean distance, then $d_i^2 = (p_x - x_i)^2 + (p_y - y_i)^2$. If we sought to minimize $d_1^2 + d_2^2 + d_3^2$, then we would proceed as follows: - -```julia; -@syms x1 y1 x2 y2 x3 y3 -d2(p,x) = (p[1] - x[1])^2 + (p[2]-x[2])^2 -d2_1, d2_2, d2_3 = d2((x,y), (x1, y1)), d2((x,y), (x2, y2)), d2((x,y), (x3, y3)) -exₛ = d2_1 + d2_2 + d2_3 -``` - -We then find the gradient, and solve for when it is $\vec{0}$: - -```julia; -gradfₛ = diff.(exₛ, [x,y]) -xstarₛ = solve(gradfₛ, [x,y]) -``` - -There is only one critical point, so must be a minimum. - -We confirm this by looking at the Hessian and noting $H_{11} > 0$: - -```julia; -Hₛ = subs.(hessian(exₛ, [x,y]), x=>xstarₛ[x], y=>xstarₛ[y]) -``` - - -As it occurs at $(\bar{x}, \bar{y})$ where $\bar{x} = (x_1 + x_2 + x_3)/3$ and $\bar{y} = (y_1+y_2+y_3)/3$ - the averages of the three values - the critical point is an interior point of the triangle. - - -As mentioned by Strang, the real problem is to minimize $d_1 + d_2 + d_3$. A direct approach with `SymPy` - just replacing `d2` above with the square root` fails. Consider instead the gradient of $d_1$, say. To avoid square roots, this is taken implicitly from $d_1^2$: - -```math -\frac{\partial}{\partial{x}}(d_1^2) = 2 d_1 \frac{\partial{d_1}}{\partial{x}}. -``` - -But computing directly from the expression yields $2(x - x_1)$ Solving, yields: - -```math -\frac{\partial{d_1}}{\partial{x}} = \frac{(x-x_1)}{d_1}, \quad -\frac{\partial{d_1}}{\partial{y}} = \frac{(y-y_1)}{d_1}. -``` - -The gradient is then $(\vec{p} - \vec{x}_1)/\|\vec{p} - \vec{x}_1\|$, a *unit* vector, call it $\hat{u}_1$. Similarly for $\hat{u}_2$ and $\hat{u}_3$. - -Let $f = d_1 + d_2 + d_3$. Then $\nabla{f} = \hat{u}_1 + \hat{u}_2 + \hat{u}_3$. At the minimum, the gradient is $\vec{0}$, so the three unit vectors must cancel. This can only happen if the three make a "peace" sign with angles $120^\circ$ between them. -To find the minimum then within the triangle, this point and the boundary must be considered, when this point falls outside the triangle. - -Here is a triangle, where the minimum would be within the triangle: - -```julia; -usₛ = [[cos(t), sin(t)] for t in (0, 2pi/3, 4pi/3)] -polygon(ps) = unzip(vcat(ps, ps[1:1])) # easier way to plot a polygon - -pₛ = scatter([0],[0], markersize=2, legend=false, aspect_ratio=:equal) - -asₛ = (1,2,3) -plot!(polygon([a*u for (a,u) in zip(asₛ, usₛ)])...) -[arrow!([0,0], a*u, alpha=0.5) for (a,u) in zip(asₛ, usₛ)] -pₛ -``` -For this triangle we find the Steiner point outside of the triangle. - -```julia; -asₛ₁ = (1, -1, 3) -scatter([0],[0], markersize=2, legend=false) -psₛₗ = [a*u for (a,u) in zip(asₛ₁, usₛ)] -plot!(polygon(psₛₗ)...) -``` - -Let's see where the minimum distance point is by constructing a plot. The minimum must be on the boundary, as the only point where the gradient vanishes is the origin, not in the triangle. The plot of the triangle has a contour plot of the distance function, so we see clearly that the minimum happens at the point `[0.5, -0.866025]`. On this plot, we drew the gradient at some points along the boundary. The gradient points in the direction of greatest increase - away from the minimum. That the gradient vectors have a non-zero projection onto the edges of the triangle in a direction pointing away from the point indicates that the function `d` would increase if moved along the boundary in that direction, as indeed it does. - -```julia -euclid_dist(x; ps=psₛₗ) = sum(norm(x-p) for p in ps) -euclid_dist(x,y; ps=psₛₗ) = euclid_dist([x,y]; ps=ps) -``` - -```julia; hold=true -xs = range(-1.5, 1.5, length=100) -ys = range(-3, 1.0, length=100) - -p = plot(polygon(psₛₗ)..., linewidth=3, legend=false) -scatter!(p, unzip(psₛₗ)..., markersize=3) -contour!(p, xs, ys, euclid_dist) - -# add some gradients along boundary -li(t, p1, p2) = p1 + t*(p2-p1) # t in [0,1] -for t in range(1/100, 1/2, length=3) - pt = li(t, psₛₗ[2], psₛₗ[3]) - arrow!(pt, ForwardDiff.gradient(euclid_dist, pt)) - pt = li(t, psₛₗ[2], psₛₗ[1]) - arrow!(pt, ForwardDiff.gradient(euclid_dist, pt)) -end - -p -``` - -The following graph, shows distance along each edge: - -```julia;hold = true -li(t, p1, p2) = p1 + t*(p2-p1) -p = plot(legend=false) -for i in 1:2, j in (i+1):3 - plot!(p, t -> euclid_dist(li(t, psₛₗ[i], psₛₗ[j]); ps=psₛₗ), 0, 1) -end -p -``` - -The smallest value is when $t=0$ or $t=1$, so at one of the points, as `li` is defined above. - - -##### Example: least squares - -We know that two points determine a line. What happens when there are more than two points? This is common in statistics where a bivariate data set (pairs of points $(x,y)$) are summarized through a linear model $\mu_{y|x} = \alpha + \beta x$, That is the average value for $y$ given a particular $x$ value is given through the equation of a line. The data is used to identify what the slope and intercept are for this line. We consider a simple case - $3$ points. The case of $n \geq 3$ being similar. - -We have a line $l(x) = \alpha + \beta(x)$ and three points $(x_1, y_1)$, $(x_2, y_2)$, and $(x_3, y_3)$. Unless these three points *happen* to be collinear, they can't possibly all lie on the same line. So to *approximate* a relationship by a line requires some inexactness. One measure of inexactness is the *vertical* distance to the line: - -```math -d1(\alpha, \beta) = |y_1 - l(x_1)| + |y_2 - l(x_2)| + |y_3 - l(x_3)|. -``` - -Another might be the vertical squared distance to the line: - - -```math -\begin{align*} -d2(\alpha, \beta) &= (y_1 - l(x_1))^2 + (y_2 - l(x_2))^2 + (y_3 - l(x_3))^2 \\ -&= (y1 - (\alpha + \beta x_1))^2 + (y3 - (\alpha + \beta x_3))^2 + (y3 - (\alpha + \beta x_3))^2 -\end{align*} -``` - -Another might be the *shortest* distance to the line: - -```math -d3(\alpha, \beta) = \frac{\beta x_1 - y_1 + \alpha}{\sqrt{1 + \beta^2}} + \frac{\beta x_2 - y_2 + \alpha}{\sqrt{1 + \beta^2}} + \frac{\beta x_3 - y_3 + \alpha}{\sqrt{1 + \beta^2}}. -``` - -The method of least squares minimizes the second one of these. That is, it chooses $\alpha$ and $\beta$ that make the expression a minimum. - -```julia; -@syms xₗₛ[1:3] yₗₛ[1:3] α β -li(x, alpha, beta) = alpha + beta * x -d₂(alpha, beta) = sum((y - li(x, alpha, beta))^2 for (y,x) in zip(yₗₛ, xₗₛ)) -d₂(α, β) -``` - -To identify $\alpha$ and $\beta$ we find the gradient: - -```julia; -grad_d₂ = diff.(d₂(α, β), [α, β]) -``` - -```julia; -outₗₛ = solve(grad_d₂, [α, β]) -``` - -As found, the formulas aren't pretty. If $x_1 + x_2 + x_3 = 0$ they simplify. For example: - -```julia; -subs(outₗₛ[β], sum(xₗₛ) => 0) -``` - -Let $\vec{x} = \langle x_1, x_2, x_3 \rangle$ and $\vec{y} = \langle -y_1, y_2, y_3 \rangle$ this is simply $(\vec{x} \cdot -\vec{y})/(\vec{x}\cdot \vec{x})$, a formula that will generalize to -$n > 3$. The assumption is not a restriction - it comes about by subtracting the mean, -$\bar{x} = (x_1 + x_2 + x_3)/3$, from each $x$ term (and similarly -subtract $\bar{y}$ from each $y$ term). A process called "centering." - -With this observation, the formulas can be re-expressed through: - -```math -\beta = \frac{\sum{x_i - \bar{x}}(y_i - \bar{y})}{\sum(x_i-\bar{x})^2}, -\quad -\alpha = \bar{y} - \beta \bar{x}. -``` - - -Relative to the centered values, this may be viewed as a line through -$(\bar{x}, \bar{y})$ with slope given by -$(\vec{x}-\bar{x})\cdot(\vec{y}-\bar{y}) / \|\vec{x}-\bar{x}\|$. - - -As an example, if the point are $(1,1), (2,3), (5,8)$ we get: - -```julia; -[k => subs(v, xₗₛ[1]=>1, yₗₛ[1]=>1, xₗₛ[2]=>2, yₗₛ[2]=>3, - xₗₛ[3]=>5, yₗₛ[3]=>8) for (k,v) in outₗₛ] -``` - - - - - - -### Gradient descent - -As seen in the examples above, extrema may be identified analytically by solving for when the gradient is $0$. Here we discuss some numeric algorithms for finding extrema. - - -An algorithm to identify where a surface is at its minimum is [gradient descent](https://en.wikipedia.org/wiki/Gradient_descent). The gradient points in the direction of the steepest ascent of the surface and the negative gradient the direction of the steepest descent. To move to a minimum then, it make intuitive sense to move in the direction of the negative gradient. How far? That is a different question and one with different answers. Let's formulate the movement first, then discuss how far. - -Let $\vec{x}_0$, $\vec{x}_1$, $\dots$, $\vec{x}_n$ be the position of the algorithm for $n$ steps starting from an initial point $\vec{x}_0$. The difference between these points is given by: - -```math -\vec{x}_{n+1} = \vec{x}_n - \gamma \nabla{f}(\vec{x}_n), -``` - -where $\gamma$ is some scaling factor for the gradient. The above quantifies the idea: to go from $\vec{x}_n$ to $\vec{x}_{n+1}$, move along $-\nabla{f}$ by a certain amount. - -Let $\Delta_x =\vec{x}_{n}- \vec{x}_{n-1}$ and $\Delta_y = \nabla{f}(\vec{x}_{n}) - \nabla{f}(\vec{x}_{n-1})$ A variant of the Barzilai-Borwein method is to take $\gamma_n = | \Delta_x \cdot \Delta_y / \Delta_y \cdot \Delta_y |$. - -To illustrate, take $f(x,y) = -(x^2 + y^2) \cdot e^{-(2x^2 + y^2)}$ and a starting point $\langle 1, 1 \rangle$. We have, starting with $\gamma_0 = 1$ there are $5$ steps taken: - -```julia; -f₂(x,y) = -exp(-((x-1)^2 + 2(y-1/2)^2)) -f₂(x) = f₂(x...) - -xs₂ = [[0.0, 0.0]] # we store a vector -gammas₂ = [1.0] - -for n in 1:5 - xn = xs₂[end] - gamma₀ = gammas₂[end] - xn1 = xn - gamma₀ * gradient(f₂)(xn) - dx, dy = xn1 - xn, gradient(f₂)(xn1) - gradient(f₂)(xn) - gamman1 = abs( (dx ⋅ dy) / (dy ⋅ dy) ) - - push!(xs₂, xn1) - push!(gammas₂, gamman1) -end - -[(x, f₂(x)) for x in xs₂] -``` - -We now visualize, using the `Contour` package to draw the contour lines in the $x-y$ plane: - -```julia; hold=true -function surface_contour(xs, ys, f; offset=0) - p = surface(xs, ys, f, legend=false, fillalpha=0.5) - - ## we add to the graphic p, then plot - zs = [f(x,y) for x in xs, y in ys] # reverse order for use with Contour package - for cl in levels(contours(xs, ys, zs)) - lvl = level(cl) # the z-value of this contour level - for line in lines(cl) - _xs, _ys = coordinates(line) # coordinates of this line segment - _zs = offset * _xs - plot!(p, _xs, _ys, _zs, alpha=0.5) # add curve on x-y plane - end - end - p -end - - -offset = 0 -us = vs = range(-1, 2, length=100) -surface_contour(vs, vs, f₂, offset=offset) -pts = [[pt..., offset] for pt in xs₂] -scatter!(unzip(pts)...) -plot!(unzip(pts)..., linewidth=3) -``` - - -### Newton's method for minimization - - -A variant of Newton's method can be used to minimize a function $f:R^2 \rightarrow R$. We look for points where both partial derivatives of $f$ vanish. Let $g(x,y) = \partial f/\partial x(x,y)$ and $h(x,y) = \partial f/\partial y(x,y)$. Then applying Newton's method, as above to solve simultaneously for when $g=0$ and $h=0$, we considered this matrix: - -```math -M = [\nabla{g}'; \nabla{h}'], -``` - -and had a step expressible in terms of the inverse of $M$ as $M^{-1} [g; h]$. In terms of the function $f$, this step is $H^{-1}\nabla{f}$, where $H$ is the Hessian matrix. [Newton](https://en.wikipedia.org/wiki/Newton%27s_method_in_optimization#Higher_dimensions)'s method then becomes: - -```math -\vec{x}_{n+1} = \vec{x}_n - [H_f(\vec{x}_n]^{-1} \nabla(f)(\vec{x}_n). -``` - -The Wikipedia page states where applicable, Newton's method converges much faster towards a local maximum or minimum than gradient descent. - - - - -We apply it to the task of characterizing the following function, which has a few different peaks over the region $[-3,3] \times [-2,2]$: - -```julia; -function peaks(x, y) - z = 3 * (1 - x)^2 * exp(-x^2 - (y + 1)^2) - z += -10 * (x / 5 - x^3 - y^5) * exp(-x^2 - y^2) - z += -1/3 * exp(-(x+1)^2 - y^2) - return z -end -peaks(v) = peaks(v...) -``` - -```julia; hold=true -xs = range(-3, stop=3, length=100) -ys = range(-2, stop=2, length=100) -Ps = surface(xs, ys, peaks, legend=false) -Pc = contour(xs, ys, peaks, legend=false) -plot(Ps, Pc, layout=2) # combine plots -``` - -As we will solve for the critical points numerically, we consider the contour plot as well, as it shows better where the critical points are. - -Over this region we see clearly 5 peaks or valleys: near $(0, 1.5)$, near $(1.2, 0)$, near $(0.2, -1.8)$, near $(-0.5, -0.8)$, and near $(-1.2, 0.2)$. To classify the $5$ critical points we need to first identify them, then compute the Hessian, and then, possibly compute $f_xx$ at the point. Here we do so for one of them using a numeric approach. - -For concreteness, consider the peak or valley near $(0,1.5)$. We use Newton's method to numerically compute the critical point. The Newton step, specialized here is: - -```julia; -function newton_stepₚ(f, x) - M = ForwardDiff.hessian(f, x) - b = ForwardDiff.gradient(f, x) - x - M \ b -end -``` - -We perform ``3`` steps of Newton's method, and see that it has found a critical point. - -```julia; -xₚ = [0, 1.5] -xₚ = newton_stepₚ(peaks, xₚ) -xₚ = newton_stepₚ(peaks, xₚ) -xₚ = newton_stepₚ(peaks, xₚ) -xₚ, ForwardDiff.gradient(peaks, xₚ) -``` - -The Hessian at this point is given by: - -```julia; -Hₚ = ForwardDiff.hessian(peaks, xₚ) -``` - -From which we see: - -```julia;hold=true -fxx = Hₚ[1,1] -d = det(Hₚ) -fxx, d -``` - -Consequently we have a local maximum at this critical point. - - -!!! note -The `Optim.jl` package provides efficient implementations of these two numeric methods, and others. - -## Constrained optimization, Lagrange multipliers - - -We considered the problem of maximizing a function over a closed region. This maximum is achieved at a critical point *or* a boundary point. Investigating the critical points isn't so difficult and the second partial derivative test can help characterize the points along the way, but characterizing the boundary points usually involves parameterizing the boundary, which is not always so easy. However, if we put this problem into a more general setting a different technique becomes available. - -The different setting is: maximize $f(x,y)$ subject to the constraint $g(x,y) = k$. The constraint can be used to describe the boundary used previously. - -Why does this help? The key is something we have seen prior: If $g$ is differentiable, and we take $\nabla{g}$, then it will point at directions *orthogonal* to the level curve $g(x,y) = 0$. (Parameterize the curve, then $(g\circ\vec{r})(t) = 0$ and so the chain rule has $\nabla{g}(\vec{r}(t)) \cdot \vec{r}'(t) = 0$.) For example, consider the function $g(x,y) = x^2 +2y^2 - 1$. The level curve $g(x,y) = 0$ is an ellipse. Here we plot the level curve, along with a few gradient vectors at points satisfying $g(x,y) = 0$: - -```julia; hold=true -g(x,y) = x^2 + 2y^2 -1 -g(v) = g(v...) - -xs = range(-3, 3, length=100) -ys = range(-1, 4, length=100) - -p = plot(aspect_ratio=:equal, legend=false) -contour!(xs, ys, g, levels=[0]) - -gi(x) = sqrt(1/2*(1-x^2)) # solve for y in terms of x -pts = [[x, gi(x)] for x in (-3/4, -1/4, 1/4, 3/4)] - -for pt in pts - arrow!(pt, ForwardDiff.gradient(g, pt) ) -end - -p -``` - - -From the plot we see the key property that $g$ is orthogonal to the level curve. - -Now consider $f(x,y)$, a function we wish to maximize. The gradient points in the direction of *greatest* increase, provided $f$ is smooth. We are interested in the value of this gradient along the level curve of $g$. Consider this figure representing a portion of the level curve, it's tangent, normal, the gradient of $f$, and the contours of $f$: - -```julia; hold=true; echo=false -r(t) = [cos(t), sin(t)/2] -plot_parametric(pi/12..pi/3, r, legend=false, aspect_ratio=true, linewidth=3) -T(t) = -r'(t) / norm(r'(t)) -No(t) = T'(t) / norm(T'(t)) -t = pi/4 -lambda=1/10 -scatter!(unzip([r(t)])...) -arrow!(r(t), T(t)*lambda) -arrow!(r(t), No(t)* lambda) - -f(x,y)= x^2 + y^2 -f(v) = f(v...) -arrow!(r(t), lambda*ForwardDiff.gradient(f, r(t))) - -xs = range(0.5,1, length=100) -ys = range(0.1, 0.5, length=100) -contour!(xs, ys, f) -``` - -We can identify the tangent, the normal, and subsequently the gradient of $f$. Is the point drawn a maximum of $f$ subject to the constraint $g$? - -The answer is no, but why? By adding the contours of $f$, we see that moving along the curve from this point will increase or decrease $f$, depending on which direction we move in. As the *gradient* is the direction of greatest increase, we can see that the *projection* of the gradient on the tangent will point in a direction of *increase*. - - -It isn't just because the point picked was chosen to make a pretty picture, and not be a maximum. Rather, the fact that $\nabla{f}$ has a non-trivial projection onto the tangent vector. What does it say if we move the point in the direction of this projection? - -The gradient points in the direction of greatest increase. If we first move in one component of the gradient we will increase, just not as fast. This is because the directional derivative in the direction of the tangent will be non-zero. In the picture, if we were to move the point to the right along the curve $f(x,y)$ will increase. - -Now consider this figure at a different point of the figure: - -```julia; hold=true; echo=false -r(t) = [cos(t), sin(t)/2] -plot_parametric(-pi/6..pi/6,r, legend=false, aspect_ratio=true, linewidth=3) -T(t) = -r'(t) / norm(r'(t)) -No(t) = T'(t) / norm(T'(t)) -t = 0 -lambda=1/10 -scatter!(unzip([r(t)])...) -arrow!(r(t), T(t)*lambda) -arrow!(r(t), No(t)* lambda) - -f(x,y)= x^2 + y^2 -f(v) = f(v...) -arrow!(r(t), lambda*ForwardDiff.gradient(f, r(t))) - -xs = range(0.5,1.5, length=100) -ys = range(-0.5, 0.5, length=100) -contour!(xs, ys, f, levels = [.7, .85, 1, 1.15, 1.3]) -``` - - -We can still identify the tangent and normal directions. -What is different about this point is that local movement on the constraint curve is also local movement on the contour line of $f$, so $f$ doesn't increase or decrease here, as it would if this point were an extrema along the contraint. The key to seeing this is the contour lines of $f$ are *tangent* to the constraint. The respective gradients are *orthogonal* to their tangent lines, and in dimension $2$, this implies they are parallel to each other. - - -> *The method of Lagrange multipliers*: To optimize $f(x,y)$ subject to a constraint $g(x,y) = k$ we solve for all *simultaneous* solutions to -> -> ```math -> \begin{align} -> \nabla{f}(x,y) &= \lambda \nabla{g}(x,y), \text{and}\\ -> g(x,y) &= k. -> \end{align} ->``` -> -> These *possible* points are evaluated to see if they are maxima or minima. - -The method will not work if $\nabla{g} = \vec{0}$ or if $f$ and $g$ are not differentiable. - ----- - - -##### Example - -We consider [again]("../derivatives/optimization.html") the problem of maximizing all rectangles subject to the perimeter being $20$. We have seen this results in a square. This time we use the Lagrange multiplier technique. We have two equations: - -```math -A(x,y) = xy, \quad P(x,y) = 2x + 2y = 25. -``` - -We see $\nabla{A} = \lambda \nabla{P}$, or $\langle y, x \rangle = \lambda \langle 2, 2\rangle$. We see the solution has $x = y$ and from the constraint $x=y = 5$. - -This is clearly the maximum for this problem, though the Lagrange technique does not imply that, it only identifies possible extrema. - - -##### Example - - -We can reverse the question: what are the ranges for the perimeter when the area is a fixed value of $25$? We have: - -```math -P(x,y) = 2x + 2y, \quad A(x,y) = xy = 25. -``` - -Now we look for $\nabla{P} = \lambda \nabla{A}$ and will get, as the last example, that $\langle 2, 2 \rangle = \lambda \langle y, x\rangle$. So $x=y$ and from the constraint $x=y=5$. - -However this is *not* the maximum perimeter, but rather the minimal perimeter. The maximum is $\infty$, which comes about in the limit by considering long skinny rectangles. - - -##### Example: A rephrasing - -An slightly different formulation of the Lagrange method is to combine the equation and the constraint into one equation: - -```math -L(x,y,\lambda) = f(x,y) - \lambda (g(x,y) - k). -``` - -The we have - -```math -\begin{align} -\frac{\partial L}{\partial{x}} &= \frac{\partial{f}}{\partial{x}} - \lambda \frac{\partial{g}}{\partial{x}}\\ -\frac{\partial L}{\partial{y}} &= \frac{\partial{f}}{\partial{y}} - \lambda \frac{\partial{g}}{\partial{y}}\\ -\frac{\partial L}{\partial{\lambda}} &= 0 + (g(x,y) - k). -\end{align} -``` - -But if the Lagrange condition holds, each term is $0$, so Lagrange's method can be seen as solving for point $\nabla{L} = \vec{0}$. The optimization problem in two variables with a constraint becomes a problem of finding and classifying zeros of a function with *three* variables. - - -Apply this to the optimization problem: - -Find the extrema of $f(x,y) = x^2 - y^2$ subject to the constraint $g(x,y) = x^2 + y^2 = 1$. - -We have: - -```math -L(x, y, \lambda) = f(x,y) - \lambda(g(x,y) - 1) -``` - -We can solve for $\nabla{L} = \vec{0}$ by hand, but we do so symbolically: - -```julia; -@syms lambda -fₗₐ(x, y) = x^2 - y^2 -gₗₐ(x, y) = x^2 + y^2 -Lₗₐ(x, y, lambda) = fₗₐ(x,y) - lambda * (gₗₐ(x,y) - 1) -dsₗₐ = solve(diff.(Lₗₐ(x, y, lambda), [x, y, lambda])) -``` - -This has $4$ easy solutions, here are the values at each point: - -```julia; -[fₗₐ(d[x], d[y]) for d in dsₗₐ] -``` - -So $1$ is a maximum value and $-1$ a minimum value. - -##### Example: Dido's problem - -Consider a slightly different problem: What shape should a rope (curve) of fixed length make to *maximize* the area between the rope and $x$ axis? - - -Let $L$ be the length of the rope and suppose $y(x)$ describes the curve. Then we wish to - -```math -\text{Maximize } \int y(x) dx, \quad\text{subject to } -\int \sqrt{1 + y'(x)^2} dx = L. -``` - -The latter being the formula for arc length. This is very much like a optimization problem that Lagrange's method could help solve, but with one big difference: the answer is *not* a point but a *function*. - -This is a variant of [Dido](http://www.ams.org/publications/journals/notices/201709/rnoti-p980.pdf)'s problem, described by Bandle as - -> *Dido’s problem*: The Roman poet Publius Vergilius Maro (70–19 B.C.) -> tells in his epic Aeneid the story of queen Dido, the daughter of the -> Phoenician king of the 9th century B.C. After the assassination of her -> husband by her brother she fled to a haven near Tunis. There she asked -> the local leader, Yarb, for as much land as could be enclosed by the -> hide of a bull. Since the deal seemed very modest, he agreed. Dido cut -> the hide into narrow strips, tied them together and encircled a large -> tract of land which became the city of Carthage. Dido faced the -> following mathematical problem, which is also known as the -> isoperimetric problem: Find among all curves of given length the one -> which encloses maximal area. Dido found intuitively the right answer. - -The problem as stated above and method of solution follows notes by [Wang](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.368.1522&rep=rep1&type=pdf) though Bandle attributes the ideas back to a 19-year old Lagrange in a letter to Euler. - - -The method of solution will be to *assume* we have the function and then characterize this function in such a way that it can be identified. - - -Following Lagrange, we generalize the problem to the following: maximize $\int_{x_0}^{x_1} f(x, y(x), y'(x)) dx$ subject to a constraint $\int_{x_0}^{x_1} g(x,y(x), y'(x)) dx = K$. Suppose $y(x)$ is a solution. - -The starting point is a *perturbation*: $\hat{y}(x) = y(x) + \epsilon_1 \eta_1(x) + \epsilon_2 \eta_2(x)$. There are two perturbation terms, were only one term added, then the perturbation may make $\hat{y}$ not satisfy the constraint, the second term is used to ensure the constraint is not violated. If $\hat{y}$ is to be a possible solution to our problem, we would want $\hat{y}(x_0) = \hat{y}(x_1) = 0$, as it does for $y(x)$, so we *assume* $\eta_1$ and $\eta_2$ satisfy this boundary condition. - -With this notation, and fixing $y$ we can re-express the equations in terms ot $\epsilon_1$ and $\epsilon_2$: - -```math -\begin{align} -F(\epsilon_1, \epsilon_2) &= \int f(x, \hat{y}, \hat{y}') dx = -\int f(x, y + \epsilon_1 \eta_1 + \epsilon_2 \eta_2, y' + \epsilon_1 \eta_1' + \epsilon_2 \eta_2') dx,\\ -G(\epsilon_1, \epsilon_2) &= \int g(x, \hat{y}, \hat{y}') dx = -\int g(x, y + \epsilon_1 \eta_1 + \epsilon_2 \eta_2, y' + \epsilon_1 \eta_1' + \epsilon_2 \eta_2') dx. -\end{align} -``` - -Then our problem is restated as: - -```math -\text{Maximize } F(\epsilon_1, \epsilon_2) \text{ subject to } -G(\epsilon_1, \epsilon_2) = L. -``` - -Now, Lagrange's method can be employed. This will be fruitful - even though we know the answer - it being $\epsilon_1 = \epsilon_2 = 0$! - -Forging ahead, we compute $\nabla{F}$ and $\lambda \nabla{G}$ and set $\epsilon_1 = \epsilon_2 = 0$ where the two are equal. This will lead to a description of $y$ in terms of $y'$. - -Lagrange's method has: - -```math -\frac{\partial{F}}{\partial{\epsilon_1}}(0,0) - \lambda \frac{\partial{G}}{\partial{\epsilon_1}}(0,0) = 0, \text{ and } -\frac{\partial{F}}{\partial{\epsilon_2}}(0,0) - \lambda \frac{\partial{G}}{\partial{\epsilon_2}}(0,0) = 0. -``` - -Computing just the first one, we have using the chain rule and assuming interchanging the derivative and integral is possible: - -```math -\begin{align} -\frac{\partial{F}}{\partial{\epsilon_1}} -&= \int \frac{\partial}{\partial{\epsilon_1}}( -f(x, y + \epsilon_1 \eta_1 + \epsilon_2 \eta_2, y' + \epsilon_1 \eta_1' + \epsilon_2 \eta_2')) dx\\ -&= \int \left(\frac{\partial{f}}{\partial{y}} \eta_1 + \frac{\partial{f}}{\partial{y'}} \eta_1'\right) dx\quad\quad(\text{from }\nabla{f} \cdot \langle 0, \eta_1, \eta_1'\rangle)\\ -&=\int \eta_1 \left(\frac{\partial{f}}{\partial{y}} - \frac{d}{dx}\frac{\partial{f}}{\partial{y'}}\right) dx. -\end{align} -``` - -The last line by integration by parts: $\int u'(x) v(x) dx = (u \cdot v)(x)\mid_{x_0}^{x_1} - \int u(x) \frac{d}{dx} v(x) dx = - \int u(x) \frac{d}{dx} v(x) dx $. The last lines, as $\eta_1 = 0$ at $x_0$ and $x_1$ by assumption. We get: - -```math -0 = \int \eta_1\left(\frac{\partial{f}}{\partial{y}} - \frac{d}{dx}\frac{\partial{f}}{\partial{y'}}\right). -``` - -Similarly were $G$ considered, we would find a similar statement. Setting $L(x, y, y') = f(x, y, y') - \lambda g(x, y, y')$, the combination of terms gives: - -```math -0 = \int \eta_1\left(\frac{\partial{L}}{\partial{y}} - \frac{d}{dx}\frac{\partial{L}}{\partial{y'}}\right) dx. -``` - -Since $\eta_1$ is arbitrary save for its boundary conditions, under smoothness conditions on $L$ this will imply the rest of the integrand *must* be $0$. - -That is, If $y(x)$ is a maximizer of $\int_{x_0}^{x_1} f(x, y, y')dx$ and sufficiently smooth over $[x_0, x_1]$ and $y(x)$ satisfies the constraint $\int_{x_0}^{x_1} g(x, y, y')dx = K$ then there exists a constant $\lambda$ such that $L = f -\lambda g$ will satisfy: - -```math -\frac{d}{dx}\frac{\partial{L}}{\partial{y'}} - \frac{\partial{L}}{\partial{y}} = 0. -``` - -If $\partial{L}/\partial{x} = 0$, this simplifies to the [Beltrami](https://en.wikipedia.org/wiki/Beltrami_identity) identity: - -```math -L - y' \frac{\partial{L}}{\partial{y'}} = C.\quad(\text{Beltrami identity}) -``` - - - ----- - -For Dido's problem, $f(x,y,y') = y$ and $g(x, y, y') = \sqrt{1 + y'^2}$, so $L = y - \lambda\sqrt{1 + y'^2}$ will have $0$ partial derivative with respect to $x$. Using the Beltrami identify we have: - -```math -(y - \lambda\sqrt{1 + y'^2}) - \lambda y' \frac{2y'}{2\sqrt{1 + y'^2}} = C. -``` - -by multiplying through by the denominator and squaring to remove the square root, a quadratic equation in $y'^2$ can be found. This can be solved to give: - -```math -y' = \frac{dy}{dx} = \sqrt{\frac{\lambda^2 -(y + C)^2}{(y+C)^2}}. -``` - -Here is a snippet of `SymPy` code to verify the above: - -```julia; hold=true -@vars y y′ λ C -ex = Eq(-λ*y′^2/sqrt(1 + y′^2) + λ*sqrt(1 + y′^2), C + y) -Δ = sqrt(1 + y′^2) / (C+y) -ex1 = Eq(simplify(ex.lhs()*Δ), simplify(ex.rhs() * Δ)) -ex2 = Eq(ex1.lhs()^2 - 1, simplify(ex1.rhs()^2) - 1) -``` - - - -Now ``y'`` can be integrated using the substitution $y + C = \lambda \cos\theta$ to give: $-\lambda\int\cos\theta d\theta = x + D$, $D$ some constant. That is: - -```math -\begin{align} -x + D &= - \lambda \sin\theta\\ -y + C &= \lambda\cos\theta. -\end{align} -``` - -Squaring gives the equation of a circle: $(x +D)^2 + (y+C)^2 = \lambda^2$. - -We center and *rescale* the problem so that $x_0 = -1, x_1 = 1$. Then $L > 2$ as otherwise the rope is too short. From here, we describe the radius and center of the circle. - -We have $y=0$ at $x=1$ and $-1$ giving: - -```math -\begin{align} -(-1 + D)^2 + (0 + C)^2 &= \lambda^2\\ -(+1 + D)^2 + (0 + C)^2 &= \lambda^2. -\end{align} -``` - -Squaring out and solving gives $D=0$, $1 + C^2 = \lambda^2$. That is, an arc of circle with radius $1+C^2$ and centered at $(0, -C)$. - -```math -x^2 + (y + C)^2 = 1 + C^2. -``` - -Now to identify $C$ in terms of $L$. $L$ is the length of arc of circle of radius $r =\sqrt{1 + C^2}$ and angle $2\theta$, so $L = 2r\theta$ But using the boundary conditions in the equations for $x$ and $y$ gives $\tan\theta = 1/C$, so $L = 2\sqrt{1 + C^2}\tan^{-1}(1/C)$ which can be solved for $C$ provided $L \geq 2$. - - - - - -##### Example: more constraints - -Consider now the case of maximizing $f(x,y,z)$ subject to $g(x,y,z)=c$ and $h(x,y,z) = d$. Can something similar be said to characterize potential values for this to occur? Trying to describe where $g(x,y,z) = c$ and $h(x,y,z)=d$ in general will prove difficult. The easy case would be it the two equations were linear, in which case they would describe planes. Two non-parallel planes would intersect in a line. If the general case, imagine the surfaces locally replaced by their tangent planes, then their intersection would be a line, and this line would point in along the curve given by the intersection of the surfaces formed by the contraints. This line is similar to the tangent line in the ``2``-variable case. Now if $\nabla{f}$, which points in the direction of greatest increase of $f$, had a non-zero projection onto this line, then moving the point in that direction along the line would increase $f$ and still leave the point following the contraints. That is, if there is a non-zero directional derivative the point is not a maximum. - - -The tangent planes are *orthogonal* to the vectors $\nabla{g}$ and $\nabla{h}$, so in this case parallel to $\nabla{g} \times \nabla{h}$. The condition that $\nabla{f}$ be *orthogonal* to this vector, means that $\nabla{f}$ *must* sit in the plane described by $\nabla{g}$ and $\nabla{h}$ - the plane of orthogonal vectors to $\nabla{g} \times \nabla{h}$. That is, this condition is needed: - -```math -\nabla{f}(x,y,z) = \lambda_1 \nabla{g}(x,y,z) + \lambda_2 \nabla{h}(x,y,z). -``` - -At a point satisfying the above, we would have the tangent "plane" of $f$ is contained in the intersection of the tangent "plane"s to $g$ and $h$. - - - - ----- - -Consider a curve given through the intersection of two expressions: $g_1(x,y,z) = x^2 + y^2 - z^2 = 0$ and $g_2(x,y,z) = x - 2z = 3$. What is the minimum distance to the origin along this curve? - -We have $f(x,y,z) = \text{distance}(\vec{x},\vec{0}) = \sqrt{x^2 + y^2 + z^2}$, subject to the two constraints. As the square root is increasing, we can actually just consider $f(x,y,z) = x^2 + y^2 + z^2$, ignoring the square root. The Lagrange multiplier technique instructs us to look for solutions to: - -```math -\langle 2x, 2y ,2x \rangle = \lambda_1\langle 2x, 2y, -2z\rangle + \lambda_2 \langle 1, 0, -2 \rangle. -``` - -Here we use `SymPy`: - -```julia; -@syms z lambda1 lambda2 -g1(x, y, z) = x^2 + y^2 - z^2 -g2(x, y, z) = x - 2z - 3 -fₘ(x,y,z)= x^2 + y^2 + z^2 -Lₘ(x,y,z,lambda1, lambda2) = fₘ(x,y,z) - lambda1*(g1(x,y,z) - 0) - lambda2*(g2(x,y,z) - 0) - -∇Lₘ = diff.(Lₘ(x,y,z,lambda1, lambda2), [x, y, z,lambda1, lambda2]) -``` - -Before trying to solve for $\nabla{L} = \vec{0}$ we see from the second equation that *either* $\lambda_1 = 1$ or $y = 0$. First we solve with $\lambda_1 = 1$: - -```julia; -solve(subs.(∇Lₘ, lambda1 .=> 1)) -``` - -There are no real solutions. Next when $y = 0$ we get: - -```julia; -outₘ = solve(subs.(∇Lₘ, y .=> 0)) -``` - -The two solutions have values yielding the extrema: - -```julia; -[fₘ(d[x], 0, d[z]) for d in outₘ] -``` - -## Taylor's theorem - -Taylor's theorem for a univariate function states that if $f$ has $k+1$ derivatives in an open interval around $a$, $f^{(k)}$ is continuous between the closed interval from $a$ to $x$ then: - -```math -f(x) = \sum_{j=0}^k \frac{f^{j}(a)}{j!} (x-a)^k + R_k(x), -``` - -where $R_k(x) = f^{k+1}(\xi)/(k+1)!(x-a)^{k+1}$ for some $\xi$ between $a$ and $x$. - -This theorem can be generalized to scalar functions, but the notation can be cumbersome. -Following [Folland](https://sites.math.washington.edu/~folland/Math425/taylor2.pdf) we use *multi-index* notation. Suppose $f:R^n \rightarrow R$, and let $\alpha=(\alpha_1, \alpha_2, \dots, \alpha_n)$. Then define the following notation: - -```math -\begin{align*} -|\alpha| &= \alpha_1 + \cdots + \alpha_n, \\ -\alpha! &= \alpha_1!\alpha_2!\cdot\cdots\cdot\alpha_n!, \\ -\vec{x}^\alpha &= x_1^{\alpha_1}x_2^{\alpha_2}\cdots x_n^{\alpha^n}, \\ -\partial^\alpha f &= \partial_1^{\alpha_1}\partial_2^{\alpha_2}\cdots \partial_n^{\alpha_n} f \\ -& = \frac{\partial^{|\alpha|}f}{\partial x_1^{\alpha_1} \partial x_2^{\alpha_2} \cdots \partial x_n^{\alpha_n}}. -\endalign*} -``` - -This notation makes many formulas from one dimension carry over to higher dimensions. For example, the binomial theorem says: - -```math -(a+b)^n = \sum_{k=0}^n \frac{n!}{k!(n-k)!}a^kb^{n-k}, -``` - -and this becomes: - -```math -(x_1 + x_2 + \cdots + x_n)^n = \sum_{|\alpha|=k} \frac{k!}{\alpha!} \vec{x}^\alpha. -``` - -Taylor's theorem then becomes: - -If $f: R^n \rightarrow R$ is sufficiently smooth ($C^{k+1}$) on an open convex set $S$ about $\vec{a}$ then if $\vec{a}$ and $\vec{a}+\vec{h}$ are in $S$, -```math -f(\vec{a} + \vec{h}) = \sum_{|\alpha| \leq k}\frac{\partial^\alpha f(\vec{a})}{\alpha!}\vec{h}^\alpha + R_{\vec{a},k}(\vec{h}), -``` -where $R_{\vec{a},k} = \sum_{|\alpha|=k+1}\partial^\alpha \frac{f(\vec{a} + c\vec{h})}{\alpha!} \vec{h}^\alpha$ for some $c$ in $(0,1)$. - -##### Example - -The elegant notation masks what can be complicated expressions. Consider the simple case $f:R^2 \rightarrow R$ and $k=2$. Then this says: - -```math -\begin{align*} -f(x + dx, y+dy) &= f(x, y) + \frac{\partial f}{\partial x} dx + \frac{\partial f}{\partial y} dy \\ -&+ \frac{\partial^2 f}{\partial x^2} \frac{dx^2}{2} + 2\frac{\partial^2 f}{\partial x\partial y} \frac{dx dy}{2}\\ -&+ \frac{\partial^2 f}{\partial y^2} \frac{dy^2}{2} + R_{\langle x, y \rangle, k}(\langle dx, dy \rangle). -\end{align*} -``` - -Using $\nabla$ and $H$ for the Hessian and $\vec{x} = \langle x, y \rangle$ and $d\vec{x} = \langle dx, dy \rangle$, this can be expressed as: - -```math -f(\vec{x} + d\vec{x}) = f(\vec{x}) + \nabla{f} \cdot d\vec{x} + d\vec{x} \cdot (H d\vec{x}) +R_{\vec{x}, k}d\vec{x}. -``` - -As for $R$, the full term involves terms for $\alpha = (3,0), (2,1), (1,2)$, and $(0,3)$. Using $\vec{a} = \langle x, y\rangle$ and $\vec{h}=\langle dx, dy\rangle$: - -```math -\frac{\partial^3 f(\vec{a}+c\vec{h})}{\partial x^3} \frac{dx^3}{3!}+ -\frac{\partial^3 f(\vec{a}+c\vec{h})}{\partial x^2\partial y} \frac{dx^2 dy}{2!1!} + -\frac{\partial^3 f(\vec{a}+c\vec{h})}{\partial x\partial y^2} \frac{dxdy^2}{1!2!} + -\frac{\partial^3 f(\vec{a}+c\vec{h})}{\partial y^3} \frac{dy^3}{3!}. -``` - -The exact answer is usually not as useful as the bound: $|R| \leq M/(k+1)! \|\vec{h}\|^{k+1}$, for some finite constant $M$. - - -##### Example - -We can encode multiindices using `SymPy`. The basic definitions are fairly straightforward using `zip` to pair variables with components of $\alpha$. We define a new type so that we can overload the familiar notation: - -```julia; -struct MultiIndex - alpha::Vector{Int} - end -Base.show(io::IO, α::MultiIndex) = println(io, "α = ($(join(α.alpha, ", ")))") - -## |α| = α_1 + ... + α_m -Base.length(α::MultiIndex) = sum(α.alpha) - -## factorial(α) computes α! -Base.factorial(α::MultiIndex) = prod(factorial(Sym(a)) for a in α.alpha) - -## x^α = x_1^α_1 * x_2^α^2 * ... * x_n^α_n -import Base: ^ -^(x, α::MultiIndex) = prod(u^a for (u,a) in zip(x, α.alpha)) - -## ∂^α(ex) = ∂_1^α_1 ∘ ∂_2^α_2 ∘ ... ∘ ∂_n^α_n (ex) -partial(ex::SymPy.SymbolicObject, α::MultiIndex, vars=free_symbols(ex)) = diff(ex, zip(vars, α.alpha)...) -``` - - -```julia; -@syms w -alpha = MultiIndex([1,2,1,3]) -length(alpha) # 1 + 2 + 1 + 3=7 -[1,2,3,4]^alpha -exₜ = x^3 * cos(w*y*z) -partial(exₜ, alpha, [w,x,y,z]) -``` - - -The remainder term needs to know information about sets like $|\alpha| =k$. This is a combinatoric problem, even to identify the length. Here we define an iterator to iterate over all possible MultiIndexes. This is low level, and likely could be done in a much better style, so shouldn't be parsed unless there is curiosity. It manually chains together iterators. - -```julia; -struct MultiIndices - n::Int - k::Int -end - -function Base.length(as::MultiIndices) - n,k = as.n, as.k - n == 1 && return 1 - sum(length(MultiIndices(n-1, j)) for j in 0:k) # recursively identify length -end - -function Base.iterate(alphas::MultiIndices) - k, n = alphas.k, alphas.n - n == 1 && return ([k],(0, MultiIndices(0,0), nothing)) - - m = zeros(Int, n) - m[1] = k - betas = MultiIndices(n-1, 0) - stb = iterate(betas) - st = (k, MultiIndices(n-1, 0), stb) - return (m, st) -end - -function Base.iterate(alphas::MultiIndices, st) - - st == nothing && return nothing - k,n = alphas.k, alphas.n - k == 0 && return nothing - n == 1 && return nothing - - # can we iterate the next on - bk, bs, stb = st - - if stb==nothing - bk = bk-1 - bk < 0 && return nothing - bs = MultiIndices(bs.n, bs.k+1) - val, stb = iterate(bs) - return (vcat(bk,val), (bk, bs, stb)) - end - - resp = iterate(bs, stb) - if resp == nothing - bk = bk-1 - bk < 0 && return nothing - bs = MultiIndices(bs.n, bs.k+1) - val, stb = iterate(bs) - return (vcat(bk, val), (bk, bs, stb)) - end - - val, stb = resp - return (vcat(bk, val), (bk, bs, stb)) - -end -``` - -This returns a vector, not a `MultiIndex`. Here we get all multiindices in two variables of size $3$ - -```julia; -collect(MultiIndices(2, 3)) -``` - -To get all of size $3$ or less, we could do something like this: - -```julia; -union((collect(MultiIndices(2, i)) for i in 0:3)...) -``` - - -To see the computational complexity. Suppose we had $3$ variables and were interested in the error for order $4$: - -```julia; -k = 4 -length(MultiIndices(3, k+1)) -``` - -Finally, to see how compact the notation issue, suppose $f:R^3 \rightarrow R$, we have the third-order Taylor series expands to ``20`` terms as follows: - -```julia; hold=true -@syms F() a[1:3] dx[1:3] - -sum(partial(F(a...), α, a) / factorial(α) * dx^α for k in 0:3 for α in MultiIndex.(MultiIndices(3, k))) # 3rd order -``` - - -## Questions - -###### Question - -Let $f(x,y) = \sqrt{x + y}$. Find the tangent plane approximation for $f(2.1, 2.2)$? - -```julia; hold=true; echo=false -f(x,y) = sqrt(x + y) -f(v) = f(v...) -pt = [2,2] -dxdy = [.1, .2] -val = f(pt) + dot(ForwardDiff.gradient(f, pt), dxdy) -numericq(val) -``` - -###### Question - -Let $f(x,y,z) = xy + yz + zx$. Using a *linear approximation* estimate $f(1.1, 1.0, 0.9)$. - -```julia; hold=true; echo=false -f(x,y,z) = x*y + y*z + z*x -f(v) = f(v...) -pt = [1,1,1] -dx = [0.1, 0.0, -0.1] -val = f(pt) + ∇(f)(pt) ⋅ dx -numericq(val) -``` - - -###### Question - -Let $f(x,y,z) = xy + yz + zx - 3$. What equation describes the tangent approximation at $(1,1,1)$? - -```julia; hold=true; echo=false -f(x,y,z) = x*y + y*z + z*x - 8 -f(v) = f(v...) -pt = [1,1,1] -n = ∇(f)(pt) -d = dot(n, pt) -choices = [ - raw"`` x + y + z = 3``", - raw"`` 2x + y - 2z = 1``", - raw"`` x + 2y + 3z = 6``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -([Knill](http://www.math.harvard.edu/~knill/teaching/summer2018/handouts/week4.pdf)) Let $f(x,y) = xy + x^2y + xy^2$. - -Find the gradient of $f$: - -```julia; hold=true; echo=false -choices = [ - raw"`` \langle 2xy + y^2 + y, 2xy + x^2 + x\rangle``", - raw"`` y^2 + y, x^2 + x``", - raw"`` \langle 2y + y^2, 2x + x^2``" -] -answ = 1 -radioq(choices, answ) -``` - -Is this the Hessian of $f$? - -```math -\left[\begin{matrix}2 y & 2 x + 2 y + 1\\2 x + 2 y + 1 & 2 x\end{matrix}\right] -``` - -```julia; hold=true; echo=false -yesnoq(true) -``` - -The point $(-1/3, -1/3)$ is a solution to the $\nabla{f} = 0$. What is the *determinant*, $d$, of the Hessian at this point? - -```julia; hold=true; echo=false -f(x,y) = x*y + x*y^2 + x^2 * y -f(v) = f(v...) -val = det(ForwardDiff.hessian(f, [-1/3, -1/3])) -numericq(val) -``` - -Which is true of $f$ at $(-1/3, 1/3)$: - -```julia; hold=true; echo=false -choices = [ - L"The function $f$ has a local minimum, as $f_{xx} > 0$ and $d >0$", - L"The function $f$ has a local maximum, as $f_{xx} < 0$ and $d >0$", - L"The function $f$ has a saddle point, as $d < 0$", - L"Nothing can be said, as $d=0$" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - - -##### Question - -([Knill](http://www.math.harvard.edu/~knill/teaching/summer2018/handouts/week4.pdf)) Let the Tutte polynomial be $f(x,y) = x + 2x^2 + x^3 + y + 2xy + y^2$. - -Does this accurately find the gradient of $f$? - -```julia; hold=true; results="hidden" -f(x,y) = x + 2x^2 + x^3 + y + 2x*y + y^2 -@syms x::real y::real -gradf = gradient(f(x,y), [x,y]) -``` - -```julia; hold=true; echo=false -yesnoq(true) -``` - -How many answers does this find to $\nabla{f} = \vec{0}$? - -```julia; hold=true; results="hidden" -f(x,y) = x + 2x^2 + x^3 + y + 2x*y + y^2 -@syms x::real y::real -gradf = gradient(f(x,y), [x,y]) - -solve(gradf, [x,y]) -``` - -```julia; hold=true; echo=false -numericq(2) -``` - -The Hessian is found by - -```julia; hold=true; -f(x,y) = x + 2x^2 + x^3 + y + 2x*y + y^2 -@syms x::real y::real -gradf = gradient(f(x,y), [x,y]) - -sympy.hessian(f(x,y), [x,y]) -``` - -Which is true of $f$ at $(-1/3, 1/3)$: - -```julia; hold=true; echo=false -choices = [ - L"The function $f$ has a local minimum, as $f_{xx} > 0$ and $d >0$", - L"The function $f$ has a local maximum, as $f_{xx} < 0$ and $d >0$", - L"The function $f$ has a saddle point, as $d < 0$", - L"Nothing can be said, as $d=0$", - L"The test does not apply, as $\nabla{f}$ is not $0$ at this point." -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -Which is true of $f$ at $(0, -1/2)$: - -```julia; hold=true; echo=false -choices = [ - L"The function $f$ has a local minimum, as $f_{xx} > 0$ and $d >0$", - L"The function $f$ has a local maximum, as $f_{xx} < 0$ and $d >0$", - L"The function $f$ has a saddle point, as $d < 0$", - L"Nothing can be said, as $d=0$", - L"The test does not apply, as $\nabla{f}$ is not $0$ at this point." -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - - -Which is true of $f$ at $(1/2, 0)$: - -```julia; hold=true; echo=false -choices = [ - L"The function $f$ has a local minimum, as $f_{xx} > 0$ and $d >0$", - L"The function $f$ has a local maximum, as $f_{xx} < 0$ and $d >0$", - L"The function $f$ has a saddle point, as $d < 0$", - L"Nothing can be said, as $d=0$", - L"The test does not apply, as $\nabla{f}$ is not $0$ at this point." -] -answ = 5 -radioq(choices, answ, keep_order=true) -``` - - - - -###### Question - -(Strang p509) Consider the quadratic function $f(x,y) = ax^2 + bxy +cy^2$. Since the second partial derivative test is essentially done by replacing the function at a critical point by a quadratic function, understanding this $f$ is of some interest. - -Is this the Hessian of $f$? - -```math -\left[ -\begin{array}{} -2a & 2b\\ -2b & 2c -\end{array} -\right] -``` - -```julia; hold=true; echo=false -yesnoq(true) -``` - -Or is this the Hessian of $f$? - -```math -\left[ -\begin{array}{} -2ax & by\\ -bx & 2cy -\end{array} -\right] -``` - -```julia; hold=true; echo=false -yesnoq(false) -``` - -Explain why $ac - b^2$ is of any interest here: - -```julia; hold=true; echo=false -choices =[ - "It is the determinant of the Hessian", - L"It isn't, $b^2-4ac$ is from the quadratic formula" -] -answ = 1 -radioq(choices, answ) -``` - -Which condition on $a$, $b$, and $c$ will ensure a *local maximum*: - -```julia; hold=true; echo=false -choices = [ - L"That $a>0$ and $ac-b^2 > 0$", - L"That $a<0$ and $ac-b^2 > 0$", - L"That $ac-b^2 < 0$" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -Which condition on $a$, $b$, and $c$ will ensure a saddle point? - - -```julia; hold=true; echo=false -choices = [ - L"That $a>0$ and $ac-b^2 > 0$", - L"That $a<0$ and $ac-b^2 > 0$", - L"That $ac-b^2 < 0$" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -Let $f(x,y) = e^{-x^2 - y^2} (2x^2 + y^2)$. Use Lagrange's method to find the absolute maximum and absolute minimum over $x^2 + y^2 = 3$. - - -Is $\nabla{f}$ given by the following? - -```math -\nabla{f} =2 e^{-x^2 - y^2} \langle x(2 - 2x^2 - y^2), y(1 - 2x^2 - y^2)\rangle. -``` - -```julia; hold=true; echo=false -yesnoq(true) -``` - - -Which vector is orthogonal to the contour line $x^2 + y^2 = 3$? - -```julia; hold=true; echo=false -choices = [ - raw"`` \langle 2x, 2y\rangle``", - raw"`` \langle 2x, y^2\rangle``", - raw"`` \langle x^2, 2y \rangle``" -] -answ = 1 -radioq(choices, answ) -``` - -Due to the form of the gradient of the constraint, finding when $\nabla{f} = \lambda \nabla{g}$ is the same as identifying when this ratio $|f_x/f_y|$ is $1$. The following solves for this by checking each point on the constraint: - -```julia; hold=true; -f(x,y) = exp(-x^2-y^2) * (2x^2 + y^2) -f(v) = f(v...) -r(t) = 3*[cos(t), sin(t)] -rat(x) = abs(x[1]/x[2]) - 1 -fn = rat ∘ ∇(f) ∘ r -ts = fzeros(fn, 0, 2pi) -``` - -Using these points, what is the largest value on the boundary? - -```julia; hold=true; echo=false -f(x,y) = exp(-x^2-y^2) * (2x^2 + y^2) -f(v) = f(v...) -r(t) = 3*[cos(t), sin(t)] -rat(x) = abs(x[1]/x[2]) - 1 -fn = rat ∘ ∇(f) ∘ r -ts = fzeros(fn, 0, 2pi) - -val = maximum((f∘r).(ts)) -numericq(val) -``` diff --git a/CwJ/differentiable_vector_calculus/vector_fields.jmd b/CwJ/differentiable_vector_calculus/vector_fields.jmd deleted file mode 100644 index 92fa448..0000000 --- a/CwJ/differentiable_vector_calculus/vector_fields.jmd +++ /dev/null @@ -1,1012 +0,0 @@ -# Functions $R^n \rightarrow R^m$ - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using ForwardDiff -using LinearAlgebra -``` - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Functions ``R^n \\rightarrow R^m``", - description = "Calculus with Julia: Functions ``R^n \\rightarrow R^m``", - tags = ["CalculusWithJulia", "differentiable_vector_calculus", "functions ``R^n \\rightarrow R^m``"], -); - -nothing -``` - - -For a scalar function $f: R^n \rightarrow R$, the gradient of $f$, $\nabla{f}$, is a function from $R^n \rightarrow R^n$. Specializing to $n=2$, a function that for each point, $(x,y)$, assigns a vector $\vec{v}$. This is an example of vector field. More generally, we could have a [function](https://en.wikipedia.org/wiki/Multivariable_calculus) $f: R^n \rightarrow R^m$, of which we have discussed many already: - -| Mapping | Name | Visualize with | Notation | -|:-----------------------:|:---------------:|:---------------------------:|:--------------------:| -|$f: R\rightarrow R$ | univariate | familiar graph of function | $f$ | -|$f: R\rightarrow R^m$ | vector-valued | space curve when n=2 or 3 | $\vec{r}$, $\vec{N}$ | -|$f: R^n\rightarrow R$ | scalar | a surface when n=2 | $f$ | -|$F: R^n\rightarrow R^n$ | vector field | a vector field when n=2 | $F$ | -|$F: R^n\rightarrow R^m$ | multivariable | n=2,m=3 describes a surface | $F$, $\Phi$ | - - -After an example where the use of a multivariable function is of necessity, we -discuss differentiation in general for a multivariable functions. - -## Vector fields - -We have seen that the gradient of a scalar function, $f:R^2 \rightarrow R$, takes a point in $R^2$ and associates a vector in $R^2$. As such $\nabla{f}:R^2 \rightarrow R^2$ is a vector field. A vector field can be visualized by sampling a region and representing the field at those points. The details, as previously mentioned, are in the `vectorfieldplot` function of `CalculusWithJulia`. - -```julia; -F(u,v) = [-v, u] -vectorfieldplot(F, xlim=(-5,5), ylim=(-5,5), nx=10, ny=10) -``` - -The optional arguments `nx=10` and `ny=10` determine the number of points on the grid that a vector will be plotted. These vectors are scaled to not overlap. - - - -Vector field plots are useful for visualizing velocity fields, where a -velocity vector is associated to each point; or streamlines, curves -whose tangents are follow the velocity vector of a flow. Vector -fields are used in physics to model the electric field and the -magnetic field. These are used to describe forces on objects within -the field. - - - -The three dimensional vector field is one way to illustrate a vector field, but there is an alternate using field lines. Like Euler's method, imagine starting at some point, $\vec{r}$ in $R^3$. The field at that point is a vector indicating a direction of motion. Follow that vector for some infinitesimal amount, $d\vec{r}$. From here repeat. The field curve would satisfy $\vec{r}'(t) = F(\vec{r}(t))$. Field curves only show direction, to indicate magnitude at a point, the convention is to use denser lines when the field is stronger. - -```julia; hold=true; echo=false -#out download("https://upload.wikimedia.org/wikipedia/commons/thumb/f/ff/VFPt_Earths_Magnetic_Field_Confusion.svg/320px-VFPt_Earths_Magnetic_Field_Confusion.svg.png") -#cp(out, "figures/magnetic-field.png") - -imgfile = "figures/magnetic-field.png" -caption = """ -Illustration of the magnetic field of the earth using field lines to indicate the field. From -[Wikipedia](https://en.wikipedia.org/wiki/Magnetic_field). -""" -ImageFile(:differentiable_vector_calculus, imgfile, caption) -``` - ---- - -Vector fields are also useful for other purposes, such as -transformations, examples of which are a rotation or the conversion -from polar to rectangular coordinates. - -For transformations, a useful visualization is to plot curves where -one variables is fixed. Consider the transformation from polar -coordinates to cartesian coordinates $F(r, \theta) = r -\langle\cos(\theta),\sin(\theta)\rangle$. The following plot will show -in blue fixed values of $r$ (circles) and in red fixed values of -$\theta$ (rays). - -```julia; hold=true -F(r,theta) = r*[cos(theta), sin(theta)] -F(v) = F(v...) - -rs = range(0, 2, length=5) -thetas = range(0, pi/2, length=9) - -plot(legend=false, aspect_ratio=:equal) -plot!(unzip(F.(rs, thetas'))..., color=:red) -plot!(unzip(F.(rs', thetas))..., color=:blue) - -pt = [1, pi/4] -J = ForwardDiff.jacobian(F, pt) -arrow!(F(pt...), J[:,1], linewidth=5, color=:red) -arrow!(F(pt...), J[:,2], linewidth=5, color=:blue) -``` - -To the plot, we added the partial derivatives with respect to $r$ (in red) and with respect to $\theta$ (in blue). These are found with the soon-to-be discussed Jacobian. From the graph, you can see that these vectors are tangent vectors to the drawn curves. - - - -## Parametrically defined surfaces - -For a one-dimensional curve we have several descriptions. For example, as the graph of a function $y=f(x)$; as a parametrically defined curve $\vec{r}(t) = \langle x(t), y(t)\rangle$; or as a level curve of a scalar function $f(x,y) = c$. - -For two-dimensional surfaces in three dimensions, we have discussed describing these in terms of a function $z = f(x,y)$ and as level curves of scalar functions: $c = f(x,y,z)$. They can also be described parametrically. - -We pick a familiar case, to make this concrete: the unit sphere in $R^3$. We have - -* It is described by two functions through $f(x,y) = \pm \sqrt{1 - (x^2 + y^2)}$. - -* It is described by $f(x,y,z) = 1$, where $f(x,y,z) = x^2 + y^2 + z^2$. - -* It can be described in terms of [spherical coordinates](https://en.wikipedia.org/wiki/Spherical_coordinate_system): - -```math -\Phi(\theta, \phi) = \langle \sin(\phi)\cos(\theta), \sin(\phi)\sin(\theta), \cos(\phi) \rangle, -``` - -with $\theta$ the *azimuthal* angle and $\phi$ the polar angle (measured down from the $z$ axis). - -The function $\Phi$ takes $R^2$ into $R^3$, so is a multivariable function. - -When a surface is described by a function, $z=f(x,y)$, then the gradient points (in the $x-y$ plane) in the direction of greatest increase of $f$. The vector $\langle -f_x, -f_y, 1\rangle$ is a normal. - -When a surface is described as a level curve, $f(x,y,z) = c$, then the gradient is *normal* to the surface. - -When a surface is described parametrically, there is no "gradient." The *partial* derivatives are of interest, e.g., $\partial{F}/\partial{\theta}$ and $\partial{F}/\partial{\phi}$, vectors defined componentwise. These will be lie in the tangent plane of the surface, as they can be viewed as tangent vectors for parametrically defined curves on the surface. Their cross product will be *normal* to the surface. The magnitude of the cross product, which reflects the angle between the two partial derivatives, will be informative as to the surface area. - - -### Plotting parametrized surfaces in `Julia` - - - -Consider the parametrically described surface above. How would it be -plotted? Using the `Plots` package, the process is quite similar to how -a surface described by a function is plotted, but the $z$ values must -be computed prior to plotting. - - - - -Here we define the parameterization using functions to represent each component: - -```julia -X(theta,phi) = sin(phi) * cos(theta) -Y(theta,phi) = sin(phi) * sin(theta) -Z(theta,phi) = cos(phi) -``` - -Then: - -```julia; -thetas = range(0, stop=pi/2, length=50) -phis = range(0, stop=pi, length=50) - -xs = [X(theta, phi) for theta in thetas, phi in phis] -ys = [Y(theta, phi) for theta in thetas, phi in phis] -zs = [Z(theta, phi) for theta in thetas, phi in phis] - -surface(xs, ys, zs) ## see note -``` - -!!! note -Only *some* backends for `Plots` will produce this type of plot. Both `plotly()` and `pyplot()` will, but not `gr()`. - - -!!! note - PyPlot can be used directly to make these surface plots: `import PyPlot; PyPlot.plot_surface(xs,ys,zs). - -Instead of the comprehension, broadcasting can be used - -```julia; -surface(X.(thetas, phis'), Y.(thetas, phis'), Z.(thetas, phis')) -``` - - - -If the parameterization is presented as a function, broadcasting can be used to succintly plot - -```julia -Phi(theta, phi) = [X(theta, phi), Y(theta, phi), Z(theta, phi)] - -surface(unzip(Phi.(thetas, phis'))...) -``` - - -The partial derivatives of each component, $\partial{\Phi}/\partial{\theta}$ and $\partial{\Phi}/\partial{\phi}$, can be computed directly: - -```math -\begin{align*} -\partial{\Phi}/\partial{\theta} &= \langle -\sin(\phi)\sin(\theta), \sin(\phi)\cos(\theta),0 \rangle,\\ -\partial{\Phi}/\partial{\phi} &= \langle \cos(\phi)\cos(\theta), \cos(\phi)\sin(\theta), -\sin(\phi) \rangle. -\end{align*} -``` - -Using `SymPy`, we can compute through: - - -```julia; -@syms theta phi -out = [diff.(Phi(theta, phi), theta) diff.(Phi(theta, phi), phi)] -``` - - -At the point $(\theta, \phi) = (\pi/12, \pi/6)$ this evaluates to the following. - -```julia; -subs.(out, theta.=> PI/12, phi.=>PI/6) .|> N -``` - -We found numeric values, so that we can -compare to the numerically identical values computed by the `jacobian` function from `ForwardDiff`: - -```julia; -pt = [pi/12, pi/6] -out₁ = ForwardDiff.jacobian(v -> Phi(v...), pt) -``` - -What this function computes exactly will be described next, but here we visualize the partial derivatives and see they lie in the tangent plane at the point: - -```julia; hold=true -us, vs = range(0, pi/2, length=25), range(0, pi, length=25) -xs, ys, zs = unzip(Phi.(us, vs')) -surface(xs, ys, zs, legend=false) -arrow!(Phi(pt...), out₁[:,1], linewidth=3) -arrow!(Phi(pt...), out₁[:,2], linewidth=3) -``` - - - -## The total derivative - -Informally, the [total derivative](https://en.wikipedia.org/wiki/Total_derivative) at $a$ is the best linear approximation of the value of a function, $F$, near $a$ with respect to its arguments. If it exists, denote it $dF_a$. - -For a function $F: R^n \rightarrow R^m$ we have the total derivative at $\vec{a}$ (a point or vector in $R^n$) is a matrix $J$ (a linear transformation) taking vectors in $R^n$ and returning, under multiplication, vectors in $R^m$ (this matrix will be $m \times n$), such that for some neighborhood of $\vec{a}$, we have: - -```math -\lim_{\vec{x} \rightarrow \vec{a}} \frac{\|F(\vec{x}) - F(\vec{a}) - J\cdot(\vec{x}-\vec{a})\|}{\|\vec{x} - \vec{a}\|} = \vec{0}. -``` - -(That is $\|F(\vec{x}) - F(\vec{a}) - J\cdot(\vec{x}-\vec{a})\|=\mathcal{o}(\|\vec{x}-\vec{a}\|)$.) - -If for some $J$ the above holds, the function $F$ is said to be totally differentiable, and the matrix $J =J_F=dF_a$ is the total derivative. - -For a multivariable function $F:R^n \rightarrow R^m$, we may express the function in vector-valued form $F(\vec{x}) = \langle f_1(\vec{x}), f_2(\vec{x}),\dots,f_m(\vec{x})\rangle$, each component a scalar function. Then, if the total derivative exists, it can be expressed by the [Jacobian](https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant): - -```math -J = \left[ -\begin{align*} -\frac{\partial f_1}{\partial x_1} &\quad \frac{\partial f_1}{\partial x_2} &\dots&\quad\frac{\partial f_1}{\partial x_n}\\ -\frac{\partial f_2}{\partial x_1} &\quad \frac{\partial f_2}{\partial x_2} &\dots&\quad\frac{\partial f_2}{\partial x_n}\\ -&&\vdots&\\ -\frac{\partial f_m}{\partial x_1} &\quad \frac{\partial f_m}{\partial x_2} &\dots&\quad\frac{\partial f_m}{\partial x_n} -\end{align*} -\right]. -``` - -This may also be viewed as: - -```math -J = \left[ -\begin{align*} -&\nabla{f_1}'\\ -&\nabla{f_2}'\\ -&\quad\vdots\\ -&\nabla{f_m}' -\end{align*} -\right] = -\left[ -\frac{\partial{F}}{\partial{x_1}}\quad -\frac{\partial{F}}{\partial{x_2}} \cdots -\frac{\partial{F}}{\partial{x_n}} -\right]. -``` - -The latter representing a matrix of $m$ row vectors, each with $n$ components or as a matrix of $n$ column vectors, each with $m$ components. - ----- - -After specializing the total derivative to the cases already discussed, we have: - -* Univariate functions. Here $f'(t)$ is also univariate. Identifying $J$ with the $1 \times 1$ matrix with component $f'(t)$, then the total derivative is just a restatement of the derivative existing. - -* Vector-valued functions $\vec{f}(t) = \langle f_1(t), f_2(t), \dots, f_m(t) \rangle$, each component univariate. Then the derivative, $\vec{f}'(t) = \langle \frac{df_1}{dt}, \frac{df_2}{dt}, \dots, \frac{df_m}{dt} \rangle$. The total derivative in this case, is a a $m \times 1$ vector of partial derivatives, and since there is only $1$ variable, would be written without partials. So the two agree. - -* Scalar functions $f(\vec{x}) = a$ of type $R^n \rightarrow R$. The -definition of differentiability for $f$ involved existence of the -partial derivatives and moreover, the fact that a limit like the above -held with $\nabla{f}(C) \cdot \vec{h}$ in place of -$J\cdot(\vec{x}-\vec{a})$. Here $\vec{h}$ and $\vec{x}-\vec{a}$ are -vectors in $R^n$. Were the dot product in $\nabla{f}(C) \cdot \vec{h}$ -expressed in matrix multiplication we would have for this case a $1 -\times n$ matrix of the correct form: - -```math -J = [\nabla{f}']. -``` - - -* For $f:R^2 \rightarrow R$, the Hessian matrix, was the matrix of ``2``nd partial derivatives. This may be viewed as the total derivative of the the gradient function, $\nabla{f}$: - -```math -\text{Hessian} = -\left[ -\begin{align*} -\frac{\partial^2 f}{\partial x^2} &\quad \frac{\partial^2 f}{\partial x \partial y}\\ -\frac{\partial^2 f}{\partial y \partial x} &\quad \frac{\partial^2 f}{\partial y \partial y} -\end{align*} -\right] -``` - -This is equivalent to: -```math -\left[ -\begin{align*} -\frac{\partial \frac{\partial f}{\partial x}}{\partial x} &\quad \frac{\partial \frac{\partial f}{\partial x}}{\partial y}\\ -\frac{\partial \frac{\partial f}{\partial y}}{\partial x} &\quad \frac{\partial \frac{\partial f}{\partial y}}{\partial y}\\ -\end{align*} -\right]. -``` - - -As such, the total derivative is a generalization of what we have previously discussed. - - -## The chain rule - -If $G:R^k \rightarrow R^n$ and $F:R^n \rightarrow R^m$, then the composition $F\circ G$ takes $R^k \rightarrow R^m$. If all three functions are totally differentiable, then a chain rule will hold (total derivative of $F\circ G$ at point $a$): - -```math -d(F\circ G)_a = dF_{G(a)} \cdot dG_a - -``` - -If correct, this has the same formulation as the chain rule for the univariate case: derivative of outer at the inner *times* the derivative of the inner. - -First we check that the dimensions are correct: We have $dF_{G(a)}$ (the total derivative of $F$ at the point $G(a)$) is an $m \times n$ matrix and $dG_a$ (the total derivative of $G$ at the point $a$) is a $n \times k$ matrix. The product of a $m \times n$ matrix with a $n \times k$ matrix is defined, and is a $m \times k$ matrix, as is $d(F \circ G)_a$. - -The proof that the formula is correct uses the definition of totally differentiable written as - -```math -F(b + \vec{h}) - F(b) - dF_b\cdot \vec{h} = \epsilon(\vec{h}) \vec{h}, -``` - -where $\epsilon(h) \rightarrow \vec{0}$ as $h \rightarrow \vec{0}$. - -We have, using this for *both* $F$ and $G$: - -```math -\begin{align*} -F(G(a + \vec{h})) - F(G(a)) &= -F(G(a) + (dG_a \cdot \vec{h} + \epsilon_G \vec{h})) - F(G(a))\\ -&= F(G(a)) + dF_{G(a)} \cdot (dG_a \cdot \vec{h} + \epsilon_G \vec{h}) \\ -&+ \quad\epsilon_F (dG_a \cdot \vec{h} + \epsilon_G \vec{h}) - F(G(a))\\ -&= dF_{G(a)} \cdot (dG_a \cdot \vec{h}) + dF_{G(a)} \cdot (\epsilon_G \vec{h}) + \epsilon_F (dG_a \cdot \vec{h}) + (\epsilon_F \cdot \epsilon_G\vec{h}) -\end{align*} -``` - -The last line uses the linearity of $dF$ to isolate $dF_{G(a)} \cdot (dG_a \cdot \vec{h})$. Factoring out $\vec{h}$ and taking norms gives: - - -```math -\begin{align*} -\frac{\| F(G(a+\vec{h})) - F(G(a)) - dF_{G(a)}dG_a \cdot \vec{h} \|}{\| \vec{h} \|} &= -\frac{\| dF_{G(a)}\cdot(\epsilon_G\vec{h}) + \epsilon_F (dG_a\cdot \vec{h}) + (\epsilon_F\cdot\epsilon_G\vec{h}) \|}{\| \vec{h} \|} \\ -&\leq \| dF_{G(a)}\cdot\epsilon_G + \epsilon_F (dG_a) + \epsilon_F\cdot\epsilon_G \|\frac{\|\vec{h}\|}{\| \vec{h} \|}\\ -&\rightarrow 0. -\end{align*} -``` - - - -### Examples - -Our main use of the total derivative will be the change of variables in integration. - - -##### Example: polar coordinates - -A point $(a,b)$ in the plane can be described in polar coordinates by a radius $r$ and polar angle $\theta$. We can express this formally by $F:(a,b) \rightarrow (r, \theta)$ with - -```math -r(a,b) = \sqrt{a^2 + b^2}, \quad -\theta(a,b) = \tan^{-1}(b/a), -``` - -the latter assuming the point is in quadrant I or IV (though `atan(y,x)` will properly handle the other quadrants). The Jacobian of this transformation may be found with - -```julia; -@syms a::real b::real - -rⱼ = sqrt(a^2 + b^2) -θⱼ = atan(b/a) - -Jac = Sym[diff.(rⱼ, [a,b])'; # [∇f_1'; ∇f_2'] - diff.(θⱼ, [a,b])'] - -simplify.(Jac) -``` - -`SymPy` array objects have a `jacobian` method to make this easier to do. The calling style is Python-like, using `object.method(...)`: - -```julia; -[rⱼ, θⱼ].jacobian([a, b]) -``` - - - -The determinant, of geometric interest, will be - -```julia; -det(Jac) |> simplify -``` - - -The determinant is of interest, as the linear mapping represented by the Jacobian changes the area of the associated coordinate vectors. The determinant describes ow this area changes, as a multiplying factor. - - - -##### Example Spherical Coordinates - -In ``3`` dimensions a point can be described by (among other ways): - -* Cartesian coordinates: three coordinates relative to the $x$, $y$, and $z$ axes as $(a,b,c)$. - -* Spherical coordinates: a radius, $r$, an azimuthal angle $\theta$, and a polar angle -$\phi$ measured down from the $z$ axes. (We use the mathematics naming convention, the physics one has $\phi$ and $\theta$ reversed.) - -* Cylindrical coordinates: a radius, $r$, a polar angle $\theta$, and height $z$. - - -Some mappings are: - -| Cartesian (x,y,z) | Spherical ($r$, $\theta$, $\phi$) | Cylindrical ($r$, $\theta$, $z$) | -|:-------------------:|:---------------------------------:|:------------------------------------:| -| (1, 1, 0) | $(\sqrt{2}, \pi/4, \pi/2)$ | $(\sqrt{2},\pi/4, 0)$ | -| (0, 1, 1) | $(\sqrt{2}, 0, \pi/4)$ | $(\sqrt{2}, 0, 1)$ | - ----- - -Formulas can be found to convert between the different systems, here are a few written as multivariable functions: - -```julia; -function spherical_from_cartesian(x,y,z) - r = sqrt(x^2 + y^2 + z^2) - theta = atan(y/x) - phi = acos(z/r) - [r, theta, phi] -end - -function cartesian_from_spherical(r, theta, phi) - x = r*sin(phi)*cos(theta) - y = r*sin(phi)*sin(theta) - z = r*cos(phi) - [x, y, z] -end - -function cylindrical_from_cartesian(x, y, z) - r = sqrt(x^2 + y^2) - theta = atan(y/x) - z = z - [r, theta, z] -end - -function cartesian_from_cylindrical(r, theta, z) - x = r*cos(theta) - y = r*sin(theta) - z = z - [x, y, z] -end - -spherical_from_cartesian(v) = spherical_from_cartesian(v...) -cartesian_from_spherical(v) = cartesian_from_spherical(v...) -cylindrical_from_cartesian(v)= cylindrical_from_cartesian(v...) -cartesian_from_cylindrical(v) = cartesian_from_cylindrical(v...) -``` - -The Jacobian of a transformation can be found from these conversions. For example, the conversion from spherical to cartesian would have Jacobian computed by: - -```julia; -@syms r::real - -ex1 = cartesian_from_spherical(r, theta, phi) -J1 = ex1.jacobian([r, theta, phi]) -``` - -This has determinant: - -```julia; -det(J1) |> simplify -``` - -There is no function to convert from spherical to cylindrical above, but clearly one can be made by *composition*: - -```julia; -cylindrical_from_spherical(r, theta, phi) = - cylindrical_from_cartesian(cartesian_from_spherical(r, theta, phi)...) -cylindrical_from_spherical(v) = cylindrical_from_spherical(v...) -``` - -From this composition, we could compute the Jacobian directly, as with: - - -```julia; -ex2 = cylindrical_from_spherical(r, theta, phi) -J2 = ex2.jacobian([r, theta, phi]) -``` - -Now to see that this last expression could have been found by the *chain rule*. To do this we need to find the Jacobian of each function; evaluate them at the proper places; and, finally, multiply the matrices. The `J1` object, found above, does one Jacobian. We now need to find that of `cylindrical_from_cartesian`: - -```julia; -@syms x::real y::real z::real -ex3 = cylindrical_from_cartesian(x, y, z) -J3 = ex3.jacobian([x,y,z]) -``` - -The chain rule is not simply `J3 * J1` in the notation above, as the `J3` matrix must be evaluated at "`G(a)`", which is `ex1` from above: - -```julia; -J3_Ga = subs.(J3, x => ex1[1], y => ex1[2], z => ex1[3]) .|> simplify # the dots are important -``` - -The chain rule now says this product should be equivalent to `J2` above: - -```julia; -J3_Ga * J1 -``` - -The two are equivalent after simplification, as seen here: - -```julia; -J3_Ga * J1 - J2 .|> simplify -``` - - -##### Example - -The above examples were done symbolically. Performing the calculation numerically is quite similar. The `ForwardDiff` package has a gradient function to find the gradient at a point. The `CalculusWithJulia` package extends this to take a gradient of a function and return a function, also called `gradient`. This is defined along the lines of: - -```julia; eval=false -gradient(f::Function) = x -> ForwardDiff.gradient(f, x) -``` - -(though more flexibly, as either vector or a separate arguments can be used.) - -With this, defining a Jacobian function *could* be done like: - -```julia; eval=false -function Jacobian(F, x) - n = length(F(x...)) - grads = [gradient(x -> F(x...)[i])(x) for i in 1:n] - vcat(grads'...) -end -``` - -But, like `SymPy`, `ForwardDiff` provides a `jacobian` function directly, so we will use that; it requires a function definition where a vector is passed in and is called by `ForwardDiff.jacobian`. (The `ForwardDiff` package does not export its methods, they are qualified using the module name.) - -Using the above functions, we can verify the last example at a point: - - -```julia; -rtp = [1, pi/3, pi/4] -ForwardDiff.jacobian(cylindrical_from_spherical, rtp) -``` - -The chain rule gives the same answer up to roundoff error: - -```julia; -ForwardDiff.jacobian(cylindrical_from_cartesian, cartesian_from_spherical(rtp)) * ForwardDiff.jacobian(cartesian_from_spherical, rtp) -``` - - -##### Example: The Inverse Function Theorem - -For a change of variable problem, $F:R^n \rightarrow R^n$, the determinant of the Jacobian quantifies how volumes get modified under the transformation. When this determinant is *non*zero, then more can be said. The [Inverse Function Theorem](https://en.wikipedia.org/wiki/Inverse_function_theorem) states - -> if $F$ is a continuously differentiable function from an open set of $R^n$ into $R^n$and the total derivative is invertible at a point $p$ (i.e., the Jacobian determinant of $F$ at $p$ is non-zero), then $F$ is invertible near $p$. That is, an inverse function to $F$ is defined on some neighborhood of $q$, where $q=F(p)$. Further, $F^{-1}$ will be continuously differentiable at $q$ with $J_{F^{-1}}(q) = [J_F(p)]^{-1}$, the latter being the matrix inverse. Taking determinants, $\det(J_{F^{-1}}(q)) = 1/\det(J_F(p))$. - - -Assuming $F^{-1}$ exists, we can verify the last part from the chain rule, in an identical manner to the univariate case, starting with $F^{-1} \circ F$ being the identity, we would have: - -```math -J_{F^{-1}\circ F}(p) = I, -``` - -where $I$ is the *identity* matrix with entry $a_{ij} = 1$ when $i=j$ and $0$ otherwise. - -But the chain rule then says $J_{F^{-1}}(F(p)) J_F(p) = I$. This implies the two matrices are inverses to each other, and using the multiplicative mapping property of the determinant will also imply the determinant relationship. - -The theorem is an existential theorem, in that it implies $F^{-1}$ exists, but doesn't indicate how to find it. When we have an inverse though, we can verify the properties implied. - -The transformation examples have inverses indicated. Using one of these we can verify things at a point, as done in the following: - -```julia; -p = [1, pi/3, pi/4] -q = cartesian_from_spherical(p) - -A1 = ForwardDiff.jacobian(spherical_from_cartesian, q) # J_F⁻¹(q) -A2 = ForwardDiff.jacobian(cartesian_from_spherical, p) # J_F(p) - -A1 * A2 -``` - - -Up to roundoff error, this is the identity matrix. -As for the relationship between the determinants, up to roundoff error the two are related, as expected: - -```julia; -det(A1), 1/det(A2) -``` - - -##### Example: Implicit Differentiation, the Implicit Function Theorem - -The technique of *implicit differentiation* is a useful one, as it allows derivatives of more complicated expressions to be found. The main idea, expressed here with three variables is if an equation may be viewed as $F(x,y,z) = c$, $c$ a constant, then $z=\phi(x,y)$ may be viewed as a function of $x$ and $y$. Hence, we can use the chain rule to find: $\partial z / \partial x$ and $\partial z /\partial x$. Let $G(x,y) = \langle x, y, \phi(x,y) \rangle$ and then differentiation $(F \circ G)(x,y) = c$: - -```math -\begin{align*} -0 &= dF_{G(x,y)} \circ dG_{\langle x, y\rangle}\\ -&= [\frac{\partial F}{\partial x}\quad \frac{\partial F}{\partial y}\quad \frac{\partial F}{\partial z}](G(x,y)) \cdot -\left[\begin{array}{} -1 & 0\\ -0 & 1\\ -\frac{\partial \phi}{\partial x} & \frac{\partial \phi}{\partial y} -\end{array}\right]. -\end{align*} -``` - -Solving yields - -```math -\frac{\partial \phi}{\partial x} = -\frac{\partial F/\partial x}{\partial F/\partial z},\quad -\frac{\partial \phi}{\partial y} = -\frac{\partial F/\partial y}{\partial F/\partial z}. -``` - -Where the right hand side of each is evaluated at $G(x,y)$. - -When can it be reasonably assumed that such a function $z= \phi(x,y)$ exists? - -The [Implicit Function Theorem](https://en.wikipedia.org/wiki/Implicit_function_theorem) provides a statement (slightly abridged here): - -> Let $F:R^{n+m} \rightarrow R^m$ be a continuously differentiable function and let $R^{n+m}$ have (compactly defined) coordinates $\langle \vec{x}, \vec{y} \rangle$, Fix a point $\langle \vec{a}, \vec{b} \rangle$ with $F(\vec{a}, \vec{b}) = \vec{0}$. Let $J_{F, \vec{y}}(\vec{a}, \vec{b})$ be the Jacobian restricted to *just* the $y$ variables. ($J$ is $m \times m$.) If this matrix has non-zero determinant (it is invertible), then there exists an open set $U$ containing $\vec{a}$ and a *unique* continuously differentiable function $G: U \subset R^n \rightarrow R^m$ such that $G(\vec{a}) = \vec{b}$, $F(\vec{x}, G(\vec{x})) = 0$ for $\vec x$ in $U$. Moreover, the partial derivatives of $G$ are given by the matrix product: -> -> -> ``\frac{\partial G}{\partial x_j}(\vec{x}) = - [J_{F, \vec{y}}(x, F(\vec{x}))]^{-1} \left[\frac{\partial F}{\partial x_j}(x, G(\vec{x}))\right].`` - - ----- - -Specializing to our case above, we have $f:R^{2+1}\rightarrow R^1$ and $\vec{x} = \langle a, b\rangle$ and $\phi:R^2 \rightarrow R$. Then - -```math -[J_{f, \vec{y}}(x, g(\vec{x}))] = [\frac{\partial f}{\partial z}(a, b, \phi(a,b)], -``` - -a $1\times 1$ matrix, identified as a scalar, so inversion is just the reciprocal. So the formula, becomes, say for $x_1 = x$: - -```math -\frac{\partial \phi}{\partial x}(a, b) = - \frac{\frac{\partial{f}}{\partial{x}}(a, b,\phi(a,b))}{\frac{\partial{f}}{\partial{z}}(a, b, \phi(a,b))}, -``` - -as expressed above. Here invertibility is simply a non-zero value, and is needed for the division. In general, we see inverse (the $J^{-1}$) is necessary to express the answer. - - -Using this, we can answer questions like the following (as we did before) on a more solid ground: - -Let $x^2/a^2 + y^2/b^2 + z^2/c^2 = 1$ be an equation describing an ellipsoid. Describe the tangent plane at a point on the ellipse. - -We would like to express the tangent plane in terms of $\partial{z}/\partial{x}$ and $\partial{z}/\partial{y}$, which we can do through: - -```math -\frac{2x}{a^2} + \frac{2z}{c^2} \frac{\partial{z}}{\partial{x}} = 0, \quad -\frac{2y}{a^2} + \frac{2z}{c^2} \frac{\partial{z}}{\partial{y}} = 0. -``` - -Solving, we get - -```math -\frac{\partial{z}}{\partial{x}} = -\frac{2x}{a^2}\frac{c^2}{2z}, -\quad -\frac{\partial{z}}{\partial{y}} = -\frac{2y}{a^2}\frac{c^2}{2z}, -``` - -*provided* $z \neq 0$. At $z=0$ the tangent plane exists, but we can't describe it in this manner, as it is vertical. However, the choice of variables to use is not fixed in the theorem, so if $x \neq 0$ we can express $x = x(y,z)$ and express the tangent plane in terms of $\partial{x}/\partial{y}$ and $\partial{x}/\partial{z}$. The answer is similar to the above, and we won't repeat. Similarly, should $x = z = 0$, the $y \neq 0$ and we can use an implicit definition $y = y(x,z)$ and express the tangent plane through $\partial{y}/\partial{x}$ and $\partial{y}/\partial{z}$. - -##### Example: Lagrange multipliers in more dimensions - -Consider now the problem of maximizing $f:R^n \rightarrow R$ subject to -$k < n$ constraints $g_1(\vec{x}) = c_1, g_2(\vec{x}) = c_2, \dots, g_{k}(\vec{x}) = c_{k}$. For $n=1$ and $2$, we saw that if all derivatives exist, then a *necessary* condition to be at a maximum is that $\nabla{f}$ can be written as $\lambda_1 \nabla{g_1}$ ($n=1$) or $\lambda_1 \nabla{g_1} + \lambda_2 \nabla{g_2}$. The key observation is that the gradient of $f$ must have no projection on the intersection of the tangent planes found by linearizing $g_i$. - -The same thing holds in dimension $n > 2$: Let $\vec{x}_0$ be a point where $f(\vec{x})$ is maximum subject to the $p$ constraints. We want to show that $\vec{x}_0$ must satisfy: - -```math -\nabla{f}(\vec{x}_0) = \sum \lambda_i \nabla{g_i}(\vec{x}_0). -``` - -By considering $-f$, the same holds for a minimum. - -We follow the sketch of [Sawyer](https://www.math.wustl.edu/~sawyer/handouts/LagrangeMult.pdf). - -Using Taylor's theorem, we have $f(\vec{x} + h \vec{y}) = f(\vec{x}) + h \vec{y}\cdot\nabla{f} + h^2\vec{c}$, for some $\vec{c}$. If $h$ is small enough, this term can be ignored. - -The tangent "plane" for each constraint, $g_i(\vec{x}) = c_i$, is orthogonal to the gradient vector $\nabla{g_i}(\vec{x})$. -That is, $\nabla{g_i}(\vec{x})$ is orthogonal to the level-surface formed by the constraint $g_i(\vec{x}) = 0$. Let $A$ be the set of all *linear* combinations of $\nabla{g_i}$, that are possible: $\lambda_1 g_1(\vec{x}) + \lambda_2 g_2(\vec{x}) + \cdots + \lambda_p g_p(\vec{x})$, as in the statement. Through projection, we can write $\nabla{f}(\vec{x}_0) = \vec{a} + \vec{b}$, where $\vec{a}$ is in $A$ and $\vec{b}$ is *orthogonal* to $A$. - -Let $\vec{r}(t)$ be a parameterization of a path through the intersection of the $p$ tangent planes that goes through $\vec{x}_0$ at $t_0$ *and* $\vec{b}$ is parallel to $\vec{x}_0'(t_0)$. (The implicit function theorem would guarantee this path.) - -If we consider $f(\vec{x}_0 + h \vec{b})$ for small $h$, then unless $\vec{b} \cdot \nabla{f} = 0$, the function would increase in the direction of $\vec{b}$ due to the $h \vec{b}\cdot\nabla{f}$ term in the approximating Taylor series. -That is, $\vec{x}_0$ would not be a maximum on the constraint. So at $\vec{x}_0$ this directional derivative is $0$. - -Then we have the directional derivative in the direction of $b$ is $\vec{0}$, as the gradient -```math -\vec{0} = \vec{b} \cdot \nabla{f}(\vec{x}_0) = \vec{b} \cdot (\vec{a} + \vec{b}) = \vec{b}\cdot \vec{a} + \vec{b}\cdot\vec{b} = \vec{b}\cdot\vec{b}, -``` - -or $\| \vec{b} \| = 0$ and $\nabla{f}(\vec{x}_0)$ must lie in the plane $A$. - ----- - -How does the implicit function theorem guarantee a parameterization of -a curve along the constraint in the direction of $b$? - -A formal proof -requires a bit of linear algebra, but here we go. Let $G(\vec{x}) = -\langle g_1(\vec{x}), g_2(\vec{x}), \dots, g_k(\vec{x}) \rangle$. Then $G(\vec{x}) = -\vec{c}$ encodes the constraint. The tangent planes are orthogonal to -each $\nabla{g_i}$, so using matrix notation, the intersection of the -tangent planes is any vector $\vec{h}$ satisfying $J_G(\vec{x}_0) -\vec{h} = 0$. Let $k = n - 1 - p$. If $k > 0$, there will be $k$ -vectors *orthogonal* to each of $\nabla{g_i}$ and $\vec{b}$. Call -these $\vec{v}_j$. Then define additional constraints $h_j(\vec{x}) = \vec{v}_j -\cdot \vec{x} = 0$. Let $H(x_1, x_2, \dots, x_n) = \langle g_1, g_2, \dots, -g_p, h_1, \dots, h_{n-1-p}\rangle$. $H:R^{1 + (n-1)} \rightarrow -R^{n-1}$. Let $H(x_1, \dots, x_n) = H(x, \vec{y})$ The $H$ *restricted* -to the $\vec{y}$ variables is a function from $R^{n-1}\rightarrow -R^{n-1}$. *If* this restricted function has a Jacobian with non-zero determinant, then there -exists a $\vec\phi(x): R \rightarrow R^{n-1}$ with $H(x, \vec\phi(x)) = -\vec{c}$. Let $\vec{r}(t) = \langle t, \phi_1(t), \dots, -\phi_{n-1}(t)\rangle$. Then $(H\circ\vec{r})(t) = \vec{c}$, so by the chain -rule $d_H(\vec{r}) d\vec{r} = 0$. But $dH = -[\nabla{g_1}'; \nabla{g_2}' \dots;\nabla{g_p}', v_1';\dots;v_{n-1-p}']$ -(A matrix of row vectors). The condition $dH(\vec{r}) d\vec{r} = -\vec{0}$ is equivalent to saying $d\vec{r}$ is *orthogonal* to the row -vectors in $dH$. A *basis* for $R^n$ are these vectors and $\vec{b}$, -so $\vec{r}$ and $\vec{b}$ must be parallel. - -##### Example - -We apply this to two problems, also from Sawyer. First, let $n > 1$ and $f(x_1, \dots, x_n) = \sum x_i^2$. Minimize this subject to the constraint $\sum x_i = 1$. This one constraint means an answer must satisfy $\nabla{L} = \vec{0}$ where - -```math -L(x_1, \dots, x_n, \lambda) = \sum x_i^2 + \lambda \sum x_i - 1. -``` - -Taking $\partial/\partial{x_i}$ we have $2x_i + \lambda = 0$, so $x_i = \lambda/2$, a constant. From the constraint, we see $x_i = 1/n$. This does not correspond to a maximum, but a minimum. A maximum would be at point on the constraint such as $\langle 1, 0, \dots, 0\rangle$, which gives a value of $1$ for $f$, not $n \times 1/n^2 = 1/n$. - - -##### Example - -In statistics, there are different ways to define the best estimate for a population parameter based on the data. -That is, suppose $X_1, X_2, \dots, X_n$ are random variables. The population parameters of interest here are the mean $E(X_i) = \mu$ and the variance $Var(X_i) = \sigma_i^2$. (The mean is assumed to be the same for all, but the variance need not be.) What should someone use to *estimate* $\mu$ using just the sample values $X_1, X_2, \dots, X_n$? The average, $(X_1 + \cdots + X_n)/n$ is a well known estimate, but is it the "best" in some sense for this set up? Here some variables are more variable, should they count the same, more, or less in the weighting for the estimate? - -In Sawyer, we see an example of applying the Lagrange multiplier method to the best linear unbiased estimator (BLUE). The BLUE is a choice of coefficients $a_i$ such that $Var(\sum a_i X_i)$ is smallest subject to the constraint $E(\sum a_i X_i) = \mu$. - -The BLUE *minimizes* the *variance* of the estimator. (This is the *B*est part of BLUE). The estimator, $\sum a_i X_i$, is *L*inear. The constraint is that the estimator has theoretical mean given by $\mu$. (This is the *Un*biased part of BLUE.) - -Going from statistics to mathematics, we use formulas for *independent* random variables to restate this problem mathematically as: - -```math -\text{Minimize } \sum a_i^2 \sigma_i^2 \text{ subject to } \sum a_i = 1. -``` - -This problem is similar now to the last one, save the sum to minimize includes the sigmas. Set $L = \sum a_i^2 \sigma_i^2 + \lambda\sum a_i - 1$ - -Taking $\partial/\partial{a_i}$ gives equations $2a_i\sigma_i^2 + \lambda = 0$, $a_i = -\lambda/(2\sigma_i^2) = c/\sigma_i^2$. The constraint implies $c = 1/\sum(1/\sigma_i)^2$. So variables with *more* variance, get smaller weights. - -For the special case of a common variance, $\sigma_i=\sigma$, the above simplifies to $a_i = 1/n$ and the estimator is $\sum X_i/n$, the familiar sample mean, $\bar{X}$. - - - - - - -## Questions - - -###### Question - -The following plots a surface defined by a (hidden) function $F: R^2 \rightarrow R^3$: - -```julia; echo=false; -𝑭(u, v) = [u*cos(v), u*sin(v), 2v] -``` - -```julia; hold=true; -us, vs = range(0, 1, length=25), range(0, 2pi, length=25) -xs, ys, zs = unzip(𝑭.(us, vs')) -surface(xs, ys, zs) -``` - -Is this the surface generated by $F(u,v) = \langle u\cos(v), u\sin(v), 2v\rangle$? This function's surface is termed a helicoid. - -```julia; hold=true; echo=false -yesnoq(true) -``` - -###### Question - -The following plots a surface defined by a (hidden) function $F: R^2 \rightarrow R^3$ of the form $F(u,v) = \langle r(u)\cos(v), r(u)\sin(v), u\rangle$ - -```julia; echo=false -𝓇ad(u) = 1 + u^2 -ℱ(u, v) = [𝓇ad(u)*cos(v), 𝓇ad(u)*sin(v), u] -``` - -```julia; hold=true; -us, vs = range(-1, 1, length=25), range(0, 2pi, length=25) -xs, ys, zs = unzip(ℱ.(us, vs')) -surface(xs, ys, zs) -``` - -Is this the surface generated by $r(u) = 1+u^2$? This form of a function is for a surface of revolution about the $z$ axis. - -```julia; hold=true; echo=false -yesnoq(true) -``` - -###### Question - -The transformation $F(x, y) = \langle 2x + 3y + 1, 4x + y + 2\rangle$ is an example of an affine transformation. Is this the *Jacobian* of $F$ - -```math -J = \left[ -\begin{array}{} -2 & 4\\ -3 & 1 -\end{array} -\right]. -``` - -```julia; hold=true; echo=false -choices = [ -"Yes", -"No, it is the transpose" -] -answ=2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Does the transformation $F(u,v) = \langle u^2 - v^2, u^2 + v^2 \rangle$ have Jacobian - -```math -J = \left[ -\begin{array}{} -2u & -2v\\ -2u & 2v -\end{array} -\right]? -``` - - -```julia; hold=true; echo=false -choices = [ -"Yes", -"No, it is the transpose" -] -answ=1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Fix constants $\lambda_0$ and $\phi_0$ and define a transformation -```math -F(\lambda, \phi) = \langle \cos(\phi)\sin(\lambda - \lambda_0), -\cos(\phi_0)\sin(\phi) - \sin(\phi_0)\cos(\phi)\cos(\lambda - \lambda_0) \rangle -``` - -What does the following `SymPy` code compute? - -```julia; hold=true; -@syms lambda lambda_0 phi phi_0 -F(lambda,phi) = [cos(phi)*sin(lambda-lambda_0), cos(phi_0)*sin(phi) - sin(phi_0)*cos(phi)*cos(lambda-lambda_0)] - -out = [diff.(F(lambda, phi), lambda) diff.(F(lambda, phi), phi)] -det(out) |> simplify -``` - -```julia; hold=true; echo=false -choices = [ -"The determinant of the Jacobian.", -"The determinant of the Hessian.", -"The determinant of the gradient." -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - - -What would be a more direct method: - -```julia; hold=true; echo=false -choices = [ -"`det(F(lambda, phi).jacobian([lambda, phi]))`", -"`det(hessian(F(lambda, phi), [lambda, phi]))`", -"`det(gradient(F(lambda, phi), [lambda, phi]))`" -] -answ=1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Let $z\sin(z) = x^3y^2 + z$. Compute $\partial{z}/\partial{x}$ implicitly. - -```julia; hold=true; echo=false -choices = [ - raw"`` 3x^2y^2/(z\cos(z) + \sin(z) + 1)``", - raw"`` 2x^3y/ (z\cos(z) + \sin(z) + 1)``", - raw"`` 3x^2y^2``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Let $x^4 + y^4 + z^4 + x^2y^2z^2 = 1$. Compute $\partial{z}/\partial{y}$ implicitly. - -```julia; hold=true; echo=false -choices = [ - raw"`` \frac{y \left(- x^{2} z^{2}{\left (x,y \right )} + 2 y^{2}\right)}{\left(x^{2} y^{2} - 2 z^{2}{\left (x,y \right )}\right) z{\left (x,y \right )}}``", - raw"`` \frac{x \left(2 x^{2} - y^{2} z^{2}{\left (x,y \right )}\right)}{\left(x^{2} y^{2} - 2 z^{2}{\left (x,y \right )}\right) z{\left (x,y \right )}}``", - raw"`` \frac{x \left(2 x^{2} - z^{2}{\left (x,y \right )}\right)}{\left(x^{2} - 2 z^{2}{\left (x,y \right )}\right) z{\left (x,y \right )}}``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Consider the vector field $R:R^2 \rightarrow R^2$ defined by $R(x,y) = \langle x, y\rangle$ and the vector field $S:R^2\rightarrow R^2$ defined by $S(x,y) = \langle -y, x\rangle$. Let $r = \|R\| = \sqrt{x^2 + y^2}$. $R$ is a radial field, $S$ a spin field. - -What is $\nabla{r}$? - -```julia; hold=true; echo=false -choices = [ - raw"`` R/r``", - raw"`` S/r``", - raw"`` R``" -] -answ = 1 -radioq(choices, answ) -``` - -Let $\phi = r^k$. What is $\nabla{\phi}$? - -```julia; hold=true; echo=false -choices = [ - raw"`` k r^{k-2} R``", - raw"`` kr^k R``", - raw"`` k r^{k-2} S``" -] -answ = 1 -radioq(choices, answ) -``` - -Based on your last answer, are all radial fields $R/r^n$, $n\geq 0$ gradients of scalar functions? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -Let $\phi = \tan^{-1}(y/x)$. What is $\nabla{\phi}$? - -```julia; hold=true; echo=false -choices = [ - raw"`` S/r^2``", - raw"`` S/r``", - raw"`` S``" -] -answ = 1 -radioq(choices, answ) -``` - -Express $S/r^n = \langle F_x, F_y\rangle$. For which $n$ is $\partial{F_y}/\partial{x} - \partial{F_x}/\partial{y} = 0$? - - -```julia; hold=true; echo=false -choices = [ -L"As the left-hand side becomes $(-n+2)r^{-n}$, only $n=2$.", -L"All $n \geq 0$", -L"No values of $n$" -] -answ = 1 -radioq(choices, answ) -``` - -(The latter is of interest, as only when the expression is $0$ will the vector field be the gradient of a scalar function.) diff --git a/CwJ/differentiable_vector_calculus/vector_valued_functions.jmd b/CwJ/differentiable_vector_calculus/vector_valued_functions.jmd deleted file mode 100644 index 5df6427..0000000 --- a/CwJ/differentiable_vector_calculus/vector_valued_functions.jmd +++ /dev/null @@ -1,2506 +0,0 @@ -# Vector-valued functions, $f:R \rightarrow R^n$ - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using Roots -using LinearAlgebra -using QuadGK -``` - -and - -```julia -import DifferentialEquations -import DifferentialEquations: ODEProblem, Tsit5 -``` - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -frontmatter = ( - title = "Vector-valued functions, ``f:R \\rightarrow R^n``", - description = "Calculus with Julia: Vector-valued functions, ``f:R \\rightarrow R^n``", - tags = ["CalculusWithJulia", "differentiable_vector_calculus", "vector-valued functions, ``f:R \\rightarrow r^n``"], -); - -nothing -``` - ----- - -We discuss functions of a single variable that return a vector in $R^n$. There are many parallels to univariate functions (when $n=1$) and differences. - - -## Definition - -A function $\vec{f}: R \rightarrow R^n$, $n > 1$ is called a vector-valued function. Some examples: - -```math -\vec{f}(t) = \langle \sin(t), 2\cos(t) \rangle, \quad -\vec{g}(t) = \langle \sin(t), \cos(t), t \rangle, \quad -\vec{h}(t) = \langle 2, 3 \rangle + t \cdot \langle 1, 2 \rangle. -``` - -The components themselves are also functions of $t$, in this case -univariate functions. Depending on the context, it can be useful to -view vector-valued functions as a function that returns a vector, or a -vector of the component functions. - -The above example functions have $n$ equal $2$, $3$, and $2$ respectively. We -will see that many concepts of calculus for univariate functions -($n=1$) have direct counterparts. - -(We use ``\vec{f}`` above to emphasize the return value is a vector, but will quickly drop that notation and let context determine if ``f`` refers to a scalar- or vector-valued function.) - -## Representation in Julia - -In `Julia`, the representation of a vector-valued function is straightforward: we define a function of a single variable that returns a vector. For example, the three functions above would be represented by: - -```julia; -f(t) = [sin(t), 2*cos(t)] -g(t) = [sin(t), cos(t), t] -h(t) = [2, 3] + t * [1, 2] -``` - -For a given `t`, these evaluate to a vector. For example: - -```julia; -h(2) -``` - -We can create a vector of functions, e.g., `F = [cos, sin, identity]`, but calling this object, as in `F(t)`, would require some work, such as `t = 1; [f(t) for f in F]` or `1 .|> F`. - -```julia -F = [cos, sin, identity] -[f(1) for f in F] -``` - -or - -```julia -1 .|> F -``` - - - - -## Space curves - -A vector-valued function is typically visualized as a -curve. That is, for some range, $a \leq t \leq b$ the set of points -$\{\vec{f}(t): a \leq t \leq b\}$ are plotted. If, say in $n=2$, we -have $x(t)$ and $y(t)$ as the component functions, then the graph -would also be the parametric plot of $x$ and $y$. The term *planar* curve is common for the $n=2$ case and *space* curve for the $n \geq 3$ case. - - -This plot represents the vectors with their tails at the origin. - -There is a convention for plotting the component functions to yield a parametric plot within the `Plots` package (e.g., `plot(x, y, a, b)`). This can be used to make polar plots, where `x` is `t -> r(t)*cos(t)` and `y` is `t -> r(t)*sin(t)`. - -However, we will use a different approach, as the component functions are not naturally produced from the vector-valued function. - -In `Plots`, the command `plot(xs, ys)`, where, say, `xs=[x1, x2, ..., xn]` and `ys=[y1, y2, ..., yn]`, will make a connect-the-dot plot between corresponding pairs of points. As previously discussed, this can be used as an alternative to plotting a function through `plot(f, a, b)`: first make a set of $x$ values, say `xs=range(a, b, length=100)`; then the corresponding $y$ values, say `ys = f.(xs)`; and then plotting through `plot(xs, ys)`. - -Similarly, were a third vector, ` zs`, for $z$ components used, `plot(xs, ys, zs)` will make a ``3``-dimensional connect the dot plot - -However, our representation of vector-valued functions naturally generates a vector of points: `[[x1,y1], [x2, y2], ..., [xn, yn]]`, as this comes from broadcasting `f` over some time values. -That is, for a collection of time values, `ts` the command `f.(ts)` will produce a vector of points. (Technically a vector of vectors, but points if you identify the ``2``-``d`` vectors as points.) - -To get the `xs` and `ys` from this is conceptually easy: just iterate over all the points and extract the corresponding component. For example, to get `xs` we would have a command like `[p[1] for p in f.(ts)]`. Similarly, the `ys` would use `p[2]` in place of `p[1]`. The `unzip` function from the `CalculusWithJulia` package does this for us. The name comes from how the `zip` function in base `Julia` takes two vectors and returns a vector of the values paired off. This is the reverse. As previously mentioned, `unzip` uses the `invert` function of the `SplitApplyCombine` package to invert the indexing (the ``j``th component of the ``i``th point can be referenced by `vs[i][j]` or `invert(vs)[j][i]`). - -Visually, we have `unzip` performing this reassociation: - -```verbatim -[[x1, y1, z1], (⌈x1⌉, ⌈y1⌉, ⌈z1⌉, - [x2, y2, z2], |x2|, |y2|, |z2|, - [x3, y3, z3], --> |x3|, |y3|, |z3|, - ⋮ ⋮ - [xn, yn, zn]] ⌊xn⌋, ⌊yn⌋, ⌊zn⌋ ) -``` - - -To turn a collection of vectors into separate arguments for a function, splatting (the `...`) is used. - ----- - -Finally, with these definitions, we can visualize the three functions we have defined. - -Here we show the plot of `f` over the values between $0$ and $2\pi$ and also add a vector anchored at the origin defined by `f(1)`. - -```julia; hold=true -ts = range(0, 2pi, length=200) -xs, ys = unzip(f.(ts)) -plot(xs, ys) -arrow!([0, 0], f(1)) -``` - -The trace of the plot is an ellipse. If we describe the components as $\vec{f}(t) = \langle x(t), y(t) \rangle$, then we have $x(t)^2 + y(t)^2/4 = 1$. That is, for any value of $t$, the resulting point satisfies the equation $x^2 + y^2/4 =1$ for an ellipse. - - -The plot of $g$ needs ``3``-dimensions to render. For most plotting backends, the following should work with no differences, save the additional vector is anchored in ``3`` dimensions now: - -```julia; hold=true -ts = range(0, 6pi, length=200) -plot(unzip(g.(ts))...) # use splatting to avoid xs,ys,zs = unzip(g.(ts)) -arrow!([0, 0, 0], g(2pi)) -``` - -Here the graph is a helix; three turns are plotted. If we write $g(t) = \langle x(t), y(t), z(t) \rangle$, as the $x$ and $y$ values trace out a circle, the $z$ value increases. When the graph is viewed from above, as below, we see only $x$ and $y$ components, and the view is circular. - -```julia; hold=true -ts = range(0, 6pi, length=200) -plot(unzip(g.(ts))..., camera=(0, 90)) -``` - -The graph of $h$ shows that this function parameterizes a line in space. The line segment for $-2 \leq t \leq 2$ is shown below: - -```julia; hold=true -ts = range(-2, 2, length=200) -plot(unzip(h.(ts))...) -``` - -### The `plot_parametric` function - -While the `unzip` function is easy to understand as a function that reshapes data from one format into one that `plot` can use, it's usage is a bit cumbersome. -The `CalculusWithJulia` package provides a function `plot_parametric` which hides the use of `unzip` and the splatting within a function definition. - -The function borrows a calling style for `Makie`. The interval to plot -over is specified first using `a..b` notation (which specifies a -closed interval in the `IntervalSets` package), then the function is -specified. Additional keyword arguments are passed along to `plot`. - -```julia; -plot_parametric(-2..2, h) -``` - -!!! note - Defining plotting functions in `Julia` for `Plots` is facilitated by the `RecipesBase` package. There are two common choices: creating a new function for plotting, as is done with `plot_parametric` and `plot_polar`; or creating a new type so that `plot` can dispatch to an appropriate plotting method. The latter would also be a reasonable choice, but wasn't taken here. In any case, each can be avoided by creating the appropriate values for `xs` and `ys` (and possibly `zs`). - -##### Example - -Familiarity with equations for lines, circles, and ellipses is -important, as these fundamental geometric shapes are often building -blocks in the description of other more complicated things. - -The point-slope equation of a line, $y = y_0 + m \cdot (x - x_0)$ -finds an analog. The slope, $m$, is replaced with a vector $\vec{v}$ -and the point, $(x_0, y_0)$ is replaced with a vector $\vec{p}$ -identified with a point in the plane. A parameterization would then be -$\vec{f}(t) = \vec{p} + (t - t_0) \vec{v}$. From this, we have $\vec{f}(t_0) = -\vec{p}$. - - - - -The unit circle is instrumental in introducing the trigonometric -functions though the identification of an angle $t$ with a point on -the unit circle $(x,y)$ through $y = \sin(t)$ and $x=\cos(t)$. With -this identification certain properties of the trigonometric functions -are immediately seen, such as the period of $\sin$ and $\cos$ being -$2\pi$, or the angles for which $\sin$ and $\cos$ are positive or even -increasing. Further, this gives a natural parameterization for a -vector-valued function whose plot yields the unit circle, namely -$\vec{f}(t) = \langle \cos(t), \sin(t) \rangle$. This -parameterization starts (at $t=0$) at the point $(1, 0)$. More -generally, we might have additional parameters $\vec{f}(t) = \vec{p} + -R \cdot \langle \cos(\omega(t-t_0)), \sin(\omega(t-t_0)) \rangle$ to -change the origin, $\vec{p}$; the radius, $R$; the starting angle, -$t_0$; and the rotational frequency, $\omega$. - - -An ellipse has a slightly more general equation than a circle and in -simplest forms may satisfy the equation $x^2/a^2 + y^2/b^2 = 1$, where *when* -$a=b$ a circle is being described. A vector-valued function of the form -$\vec{f}(t) = \langle a\cdot\cos(t), b\cdot\sin(t) \rangle$ will trace out an ellipse. - - -The above description of an ellipse is useful, but it can also be useful to re-express the ellipse so that one of the foci is at the origin. With this, the ellipse can be given in *polar* coordinates through a description of the radius: - -```math -r(\theta) = \frac{a (1 - e^2)}{1 + e \cos(\theta)}. -``` - -Here, $a$ is the semi-major axis ($a > b$); $e$ is the *eccentricity* given by $b = a \sqrt{1 - e^2}$; and $\theta$ a polar angle. - -Using the conversion to Cartesian equations, we have -$\vec{f}(\theta) = \langle r(\theta) \cos(\theta), r(\theta) \sin(\theta)\rangle$. - -For example: - -```julia; hold=true -a, ecc = 20, 3/4 -f(t) = a*(1-ecc^2)/(1 + ecc*cos(t)) * [cos(t), sin(t)] -plot_parametric(0..2pi, f, legend=false) -scatter!([0],[0], markersize=4) -``` - - - -##### Example - -The [Spirograph](https://en.wikipedia.org/wiki/Spirograph) is "... a geometric drawing toy that produces mathematical roulette curves of the variety technically known as hypotrochoids and epitrochoids. It was developed by British engineer Denys Fisher and first sold in ``1965``." These can be used to make interesting geometrical curves. - -Following Wikipedia: Consider a fixed outer circle $C_o$ of radius $R$ centered at the origin. A smaller inner circle $C_i$ of radius $r < R$ rolling inside $C_o$ and is continuously tangent to it. $C_i$ will be assumed never to slip on $C_o$ (in a real Spirograph, teeth on both circles prevent such slippage). Now assume that a point $A$ lying somewhere inside $C_{i}$ is located a distance $\rho < r$ from $C_i$'s center. - -The center of the inner circle will move in a circular manner with radius $R-r$. The fixed point on the inner circle will rotate about this center. The accumulated angle may be described by the angle the point of contact of the inner circle with the outer circle. Call this angle $t$. - -Suppose the outer circle is centered at the origin and the inner circle starts ($t=0$) with center $(R-r, 0)$ and rotates around counterclockwise. Then if the point of contact makes angle $t$, the arc length along the outer circle is $Rt$. The inner circle will have moved a distance $r t'$ in the opposite direction, so $Rt =-r t'$ and solving the angle will be $t' = -(R/r)t$. - -If the initial position of the fixed point is at $(\rho, 0)$ relative to the origin, then the following function will describe the motion: - -```math -\vec{s}(t) = (R-r) \cdot \langle \cos(t), \sin(t) \rangle + -\rho \cdot \langle \cos(-\frac{R}{r}t), \sin(-\frac{R}{r}t) \rangle. -``` - - - -To visualize this we first define a helper function to draw a circle at point $P$ with radius $R$: - -```julia; -circle!(P, R; kwargs...) = plot_parametric!(0..2pi, t -> P + R * [cos(t), sin(t)]; kwargs...) -``` - -Then we have this function to visualize the spirograph for different $t$ values: - -```julia; -function spiro(t; r=2, R=5, rho=0.8*r) - - cent(t) = (R-r) * [cos(t), sin(t)] - - p = plot(legend=false, aspect_ratio=:equal) - circle!([0,0], R, color=:blue) - circle!(cent(t), r, color=:black) - - tp(t) = -R/r * t - - s(t) = cent(t) + rho * [cos(tp(t)), sin(tp(t))] - plot_parametric!(0..t, s, color=:red) - - p -end -``` - -And we can see the trace for $t=\pi$: - - -```julia; -spiro(pi) -``` - -The point of contact is at $(-R, 0)$, as expected. Carrying this forward to a full circle's worth is done through: - -```julia; -spiro(2pi) -``` - -The curve does not match up at the start. For that, a second time around the outer circle is needed: - -```julia; -spiro(4pi) -``` - -Whether the curve will have a period or not is decided by the ratio of $R/r$ being rational or irrational. - -##### Example - -In 1935 [Marcel Duchamp](https://arthur.io/art/marcel-duchamp/rotorelief-no-10-cage-modele-depose-verso) showed a collection of "Rotorelief" discs at a French fair for inventors. Disk number 10 is comprised of several nested, off-center circles on disk that would be rotated to give a sense of movement. To mimic the effect: - -* for each circle, ``3`` points where selected using a mouse from an - image and their pixels recorded; -* as ``3`` points determine a circle, the center and radius of each circle can be solved for -* the exterior of the disc is drawn (the last specified circle below); -* each nested circle is drawn after its center is rotated by ``\theta`` radian; -* an animation captures the movement for display. - -```julia -let -# https://exploratorium.tumblr.com/post/33140874462/marcel-duchamp-rotoreliefs-duchamp-recognized - -# coordinates and colors selected by gimp from -# https://arthur.io/art/marcel-duchamp/rotorelief-no-10-cage-modele-depose-verso - circs = [466 548 513 505 556 554 # x₁,y₁,x₂,y₂,x₂,y₃ - 414 549 511 455 595 549 - 365 545 507 408 635 548 - 319 541 506 361 673 546 - 277 543 509 317 711 546 - 236 539 507 272 747 551 - 201 541 504 230 781 550 - 166 541 503 189 816 544 - 140 542 499 153 848 538 - 116 537 496 119 879 538 - 96 539 501 90 905 534 - 81 530 500 67 930 530 - 72 525 498 51 949 529 - 66 520 500 36 966 527 - 60 515 499 25 982 526 - 35 509 499 11 1004 525 # outer edge, c₀ - ] - - greenblue= RGB(8/100, 58/100, 53/100) - grey = RGB(76/100, 74/100, 72/100) - white = RGB(88/100, 85/100, 81/100) - - # solve for center of circle, radius for each - @syms h::positive k::positive r::positive - function solve_i(i) - eqs = [(p[1] - h)^2 + (p[2]-k)^2 ~ r^2 for - p ∈ (circs[i,1:2], circs[i,3:4], circs[i,5:6])] - d = solve(eqs)[1] - (x=float(d[h]), y=float(d[k]), r=float(d[r])) - end - c₀, cs... = solve_i.(16:-1:1) # c₀ is centered - - function duchamp_rotorelief_10(θ) - p = plot(legend=false, - axis=nothing, xaxis=false, yaxis=false, - aspect_ratio=:equal) - - O = [c₀.x, c₀.y] - θ̂ = [cos(θ), sin(θ)] - - circle!(O, c₀.r, # outer ring is c₀ - linewidth=2, - color=grey, fill=white, - seriestype=:shape) - - for (i,c) ∈ enumerate(cs) # add nested rings - rᵢ = sqrt((c₀.x - c.x)^2+(c₀.y - c.y)^2) - P = O + rᵢ * θ̂ # rotate about origin by θ - circle!(P, c.r, - linewidth = i == 1 ? 1 : i <= 3 ? 2 : 3, - color=greenblue) - end - - p - - end - - # animate using Plots.@animate macro - anim = @animate for θ ∈ range(0, -2π, length=60) - duchamp_rotorelief_10(θ) - end - - fname = tempname() * ".gif" - gif(anim, fname, fps = 40) -end -``` - -```julia; echo=false -#import PlutoUI -#PlutoUI.LocalResource(fname) # to display w/in Pluto -nothing -``` - -##### Example - -[Ivars Peterson](http://www.phschool.com/science/science_news/articles/tilt_a_whirl.html) described the carnival ride "tilt-a-whirl" as a chaotic system, whose equations of motion are presented in [American Journal of Physics](https://doi.org/10.1119/1.17742) by Kautz and Huggard. The tilt-a-whirl has a platform that moves in a circle that also moves up and down. To describe the motion of a point on the platform assuming it has radius $R$ and period $T$ and rises twice in that period could be done with the function: - -```math -\vec{u}(t) = \langle R \sin(2\pi t/T), R \cos(2\pi t/T), h + h \cdot \sin(2\pi t/ T) \rangle. -``` - -A passenger sits on a circular platform with radius $r$ attached at some point on the larger platform. The dynamics of the person on the tilt-a-whirl depend on physics, but for simplicity, let's assume the platform moves at a constant rate with period $S$ and has no relative $z$ component. The motion of the platform in relation to the point it is attached would be modeled by: - -```math -\vec{v}(t) = \langle r \sin(2\pi t/S), r \sin(2\pi t/S), 0 \rangle. -``` - -And the motion relative to the origin would be the vector sum, or superposition: - -```math -\vec{f}(t) = \vec{u}(t) + \vec{v}(t). -``` - -To visualize for some parameters, we have: - -```julia; hold=true -M, m = 25, 5 -height = 5 -S, T = 8, 2 -outer(t) = [M * sin(2pi*t/T), M * cos(2pi*t/T), height*(1 +sin(2pi * (t-pi/2)/T))] -inner(t) = [m * sin(2pi*t/S), m * cos(2pi*t/S), 0] -f(t) = outer(t) + inner(t) -plot_parametric(0..8, f) -``` - - - - -## Limits and continuity - -The definition of a limit for a univariate function is: For every $\epsilon > 0$ there exists a $\delta > 0$ such that *if* $0 < |x-c| < \delta$ *then* $|f(x) - L | < \epsilon$. - -If the notion of "$\vec{f}$ is close to $L$" is replaced by close in the sense of a norm, or vector distance, then the same limit definition can be used, with the new wording "... $\| \vec{f}(x) - L \| < \epsilon$". - - -The notion of continuity is identical: $\vec{f}(t)$ is continuous at $t_0$ if $\lim_{t \rightarrow t_0}\vec{f}(t) = \vec{f}(t_0)$. More informally ``\| \vec{f}(t) - \vec{f}(t_0)\| \rightarrow 0``. - -A consequence of the triangle inequality is that a vector-valued function is continuous or has a limit if and only it its component functions do. - -### Derivatives - -If $\vec{f}(t)$ is vector valued, and $\Delta t > 0$ then we can consider the vector: - -```math -\vec{f}(t + \Delta t) - \vec{f}(t) -``` - -For example, if $\vec{f}(t) = \langle 3\cos(t), 2\sin(t) \rangle$ and $t=\pi/4$ and $\Delta t = \pi/16$ we have this picture: - -```julia; hold=true -f(t) = [3cos(t), 2sin(t)] -t, Δt = pi/4, pi/16 -df = f(t + Δt) - f(t) - -plot(legend=false) -arrow!([0,0], f(t)) -arrow!([0,0], f(t + Δt)) -arrow!(f(t), df) -``` - -The length of the difference appears to be related to the length of $\Delta t$, in a similar manner as the univariate derivative. The following limit defines the *derivative* of a vector-valued function: - -```math -\vec{f}'(t) = \lim_{\Delta t \rightarrow 0} \frac{f(t + \Delta t) - f(t)}{\Delta t}. -``` - -The limit exists if the component limits do. The component limits are just the derivatives of the component functions. So, if $\vec{f}(t) = \langle x(t), y(t) \rangle$, then $\vec{f}'(t) = \langle x'(t), y'(t) \rangle$. - -If the derivative is never $\vec{0}$, the curve is called *regular*. For a regular -curve -the derivative is a tangent vector to the parameterized curve, akin to the case for a univariate function. We can use `ForwardDiff` to compute the derivative in the exact same manner as was done for univariate functions: - -```julia; eval=false -using ForwardDiff -D(f,n=1) = n > 1 ? D(D(f),n-1) : x -> ForwardDiff.derivative(f, float(x)) -Base.adjoint(f::Function) = D(f) # allow f' to compute derivative -``` - -(This is already done by the `CalculusWithJulia` package.) - -We can visualize the tangential property through a graph: - -```julia; hold=true -f(t) = [3cos(t), 2sin(t)] -p = plot_parametric(0..2pi, f, legend=false, aspect_ratio=:equal) -for t in [1,2,3] - arrow!(f(t), f'(t)) # add arrow with tail on curve, in direction of derivative -end -p -``` - - -### Symbolic representation - -Were symbolic expressions used in place of functions, the vector-valued function would naturally be represented as a vector of expressions: - -```julia; -@syms 𝒕 -𝒗vf = [cos(𝒕), sin(𝒕), 𝒕] -``` - -We will see working with these expressions is not identical to working with a vector-valued function. - -To plot, we can avail ourselves of the the parametric plot syntax. The following expands to `plot(cos(t), sin(t), t, 0, 2pi)`: - -```julia; -plot(𝒗vf..., 0, 2pi) -``` - -The `unzip` usage, as was done above, could be used, but it would be more trouble in this case. - -To evaluate the function at a given value, say $t=2$, we can use `subs` with broadcasting to substitute into each component: - -```julia; -subs.(𝒗vf, 𝒕=>2) -``` - -Limits are performed component by component, and can also be defined by broadcasting, again with the need to adjust the values: - -```julia; -@syms Δ -limit.((subs.(𝒗vf, 𝒕 => 𝒕 + Δ) - 𝒗vf) / Δ, Δ => 0) -``` - - -Derivatives, as was just done through a limit, are a bit more -straightforward than evaluation or limit taking, as we won't bump into -the shape mismatch when broadcasting: - -```julia; -diff.(𝒗vf, 𝒕) -``` - -The second derivative, can be found through: - -```julia; -diff.(𝒗vf, 𝒕, 𝒕) -``` - - -### Applications of the derivative - -Here are some sample applications of the derivative. - -##### Example: equation of the tangent line -The derivative of a vector-valued function is similar to that of a univariate function, in that it indicates a direction tangent to a curve. The point-slope form offers a straightforward parameterization. We have a point given through the vector-valued function and a direction given by its derivative. (After identifying a vector with its tail at the origin with the point that is the head of the vector.) - -With this, the equation is simply $\vec{tl}(t) = \vec{f}(t_0) + \vec{f}'(t_0) \cdot (t - t_0)$, where the dot indicates scalar multiplication. - - - -##### Example: parabolic motion - -In physics, we learn that the equation $F=ma$ can be used to derive a formula for postion, when acceleration, $a$, is a constant. The resulting equation of motion is $x = x_0 + v_0t + (1/2) at^2$. Similarly, if $x(t)$ is a vector-valued postion vector, and the *second* derivative, $x''(t) =\vec{a}$, a constant, then we have: $x(t) = \vec{x_0} + \vec{v_0}t + (1/2) \vec{a} t^2$. - -For two dimensions, we have the force due to gravity acts downward, only in the $y$ direction. The acceleration is then $\vec{a} = \langle 0, -g \rangle$. If we start at the origin, with initial velocity $\vec{v_0} = \langle 2, 3\rangle$, then we can plot the trajectory until the object returns to ground ($y=0$) as follows: - -```julia; hold=true -gravity = 9.8 -x0, v0, a = [0,0], [2, 3], [0, -gravity] -xpos(t) = x0 + v0*t + (1/2)*a*t^2 - -t_0 = find_zero(t -> xpos(t)[2], (1/10, 100)) # find when y=0 - -plot_parametric(0..t_0, xpos) -``` - - - -```julia; echo=false; -# https://en.wikipedia.org/wiki/Tractrix -# https://sinews.siam.org/Details-Page/a-bike-and-a-catenary -# https://www.math.psu.edu/tabachni/talks/BicycleDouble.pdf -# https://www.tandfonline.com/doi/abs/10.4169/amer.math.monthly.120.03.199 -# https://projecteuclid.org/download/pdf_1/euclid.em/1259158427 -nothing -``` - -##### Example: a tractrix - -A [tractrix](https://en.wikipedia.org/wiki/Tractrix), studied by Perrault, Newton, Huygens, and many others, is the curve along which an object moves when pulled in a horizontal plane by a line segment attached to a pulling point (Wikipedia). If the object is placed at $(a,0)$ and the puller at the origin, and the puller moves along the positive $x$ axis, then the line will always be tangent to the curve and of fixed length, so determinable from the motion of the puller. In this example $dy/dx = -\sqrt{a^2-x^2}/x$. - -This is the key property: "Due to the geometrical way it was defined, the tractrix has the property that the segment of its tangent, between the asymptote and the point of tangency, has constant length $a$." - - -The tracks made by the front and rear bicycle wheels also have this -same property and similarly afford a mathematical description. We follow -[Dunbar, Bosman, and Nooij](https://doi.org/10.2307/2691097) from *The -Track of a Bicycle Back Tire* below, though -[Levi and Tabachnikov](https://projecteuclid.org/download/pdf_1/euclid.em/1259158427) -and -[Foote, Levi, and Tabachnikov](https://www.tandfonline.com/doi/abs/10.4169/amer.math.monthly.120.03.199) -were also consulted. Let $a$ be the distance between the front and -back wheels, whose positions are parameterized by $\vec{F}(t)$ and -$\vec{B}(t)$, respectively. The key property is the distance between -the two is always $a$, and, as the back wheel is always moving in the -direction of the front wheel, we have $\vec{B}'(t)$ is in the -direction of $\vec{F}(t) - \vec{B}(t)$, that is the vector -$(\vec{F}(t)-\vec{B}(t))/a$ is a unit vector in the direction of the -derivative of $\vec{B}$. How long is the derivative vector? That would -be answered by the speed of the back wheel, which is related to the -velocity of the front wheel. But only the component of the velocity in -the direction of $\vec{F}(t)-\vec{B}(t)$, so the speed of the back -wheel is the length of the projection of $\vec{F}'(t)$ onto the unit -vector $(\vec{F}(t)-\vec{B}(t))/a$, which is identified through the dot product. - -Combined, this gives the following equations relating $\vec{F}(t)$ to $\vec{B}(t)$: - -```math -s_B(t) = \vec{F}'(t) \cdot \frac{\vec{F}(t)-\vec{B}(t)}{a}, \quad -\vec{B}'(t) = s_B(t) \frac{\vec{F}(t)-\vec{B}(t)}{a}. -``` - -This is a *differential* equation describing the motion of the back wheel in terms of the front wheel. - -If the back wheel trajectory is known, the relationship is much easier, as the two differ by a vector of length $a$ in the direction of $\vec{B}'(t)$, or: - -```math -F(t) = \vec{B}(t) + a \frac{\vec{B'(t)}}{\|\vec{B}'(t)\|}. -``` - - -We don't discuss when a differential equation has a solution, or if it -is unique when it does, but note that the differential equation above -may be solved numerically, in a manner somewhat similar to what was -discussed in [ODEs](../ODEs/odes.html). Here we will use the -`DifferentialEquations` package for finding numeric solutions. - -We can define our equation as follows, using `p` to pass in the two parameters: the wheel-base length $a$, and $F(t)$, the parameterization of the front wheel in time: - - -```julia; - -function bicycle(dB, B, p, t) - - a, F = p # unpack parameters - - speed = F'(t) ⋅ (F(t) - B) / a - dB[1], dB[2] = speed * (F(t) - B) / a - -end -``` - -Let's consider a few simple cases first. We suppose $a=1$ and the front wheel moves in a circle of radius $3$. Here is how we can plot two loops: - -```julia; -t₀, t₁ = 0.0, 4pi - -tspan₁ = (t₀, t₁) # time span to consider - -a₁ = 1 -F₁(t) = 3 * [cos(t), sin(t)] -p₁ = (a₁, F₁) # combine parameters - -B₁0 = F₁(0) - [0, a₁] # some initial position for the back -prob₁ = ODEProblem(bicycle, B₁0, tspan₁, p₁) - -out₁ = solve(prob₁, reltol=1e-6, Tsit5()) -``` - -The object `out` holds the answer. This object is callable, in that `out(t)` will return the numerically computed value for the answer to our equation at time point `t`. - -To plot the two trajectories, we could use that `out.u` holds the $x$ and $y$ components of the computed trajectory, but more simply, we can just call `out` like a function. - -```julia; -plt₁ = plot_parametric(t₀..t₁, F₁, legend=false) -plot_parametric!(t₀..t₁, out₁, linewidth=3) - -## add the bicycle as a line segment at a few times along the path -for t in range(t₀, t₁, length=11) - plot!(unzip([out₁(t), F₁(t)])..., linewidth=3, color=:black) -end -plt₁ -``` - - -That the rear wheel track appears shorter, despite the rear wheel starting outside the circle, is typical of bicycle tracks and also a reason to rotate tires on car, as the front ones move a bit more than the rear, so presumably wear faster. - -Let's look what happens if the front wheel wobbles back and forth following a sine curve. Repeating the above, only with $F$ redefined, we have: - -```julia; -a₂ = 1 -F₂(t) = [t, 2sin(t)] -p₂ = (a₂, F₂) - -B₂0 = F₂(0) - [0, a₂] # some initial position for the back -prob₂ = ODEProblem(bicycle, B₂0, tspan₁, p₂) - -out₂ = solve(prob₂, reltol=1e-6, Tsit5()) - -plot_parametric(t₀..t₁, F₂, legend=false) -plot_parametric!(t₀..t₁, t -> out₂(t), linewidth=3) -``` - -Again, the back wheel moves less than the front. - - -The motion of the back wheel need not be smooth, even if the motion of the front wheel is, as this curve illustrates: - -```julia; -a₃ = 1 -F₃(t) = [cos(t), sin(t)] + [cos(2t), sin(2t)] -p₃ = (a₃, F₃) - -B₃0 = F₃(0) - [0,a₃] -prob₃ = ODEProblem(bicycle, B₃0, tspan₁, p₃) - -out₃ = solve(prob₃, reltol=1e-6, Tsit5()) -plot_parametric(t₀..t₁, F₃, legend=false) -plot_parametric!(t₀..t₁, t -> out₃(t), linewidth=3) -``` - -The back wheel is moving backwards for part of the above trajectory. - - -This effect can happen even for a front wheel motion as simple as a circle when the front wheel radius is less than the wheelbase: - -```julia; -a₄ = 1 -F₄(t) = a₄/3 * [cos(t), sin(t)] -p₄ = (a₄, F₄) - -t₀₄, t₁₄ = 0.0, 25pi -tspan₄ = (t₀₄, t₁₄) - -B₄0 = F₄(0) - [0, a₄] -prob₄ = ODEProblem(bicycle, B₄0, tspan₄, p₄) - -out₄ = solve(prob₄, reltol=1e-6, Tsit5()) -plot_parametric(t₀₄..t₁₄, F₄, legend=false, aspect_ratio=:equal) -plot_parametric!(t₀₄..t₁₄, t -> out₄(t), linewidth=3) -``` - - -Later we will characterize when there are cusps in the rear-wheel trajectory. - - -## Derivative rules - -From the definition, as it is for univariate functions, for vector-valued functions $\vec{f}, \vec{g}: R \rightarrow R^n$: - -```math -[\vec{f} + \vec{g}]'(t) = \vec{f}'(t) + \vec{g}'(t), \quad\text{and } -[a\vec{f}]'(t) = a \vec{f}'(t). -``` - -If $a(t)$ is a univariate (scalar) function of $t$, then a product rule holds: - -```math -[a(t) \vec{f}(t)]' = a'(t)\vec{f}(t) + a(t)\vec{f}'(t). -``` - -If $s$ is a univariate function, then the composition $\vec{f}(s(t))$ can be differentiated. Each component would satisfy the chain rule, and consequently: - -```math -\frac{d}{dt}\left(\vec{f}(s(t))\right) = \vec{f}'(s(t)) \cdot s'(t), -``` - -The dot being scalar multiplication by the derivative of the univariate function $s$. - -Vector-valued functions do not have multiplication or division defined for them, so there are no ready analogues of the product and quotient rule. However, the dot product and the cross product produce new functions that may have derivative rules available. - -For the dot product, the combination $\vec{f}(t) \cdot \vec{g}(t)$ we have a univariate function of $t$, so we know a derivative is well defined. Can it be represented in terms of the vector-valued functions? In terms of the component functions, we have this calculation specific to $n=2$, but that which can be generalized: - -```math -\begin{align*} -\frac{d}{dt}(\vec{f}(t) \cdot \vec{g}(t)) &= -\frac{d}{dt}(f_1(t) g_1(t) + f_2(t) g_2(t))\\ -&= f_1'(t) g_1(t) + f_1(t) g_1'(t) + f_2'(t) g_2(t) + f_2(t) g_2'(t)\\ -&= f_1'(t) g_1(t) + f_2'(t) g_2(t) + f_1(t) g_1'(t) + f_2(t) g_2'(t)\\ -&= \vec{f}'(t)\cdot \vec{g}(t) + \vec{f}(t) \cdot \vec{g}'(t). -\end{align*} -``` - -Suggesting the that a product rule like formula applies for dot products. - - -For the cross product, we let `SymPy` derive a formula for us. - -```julia; -@syms tₛ us()[1:3] vs()[1:3] -uₛ = tₛ .|> us # evaluate each of us at t -vₛ = tₛ .|> vs -``` - -Then the cross product has a derivative: - -```julia; -diff.(uₛ × vₛ, tₛ) -``` - -Admittedly, that isn't very clear. With a peek at the answer, we show that the derivative is the same as the product rule would suggest ($\vec{u}' \times \vec{v} + \vec{u} \times \vec{v}'$): - -```julia; -diff.(uₛ × vₛ, tₛ) - (diff.(uₛ, tₛ) × vₛ + uₛ × diff.(vₛ, tₛ)) -``` - - -In summary, these two derivative formulas hold for vector-valued functions $R \rightarrow R^n$: - -```math -\begin{align} -(\vec{u} \cdot \vec{v})' &= \vec{u}' \cdot \vec{v} + \vec{u} \cdot \vec{v}',\\ -(\vec{u} \times \vec{v})' &= \vec{u}' \times \vec{v} + \vec{u} \times \vec{v}'. -\end{align} -``` - -##### Application. Circular motion and the tangent vector. - -The parameterization $\vec{r}(t) = \langle \cos(t), \sin(t) \rangle$ describes a circle. Characteristic of this motion is a constant radius, or in terms of a norm: $\| \vec{r}(t) \| = c$. The norm squared, can be expressed in terms of the dot product: - -```math -\| \vec{r}(t) \|^2 = \vec{r}(t) \cdot \vec{r}(t). -``` - -Differentiating this for the case of a constant radius yields the -equation $0 = [\vec{r}\cdot\vec{r}]'(t)$, which simplifies through the -product rule and commutativity of the dot product to $0 = 2 \vec{r}(t) -\cdot \vec{r}'(t)$. That is, the two vectors are orthogonal to each -other. This observation proves to be very useful, as will be seen. - - -##### Example: Kepler's laws - -[Kepler](https://tinyurl.com/y38wragh)'s laws of planetary motion are summarized by: - -* The orbit of a planet is an ellipse with the Sun at one of the two foci. - -* A line segment joining a planet and the Sun sweeps out equal areas during equal intervals of time. - -* The square of the orbital period of a planet is directly proportional to the cube of the semi-major axis of its orbit. - -Kepler was a careful astronomer, and derived these laws empirically. -We show next how to derive these laws using vector calculus assuming some facts on Newtonian motion, as postulated by Newton. This approach is borrowed from [Joyce](https://mathcs.clarku.edu/~djoyce/ma131/kepler.pdf). - -We adopt a sun-centered view of the universe, placing the sun at the origin and letting $\vec{x}(t)$ be the position of a planet relative to this origin. We can express this in terms of a magnitude and direction through $r(t) \hat{x}(t)$. - -Newton's law of gravitational force between the sun and this planet is then expressed by: - - -```math -\vec{F} = -\frac{G M m}{r^2} \hat{x}(t). -``` - -Newton's famous law relating force and acceleration is - -```math -\vec{F} = m \vec{a} = m \ddot{\vec{x}}. -``` - -Combining, Newton states $\vec{a} = -(GM/r^2) \hat{x}$. - -Now to show the first law. Consider $\vec{x} \times \vec{v}$. It is constant, as: - -```math -\begin{align} -(\vec{x} \times \vec{v})' &= \vec{x}' \times \vec{v} + \vec{x} \times \vec{v}'\\ -&= \vec{v} \times \vec{v} + \vec{x} \times \vec{a}. -\end{align} -``` - -Both terms are $\vec{0}$, as $\vec{a}$ is parallel to $\vec{x}$ by the above, and clearly $\vec{v}$ is parallel to itself. - -This says, $\vec{x} \times \vec{v} = \vec{c}$ is a constant vector, meaning, the motion of $\vec{x}$ must lie in a plane, as $\vec{x}$ is always orthogonal to the fixed vector $\vec{c}$. - -Now, by differentiating $\vec{x} = r \hat{x}$ we have: - -```math -\begin{align} -\vec{v} &= \vec{x}'\\ -&= (r\hat{x})'\\ -&= r' \hat{x} + r \hat{x}', -\end{align} -``` - -and so - -```math -\begin{align} -\vec{c} &= \vec{x} \times \vec{v}\\ -&= (r\hat{x}) \times (r'\hat{x} + r \hat{x}')\\ -&= r^2 (\hat{x} \times \hat{x}'). -\end{align} -``` - -From this, we can compute $\vec{a} \times \vec{c}$: - -```math -\begin{align} -\vec{a} \times \vec{c} &= (-\frac{GM}{r^2})\hat{x} \times r^2(\hat{x} \times \hat{x}')\\ -&= -GM \hat{x} \times (\hat{x} \times \hat{x}') \\ -&= GM (\hat{x} \times \hat{x}')\times \hat{x}. -\end{align} -``` - -The last line by anti-commutativity. - -But, the triple cross product can be simplified through the identify -$(\vec{u}\times\vec{v})\times\vec{w} = (\vec{u}\cdot\vec{w})\vec{v} - (\vec{v}\cdot\vec{w})\vec{u}$. So, the above becomes: - -```math -\begin{align} -\vec{a} \times \vec{c} &= GM ((\hat{x}\cdot\hat{x})\hat{x}' - (\hat{x} \cdot \hat{x}')\hat{x})\\ -&= GM (1 \hat{x}' - 0 \hat{x}). -\end{align} -``` - -Now, since $\vec{c}$ is constant, we have: - -```math -\begin{align} -(\vec{v} \times \vec{c})' &= (\vec{a} \times \vec{c})\\ -&= GM \hat{x}'\\ -&= (GM\hat{x})'. -\end{align} -``` - -The two sides have the same derivative, hence differ by a constant: - -```math -\vec{v} \times \vec{c} = GM \hat{x} + \vec{d}. -``` - -As $\vec{u}$ and $\vec{v}\times\vec{c}$ lie in the same plane - orthogonal to $\vec{c}$ - so does $\vec{d}$. With a suitable re-orientation, so that $\vec{d}$ is along the $x$ axis, $\vec{c}$ is along the $z$-axis, then we have $\vec{c} = \langle 0,0,c\rangle$ and $\vec{d} = \langle d ,0,0 \rangle$, and $\vec{x} = \langle x, y, 0 \rangle$. Set $\theta$ to be the angle, then $\hat{x} = \langle \cos(\theta), \sin(\theta), 0\rangle$. - -Now - -```math -\begin{align} -c^2 &= \|\vec{c}\|^2 \\ -&= \vec{c} \cdot \vec{c}\\ -&= (\vec{x} \times \vec{v}) \cdot \vec{c}\\ -&= \vec{x} \cdot (\vec{v} \times \vec{c})\\ -&= r\hat{x} \cdot (GM\hat{x} + \vec{d})\\ -&= GMr + r \hat{x} \cdot \vec{d}\\ -&= GMr + rd \cos(\theta). -\end{align} -``` - -Solving, this gives the first law. That is, the radial distance is in the form of an ellipse: - -```math -r = \frac{c^2}{GM + d\cos(\theta)} = -\frac{c^2/(GM)}{1 + (d/GM) \cos(\theta)}. -``` - ----- - -Kepler's second law can also be derived from vector calculus. This derivation follows that given at [MIT OpenCourseWare](https://ocw.mit.edu/courses/mathematics/18-02sc-multivariable-calculus-fall-2010/1.-vectors-and-matrices/part-c-parametric-equations-for-curves/session-21-keplers-second-law/MIT18_02SC_MNotes_k.pdf) and [OpenCourseWare](https://ocw.mit.edu/courses/mathematics/18-02sc-multivariable-calculus-fall-2010/index.htm). - -The second law states that the area being swept out during a time duration only depends on the duration of time, not the time. Let $\Delta t$ be this duration. Then if $\vec{x}(t)$ is the position vector, as above, we have the area swept out between $t$ and $t + \Delta t$ is visualized along the lines of: - - -```julia; hold=true -x1(t) = [cos(t), 2 * sin(t)] -t0, t1, Delta = 1.0, 2.0, 1/10 -plot_parametric(0..pi/2, x1) - -arrow!([0,0], x1(t0)); arrow!([0,0], x1(t0 + Delta)) -arrow!(x1(t0), x1(t0+Delta)- x1(t0), linewidth=5) -``` - -The area swept out, is basically the half the area of the parallelogram formed by $\vec{x}(t)$ and $\Delta \vec{x}(t) = \vec{x}(t + \Delta t) - \vec{x}(t))$. This area is $(1/2) (\vec{x} \times \Delta\vec{x}(t))$. - -If we divide through by $\Delta t$, and take a limit we have: - -```math -\frac{dA}{dt} = \| \frac{1}{2}\lim_{\Delta t \rightarrow 0} (\vec{x} \times \frac{\vec{x}(t + \Delta t) - \vec{x}(t)}{\Delta t})\| = -\frac{1}{2}\|\vec{x} \times \vec{v}\|. -``` - -But we saw above, that for the motion of a planet, that $\vec{x} \times \vec{v} = \vec{c}$, a constant. This says, $dA$ is a constant independent of $t$, and consequently, the area swept out over a duration of time will not depend on the particular times involved, just the duration. - ----- - -The third law relates the period to a parameter of the ellipse. We have from the above a strong suggestion that area of the ellipse can be found by integrating $dA$ over the period, say $T$. Assuming that is the case and letting $a$ be the semi-major axis length, and $b$ the semi-minor axis length, then - -```math -\pi a b = \int_0^T dA = \int_0^T (1/2) \|\vec{x} \times \vec{v}\| dt = \| \vec{x} \times \vec{v}\| \frac{T}{2}. -``` - -As $c = \|\vec{x} \times \vec{v}\|$ is a constant, this allows us to express $c$ by: $2\pi a b/T$. - -But, we have - -```math -r(\theta) = \frac{c^2}{GM + d\cos(\theta)} = \frac{c^2/(GM)}{1 + d/(GM) \cos(\theta)}. -``` - -So, $e = d/(GM)$ and $a (1 - e^2) = c^2/(GM)$. Using $b = a \sqrt{1-e^2}$ we have: - -```math -a(1-e^2) = c^2/(GM) = (\frac{2\pi a b}{T})^2 \frac{1}{GM} = -\frac{(2\pi)^2}{GM} \frac{a^2 (a^2(1-e^2))}{T^2}, -``` - -or after cancelling $(1-e^2)$ from each side: - -```math -T^2 = \frac{(2\pi)^2}{GM} \frac{a^4}{a} = \frac{(2\pi)^2}{GM} a^3. -``` - - ----- - -The above shows how Newton might have derived Kepler's observational facts. Next we show, that assuming the laws of Kepler can anticipate Newton's equation for gravitational force. This follows [Wikipedia](https://en.wikipedia.org/wiki/Kepler%27s_laws_of_planetary_motion#Planetary_acceleration). - -Now let $\vec{r}(t)$ be the position of the planet relative to the Sun at the origin, in two dimensions (we used $\vec{x}(t)$ above). Assume $\vec{r}(0)$ points in the $x$ direction. -Write $\vec{r} = r \hat{r}$. Define $\hat{\theta}(t)$ to be the mapping from time $t$ to the angle defined by $\hat{r}$ through the unit circle. - -Then we express the velocity ($\dot{\vec{r}}$) and acceleration ($\ddot{\vec{r}}$) in terms of the orthogonal vectors $\hat{r}$ and $\hat{\theta}$, as follows: - -```math -\frac{d}{dt}(r \hat{r}) = \dot{r} \hat{r} + r \dot{\hat{r}} = \dot{r} \hat{r} + r \dot{\theta}\hat{\theta}. -``` - -The last equality from expressing $\hat{r}(t) = \hat{r}(\theta(t))$ and using the chain rule, noting $d(\hat{r}(\theta))/d\theta = \hat{\theta}$. - -Continuing, - -```math -\frac{d^2}{dt^2}(r \hat{r}) = -(\ddot{r} \hat{r} + \dot{r} \dot{\hat{r}}) + -(\dot{r} \dot{\theta}\hat{\theta} + r \ddot{\theta}\hat{\theta} + r \dot{\theta}\dot{\hat{\theta}}). -``` - -Noting, similar to above, $\dot{\hat{\theta}} = d\hat{\theta}/dt = d\hat{\theta}/d\theta \cdot d\theta/dt = -\dot{\theta} \hat{r}$ we can express the above in terms of $\hat{r}$ and $\hat{\theta}$ as: - -```math -\vec{a} = \frac{d^2}{dt^2}(r \hat{r}) = (\ddot{r} - r (\dot{\theta})^2) \hat{r} + (r\ddot{\theta} + 2\dot{r}\dot{\theta}) \hat{\theta}. -``` - -That is, in general, the acceleration has a radial component and a transversal component. - -Kepler's second law says that the area increment over time is constant ($dA/dt$), but this area increment is approximated by the following wedge in polar coordinates: $dA = (1/2) r \cdot rd\theta$. We have then $dA/dt = r^2 \dot{\theta}$ is constant. - -Differentiating, we have: - -```math -0 = \frac{d(r^2 \dot{\theta})}{dt} = 2r\dot{r}\dot{\theta} + r^2 \ddot{\theta}, -``` - -which is the tranversal component of the acceleration times $r$, as decomposed above. This means, that the acceleration of the planet is completely towards the Sun at the origin. - -Kepler's first law, relates $r$ and $\theta$ through the polar equation of an ellipse: - -```math -r = \frac{p}{1 + \epsilon \cos(\theta)}. -``` - -Expressing in terms of $p/r$ and differentiating in $t$ gives: - -```math --\frac{p \dot{r}}{r^2} = -\epsilon\sin(\theta) \dot{\theta}. -``` - -Or - -```math -p\dot{r} = \epsilon\sin(\theta) r^2 \dot{\theta} = \epsilon \sin(\theta) C, -``` - -For a constant $C$, used above, as the second law implies $r^2 \dot{\theta}$ is constant. (This constant can be expressed in terms of parameters describing the ellipse.) - -Differentiating again in $t$, gives: - -```math -p \ddot{r} = C\epsilon \cos(\theta) \dot{\theta} = C\epsilon \cos(\theta)\frac{C}{r^2}. -``` - -So $\ddot{r} = (C^2 \epsilon / p) \cos{\theta} (1/r^2)$. - -The radial acceleration from above is: - -```math -\ddot{r} - r (\dot{\theta})^2 = -(C^2 \epsilon/p) \cos{\theta} \frac{1}{r^2} - r\frac{C^2}{r^4} = \frac{C^2}{pr^2}(\epsilon \cos(\theta) - \frac{p}{r}). -``` - -Using $p/r = 1 + \epsilon\cos(\theta)$, we have the radial acceleration is $C^2/p \cdot (1/r^2)$. That is the acceleration, is proportional to the inverse square of the position, and using the relation between $F$, force, and acceleration, we see the force on the planet follows the inverse-square law of Newton. - - - -## Moving frames of reference - -In the last example, it proved useful to represent vectors in terms of -other unit vectors, in that case $\hat{r}$ and $\hat{\theta}$. Here we -discuss a coordinate system defined intrinsically by the motion along -the trajectory of a curve. - -Let $r(t)$ be a smooth vector-valued function in $R^3$. It gives rise to a space curve, through its graph. This curve has tangent vector $\vec{r}'(t)$, indicating the direction of travel along $\vec{r}$ as $t$ increases. The length of $\vec{r}'(t)$ depends on the parameterization of $\vec{r}$, as for any increasing, differentiable function $s(t)$, the composition $\vec{r}(s(t))$ will have derivative, $\vec{r}'(s(t)) s'(t)$, having the same direction as $\vec{r}'(t)$ (at suitably calibrated points), but not the same magnitude, the factor of $s(t)$ being involved. - -To discuss properties intrinsic to the curve, the unit vector is considered: - -```math -\hat{T}(t) = \frac{\vec{r}'(t)}{\|\vec{r}'(t)\|}. -``` - - - - -The function $\hat{T}(t)$ is the unit tangent vector. An assumption of regularity ensures the denominator is never ``0``. - -Now define the unit *normal*, $\hat{N}(t)$, by: - -```math -\hat{N}(t) = \frac{\hat{T}'(t)}{\| \hat{T}'(t) \|}. -``` - -Since $\|\hat{T}(t)\| = 1$, a constant, it must be that $\hat{T}'(t) \cdot \hat{T}(t) = 0$, that is, the $\hat{N}$ and $\hat{T}$ are orthogonal. - -Finally, define the *binormal*, $\hat{B}(t) = \hat{T}(t) \times \hat{N}(t)$. At each time $t$, the three unit vectors are orthogonal to each other. They form a moving coordinate system for the motion along the curve that does not depend on the parameterization. - -We can visualize this, for example along a [Viviani](https://tinyurl.com/y4lo29mv) curve, as is done in a [Wikipedia](https://en.wikipedia.org/wiki/Frenet%E2%80%93Serret_formulas) animation: - -```julia; hold=true -function viviani(t, a=1) - [a*(1-cos(t)), a*sin(t), 2a*sin(t/2)] -end - - -Tangent(t) = viviani'(t)/norm(viviani'(t)) -Normal(t) = Tangent'(t)/norm(Tangent'(t)) -Binormal(t) = Tangent(t) × Normal(t) - -p = plot(legend=false) -plot_parametric!(-2pi..2pi, viviani) - -t0, t1 = -pi/3, pi/2 + 2pi/5 -r0, r1 = viviani(t0), viviani(t1) -arrow!(r0, Tangent(t0)); arrow!(r0, Binormal(t0)); arrow!(r0, Normal(t0)) -arrow!(r1, Tangent(t1)); arrow!(r1, Binormal(t1)); arrow!(r1, Normal(t1)) -p -``` - - - - ----- - -The *curvature* of a ``3``-dimensional space curve is defined by: - -> *The curvature*: For a ``3-D`` curve the curvature is defined by: -> -> ``\kappa = \frac{\| r'(t) \times r''(t) \|}{\| r'(t) \|^3}.`` - - - -For $2$-dimensional space curves, the same formula applies after embedding a $0$ third component. It can also be expressed directly as - -```math -\kappa = (x'y''-x''y')/\|r'\|^3. \quad (r(t) =\langle x(t), y(t) \rangle) -``` - - - -Curvature can also be defined as derivative of the tangent vector, $\hat{T}$, -*when* the curve is parameterized by arc length, a topic still to be taken up. The vector $\vec{r}'(t)$ is the direction of motion, whereas -$\vec{r}''(t)$ indicates how fast and in what direction this is -changing. For curves with little curve in them, the two will be nearly -parallel and the cross product small (reflecting the presence of -$\cos(\theta)$ in the definition). For "curvy" curves, $\vec{r}''$ will be -in a direction opposite of $\vec{r}'$ to the $\cos(\theta)$ term in the -cross product will be closer to $1$. - -Let $\vec{r}(t) = k \cdot \langle \cos(t), \sin(t), 0 \rangle$. This will have curvature: - -```julia; hold=true -@syms k::positive t::real -r1 = k * [cos(t), sin(t), 0] -norm(diff.(r1,t) × diff.(r1,t,t)) / norm(diff.(r1,t))^3 |> simplify -``` - -For larger circles (bigger $\|k\|$) there is less curvature. The limit being a line with curvature $0$. - -If a curve is imagined to have a tangent "circle" (second order Taylor series approximation), then the curvature of that circle matches the curvature of the curve. - - -The [torsion](https://en.wikipedia.org/wiki/Torsion_of_a_curve), $\tau$, of a space curve ($n=3$), is a measure of how sharply the curve is twisting out of the plane of curvature. - -The torsion is defined for smooth curves by - -> *The torsion*: -> -> ``\tau = \frac{(\vec{r}' \times \vec{r}'') \cdot \vec{r}'''}{\|\vec{r}' \times \vec{r}''\|^2}.`` - - -For the torsion to be defined, the cross product $\vec{r}' \times \vec{r}''$ must be non zero, that is the two must not be parallel or zero. - -##### Example: Tubular surface - -This last example comes from a collection of several [examples](https://github.com/empet/3D-Viz-with-PlotlyJS.jl/blob/main/5-Tubular-surface.ipynb) provided by Discourse user `@empet` to illustrate `plotlyjs`. We adopt it to `Plots` with some minor changes below. - -The task is to illustrate a space curve, ``c(t)``, using a tubular surface. At each time point ``t``, assume the curve has tangent, ``e_1``; normal, ``e_2``; and binormal, ``e_3``. (This assumes the defining derivatives exist and are non-zero and the cross product in the torsion is non zero.) The tubular surface is a circle of radius ``\epsilon`` in the plane determined by the normal and binormal. This curve would be parameterized by -``r(t,u) = c(t) + \epsilon (e_2(t) \cdot \cos(u) + e_3(t) \cdot \sin(u))`` for varying ``u``. - -The Frenet-Serret equations setup a system of differential equations driven by the curvature and torsion. We use the `DifferentialEquations` package to solve this equation for two specific functions and a given initial condition. The equations when expanded into coordinates become ``12`` different equations: - -```julia -# e₁, e₂, e₃, (x,y,z) -function Frenet_eq!(du, u, p, s) #system of ODEs - κ, τ = p - du[1] = κ(s) * u[4] # e₁′ = κ ⋅ e₂ - du[2] = κ(s) * u[5] - du[3] = κ(s) * u[6] - du[4] = -κ(s) * u[1] + τ(s) * u[7] # e₂′ = - κ ⋅ e₁ + τ ⋅ e₃ - du[5] = -κ(s) * u[2] + τ(s) * u[8] - du[6] = -κ(s) * u[3] + τ(s) * u[9] - du[7] = -τ(s) * u[4] # e₃′ = - τ ⋅ e₂ - du[8] = -τ(s) * u[5] - du[9] = -τ(s) * u[6] - du[10] = u[1] # c′ = e₁ - du[11] = u[2] - du[12] = u[3] -end -``` - -The last set of equations describe the motion of the spine. It follows from specifying the tangent to the curve is ``e_1``, as desired; it is parameterized by arc length, as ``\mid c'(t) \mid = 1``. - -Following the example of `@empet`, we define a curvature function and torsion function, the latter a constant: - -```julia -κ(s) = 3 * sin(s/10) * sin(s/10) -τ(s) = 0.35 -``` - -The initial condition and time span are set with: - -```julia -e₁₀, e₂₀, e₃₀ = [1,0,0], [0,1,0], [0,0,1] -u₀ = [0, 0, 0] -u0 = vcat(e₁₀, e₂₀, e₃₀, u₀) # initial condition for the system of ODE -t_span = (0.0, 150.0) # time interval for solution -``` - -With this set up, the problem can be solved: - -```julia -prob = ODEProblem(Frenet_eq!, u0, t_span, (κ, τ)) -sol = solve(prob, Tsit5()); -``` - -The "spine" is the center axis of the tube and is the ``10``th, ``11``th, and ``12``th coordinates: - -```julia -spine(t) = sol(t)[10:12] -``` - -The tangent, normal, and binormal can be similarly defined using the other ``9`` indices: - -```julia -e₁(t) = sol(t)[1:3] -e₂(t) = sol(t)[4:6] -e₃(t) = sol(t)[7:9] -``` - -We fix a small time range and show the trace of the spine and the frame at a single point in time: - - -```julia -a_0, b_0 = 50, 60 -ts_0 = range(a_0, b_0, length=251) - -t_0 = (a_0 + b_0) / 2 -ϵ = 1/5 - -plot_parametric(a_0..b_0, spine) - -arrow!(spine(t_0), e₁(t_0)) -arrow!(spine(t_0), e₂(t_0)) -arrow!(spine(t_0), e₃(t_0)) - -r_0(t, θ) = spine(t) + ϵ * (e₂(t)*cos(θ) + e₃(t)*sin(θ)) -plot_parametric!(0..2pi, θ -> r_0(t_0, θ)) -``` - -The `ϵ` value determines the radius of the tube; we see it above as the radius of the drawn circle. -The function `r` for a fixed `t` traces out such a circle centered at a point on the spine. For a fixed `θ`, the function `r` describes a line on the surface of the tube paralleling the spine. - -The tubular surface is now ready to be rendered along the entire time span using a pattern for parametrically defined surfaces: - -```julia; hold=true -ts = range(t_span..., length=1001) -θs = range(0, 2pi, length=100) -surface(unzip(r_0.(ts, θs'))...) -``` - - -## Arc length - -In [Arc length](../integrals/arc_length.html) there is a discussion of how to find the arc length of a parameterized curve in ``2`` dimensions. The general case is discussed by [Destafano](https://randomproofs.files.wordpress.com/2010/11/arc_length.pdf) who shows: - -> *Arc-length*: if a curve $C$ is parameterized by a smooth -> function $\vec{r}(t)$ over an interval $I$, then the arc length -> of $C$ is: -> -> $\int_I \| \vec{r}'(t) \| dt.$ - -If we associate $\vec{r}'(t)$ with the velocity, then this is the integral of the speed (the magnitude of the velocity). - -Let $I=[a,b]$ and $s(t): [v,w] \rightarrow [a,b]$ such that $s$ is increasing and differentiable. Then $\vec{\phi} = \vec{r} \circ s$ will have have - -```math -\text{arc length} = -\int_v^w \| \vec{\phi}'(t)\| dt = -\int_v^w \| \vec{r}'(s(t))\| s'(t) dt = -\int_a^b \| \vec{r}'(u) \| du, -``` - -by a change of variable $u=s(t)$. As such the arc length is a property of the curve and not the parameterization of the curve. - -For some parameterization, we can define - -```math -s(t) = \int_0^t \| \vec{r}'(u) \| du -``` - -Then by the fundamental theorem of calculus, $s(t)$ is non-decreasing. If $\vec{r}'$ is assumed to be non-zero and continuous (regular), then $s(t)$ has a derivative and an inverse which is monotonic. Using the inverse function $s^{-1}$ to change variables ($\vec{\phi} = \vec{r} \circ s^{-1}$) has - -```math -\int_0^c \| \phi'(t) \| dt = -\int_{s^{-1}(0)}^{s^{-1}(c)} \| \vec{r}'(u) \| du = -s(s^{-1}(c)) - s(s^{-1}(0)) = -c -``` - -That is, the arc length from $[0,c]$ for $\phi$ is just $c$; the curve $C$ is parameterized by arc length. - - -##### Example - -Viviani's curve is the intersection of sphere of radius $a$ with a cylinder of radius $a$. A parameterization was given previously by: - -```julia; -function viviani(t, a=1) - [a*(1-cos(t)), a*sin(t), 2a*sin(t/2)] -end -``` - -The curve is traced out over the interval $[0, 4\pi]$. We try to find the arc-length: - -```julia; hold=true -@syms t::positive a::positive -speed = simplify(norm(diff.(viviani(t, a), t))) -integrate(speed, (t, 0, 4*PI)) -``` - -We see that the answer depends linearly on $a$, but otherwise is a constant expressed as an integral. We use `QuadGk` to provide a numeric answer for the case $a=1$: - -```julia; -quadgk(t -> norm(viviani'(t)), 0, 4pi) -``` - - -##### Example - -Very few parameterized curves admit a closed-form expression for parameterization by arc-length. Let's consider the helix expressed by $\langle a\cos(t), a\sin(t), bt\rangle$, as this does allow such a parameterization. - -```julia; -@syms aₕ::positive bₕ::positive tₕ::positive alₕ::positive -helix = [aₕ * cos(tₕ), aₕ * sin(tₕ), bₕ * tₕ] -speed = simplify( norm(diff.(helix, tₕ)) ) -s = integrate(speed, (tₕ, 0, alₕ)) -``` - -So `s` is a linear function. We can re-parameterize by: - -```julia; -eqnₕ = subs.(helix, tₕ => alₕ/sqrt(aₕ^2 + bₕ^2)) -``` - -To see that the speed, $\| \vec{\phi}' \|$, is constantly $1$: - -```julia; -simplify(norm(diff.(eqnₕ, alₕ))) -``` - -From this, we have the arc length is: - -```math -\int_0^t \| \vec{\phi}'(u) \| du = \int_0^t 1 du = t -``` - ----- - -Parameterizing by arc-length is only explicitly possible for a few examples, however knowing it can be done in theory, is important. Some formulas are simplified, such as the tangent, normal, and binormal. Let $\vec{r}(s)$ be parameterized by arc length, then: - -```math -\hat{T}(s)= \vec{r}'(s) / \| \vec{r}'(s) \| = \vec{r}'(s),\quad -\hat{N}(s) = \hat{T}'(s) / \| \hat{T}'(s)\| = \hat{T}'(s)/\kappa,\quad -\hat{B} = \hat{T} \times \hat{N}, -``` - -As before, but further, we have if $\kappa$ is the curvature and $\tau$ the torsion, these relationships expressing the derivatives with respect to $s$ in terms of the components in the frame: - -```math -\begin{align*} -\hat{T}'(s) &= &\kappa \hat{N}(s) &\\ -\hat{N}'(s) &= -\kappa \hat{T}(s) & &+ \tau \hat{B}(s)\\ -\hat{B}'(s) &= &-\tau \hat{N}(s) & -\end{align*} -``` - -These are the [Frenet-Serret](https://en.wikipedia.org/wiki/Frenet%E2%80%93Serret_formulas) formulas. - -##### Example - -Continuing with our parameterization of a helix by arc length, we can compute the curvature and torsion by differentiation: - -```julia; -gammaₕ = subs.(helix, tₕ => alₕ/sqrt(aₕ^2 + bₕ^2)) # gamma parameterized by arc length -@syms uₕ::positive -gammaₕ₁ = subs.(gammaₕ, alₕ .=> uₕ) # u is arc-length parameterization -``` - -```julia; -Tₕ = diff.(gammaₕ₁, uₕ) -norm(Tₕ) |> simplify -``` - -The length is one, as the speed of a curve parameterized by arc-length is 1. - -```julia; -outₕ = diff.(Tₕ, uₕ) -``` - -This should be $\kappa \hat{N}$, so we do: - -```julia; -κₕ = norm(outₕ) |> simplify -Normₕ = outₕ / κₕ -κₕ, Normₕ -``` - -Interpreting, $a$ is the radius of the circle and $b$ how tight the coils are. If $a$ gets much larger than $b$, then the curvature is like $1/a$, just as with a circle. If $b$ gets very big, then the trajectory looks more stretched out and the curvature gets smaller. - - -To find the torsion, we find, $\hat{B}$ then differentiate: - -```julia; -Bₕ = Tₕ × Normₕ -outₕ₁ = diff.(Bₕ, uₕ) -τₕ = norm(outₕ₁) -``` - -This looks complicated, as does `Norm`: - -```julia; -Normₕ -``` - - -However, the torsion, up to a sign, simplifies nicely: - -```julia; -τₕ |> simplify -``` - -Here, when $b$ gets large, the curve looks more and more "straight" and the torsion decreases. Similarly, if $a$ gets big, the torsion decreases. - - -##### Example - -[Levi and Tabachnikov](https://projecteuclid.org/download/pdf_1/euclid.em/1259158427) consider the trajectories of the front and rear bicycle wheels. Recall the notation previously used: $\vec{F}(t)$ for the front wheel, and $\vec{B}(t)$ for the rear wheel trajectories. Consider now their parameterization by arc length, using $u$ for the arc-length parameter for $\vec{F}$ and $v$ for $\vec{B}$. We define $\alpha(u)$ to be the steering angle of the bicycle. This can be found as the angle between the tangent vector of the path of $\vec{F}$ with the vector $\vec{B} - \vec{F}$. Let $\kappa$ be the curvature of the front wheel and $k$ the curvature of the back wheel. - -```julia; echo=false -let -a = 1 -F(t) = [cos(pi/2 - t), 2sin(pi/2-t)] -p = (a, F) - -t0, t1 = -pi/6, pi/2.75 -tspan = (t0, t1) - -t = 7pi/6 -B0 = F(t0) + a*[cos(t), sin(t)] -prob = ODEProblem(bicycle, B0, tspan, p) - -out = solve(prob, reltol=1e-6, Tsit5()) -plt = plot_parametric(t0..t1, F, linewidth=3, - xticks=nothing, yticks=nothing, border=:none, - legend=false, aspect_ratio=:equal) -plot_parametric!(t0..t1, t -> out(t), linewidth=3) - -t = pi/4 -arrow!(out(t), 2*(F(t) - out(t))) -plot!(unzip([out(t), F(t)])..., linewidth=2) -arrow!(F(t), F'(t)/norm(F'(t))) -Fphat(t) = F'(t)/norm(F'(t)) -arrow!( F(t), -Fphat'(t)/norm(Fphat'(t))) -using LaTeXStrings -annotate!([(-.5,1.5,L"k"), -(.775,1.55,L"\kappa"), -(.85, 1.3, L"\alpha")]) - -plt -end -``` - -Levi and Tabachnikov prove in their Proposition 2.4: - -```math -\begin{align*} -\kappa(u) &= \frac{d\alpha(u)}{du} + \frac{\sin(\alpha(u))}{a},\\ -|\frac{du}{dv}| &= |\cos(\alpha)|, \quad \text{and}\\ -k &= \frac{\tan(\alpha)}{a}. -\end{align*} -``` - -The first equation relates the steering angle with the curvature. If the steering angle is not changed ($d\alpha/du=0$) then the curvature is constant and the motion is circular. It will be greater for larger angles (up to $\pi/2$). As the curvature is the reciprocal of the radius, this means the radius of the circular trajectory will be smaller. For the same constant steering angle, the curvature will be smaller for longer wheelbases, meaning the circular trajectory will have a larger radius. For cars, which have similar dynamics, this means longer wheelbase cars will take more room to make a U-turn. - -The second equation may be interpreted in ratio of arc lengths. The infinitesimal arc length of the rear wheel is proportional to that of the front wheel only scaled down by $\cos(\alpha)$. When $\alpha=0$ - the bike is moving in a straight line - and the two are the same. At the other extreme - when $\alpha=\pi/2$ - the bike must be pivoting on its rear wheel and the rear wheel has no arc length. This cosine, is related to the speed of the back wheel relative to the speed of the front wheel, which was used in the initial differential equation. - -The last equation, relates the curvature of the back wheel track to the steering angle of the front wheel. When $\alpha=\pm\pi/2$, the rear-wheel curvature, $k$, is infinite, resulting in a cusp (no circle with non-zero radius will approximate the trajectory). This occurs when the front wheel is steered orthogonal to the direction of motion. As was seen in previous graphs of the trajectories, a cusp can happen for quite regular front wheel trajectories. - - -To derive the first one, we have previously noted that -when a curve is parameterized by arc length, the curvature is more directly computed: it is the magnitude of the derivative of the tangent vector. -The tangent vector is of unit length, when parametrized by arc length. This implies its derivative will be orthogonal. If $\vec{r}(t)$ is a parameterization by arc length, then the curvature formula simplifies as: - -```math -\begin{align*} -\kappa(s) &= \frac{\| \vec{r}'(s) \times \vec{r}''(s) \|}{\|\vec{r}'(s)\|^3} \\ -&= \frac{\| \vec{r}'(s) \times \vec{r}''(s) \|}{1} \\ -&= \| \vec{r}'(s) \| \| \vec{r}''(s) \| \sin(\theta) \\ -&= 1 \| \vec{r}''(s) \| 1 = \| \vec{r}''(s) \|. -\end{align*} -``` - - -So in the above, the curvature is $\kappa = \| \vec{F}''(u) \|$ and $k = \|\vec{B}''(v)\|$. - -On the figure, the tangent vector $\vec{F}'(u)$ is drawn, along with this unit vector rotated by $\pi/2$. We call these, for convenience, $\vec{U}$ and $\vec{V}$. We have $\vec{U} = \vec{F}'(u)$ and $\vec{V} = -(1/\kappa) \vec{F}''(u)$. - - -The key decomposition, is to express a unit vector in the direction of the line segment, as the vector $\vec{U}$ rotated by $\alpha$ degrees. Mathematically, this is usually expressed in matrix notation, but more explicitly by - -```math -\langle \cos(\alpha) \vec{U}_1 - \sin(\alpha) \vec{U}_2, -\sin(\alpha) \vec{U}_1 + \cos(\alpha) \vec{U}_2 = -\vec{U} \cos(\alpha) - \vec{V} \sin(\alpha). -``` - -With this, the mathematical relationship between $F$ and $B$ is just a multiple of this unit vector: - -```math -\vec{B}(u) = \vec{F}(u) - a \vec{U} \cos(\alpha) + a \vec{V} \sin(\alpha). -``` - -It must be that the tangent line of $\vec{B}$ is parallel to $\vec{U} \cos(\alpha) + \vec{V} \sin(\alpha)$. To utilize this, we differentiate $\vec{B}$ using the facts that $\vec{U}' = \kappa \vec{V}$ and $\vec{V}' = -\kappa \vec{U}$. These coming from $\vec{U} = \vec{F}'$ and so it's derivative in $u$ has magnitude yielding the curvature, $\kappa$, and direction orthogonal to $\vec{U}$. - -```math -\begin{align} -\vec{B}'(u) &= \vec{F}'(u) --a \vec{U}' \cos(\alpha) -a \vec{U} (-\sin(\alpha)) \alpha' -+a \vec{V}' \sin(\alpha) + a \vec{V} \cos(\alpha) \alpha'\\ -& = \vec{U} --a (\kappa) \vec{V} \cos(\alpha) + a \vec{U} \sin(\alpha) \alpha' + -a (-\kappa) \vec{U} \sin(\alpha) + a \vec{V} \cos(\alpha) \alpha' \\ -&= \vec{U} -+ a(\alpha' - \kappa) \sin(\alpha) \vec{U} -+ a(\alpha' - \kappa) \cos(\alpha)\vec{V}. -\end{align} -``` - -Extend the ``2``-dimensional vectors to ``3`` dimensions, by adding a zero $z$ component, then: - -```math -\begin{align} -\vec{0} &= (\vec{U} -+ a(\alpha' - \kappa) \sin(\alpha) \vec{U} -+ a(\alpha' - \kappa) \cos(\alpha)\vec{V}) \times -(-\vec{U} \cos(\alpha) + \vec{V} \sin(\alpha)) \\ -&= (\vec{U} \times \vec{V}) \sin(\alpha) + -a(\alpha' - \kappa) \sin(\alpha) \vec{U} \times \vec{V} \sin(\alpha)) - -a(\alpha' - \kappa) \cos(\alpha)\vec{V} \times \vec{U} \cos(\alpha) \\ -&= (\sin(\alpha) + a(\alpha'-\kappa) \sin^2(\alpha) + -a(\alpha'-\kappa) \cos^2(\alpha)) \vec{U} \times \vec{V} \\ -&= (\sin(\alpha) + a (\alpha' - \kappa)) \vec{U} \times \vec{V}. -\end{align} -``` - -The terms $\vec{U} \times\vec{U}$ and $\vec{V}\times\vec{V}$ being $\vec{0}$, due to properties of the cross product. This says the scalar part must be $0$, or - -```math -\frac{\sin(\alpha)}{a} + \alpha' = \kappa. -``` - -As for the second equation, -from the expression for $\vec{B}'(u)$, after setting $a(\alpha'-\kappa) = -\sin(\alpha)$: - -```math -\begin{align} -\|\vec{B}'(u)\|^2 -&= \| (1 -\sin(\alpha)\sin(\alpha)) \vec{U} -\sin(\alpha)\cos(\alpha) \vec{V} \|^2\\ -&= \| \cos^2(\alpha) \vec{U} -\sin(\alpha)\cos(\alpha) \vec{V} \|^2\\ -&= (\cos^2(\alpha))^2 + (\sin(\alpha)\cos(\alpha))^2\quad\text{using } \vec{U}\cdot\vec{V}=0\\ -&= \cos^2(\alpha)(\cos^2(\alpha) + \sin^2(\alpha))\\ -&= \cos^2(\alpha). -\end{align} -``` - -From this $\|\vec{B}(u)\| = |\cos(\alpha)\|$. But $1 = \|d\vec{B}/dv\| = \|d\vec{B}/du \| \cdot |du/dv|$ and $|dv/du|=|\cos(\alpha)|$ follows. - - -```julia; echo=false; -#How to compute the curvature k? -#```math -#\begin{align} -#\frac{d^2\hat{B}}{dv} -#&= \frac{d^2\hat{B}}{du^2} \cdot (\frac{dv}{du})^2 + \frac{d^2v}{du^2} \cdot \hat{B}'(u)\\ -#&= \cos^2(\alpha) \cdot (-2\sin(\alpha)\cos(\alpha}\alpha'\vec{U} + \cos^2(\alpha) \kappa \vec{V} - (\cos^2(\alph#a)-\sin^2(\alpha))\alpha'\vec{V} + \sin(\alpha)\cos(\alpha)\kappa \vec{U}) + \frac{\sin(\alpha)}{\cos^2(\alpha) \#cdot (\cos^2(\alpha)\vec{U} - \sin(\alpha)\cos(\alpha) \vec{V})\\ -#&= -# -# -#&= \| (1 -\sin(alpha)\sin(\alpha) \vec{U} -\sin(\alpha)\cos(\alpha) \vec{V} \|^2\\ -#&= \| \cos^2(\alpha) \vec{U} -\sin(\alpha)\cos(\alpha) \vec{V} \|^2\\ -#&= ((\cos^2(alpha))^2 + (\sin(\alpha)\cos(\alpha))^2\quad\text{using } \vec{U}\cdot\vec{V}=0\\ -#&= \cos(\alpha)^2. -#\end{align} -#``` -nothing -``` - -## Evolutes and involutes - -Following [Fuchs](https://doi.org/10.4169/amer.math.monthly.120.03.217) we discuss a geometric phenomenon known and explored by Huygens, and likely earlier. We stick to the two-dimensional case, Fuchs extends this to three dimensions. The following figure - -```julia; echo=false -Xₑ(t)= 2 * cos(t) -Yₑ(t) = sin(t) -rₑ(t) = [Xₑ(t), Yₑ(t)] -unit_vec(x) = x / norm(x) -plot(legend=false, aspect_ratio=:equal) -ts = range(0, 2pi, length=50) -for t in ts - Pₑ, Vₑ = rₑ(t), unit_vec([-Yₑ'(t), Xₑ'(t)]) - plot_parametric!(-4..4, x -> Pₑ + x*Vₑ) -end -plot!(Xₑ, Yₑ, 0, 2pi, linewidth=5) -``` - -is that of an ellipse with many *normal* lines drawn to it. The normal lines appear to intersect in a somewhat diamond-shaped curve. This curve is the evolute of the ellipse. We can characterize this using the language of planar curves. - -Consider a parameterization of a curve by arc-length, $\vec\gamma(s) = \langle u(s), v(s) \rangle$. The unit *tangent* to this curve is $\vec\gamma'(s) = \hat{T}(s) = \langle u'(s), v'(s) \rangle$ and by simple geometry the unit *normal* will be $\hat{N}(s) = \langle -v'(s), u'(s) \rangle$. At a time $t$, a line through the curve parameterized by $\vec\gamma$ is given by $l_t(a) = \vec\gamma(t) + a \hat{N}(t)$. - -Consider two nearby points $t$ and $t+\epsilon$ and the intersection of $l_t$ and $l_{t+\epsilon}$. That is, we need points $a$ and $b$ with: $l_t(a) = l_{t+\epsilon}(b)$. Setting the components equal, this is: - -```math -\begin{align} -u(t) - av'(t) &= u(t+\epsilon) - bv'(t+\epsilon) \\ -v(t) - au'(t) &= v(t+\epsilon) - bu'(t+\epsilon). -\end{align} -``` - -This is a linear equation in two unknowns ($a$ and $b$) which can be solved. Here is the value for `a`: - -```julia; -@syms u() v() t epsilon w -@syms a b -γ(t) = [u(t),v(t)] -n(t) = subs.(diff.([-v(w), u(w)], w), w.=>t) -l(a, t) = γ(t) + a * n(t) -out = solve(l(a, t) - l(b, t+epsilon), [a,b]) -out[a] -``` - -Letting $\epsilon \rightarrow 0$ we get an expression for $a$ that will describe the evolute at time $t$ in terms of the function $\gamma$. Looking at the expression above, we can see that dividing the *numerator* by $\epsilon$ and taking a limit will yield $u'(t)^2 + v'(t)^2$. If the *denominator* has a limit after dividing by $\epsilon$, then we can find the description sought. Pursuing this leads to: - -```math -\begin{align*} -\frac{u'(t) v'(t+\epsilon) - v'(t) u'(t+\epsilon)}{\epsilon} -&= \frac{u'(t) v'(t+\epsilon) -u'(t)v'(t) + u'(t)v'(t)- v'(t) u'(t+\epsilon)}{\epsilon} \\ -&= \frac{u'(t)(v'(t+\epsilon) -v'(t))}{\epsilon} + \frac{(u'(t)- u'(t+\epsilon))v'(t)}{\epsilon}, -\end{align*} -``` - -which in the limit will give $u'(t)v''(t) - u''(t) v'(t)$. All told, in the limit as $\epsilon \rightarrow 0$ we get - -```math -\begin{align*} -a &= \frac{u'(t)^2 + v'(t)^2}{u'(t)v''(t) - v'(t) u''(t)} \\ -&= 1/(\|\vec\gamma'\|\kappa) \\ -&= 1/(\|\hat{T}\|\kappa) \\ -&= 1/\kappa, -\end{align*} -``` - -with $\kappa$ being the curvature of the planar curve. That is, the evolute of $\vec\gamma$ is described by: - -```math -\vec\beta(s) = \vec\gamma(s) + \frac{1}{\kappa(s)}\hat{N}(s). -``` - -Revisualizing: - -```julia; -rₑ₃(t) = [2cos(t), sin(t), 0] -Tangent(r, t) = unit_vec(r'(t)) -Normal(r, t) = unit_vec((𝒕 -> Tangent(r, 𝒕))'(t)) -curvature(r, t) = norm(r'(t) × r''(t) ) / norm(r'(t))^3 - -plot_parametric(0..2pi, t -> rₑ₃(t)[1:2], legend=false, aspect_ratio=:equal) -plot_parametric!(0..2pi, t -> (rₑ₃(t) + Normal(rₑ₃, t)/curvature(rₑ₃, t))[1:2]) -``` - -We computed the above illustration using $3$ dimensions (hence the use of `[1:2]...`) as the curvature formula is easier to express. Recall, the curvature also appears in the [Frenet-Serret](https://en.wikipedia.org/wiki/Frenet%E2%80%93Serret_formulas) formulas: $d\hat{T}/ds = \kappa \hat{N}$ and $d\hat{N}/ds = -\kappa \hat{T}+ \tau \hat{B}$. In a planar curve, as under consideration, the binormal is $\vec{0}$. This allows the computation of $\vec\beta(s)'$: - -```math -\begin{align} -\vec{\beta}' &= \frac{d(\vec\gamma + (1/k) \hat{N})}{dt}\\ -&= \hat{T} + (-\frac{k'}{k^2}\hat{N} + \frac{1}{k} \hat{N}')\\ -&= \hat{T} - \frac{k'}{k^2}\hat{N} + \frac{1}{k} (-\kappa \hat{T})\\ -&= - \frac{k'}{k^2}\hat{N}. -\end{align} -``` - -We see $\vec\beta'$ is zero (the curve is non-regular) when $\kappa'(s) = 0$. The curvature changes from increasing to decreasing, or vice versa at each of the ``4`` crossings of the major and minor axes - there are ``4`` non-regular points, and we see $4$ cusps in the evolute. - - -The curve parameterized by $\vec{r}(t) = 2(1 - \cos(t)) \langle \cos(t), \sin(t)\rangle$ over $[0,2\pi]$ is cardiod. It is formed by rolling a circle of radius $r$ around another similar sized circle. The following graphically shows the evolute is a smaller cardiod (one-third the size). For fun, the evolute of the evolute is drawn: - -```julia -function evolute(r) - t -> r(t) + 1/curvature(r, t) * Normal(r, t) -end -``` - -```julia; hold=trie -r(t) = 2*(1 - cos(t)) * [cos(t), sin(t), 0] - -plot(legend=false, aspect_ratio=:equal) -plot_parametric!(0..2pi, t -> r(t)[1:2]) -plot_parametric!(0..2pi, t -> evolute(r)(t)[1:2]) -plot_parametric!(0..2pi, t -> ((evolute∘evolute)(r)(t))[1:2]) -``` - ----- - -If $\vec\beta$ is *the* **evolute** of $\vec\gamma$, then $\vec\gamma$ is *an* **involute** of $\beta$. For a given curve, there is a parameterized family of involutes. While this definition has a pleasing self-referentialness, it doesn't have an immediately clear geometric interpretation. For that, consider the image of a string of fixed length $a$ attached to the curve $\vec\gamma$ at some point $t_0$. As this curve wraps around the curve traced by $\vec\gamma$ it is held taut so that it makes a tangent at the point of contact. The end of the string will trace out a curve and this is the trace of an *involute*. - -```julia; hold=true -r(t) = [t, cosh(t)] -t0, t1 = -2, 0 -a = t1 - -beta(r, t) = r(t) - Tangent(r, t) * quadgk(t -> norm(r'(t)), a, t)[1] - -p = plot_parametric(-2..2, r, legend=false) -plot_parametric!(t0..t1, t -> beta(r, t)) -for t in range(t0,-0.2, length=4) - arrow!(r(t), -Tangent(r, t) * quadgk(t -> norm(r'(t)), a, t)[1]) - scatter!(unzip([r(t)])...) -end -p -``` - - - -This lends itself to this mathematical description, if $\vec\gamma(t)$ parameterizes the planar curve, then an involute for $\vec\gamma(t)$ is described by: - -```math -\vec\beta(t) = \vec\gamma(t) + \left((a - \int_{t_0}^t \| \vec\gamma'(t)\| dt) \hat{T}(t)\right), -``` - -where $\hat{T}(t) = \vec\gamma'(t)/\|\vec\gamma'(t)\|$ is the unit tangent vector. The above uses two parameters ($a$ and $t_0$), but only one is needed, as there is an obvious redundancy (a point can *also* be expressed by $t$ and the shortened length of string). [Wikipedia](https://en.wikipedia.org/wiki/Involute) uses this definition for $a$ and $t$ values in an interval $[t_0, t_1]$: - -```math -\vec\beta_a(t) = \vec\gamma(t) - \frac{\vec\gamma'(t)}{\|\vec\gamma'(t)\|}\int_a^t \|\vec\gamma'(t)\| dt. -``` - -If $\vec\gamma(s)$ is parameterized by arc length, then this simplifies quite a bit, as the unit tangent is just $\vec\gamma'(s)$ and the remaining arc length just $(s-a)$: - -```math -\begin{align*} -\vec\beta_a(s) &= \vec\gamma(s) - \vec\gamma'(s) (s-a) \\ -&=\vec\gamma(s) - \hat{T}_{\vec\gamma}(s)(s-a).\quad (a \text{ is the arc-length parameter}) -\end{align*} -``` - -With this characterization, we see several properties: - -* From $\vec\beta_a'(s) = \hat{T}(s) - (\kappa(s) \hat{N}(s) (s-a) + \hat{T}(s)) = -\kappa_{\vec\gamma}(s) \cdot (s-a) \cdot \hat{N}_{\vec\gamma}(s)$, the involute is *not* regular at $s=a$, as its derivative is zero. - -* As $\vec\beta_a(s) = \vec\beta_0(s) + a\hat{T}(s)$, the family of curves is parallel. - -* The evolute of $\vec\beta_a(s)$, $s$ the arc-length parameter of $\vec\gamma$, can be shown to be $\vec\gamma$. This requires more work: - -The evolute for $\vec\beta_a(s)$ is: - -```math -\vec\beta_a(s) + \frac{1}{\kappa_{\vec\beta_a}(s)}\hat{N}_{\vec\beta_a}(s). -``` - -In the following we show that: - -```math -\begin{align} -\kappa_{\vec\beta_a}(s) &= 1/(s-a),\\ -\hat{N}_{\vec\beta_a}(s) &= \hat{T}_{\vec\beta_a}'(s)/\|\hat{T}_{\vec\beta_a}'(s)\| = -\hat{T}_{\vec\gamma}(s). -\end{align} -``` - -The first shows in a different way that when $s=a$ the curve is not regular, as the curvature fails to exists. In the above figure, when the involute touches $\vec\gamma$, there will be a cusp. - -With these two identifications and using $\vec\gamma'(s) = \hat{T}_{\vec\gamma(s)}$, we have the evolute simplifies to - -```math -\begin{align*} -\vec\beta_a(s) + \frac{1}{\kappa_{\vec\beta_a}(s)}\hat{N}_{\vec\beta_a}(s) -&= -\vec\gamma(s) + \vec\gamma'(s)(s-a) + \frac{1}{\kappa_{\vec\beta_a}(s)}\hat{N}_{\vec\beta_a}(s) \\ -&= -\vec\gamma(s) + \hat{T}_{\vec\gamma}(s)(s-a) + \frac{1}{1/(s-a)} (-\hat{T}_{\vec\gamma}(s)) \\ -&= \vec\gamma(s). -\end{align*} -``` - -That is the evolute of an involute of $\vec\gamma(s)$ is $\vec\gamma(s)$. - - -We have: - -```math -\begin{align} -\beta_a(s) &= \vec\gamma - \vec\gamma'(s)(s-a)\\ -\beta_a'(s) &= -\kappa_{\vec\gamma}(s)(s-a)\hat{N}_{\vec\gamma}(s)\\ -\beta_a''(s) &= (-\kappa_{\vec\gamma}(s)(s-a))' \hat{N}_{\vec\gamma}(s) + (-\kappa_{\vec\gamma}(s)(s-a))(-\kappa_{\vec\gamma}\hat{T}_{\vec\gamma}(s)), -\end{align} -``` - -the last line by the Frenet-Serret formula for *planar* curves which show $\hat{T}'(s) = \kappa(s) \hat{N}$ and $\hat{N}'(s) = -\kappa(s)\hat{T}(s)$. - -To compute the curvature of $\vec\beta_a$, we need to compute both: - -```math -\begin{align} -\| \vec\beta' \|^3 &= |\kappa^3 (s-a)^3|\\ -\| \vec\beta' \times \vec\beta'' \| &= |\kappa(s)^3 (s-a)^2|, -\end{align} -``` - -the last line using both $\hat{N}\times\hat{N} = \vec{0}$ and $\|\hat{N}\times\hat{T}\| = 1$. The curvature then is $\kappa_{\vec\beta_a}(s) = 1/(s-a)$. - -Using the formula for $\vec\beta'$ above, we get $\hat{T}_\beta(s)=\hat{N}_{\vec\gamma}(s)$ so $\hat{T}_\beta(s)' = -\kappa_{\vec\gamma}(s) \hat{T}_{\vec\gamma}(s)$ with unit vector just $\hat{N}_{\vec\beta_a} = -\hat{T}_{\vec\gamma}(s)$. - - ----- - -Show that *an* involute of the cycloid $\vec{r}(t) = \langle t - \sin(t), 1 - \cos(t) \rangle$ is also a cycloid. We do so graphically: - -```julia; hold=true -r(t) = [t - sin(t), 1 - cos(t)] -## find *involute*: r - r'/|r'| * int(|r'|, a, t) -t0, t1, a = 2PI, PI, PI -@syms t::real -rp = diff.(r(t), t) -speed = 2sin(t/2) - -ex = r(t) - rp/speed * integrate(speed, a, t) - -plot_parametric(0..4pi, r, legend=false) -plot_parametric!(0..4pi, u -> SymPy.N.(subs.(ex, t .=> u))) -``` - - -The expression `ex` is secretly `[t + sin(t), 3 + cos(t)]`, another cycloid. - -##### Example: goats - -An old problem of calculus is called the [goat problem](https://en.wikipedia.org/wiki/Goat_problem). This formulation -- with horses -- is from ``1748``: - -> Observing a horse tied to feed in a gentlemen’s park, with one end of a rope to his fore foot, and the other end to one of the circular iron rails, inclosing a pond, the circumference of which rails being ``160`` yards, equal to the length of the rope, what quantity of ground at most, could the horse feed? - -Let ``r`` be the radius of a circle and for concreteness we position it at ``(-r, 0)``. Let ``R`` be the length of a rope, and suppose ``R \ge \pi r``. (It is equal in the problem). Then the question can be rephrased as what is *twice* the area suggested by this graphic which is drawn in pieces: - -* Between angles ``0`` and ``\pi/2`` the horse has unconstrained access, so they can graze a wedge of radius ``R``. -* Between angles ``\pi/2`` and until the horse's ``y`` position is ``0`` when the tether is taut the boundary of what can be eaten is described by the involute. -* The horse can't eat from withing the circle or radius ``r``. - -```julia; echo=false -let - r,R = 1, 10 - R = max(R, pi*r) # R ≥ 1/2 circumference - - γ(θ) = -2r*cos(θ) * [cos(θ), sin(θ)] # parameterize the circle of radius r - involute(t) = γ(t) + γ'(t)/norm(γ'(t))* (R - quadgk(u -> norm(γ'(u)), pi/2, t)[1]) - t₀ = find_zero(t -> round(involute(t)[2], digits=4), (3pi/4, pi)) - - p = plot(; legend=false) - plot_polar!(0..(pi/2), t -> R) # unobstructed -> quarter circle - plot_parametric!((pi/2)..t₀, involute) - plot_parametric!((pi/2)..pi, γ) - plot!([0,R],[0,0]) -end -``` - -To solve for the area we parameterize the circle of radius ``r`` between ``\pi/2`` and when the involute would cross the ``x`` axis. We use `find_zero` to identify the value. - -```julia -let -r,R = 160/(2π), 160 -R = max(R, pi*r) # R ≥ 1/2 circumference -γ(θ) = -2r*cos(θ) * [cos(θ), sin(θ)] -## find *involute*: r - r'/|r'| * int(|r'|, a, t) -involute(t) = γ(t) + γ'(t)/norm(γ'(t))* (R - quadgk(u -> norm(γ'(u)), pi/2, t)[1]) - -t₀ = find_zero(t -> round(involute(t)[2], digits=4), (3pi/4, pi)) - -A₁ = π * R^2 / 4 -y(t) = involute(t)[2] -x′(t) = (h=1e-4; (involute(t+h)[1]-involute(t-h)[1])/(2h)) -A₂ = quadgk(t -> -y(t)*x′(t), pi/2, t₀)[1] # A₂ = -∫ y dx, as counterclockwise parameterization -A₃ = (1/2) * π * r^2 -2 * (A₁ + A₂ - A₃) -end -``` - -The calculation for ``A_1`` and ``A_3`` are from the familiar formula for the area of a circle. However, ``A_2`` requires the formula for area above the ``x`` axis when the curve is parameterized: ``A = -\int_a^b y(t) x'(t) dt``, given how the curve is parameterized. As written, the automatic derivative of the numeric integral gives an error, so a central-difference approximation is used for ``x'(t)``. - -## Questions - -###### Question - -A cycloid is formed by pushing a wheel on a surface without slipping. The position of a fixed point on the outer rim of the wheel traces out the cycloid. Suppose the wheel has radius $R$ and the initial position of the point is at the bottom, $(0,0)$. Let $t$ measure angle measurement, in radians. Then the point of contact of the wheel will be at $Rt$, as that is the distance the wheel will have rotated. That is, the hub of the wheel will move according to $\langle Rt,~ R\rangle$. Relative to the hub, the point on the rim will have coordinates $\langle -R\sin(t), -R\cos(t) \rangle$, so the superposition gives: - -```math -\vec{r}(t) = \langle Rt - R\sin(t), R - R\cos(t) \rangle. -``` - -What is the position at $t=\pi/4$? - -```julia; hold=true; echo=false -choices = [ -q"[0.0782914, 0.292893 ]", -q"[0.181172, 0.5]", -q"[0.570796, 1.0]"] -answ = 1 -radioq(choices, answ) -``` - -And the position at $\pi/2$? - - -```julia; hold=true; echo=false -choices = [ -q"[0.0782914, 0.292893 ]", -q"[0.181172, 0.5]", -q"[0.570796, 1.0]"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -Suppose instead of keeping track of a point on the outer rim of the wheel, a point a distance $r < R$ from the hub is chosen in the above description of a cycloid (a [Curtate](http://mathworld.wolfram.com/CurtateCycloid.html) cycloid). If we start at $\langle 0,~ R-r \rangle$, what will be the position at $t$? - -```julia; hold=true; echo=false -choices = [ -" ``\\langle Rt - r\\sin(t),~ R - r\\cos(t) \\rangle``", -" ``\\langle Rt - R\\sin(t),~ R - R\\cos(t) \\rangle``", -" ``\\langle -r\\sin(t),~ -r\\cos(t) \\rangle``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -For the cycloid $\vec{r}(t) = \langle t - \sin(t),~ 1 - \cos(t) \rangle$, find a simplified expression for $\| \vec{r}'(t)\|$. - -```julia; hold=true; echo=false -choices = [ - " ``\\sqrt{2 - 2\\cos(t)}``", - " ``1``", - " ``1 - \\cos(t)``", - " ``1 + \\cos(t) + \\cos(2t)``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The cycloid $\vec{r}(t) = \langle t - \sin(t),~ 1 - \cos(t) \rangle$ has a formula for the arc length from $0$ to $t$ given by: $l(t) = 4 - 4\cos(t/2)$. - -Plot the following two equations over $[0,8]$ which are a reparameterization of the cycloid by $l^{-1}(t)$. - -```julia; hold=true; -γ(s) = 2 * acos(1-s/4) -x1(s) = γ(s) - sin(γ(s)) -y1(s) = 1 - cos(γ(s)) -``` - -How many arches of the cycloid are traced out? - -```julia; hold=true; echo=false -radioq(1:3, 1, keep_order=true) -``` - -###### Question - -Consider the cycloid $\vec{r}(t) = \langle t - \sin(t),~ 1 - \cos(t) \rangle$ - -What is the derivative at $t=\pi/2$? - -```julia; hold=true; echo=false -choices = [ -q"[1,1]", -q"[2,0]", -q"[0,0]" -] -answ = 1 -radioq(choices, answ) -``` - -What is the derivative at $t=\pi$? - - -```julia; hold=true; echo=false -choices = [ -q"[1,1]", -q"[2,0]", -q"[0,0]" -] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -Consider the circle $\vec{r}(t) = R \langle \cos(t),~ \sin(t) \rangle$, $R > 0$. Find the norm of $\vec{r}'(t)$: - -```julia; hold=true; echo=false -choices = [ - " ``1``", - " ``1/R``", - " ``R``", - " ``R^2``" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -The curve described by $\vec{r}(t) = \langle 10t,~ 10t - 16t^2\rangle$ models the flight of an arrow. Compute the length traveled from when it is launched to when it returns to the ground. - -```julia; hold=true; echo=false -x(t) = 10t -y(t) = 10t - 16t^2 -a,b = sort(find_zeros(y, -10, 10)) -f(x,y) = 1 -val, _ = quadgk(t -> f(x(t), y(t)) * sqrt(D(x)(t)^2 + D(y)(t)^2), a, b) -numericq(val) -``` - - -##### Question - -Let $\vec{r}(t) = \langle t, t^2 \rangle$ describe a parabola. What is the arc length between $0 \leq t \leq 1$? First, what is a formula for the speed ($\| \vec{r}'(t)\|$)? - -```julia; hold=true; echo=false -choices = [ - " ``\\sqrt{1 + 4t^2}``", - " ``1 + 4t^2``", - " ``1``", - " ``t + t^2``" -] -answ = 1 -radioq(choices, answ) -``` - -Numerically find the arc length. - -```julia; hold=true; echo=false -val,err = quadgk(t -> (1 + 4t^2)^(1/2), 0, 1) -numericq(val) -``` - - - - -##### Question - -Let $\vec{r}(t) = \langle t, t^2 \rangle$ describe a parabola. What is the curvature of $\vec{r}(t)$ at $t=0$? - -```julia; hold=true; echo=false -@syms t::positive -rt = [t, t^2, 0] -rp = diff.(rt, t) -rpp = diff.(rt, t, t) -kappa = norm(rp × rpp) / norm(rp)^3 -#val = N(kappa(t=>0)) #2 -val = 2 -numericq(val) -``` - -The curvature at $1$ will be - -```julia; hold=true; echo=false -choices = [ -"greater than the curvature at ``t=0``", -"less than the curvature at ``t=0``", -"the same as the curvature at ``t=0``"] -answ = 2 -radioq(choices, answ) -``` - -The curvature as $t\rightarrow \infty$ will be - -```julia; hold=true; echo=false -choices = [ - " ``0``", - " ``\\infty``", - " ``1``" -] -answ = 1 -radioq(choices, answ) -``` - ----- - -Now, if we have a more general parabola by introducing a parameter $a>0$: $\vec{r}(t) = \langle t, a\cdot t^2 \rangle$, What is the curvature of $\vec{r}(t)$ at $t=0$? - -```julia; hold=true; echo=false -choices = [ - " ``2a``", - " ``2/a``", - " ``2``", - " ``1``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Projectile motion with constant acceleration is expressed parametrically by $\vec{x}(t) = \vec{x}_0 + \vec{v}_0 t + (1/2) \vec{a} t^2$, where $\vec{x}_0$ and $\vec{v}_0$ are initial positions and velocity respectively. In [Strang](https://ocw.mit.edu/resources/res-18-001-calculus-online-textbook-spring-2005/textbook/MITRES_18_001_strang_12.pdf) p451, we find an example utilizing this formula to study the curve of a baseball. Place the pitcher at the origin, the batter along the $x$ axis, then a baseball thrown with spin around its $z$ axis will have acceleration in the $y$ direction in addition to the acceration due to gravity in the $z$ direction. Suppose the ball starts ``5`` feet above the ground when pitched ($\vec{x}_0 = \langle 0,0, 5\rangle$), and has initial velocity $\vec{v}_0 = \langle 120, -2, 2 \rangle$. (``120`` feet per second is about ``80`` miles per hour). Suppose the pitcher can produce an acceleration in the $y$ direction of $16ft/sec^2$, then $\vec{a} = \langle 0, 16, -32\rangle$ in these units. (Gravity is $9.8m/s^2$ or $32ft/s^2$.) - - -The plate is ``60`` feet away. How long will it take for the ball to reach the batter? (When the first component is $60$?) - -```julia; hold=true; echo=false -x0 = [0,0,5] -v0 = [120, -2, 2] -a = [0, 16, -32] -r(t) = x0 + v0*t + 1/2*a*t^2 -answ = 60/v0[1] -numericq(answ) -``` - -At $t=1/4$ the ball is half-way to home. If the batter reads the ball at this point, where in the $y$ direction is the ball? - -```julia; hold=true; echo=false -x0 = [0,0,5] -v0 = [120, -2, 2] -a = [0, 16, -32] -r(t) = x0 + v0*t + 1/2*a*t^2 -t = 1/4 -answ = r(t)[2] -numericq(answ) -``` - -At $t=1/2$ has the ball moved more than ``1/2`` foot in the $y$ direction? - -```julia; hold=true; echo=false -x0 = [0,0,5] -v0 = [120, -2, 2] -a = [0, 16, -32] -r(t) = x0 + v0*t + 1/2*a*t^2 -t = 1/2 -answ = abs(r(t)[2]) > 1/2 -yesnoq(answ) -``` - - -###### Question - -In [Strang](https://ocw.mit.edu/resources/res-18-001-calculus-online-textbook-spring-2005/textbook/MITRES_18_001_strang_12.pdf) we see this picture describing a curve: - -```julia; hold=true; echo=false -a = 1 - -plot(t -> 0, -2, 2,aspect_ratio=:equal, legend=false) -plot!(t -> 2a) -r(t) = [0, a] + a*[cos(t), sin(t)] -plot!(unzip(r, 0, 2pi)...) -theta = pi/3 -plot!([0, 2a/tan(theta)], [0, 2a], linestyle=:dash) -A = [2a*cot(theta), 2a] -B = 2a*sin(theta)^2 *[ 1/tan(theta),1] -scatter!(unzip([A,B])...) -plot!([B[1],A[1],A[1]], [B[2],B[2],A[2]], linestyle=:dash) -delta = 0.2 -annotate!([(B[1],B[2]-delta,"B"),(A[1]+delta,A[2]-delta,"A")]) -r(theta) = [2a*cot(theta), 2a*sin(theta)^2 ] -theta0 = pi/4 -plot!(unzip(r, theta0, pi-theta0)..., linewidth=3) -P = r(theta) -annotate!([(P[1],P[2]-delta, "P")]) -``` - -Strang notes that the curve is called the "witch of Agnesi" after Maria Agnesi, the author of the first three-semester calculus book. (L'Hopital's book did not contain integration.) - - -We wish to identify the parameterization. Using $\theta$ an angle in standard position, we can see that the component functions $x(\theta)$ and $y(\theta)$ may be found using trigonometric analysis. - -What is the $x$ coordinate of point $A$? (Also the $x$ coordinate of $P$.) - -```julia; hold=true; echo=false -choices = [ - " ``2̧\\cot(\\theta)``", - " ``\\cot(\\theta)``", - " ``2\\tan(\\theta)``", - " ``\\tan(\\theta)``" - ] -answ = 1 -radioq(choices, answ) -``` - -Using the polar form of a circle, the length between the origin and $B$ is given by $2\cos(\theta-\pi/2) = 2\sin(\theta)$. Using this, what is the $y$ coordinate of $B$? - -```julia; hold=true; echo=false -choices = [ - " ``2\\sin^2(\\theta)``", - " ``2\\sin(\\theta)``", - " ``2``", - " ``\\sin(\\theta)``" -] -answ=1 -radioq(choices, answ) -``` - - -###### Question - -Let $n > 0$, $\vec{r}(t) = \langle t^(n+1),t^n\rangle$. Find the speed, $\|\vec{r}'(t)\|$. - -```julia; hold=true; echo=false -choices = [ - " ``\\frac{\\sqrt{n^{2} t^{2 n} + t^{2 n + 2} \\left(n + 1\\right)^{2}}}{t}``", - " ``t^n + t^{n+1}``", - " ``\\sqrt{n^2 + t^2}``" -] -answ=1 -radioq(choices, answ) -``` - -For $n=2$, the arc length of $\vec{r}$ can be found exactly. What is the arc-length between $0 \leq t \leq a$? - -```julia; hold=true; echo=false -choices = [ - " ``\\frac{a^{2} \\sqrt{9 a^{2} + 4}}{3} + \\frac{4 \\sqrt{9 a^{2} + 4}}{27} - \\frac{8}{27}``", - " ``\\frac{2 a^{\\frac{5}{2}}}{5}``", - " ``\\sqrt{a^2 + 4}``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -The [astroid](http://www-history.mcs.st-and.ac.uk/Curves/Astroid.html) is one of the few curves with an exactly computable arc-length. The curve is parametrized by $\vec{r}(t) = \langle a\cos^3(t), a\sin^3(t)\rangle$. For $a=1$ find the arc-length between $0 \leq t \leq \pi/2$. - -```julia; hold=true; echo=false -choices = [ - " ``\\sqrt{2}``", - " ``3/2``", - " ``\\pi/2``", - " ``2``" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - - -```julia; echo=false -let - t0, t1 = pi/12, pi/3 - tspan = (t0, t1) # time span to consider - - a = 1 - r(theta) = -cos(theta) + 4*2cos(theta)*sin(theta)^2 - F(t) = r(t) * [cos(t), sin(t)] - p = (a, F) # combine parameters - - B0 = F(0) - [0, a] # some initial position for the back - prob = ODEProblem(bicycle, B0, tspan, p) - - out = solve(prob, reltol=1e-6, Tsit5()) - - plt = plot(unzip(F, t0, t1)..., legend=false, color=:red) - plot!(plt, unzip(t->out(t), t0, t1)..., color=:blue) -end -``` - - -Let $F$ and $B$ be pictured above. Which is the red curve? - -```julia; hold=true; echo=false -choices = [ -"The front wheel", -"The back wheel" -] -answ=1 -radioq(choices, answ) -``` - - - -###### Question - - -```julia; echo=false -let - t0, t1 = 0.0, pi/3 - tspan = (t0, t1) # time span to consider - - a = 1 - r(t) = 3a * cos(2t)cos(t) - F(t) = r(t) * [cos(t), sin(t)] - p = (a, F) # combine parameters - - B0 = F(0) - [0, a] # some initial position for the back - prob = ODEProblem(bicycle, B0, tspan, p) - - out = solve(prob, reltol=1e-6, Tsit5()) - - plt = plot(unzip(F, t0, t1)..., legend=false, color=:blue) - plot!(plt, unzip(t->out(t), t0, t1)..., color=:red) -end -``` - - -Let $F$ and $B$ be pictured above. Which is the red curve? - -```julia; hold=true; echo=false -choices = [ -"The front wheel", -"The back wheel" -] -answ=2 -radioq(choices, answ) -``` - - -###### Question - -Let $\vec{\gamma}(s)$ be a parameterization of a curve by arc length and $s(t)$ some continuous increasing function of $t$. Then $\vec{\gamma} \circ s$ also parameterizes the curve. We have - -```math -\text{velocity} = \frac{d (\vec{\gamma} \circ s)}{dt} = \frac{d\vec{\gamma}}{ds} \frac{ds}{dt} = \hat{T} \frac{ds}{dt}. -``` - -Continuing with a second derivative - -```math -\text{acceleration} = \frac{d^2(\vec{\gamma}\circ s)}{dt^2} = -\frac{d\hat{T}}{ds} \frac{ds}{dt} \frac{ds}{dt} + \hat{T} \frac{d^2s}{dt^2} = \frac{d^2s}{dt^2}\hat{T} + \kappa (\frac{ds}{dt})^2 \hat{N}, -``` - -Using $d\hat{T}{ds} = \kappa\hat{N}$ when parameterized by arc length. - -This expresses the acceleration in terms of the tangential part and the normal part. [Strang](https://ocw.mit.edu/resources/res-18-001-calculus-online-textbook-spring-2005/textbook/MITRES_18_001_strang_12.pdf) views this in terms of driving where the car motion is determined by the gas pedal and the brake pedal only giving acceleration in the $\hat{T}$ direction) and the steering wheel (giving acceleration in the $\hat{N}$ direction). - - -If a car is on a straight road, then $\kappa=0$. Is the acceleration along the $\hat{T}$ direction or the $\hat{N}$ direction? - -```julia; hold=true; echo=false -choices = [ - "The ``\\hat{T}`` direction", - "The ``\\hat{N}`` direction"] -answ = 1 -radioq(choices, answ) -``` - -Suppose no gas or brake is applied for a duration of time. The tangential acceleration will be $0$. During this time, which of these must be $0$? - -```julia; hold=true; echo=false -choices = [ - " ``\\vec{\\gamma} \\circ s``", - " ``ds/dt``", - " ``d^2s/dt^2``" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -In going around a corner (with non-zero curvature), which is true? - -```julia; hold=true; echo=false -choices = [ -"The acceleration in the normal direction depends on both the curvature and the speed (``ds/dt``)", -"The acceleration in the normal direction depends only on the curvature and not the speed (``ds/dt``)", -"The acceleration in the normal direction depends only on the speed (``ds/dt``) and not the curvature" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The evolute comes from the formula $\vec\gamma(T) - (1/\kappa(t)) \hat{N}(t)$. For hand computation, this formula can be explicitly given by two components $\langle X(t), Y(t) \rangle$ through: - -```math -\begin{align} -r(t) &= x'(t)^2 + y'(t)^2\\ -k(t) &= x'(t)y''(t) - x''(t) y'(t)\\ -X(t) &= x(t) - y'(t) r(t)/k(t)\\ -Y(t) &= x(t) + x'(t) r(t)/k(t) -\end{align} -``` - -Let $\vec\gamma(t) = \langle t, t^2 \rangle = \langle x(t), y(t)\rangle$ be a parameterization of a parabola. - -* Compute $r(t)$ - -```julia; hold=true; echo=false -choices = [ - " ``1 + 4t^2``", - " ``1 - 4t^2``", - " ``1 + 2t``", - " ``1 - 2t``" -] -answ = 1 -radioq(choices, answ) -``` - -* Compute $k(t)$ - -```julia; hold=true; echo=false -choices = [ - " ``2``", - " ``-2``", - " ``8t``", - " ``-8t``" -] -answ = 1 -radioq(choices, answ) -``` - -* Compute $X(t)$ - -```julia; hold=true; echo=false -choices = [ - " ``t - 2t(1 + 4t^2)/2``", - " ``t - 4t(1+2t)/2``", - " ``t - 2(8t)/(1-2t)``", - " ``t - 1(1+4t^2)/2``" -] -answ = 1 -radioq(choices, answ) -``` - -* Compute $Y(t)$ - -```julia; hold=true; echo=false -choices = [ - " ``t^2 + 1(1 + 4t^2)/2``", - " ``t^2 + 2t(1+4t^2)/2``", - " ``t^2 - 1(1+4t^2)/2``", - " ``t^2 - 2t(1+4t^2)/2``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The following will compute the evolute of an ellipse: - -```julia; eval=false -@syms t a b -x = a * cos(t) -y = b * sin(t) -xp, xpp, yp, ypp = diff(x, t), diff(x,t,t), diff(y,t), diff(y,t,t) -r2 = xp^2 + yp^2 -k = xp * ypp - xpp * yp -X = x - yp * r2 / k |> simplify -Y = y + xp * r2 / k |> simplify -[X, Y] -``` - -What is the resulting curve? - -```julia; hold=true; echo=false -choices = [ -"An astroid of the form ``c \\langle \\cos^3(t), \\sin^3(t) \\rangle``", -"An cubic parabola of the form ``\\langle ct^3, dt^2\\rangle``", -"An ellipse of the form ``\\langle a\\cos(t), b\\sin(t)``", -"A cyloid of the form ``c\\langle t + \\sin(t), 1 - \\cos(t)\\rangle``" -] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/differentiable_vector_calculus/vectors.jmd b/CwJ/differentiable_vector_calculus/vectors.jmd deleted file mode 100644 index 8afefe3..0000000 --- a/CwJ/differentiable_vector_calculus/vectors.jmd +++ /dev/null @@ -1,1336 +0,0 @@ -# Vectors and matrices - - -This section uses these add-on package: - -```julia -using CalculusWithJulia -using Plots -using LinearAlgebra -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Vectors and matrices", - description = "Calculus with Julia: Vectors and matrices", - tags = ["CalculusWithJulia", "differentiable_vector_calculus", "vectors and matrices"], -); - -nothing -``` - ----- - -In [vectors](../precalc/vectors.html) we introduced the concept of a -vector. For `Julia`, vectors are a useful storage container and are -used to hold, for example, zeros of functions or the coefficients of a -polynomial. This section is about their mathematical properties. A -[vector](https://en.wikipedia.org/wiki/Euclidean_vector) -mathematically is a geometric object with two attributes a magnitude -and a direction. (The direction is undefined in the case the magnitude -is $0$.) Vectors are typically visualized with an arrow, where the -anchoring of the arrow is context dependent and is not particular to a -given vector. - -Vectors and points are related, but distinct. They are identified when the tail of the vector is taken to be the origin. Let's focus on ``3`` dimensions. Mathematically, the notation for a point is $p=(x,y,z)$ while the notation for a vector is $\vec{v} = \langle x, y, z \rangle$. The $i$th component in a vector is referenced by a subscript: $v_i$. With this, we may write a typical vector as $\vec{v} = \langle v_1, v_2, \dots, v_n \rangle$ and a vector in $n=3$ as $\vec{v} =\langle v_1, v_2, v_3 \rangle$. -The different grouping notation distinguishes the two objects. As another example, the notation $\{x, y, z\}$ indicates a set. Vectors and points may be *identified* by anchoring the vector at the origin. Sets are quite different from both, as the order of their entries is not unique. - - - - -In `Julia`, the notation to define a point and a vector would be identical, using square brackets to group like-type values: `[x, y, z]`. The notation `(x,y,z)` would form a [tuple](https://en.wikipedia.org/wiki/Euclidean_vector) which though similar in many respects, are different, as tuples do not have the operations associated with a point or a vector defined for them. - -The square bracket constructor has some subtleties: - -* `[x,y,z]` calls `vect` and creates a 1-dimensional array -* `[x; y; z]` calls `vcat` to **v**ertically con**cat**enate values together. With simple (scalar) values `[x,y,z]` and `[x; y; z]` are identical, but not in other cases. (For example, is `A` is a matrix then `[A, A]` is a vector of matrices, `[A; A]` is a matrix combined from the two pieces. -* `[x y z]` calls `hcat` to **h**orizontally con**cat**enate values together. If `x`, `y` are numbers then `[x y]` is *not* a vector, but rather a ``2``D array with a single row and two columns. -* finally `[w x; y z]` calls `hvcat` to horizontally and vertically concatenate values together to create a container in two dimensions, like a matrix. - -(A vector, mathematically, is a one-dimensional collection of numbers, a matrix a two-dimensional *rectangular* collection of numbers, and an array an $n$-dimensional rectangular-like collection of numbers. In `Julia`, a vector can hold a collection of objects of arbitrary type, though each will be promoted to a common type.) - - -## Vector addition, scalar multiplication - -As seen earlier, vectors have some arithmetic operations defined for them. As a typical use of vectors, mathematically, is to collect the $x$, $y$, and $z$ (in ``3``D) components together, operations like addition and subtraction operate component wise. With this, addition can be visualized geometrically: put the tail of $\vec{v}$ at the tip of $\vec{u}$ and draw a vector from the tail of $\vec{u}$ to the tip of $\vec{v}$ and you have $\vec{u}+\vec{v}$. This is identical by $\vec{v} + \vec{u}$ as vector addition is commutative. Unless $\vec{u}$ and $\vec{v}$ are parallel or one has $0$ length, the addition will create a vector with a different direction from the two. - -Another operation for vectors is *scalar* multiplication. Geometrically this changes the magnitude, but not the direction of a vector, when the *scalar* is positive. Scalar multiplication is defined component wise, like addition so the $i$th component of $c \vec{v}$ is $c$ times the $i$th component of $\vec{v}$. When the scalar is negative, the direction is "reversed." - -To illustrate we define two ``3``-dimensional vectors: - -```julia; -u, v = [1, 2, 3], [4, 3, 2] -``` - -The sum is component-wise summation (`1+4, 2+3, 3+2`): - -```julia; -u + v -``` - -For addition, as the components must pair off, the two vectors being added must be the same dimension. - -Scalar multiplication by `2`, say, multiplies each entry by `2`: - -```julia; -2 * u -``` - -## The length and direction of a vector - - -If a vector $\vec{v} = \langle v_1, v_2, \dots, v_n\rangle$ then the *norm* (also Euclidean norm or length) of $\vec{v}$ is defined by: - -```math -\| \vec{v} \| = \sqrt{ v_1^2 + v_2^2 + \cdots + v_n^2}. -``` - - -The definition of a norm leads to a few properties. First, if $c$ is a scalar, $\| c\vec{v} \| = |c| \| \vec{v} \|$ - which says scalar multiplication by $c$ changes the length by $|c|$. (Sometimes, scalar multiplication is described as "scaling by....") -The other property is an analog of the triangle inequality, in which for any two vectors $\| \vec{v} + \vec{w} \| \leq \| \vec{v} \| + \| \vec{w} \|$. The right hand side is equal only when the two vectors are parallel. - - -A vector with length $1$ is called a *unit* vector. Dividing a non-zero vector by its norm will yield a unit vector, a consequence of the first property above. Unit vectors are often written with a "hat:" $\hat{v}$. - - -The direction indicated by $\vec{v}$ can be visualized as an angle in ``2``- or ``3``-dimensions, but in higher dimensions, visualization is harder. For ``2``-dimensions, we might -associate with a vector, it's unit vector. This in turn may be identified with a point on the unit circle, which from basic trigonometry can be associated with an angle. Something similar, can be done in ``3`` dimensions, using two angles. However, the "direction" of a vector is best thought of in terms of its associated unit vector. With this, we have a decomposition of a non-zero vector $\vec{v}$ into a magnitude and a direction when we write $\vec{v} = \|\vec{v}\| \cdot (\vec{v} / \|\vec{v}\|)=\|\vec{v}\| \hat{v}$. - - -## Visualization of vectors - - -Vectors may be visualized in ``2`` or ``3`` dimensions using `Plots`. In ``2`` dimensions, the `quiver` function may be used. To graph a vector, it must have its tail placed at a point, so two values are needed. - -To plot `u=[1,2]` from `p=[0,0]` we have the following usage: - -```julia; -quiver([0],[0], quiver=([1],[2])) -``` - -The cumbersome syntax is typical here. We naturally describe vectors and points using `[a,b,c]` to combine them, but the plotting functions want to plot many such at a time and expect vectors containing just the `x` values, just the `y` values, etc. The above usage looks a bit odd, as these vectors of `x` and `y` values have only one entry. Converting from the one representation to the other requires reshaping the data. We will use the `unzip` function from `CalculusWithJulia` which in turn just uses the the `invert` function of the `SplitApplyCombine` package ("return a new nested container by reversing the order of the nested container") for the bulk of its work. - - -This function takes a vector of vectors, and returns a vector containing the `x` values, the `y` values, etc. So if `u=[1,2,3]` and `v=[4,5,6]`, then `unzip([u,v])` becomes `[[1,4],[2,5],[3,6]]`, etc. (The `zip` function in base does essentially the reverse operation, hence the name.) Notationally, `A = [u,v]` can have the third element of the first vector (`u`) accessed by `A[1][3]`, where as `unzip(A)[3][1]` will do the same. We use `unzip([u])` in the following, which for this `u` returns `([1],[2],[3])`. (Note the `[u]` to make a vector of a vector.) - -With `unzip` defined, we can plot a ``2``-dimensional vector `v` anchored at point `p` through `quiver(unzip([p])..., quiver=unzip([v]))`. - -To illustrate, the following defines ``3`` vectors (the third through addition), then graphs all three, though in different starting points to emphasize the geometric interpretation of vector addition. - -```julia; hold=true -u = [1, 2] -v = [4, 2] -w = u + v -p = [0,0] -quiver(unzip([p])..., quiver=unzip([u])) -quiver!(unzip([u])..., quiver=unzip([v])) -quiver!(unzip([p])..., quiver=unzip([w])) -``` - - - -Plotting a ``3``-d vector is not supported in all toolkits with -`quiver`. A line segment may be substituted and can be produced -with `plot(unzip([p,p+v])...)`. To avoid all these details, the `CalculusWithJulia` provides the `arrow!` function to *add* a vector to an existing plot. The function requires a point, `p`, and the vector, `v`: - - -With this, the above simplifies to: - -```julia; hold=true -u = [1, 2] -v = [4, 2] -w = u + v -p = [0,0] -plot(legend=false) -arrow!(p, u) -arrow!(u, v) -arrow!(p, w) -``` - - -The distinction between a point and a vector within `Julia` is only mental. We use the same storage type. Mathematically, we can **identify** a point and a vector, by considering the vector with its tail placed at the origin. In this case, the tip of the arrow is located at the point. But this is only an identification, though a useful one. It allows us to "add" a point and a vector (e.g., writing $P + \vec{v}$) by imagining the point as a vector anchored at the origin. - - - -To see that a unit vector has the same "direction" as the vector, we might draw them with different widths: - -```julia; hold=true -v = [2, 3] -u = v / norm(v) -p = [0, 0] -plot(legend=false) -arrow!(p, v) -arrow!(p, u, linewidth=5) -``` - - -The `norm` function is in the standard library, `LinearAlgebra`, which must be loaded first through the command `using LinearAlgebra`. (Though here it is redundant, as that package is loaded and reexported when the `CalculusWithJulia` package is loaded.) - -## Aside: review of `Julia`'s use of dots to work with containers - -`Julia` makes use of the dot, "`.`", in a few ways to simplify usage when containers, such as vectors, are involved: - - -* **Splatting**. The use of three dots, "`...`", to "splat" the values from a container like a vector (or tuple) into *arguments* of a function can be very convenient. It was used above in the definition for the `arrow!` function: essentially `quiver!(unzip([p])..., quiver=unzip([v]))`. The `quiver` function expects ``2`` (or ``3``) arguments describing the `xs` and `ys` (and sometimes `zs`). The `unzip` function returns these in a container, so splatting is used to turn the values in the container into distinct arguments of the function. Whereas the `quiver` argument expects a tuple of vectors, so no splatting is used for that part of the definition. Another use of splatting we will see is with functions of vectors. These can be defined in terms of the vector's components or the vector as a whole, as below: - -```julia; -f(x, y, z) = x^2 + y^2 + z^2 -f(v) = v[1]^2 + v[2]^2 + v[3]^2 -``` - -The first uses the components and is arguably, much easier to read. The second uses indexing in the function body to access the components. It has an advantage, as it can more easily handle different length vectors (e.g. using `sum(v.^2)`). Both uses have their merits, though the latter is more idiomatic throughout `Julia`. - -If a function is easier to write in terms of its components, but an interface expects a vector of components as it argument, then splatting can be useful, to go from one style to another, similar to this: - -```julia; -g(x, y, z) = x^2 + y^2 + z^2 -g(v) = g(v...) -``` - -The splatting will mean `g(v)` eventually calls `g(x, y, z)` through `Julia`'s multiple dispatch machinery when `v = [x, y, z]`. - -(The three dots can also appear in the definition of the arguments to a function, but there the usage is not splatting but rather a specification of a variable number of arguments.) - -* **Broadcasting**. For a univariate function, `f`, and vector, `xs`, the call `f.(xs)` *broadcasts* `f` over each value of `xs` and returns a container holding all the values. This is a compact alternative to a comprehension when a function is defined. When `f` depends on more than one value, broadcasting can still be used: `f.(xs, ys)` will broadcast `f` over values formed from *both* `xs` and `ys`. Broadcasting has the extra feature (over `map`) of attempting to match up the shapes of `xs` and `ys` when they are not identical. (See the help page for `broadcast` for more details.) - -For example, if `xs` is a vector and `ys` a scalar, then the value in `ys` is repeated many times to match up with the values of `xs`. Or if `xs` and `ys` have different dimensions, the values of one will be repeated. Consider this: - -```julia -𝐟(x,y) = x + y -``` - -```julia; hold=true -xs = ys = [0, 1] -𝐟.(xs, ys) -``` - -This matches `xs` and `ys` to pass `(0,0)` and then `(1,1)` to `f`, returning `0` and `2`. Now consider - -```julia; hold=true -xs = [0, 1]; ys = [0 1] # xs is a column vector, ys a row vector -𝐟.(xs, ys) -``` - -The two dimensions are different so for each value of `xs` the vector of `ys` is broadcast. This returns a matrix now. This will be important for some plotting usages where a grid (matrix) of values is needed. - -At times using the "apply" notation: `x |> f`, in place of using `f(x)` is useful, as it can move the wrapping function to the right of the expression. To broadcast, `.|>` is available. - - - -## The dot product - -There is no concept of multiplying two vectors, or for that matter dividing two vectors. However, there are two operations between vectors that are somewhat similar to multiplication, these being the dot product and the cross product. Each has an algebraic definition, but their geometric properties are what motivate their usage. We begin by discussing the dot product. - - -The dot product between two vectors can be viewed algebraically in terms of the following product. If $\vec{v} = \langle v_1, v_2, \dots, v_n\rangle$ and $\vec{w} = \langle w_1, w_2, \dots, w_n\rangle$, then the *dot product* of $\vec{v}$ and $\vec{w}$ is defined by: - -```math -\vec{v} \cdot \vec{w} = v_1 w_1 + v_2 w_2 + \cdots + v_n w_n. -``` - -From this, we can see the relationship between the norm, or Euclidean length of a vector: $\vec{v} \cdot \vec{v} = \| \vec{v} \|^2$. We can also see that the dot product is commutative, that is $\vec{v} \cdot \vec{w} = \vec{w} \cdot \vec{v}$. - -The dot product has an important geometrical interpolation. Two (non-parallel) vectors will lie in the same "plane", even in higher dimensions. Within this plane, there will be an angle between them within $[0, \pi]$. Call this angle $\theta$. (This means the angle between the two vectors is the same regardless of their order of consideration.) Then - -```math -\vec{v} \cdot \vec{w} = \|\vec{v}\| \|\vec{w}\| \cos(\theta). -``` - -If we denoted $\hat{v} = \vec{v} / \| \vec{v} \|$, the unit vector in the direction of $\vec{v}$, then by dividing, we see that -$\cos(\theta) = \hat{v} \cdot \hat{w}$. That is the angle does not depend on the magnitude of the vectors involved. - -The dot product is computed in `Julia` by the `dot` function, which is in the `LinearAlgebra` package of the standard library. This must be loaded (as above) before its use either directly or through the `CalculusWithJulia` package: - -```julia; -𝒖 = [1, 2] -𝒗 = [2, 1] -dot(𝒖, 𝒗) -``` - -!!! note - - In `Julia`, the unicode operator entered by `\cdot[tab]` can also be used to mirror the math notation: - -```julia; -𝒖 ⋅ 𝒗 # u \cdot[tab] v -``` - - -Continuing, to find the angle between $\vec{u}$ and $\vec{v}$, we might do this: - -```julia; -𝒄theta = dot(𝒖/norm(𝒖), 𝒗/norm(𝒗)) -acos(𝒄theta) -``` - - -The cosine of $\pi/2$ is $0$, so two vectors which are at right angles to each other will have a dot product of $0$: - -```julia; hold=true -u = [1, 2] -v = [2, -1] -u ⋅ v -``` - -In two dimensions, we learn that a perpendicular line to a line with slope $m$ will have slope $-1/m$. From a ``2``-dimensional vector, say $\vec{u} = \langle u_1, u_2 \rangle$, the slope is $u_2/u_1$ so a perpendicular vector to $\vec{u}$ will be $\langle u_2, -u_1 \rangle$, as above. For higher dimensions, where the angle is harder to visualize, the dot product defines perpendicularness, or *orthogonality*. - -For example, these two vectors are orthogonal, as their dot product is $0$, even though we can't readily visualize them: - -```julia; hold=true -u = [1, 2, 3, 4, 5] -v = [-30, 4, 3, 2, 1] -u ⋅ v -``` - - -#### Projection - -From right triangle trigonometry, we learn that $\cos(\theta) = \text{adjacent}/\text{hypotenuse}$. If we use a vector, $\vec{h}$ for the hypotenuse, and $\vec{a} = \langle 1, 0 \rangle$, we have this picture: - -```julia; hold=true -h = [2, 3] -a = [1, 0] # unit vector -h_hat = h / norm(h) -theta = acos(h_hat ⋅ a) - -plot(legend=false) -arrow!([0,0], h) -arrow!([0,0], norm(h) * cos(theta) * a) -arrow!([0,0], a, linewidth=3) -``` - -We used vectors to find the angle made by `h`, and from there, using the length of the hypotenuse is `norm(h)`, we can identify the length of the adjacent side, it being the length of the hypotenuse times the cosine of $\theta$. Geometrically, we call the vector `norm(h) * cos(theta) * a` the *projection* of $\vec{h}$ onto $\vec{a}$, the word coming from the shadow $\vec{h}$ would cast on the direction of $\vec{a}$ were there light coming perpendicular to $\vec{a}$. - -The projection can be made for any pair of vectors, and in any dimension $n > 1$. The projection of $\vec{u}$ on $\vec{v}$ would be a vector of length $\vec{u}$ (the hypotenuse) times the cosine of the angle in the direction of $\vec{v}$. In dot-product notation: - -```math -proj_{\vec{v}}(\vec{u}) = \| \vec{u} \| \frac{\vec{u}\cdot\vec{v}}{\|\vec{u}\|\|\vec{v}\|} \frac{\vec{v}}{\|\vec{v}\|}. -``` - -This can simplify. After cancelling, and expressing norms in terms of dot products, we have: - -```math -proj_{\vec{v}}(\vec{u}) = \frac{\vec{u} \cdot \vec{v}}{\vec{v} \cdot \vec{v}} \vec{v} = (\vec{u} \cdot \hat{v}) \hat{v}, -``` - -where $\hat{v}$ is the unit vector in the direction of $\vec{v}$. - - -##### Example - -A pendulum, a bob on a string, swings back and forth due to the force of gravity. When the bob is displaced from rest by an angle $\theta$, then the tension force of the string on the bob is directed along the string and has magnitude given by the *projection* of the force due to gravity. - - -A [force diagram](https://en.wikipedia.org/wiki/Free_body_diagram) is a useful visualization device of physics to illustrate the applied forces involved in a scenario. In this case the bob has two forces acting on it: a force due to tension in the string of unknown magnitude, but in the direction of the string; and a force due to gravity. The latter is in the downward direction and has magnitude $mg$, $g=9.8m/sec^2$ being the gravitational constant. - -```julia; -𝗍heta = pi/12 -𝗆ass, 𝗀ravity = 1/9.8, 9.8 - -𝗅 = [-sin(𝗍heta), cos(𝗍heta)] -𝗉 = -𝗅 -𝖥g = [0, -𝗆ass * 𝗀ravity] -plot(legend=false) -arrow!(𝗉, 𝗅) -arrow!(𝗉, 𝖥g) -scatter!(𝗉[1:1], 𝗉[2:2], markersize=5) -``` - -The magnitude of the tension force is exactly that of the force of gravity projected onto $\vec{l}$, as the bob is not accelerating in that direction. The component of the gravity force in the perpendicular direction is the part of the gravitational force that causes acceleration in the pendulum. Here we find the projection onto $\vec{l}$ and visualize the two components of the gravitational force. - -```julia; -plot(legend=false, aspect_ratio=:equal) -arrow!(𝗉, 𝗅) -arrow!(𝗉, 𝖥g) -scatter!(𝗉[1:1], 𝗉[2:2], markersize=5) - -𝗉roj = (𝖥g ⋅ 𝗅) / (𝗅 ⋅ 𝗅) * 𝗅 # force of gravity in direction of tension -𝗉orth = 𝖥g - 𝗉roj # force of gravity perpendicular to tension - -arrow!(𝗉, 𝗉roj) -arrow!(𝗉, 𝗉orth, linewidth=3) -``` - - - -##### Example - -Starting with three vectors, we can create three orthogonal vectors using projection and subtraction. The creation of `porth` above is the pattern we will exploit. - -Let's begin with three vectors in $R^3$: - -```julia; -u = [1, 2, 3] -v = [1, 1, 2] -w = [1, 2, 4] -``` - -We can find a vector from `v` orthogonal to `u` using: - -```julia; -unit_vec(u) = u / norm(u) -projection(u, v) = (u ⋅ unit_vec(v)) * unit_vec(v) - -vₚ = v - projection(v, u) -wₚ = w - projection(w, u) - projection(w, vₚ) -``` - -We can verify the orthogonality through: - -```julia; -u ⋅ vₚ, u ⋅ wₚ, vₚ ⋅ wₚ -``` - -This only works when the three vectors do not all lie in the same plane. In general, this is the beginning of the [Gram-Schmidt](https://en.wikipedia.org/wiki/Gram-Schmidt_process) process for creating *orthogonal* vectors from a collection of vectors. - - - -#### Algebraic properties - -The dot product is similar to multiplication, but different, as it is an operation defined between vectors of the same dimension. However, many algebraic properties carry over: - -* commutative: $\vec{u} \cdot \vec{v} = \vec{v} \cdot \vec{u}$ - -* scalar multiplication: $(c\vec{u})\cdot\vec{v} = c(\vec{u}\cdot\vec{v})$. - -* distributive $\vec{u} \cdot (\vec{v} + \vec{w}) = \vec{u} \cdot \vec{v} + \vec{u} \cdot \vec{w}$ - -The last two can be combined: $\vec{u}\cdot(s \vec{v} + t \vec{w}) = s(\vec{u}\cdot\vec{v}) + t (\vec{u}\cdot\vec{w})$. - -But the associative property does not make sense, as $(\vec{u} \cdot \vec{v}) \cdot \vec{w}$ does not make sense as two dot products: the result of the first is not a vector, but a scalar. - -## Matrices - -Algebraically, the dot product of two vectors - pair off by components, multiply these, then add - is a common operation. Take for example, the general equation of a line, or a plane: - -```math -ax + by = c, \quad ax + by + cz = d. -``` - -The left hand sides are in the form of a dot product, in this case $\langle a,b \rangle \cdot \langle x, y\rangle$ and $\langle a,b,c \rangle \cdot \langle x, y, z\rangle$ respectively. When there is a system of equations, something like: - -```math -\begin{array}{} -3x &+& 4y &- &5z &= 10\\ -3x &-& 5y &+ &7z &= 11\\ --3x &+& 6y &+ &9z &= 12, -\end{array} -``` - -Then we might think of $3$ vectors $\langle 3,4,-5\rangle$, $\langle 3,-5,7\rangle$, and $\langle -3,6,9\rangle$ being dotted with $\langle x,y,z\rangle$. Mathematically, matrices and their associated algebra are used to represent this. In this example, the system of equations above would be represented by a matrix and two vectors: - -```math -M = \left[ -\begin{array}{} -3 & 4 & -5\\ -5 &-5 & 7\\ --3& 6 & 9 -\end{array} -\right],\quad -\vec{x} = \langle x, y , z\rangle,\quad -\vec{b} = \langle 10, 11, 12\rangle, -``` - -and the expression $M\vec{x} = \vec{b}$. The matrix $M$ is a rectangular collection of numbers or expressions arranged in rows and columns with certain algebraic definitions. There are $m$ rows and $n$ columns in an $m\times n$ matrix. In this example $m=n=3$, and in such a case the matrix is called square. A vector, like $\vec{x}$ is usually identified with the $n \times 1$ matrix (a column vector). Were that done, the system of equations would be written $Mx=b$. - -If we refer to a matrix $M$ by its components, a convention is to use $(M)_{ij}$ or $m_{ij}$ to denote the entry in the $i$th *row* and $j$th *column*. Following `Julia`'s syntax, we would use $m_{i:}$ to refer to *all* entries in the $i$th row, and $m_{:j}$ to denote *all* entries in the $j$ column. - -In addition to square matrices, there are some other common types of -matrices worth naming: square matrices with $0$ entries below the -diagonal are called upper triangular; square matrices with $0$ entries -above the diagonal are called lower triangular matrices; square -matrices which are $0$ except possibly along the diagonal are diagonal -matrices; and a diagonal matrix whose diagonal entries are all $1$ is -called an *identity matrix*. - - -Matrices, like vectors, have scalar multiplication defined for them. then scalar multiplication of a matrix $M$ by $c$ just multiplies each entry by $c$, so the new matrix would have components defined by $cm_{ij}$. - -Matrices of the same size, like vectors, have addition defined for them. As with scalar multiplication, addition is defined component wise. So $A+B$ is the matrix with $ij$ entry $A_{ij} + B_{ij}$. - -### Matrix multiplication - -Matrix multiplication may be viewed as a collection of dot product operations. First, matrix multiplication is only defined between $A$ and $B$, as $AB$, if the size of $A$ is $m\times n$ and the size of $B$ is $n \times k$. That is the number of columns of $A$ must match the number of rows of $B$ for the left multiplication of $AB$ to be defined. If this is so, then we have the $ij$ entry of $AB$ is: - -```math -(AB)_{ij} = A_{i:} \cdot B_{:j}. -``` - -That is, if we view the $i$th row of $A$ and the $j$th column of B as *vectors*, then the $ij$ entry is the dot product. - -This is why $M$ in the example above, has the coefficients for each equation in a row and not a column, and why $\vec{x}$ is thought of as a $n\times 1$ matrix (a column vector) and not as a row vector. - -Matrix multiplication between $A$ and $B$ is not, in general, commutative. Not only may the sizes not permit $BA$ to be found when $AB$ may be, there is just no guarantee when the sizes match that the components will be the same. - ----- - -Matrices have other operations defined on them. We mention three here: - -* The *transpose* of a matrix flips the difference between row and column, so the $ij$ entry of the transpose is the $ji$ entry of the matrix. This means the transpose will have size $n \times m$ when $M$ has size $m \times n$. Mathematically, the transpose is denoted $M^t$. - -* The *determinant* of a *square* matrix is a number that can be used to characterize the matrix. The determinant may be computed different ways, but its [definition](https://en.wikipedia.org/wiki/Leibniz_formula_for_determinants) by the Leibniz formula is common. Two special cases are all we need. The $2\times 2$ case and the $3 \times 3$ case: - -```math -\left| -\begin{array}{} -a&b\\ -c&d -\end{array} -\right| = -ad - bc, \quad -\left| -\begin{array}{} -a&b&c\\ -d&e&f\\ -g&h&i -\end{array} -\right| = -a \left| -\begin{array}{} -e&f\\ -h&i -\end{array} -\right| -- b \left| -\begin{array}{} -d&f\\ -g&i -\end{array} -\right| -+c \left| -\begin{array}{} -d&e\\ -g&h -\end{array} -\right|. -``` - -The $3\times 3$ case shows how determinants may be [computed recursively](https://en.wikipedia.org/wiki/Determinant#Definition), using "cofactor" expansion. - -* The *inverse* of a square matrix. If $M$ is a square matrix and its determinant is non-zero, then there is an *inverse* matrix, denoted $M^{-1}$, with the properties that $MM^{-1} = M^{-1}M = I$, where $I$ is the diagonal matrix of all $1$s called the identify matrix. -### Matrices in Julia - - -As mentioned previously, a matrix in `Julia` is defined component by component with `[]`. We separate row entries with spaces and columns with semicolons: - -```julia; -ℳ = [3 4 -5; 5 -5 7; -3 6 9] -``` - -Space is the separator, which means computing a component during definition (i.e., writing `2 + 3` in place of `5`) can be problematic, as no space can be used in the computation, lest it be parsed as a separator. - -Vectors are defined similarly. As they are identified with *column* vectors, we use a semicolon (or a comma with simple numbers) to separate: - -```julia; -𝒷 = [10, 11, 12] # not 𝒷 = [10 11 12], which would be a row vector. -``` - - -In `Julia`, entries in a matrix (or a vector) are stored in a container with a type wide enough accomodate each entry. In this example, the type is SymPy's `Sym` type: - -```julia; -@syms x1 x2 x3 -𝓍 = [x1, x2, x3] -``` - -Matrices may also be defined from blocks. This example shows how to make two column vectors into a matrix: - -```julia; -𝓊 = [10, 11, 12] -𝓋 = [13, 14, 15] -[𝓊 𝓋] # horizontally combine -``` - -Vertically combining the two will stack them: - -```julia; -[𝓊; 𝓋] -``` - - - -Scalar multiplication will just work as expected: - -```julia; -2 * ℳ -``` - -Matrix addition is also straightforward: - -```julia; -ℳ + ℳ -``` - -Matrix addition expects matrices of the same size. An error will otherwise be thrown. However, if addition is *broadcasted* then the sizes need only be commensurate. For example, this will add `1` to each entry of `M`: - -```julia; -ℳ .+ 1 -``` - -Matrix multiplication is defined by `*`: - -```julia; -ℳ * ℳ -``` - -We can then see how the system of equations is represented with matrices: - -```julia; -ℳ * 𝓍 - 𝒷 -``` - -Here we use `SymPy` to verify the above: - -```julia; -𝒜 = [symbols("A$i$j", real=true) for i in 1:3, j in 1:2] -ℬ = [symbols("B$i$j", real=true) for i in 1:2, j in 1:2] -``` - -The matrix product has the expected size: the number of rows of `A` (``3``) by the number of columns of `B` (``2``): - -```julia; -𝒜 * ℬ -``` - -This confirms how each entry (`(A*B)[i,j]`) is from a dot product (`A[i,:] ⋅ B[:,j]`): - -```julia; -[ (𝒜 * ℬ)[i,j] == 𝒜[i,:] ⋅ ℬ[:,j] for i in 1:3, j in 1:2] -``` - - -When the multiplication is broadcasted though, with `.*`, the operation will be component wise: - -```julia; -ℳ .* ℳ # component wise (Hadamard product) -``` - ----- - - -The determinant is found by `det` provided by the `LinearAlgebra` package: - -```julia; -det(ℳ) -``` - ----- - -The transpose of a matrix is found through `transpose` which doesn't create a new object, but rather an object which knows to switch indices when referenced: - -```julia; -transpose(ℳ) -``` - -For matrices with *real* numbers, the transpose can be performed with the postfix operation `'`: - -```julia; -ℳ' -``` - -(However, this is not true for matrices with complex numbers as `'` is the "adjoint," that is, the transpose of the matrix *after* taking complex conjugates.) - - - -With `u` and `v`, vectors from above, we have: - -```julia; -[𝓊' 𝓋'] # [𝓊 𝓋] was a 3 × 2 matrix, above -``` - -and - -```julia; -[𝓊'; 𝓋'] -``` - -```julia; echo=false -note(""" -The adjoint is defined *recursively* in `Julia`. In the `CalculusWithJulia` package, we overload the `'` notation for *functions* to yield a univariate derivative found with automatic differentiation. This can lead to problems: if we have a matrix of functions, `M`, and took the transpose with `M'`, then the entries of `M'` would be the derivatives of the functions in `M` - not the original functions. This is very much likely to not be what is desired. The `CalculusWithJulia` package commits **type piracy** here *and* abuses the generic idea for `'` in Julia. In general type piracy is very much frowned upon, as it can change expected behaviour. It is defined in `CalculusWithJulia`, as that package is intended only to act as a means to ease users into the wider package ecosystem of `Julia`. -""") -``` - ----- - -The dot product and matrix multiplication are related, and mathematically identified through the relation: $\vec{u} \cdot \vec{v} = u^t v$, where the right hand side identifies $\vec{u}$ and $\vec{v}$ with a $n\times 1$ column matrix, and $u^t$ is the transpose, or a $1\times n$ row matrix. However, mathematically the left side is a scalar, but the right side a $1\times 1$ matrix. While distinct, the two are identified as the same. This is similar to the useful identification of a point and a vector. Within `Julia`, these identifications are context dependent. `Julia` stores vectors as ``1``-dimensional arrays, transposes as $1$-dimensional objects, and matrices as $2$-dimensional arrays. The product of a transpose and a vector is a scalar: - -```julia; hold=true -u, v = [1,1,2], [3,5,8] -u' * v # a scalar -``` - -But if we make `u` a matrix (here by "`reshape`ing" in a matrix with $1$ row and $3$ columns), we will get a matrix (actually a vector) in return: - -```julia; hold=true -u, v = [1,1,2], [3,5,8] -reshape(u,(1,3)) * v -``` - - - -## Cross product - -In three dimensions, there is a another operation between vectors that is similar to multiplication, though we will see with many differences. - -Let $\vec{u}$ and $\vec{v}$ be two ``3``-dimensional vectors, then the *cross* product, $\vec{u} \times \vec{v}$, is defined as a vector with length: - -```math -\| \vec{u} \times \vec{v} \| = \| \vec{u} \| \| \vec{v} \| \sin(\theta), -``` - -with $\theta$ being the angle in $[0, \pi]$ between $\vec{u}$ and $\vec{v}$. Consequently, $\sin(\theta) \geq 0$. - -The direction of the cross product is such that it is *orthogonal* to *both* $\vec{u}$ and $\vec{v}$. There are two such directions, to identify which is correct, the [right-hand rule](https://en.wikipedia.org/wiki/Cross_product#Definition) is used. This rule points the right hand fingers in the direction of $\vec{u}$ and curls them towards $\vec{v}$ (so that the angle between the two vectors is in $[0, \pi]$). The thumb will point in the direction. Call this direction $\hat{n}$, a normal unit vector. Then the cross product can be defined by: - -```math -\vec{u} \times \vec{v} = \| \vec{u} \| \| \vec{v} \| \sin(\theta) \hat{n}. -``` - -```julia; echo=false -note(""" -The right-hand rule is also useful to understand how standard household screws will behave when twisted with a screwdriver. If the right hand fingers curl in the direction of the twisting screwdriver, then the screw will go in or out following the direction pointed to by the thumb. -""") -``` - - -The right-hand rule depends on the order of consideration of the vectors. If they are reversed, the opposite direction is determined. A consequence is that the cross product is **anti**-commutative, unlike multiplication: - -```math -\vec{u} \times \vec{v} = - \vec{v} \times \vec{u}. -``` - -Mathematically, the definition in terms of its components is a bit involved: - -```math -\vec{u} \times \vec{v} = \langle u_2 v_3 - u_3 v_2, u_3 v_1 - u_1 v_3, u_1 v_2 - u_2 v_1 \rangle. -``` - -There is a matrix notation that can simplify this computation. If we *formally* define $\hat{i}$, $\hat{j}$, and $\hat{k}$ to represent unit vectors in the $x$, $y$, and $z$ direction, then a vector $\langle u_1, u_2, u_3 \rangle$ could be written $u_1\hat{i} + u_2\hat{j} + u_3\hat{k}$. With this the cross product of $\vec{u}$ and $\vec{v}$ is the vector associated with the *determinant* of the matrix - -```math -\left[ -\begin{array}{} -\hat{i} & \hat{j} & \hat{k}\\ -u_1 & u_2 & u_3\\ -v_1 & v_2 & v_3 -\end{array} -\right] -``` - -From the $\sin(\theta)$ term in the definition, we see that $\vec{u}\times\vec{u}=0$. In fact, the cross product is $0$ only if the two vectors involved are parallel or there is a zero vector. - - - -In `Julia`, the `cross` function from the `LinearAlgebra` package implements the cross product. For example: - -```julia; -𝓪 = [1, 2, 3] -𝓫 = [4, 2, 1] -cross(𝓪, 𝓫) -``` - -There is also the *infix* unicode operator `\times[tab]` that can be used for similarity to traditional mathematical syntax. - -```julia; -𝓪 × 𝓫 -``` - -We can see the cross product is anti-commutative by comparing the last answer with: - -```julia; -𝓫 × 𝓪 -``` - - -Using vectors of size different than $n=3$ produces a dimension mismatch error: - -```julia; -[1, 2] × [3, 4] -``` - -(It can prove useful to pad ``2``-dimensional vectors into ``3``-dimensional vectors by adding a $0$ third component. We will see this in the discussion on curvature in the plane.) - - -Let's see that the matrix definition will be identical (after identifications) to `cross`: - -```julia; -@syms î ĵ k̂ -𝓜 = [î ĵ k̂; 3 4 5; 3 6 7] -det(𝓜) |> simplify -``` - -Compare with - -```julia; -𝓜[2,:] × 𝓜[3,:] -``` - ----- - - -Consider this extended picture involving two vectors $\vec{u}$ and $\vec{v}$ drawn in two dimensions: - -```julia; -u₁ = [1, 2] -v₁ = [2, 1] -p₁ = [0,0] - -plot(aspect_ratio=:equal) -arrow!(p₁, u₁) -arrow!(p₁, v₁) -arrow!(u₁, v₁) -arrow!(v₁, u₁) - -puv₁ = (u₁ ⋅ v₁) / (v₁ ⋅ v₁) * v₁ -porth₁ = u₁ - puv₁ -arrow!(puv₁, porth₁) -``` - -The enclosed shape is a parallelogram. To this we added the projection of $\vec{u}$ onto $\vec{v}$ (`puv`) and then the *orthogonal* part (`porth`). - -The *area* of a parallelogram is the length of one side times the perpendicular height. The perpendicular height could be found from `norm(porth)`, so the area is: - -```julia; -norm(v₁) * norm(porth₁) -``` - -However, from trigonometry we have the height would also be the norm of $\vec{u}$ times $\sin(\theta)$, a value that is given through the length of the cross product of $\vec{u}$ and $\hat{v}$, the unit vector, were these vectors viewed as ``3`` dimensional by adding a $0$ third component. In formulas, this is also the case: - -```math -\text{area of the parallelogram} = \| \vec{u} \times \hat{v} \| \| \vec{v} \| = \| \vec{u} \times \vec{v} \|. -``` - -We have, for our figure, after extending `u` and `v` to be three dimensional the area of the parallelogram: - -```julia; -u₂ = [1, 2, 0] -v₂ = [2, 1, 0] -norm(u₂ × v₂) -``` - ----- - -This analysis can be extended to the case of 3 vectors, which - when not co-planar - will form a *parallelepiped*. - -```julia; -u₃, v₃, w₃ = [1,2,3], [2,1,0], [1,1,2] -plot() -p₃ = [0,0,0] - -plot(legend=false) -arrow!(p₃, u₃); arrow!(p₃, v₃); arrow!(p₃, w₃) -arrow!(u₃, v₃); arrow!(u₃, w₃) -arrow!(v₃, u₃); arrow!(v₃, w₃) -arrow!(w₃, u₃); arrow!(w₃, v₃) -arrow!(u₃ + v₃, w₃); arrow!(u₃ + w₃, v₃); arrow!(v₃ + w₃, u₃) -``` - -The volume of a parallelepiped is the area of a base parallelogram times the height of a perpendicular. If $\vec{u}$ and $\vec{v}$ form the base parallelogram, then the perpendicular will have height $\|\vec{w}\| \cos(\theta)$ where the angle is the one made by $\vec{w}$ with the normal, $\vec{n}$. Since $\vec{u} \times \vec{v} = \| \vec{u} \times \vec{v}\| \hat{n} = \hat{n}$ times the area of the base parallelogram, we have if we dot this answer with $\vec{w}$: - -```math -(\vec{u} \times \vec{v}) \cdot \vec{w} = -\|\vec{u} \times \vec{v}\| (\vec{n} \cdot \vec{w}) = -\|\vec{u} \times \vec{v}\| \| \vec{w}\| \cos(\theta), -``` - -that is, the area of the parallelepiped. Wait, what about $(\vec{v}\times\vec{u})\cdot\vec{w}$? That will have an opposite sign. Yes, in the above, there is an assumption that $\vec{n}$ and $\vec{w}$ have a an angle between them within $[0, \pi/2]$, otherwise an absolute value must be used, as volume is non-negative. - - -!!! note "Orientation" - The triple-scalar product, $\vec{u}\cdot(\vec{v}\times\vec{w})$, gives the volume of the parallelepiped up to sign. If the sign of this is positive, the ``3`` vectors are said to have a *positive* orientation, if the triple-scalar product is negative, the vectors have a *negative* orientation. - - -#### Algebraic properties - -The cross product has many properties, some different from regular multiplication: - -* scalar multiplication: $(c\vec{u})\times\vec{v} = c(\vec{u}\times\vec{v})$ - -* distributive over addition: $\vec{u} \times (\vec{v} + \vec{w}) = \vec{u}\times\vec{v} + \vec{u}\times\vec{w}$. - -* *anti*-commutative: $\vec{u} \times \vec{v} = - \vec{v} \times \vec{u}$ - -* *not* associative: that is there is no guarantee that $(\vec{u}\times\vec{v})\times\vec{w}$ will be equivalent to $\vec{u}\times(\vec{v}\times\vec{w})$. - -* The triple cross product $(\vec{u}\times\vec{v}) \times \vec{w}$ must be orthogonal to $\vec{u}\times\vec{v}$ so lies in a plane with this as a normal vector. But, $\vec{u}$ and $\vec{v}$ will generate this plane, so it should be possible to express this triple product in terms of a sum involving $\vec{u}$ and $\vec{v}$ and indeed: - -```math -(\vec{u}\times\vec{v})\times\vec{w} = (\vec{u}\cdot\vec{w})\vec{v} - (\vec{v}\cdot\vec{w})\vec{u}. -``` - ----- - -The following shows the algebraic properties stated above hold for -symbolic vectors. First the linearity of the dot product: - -```julia; -@syms s₄ t₄ u₄[1:3]::real v₄[1:3]::real w₄[1:3]::real - -u₄ ⋅ (s₄ * v₄ + t₄ * w₄) - (s₄ * (u₄ ⋅ v₄) + t₄ * (u₄ ⋅ w₄)) |> simplify -``` - -This shows the dot product is commutative: - -```julia; -(u₄ ⋅ v₄) - (v₄ ⋅ u₄) |> simplify -``` - -This shows the linearity of the cross product over scalar multiplication and vector addition: - -```julia; -u₄ × (s₄* v₄ + t₄ * w₄) - (s₄ * (u₄ × v₄) + t₄ * (u₄ × w₄)) .|> simplify -``` - -(We use `.|>` to broadcast `simplify` over each component.) - -The cross product is anti-commutative: - -```julia; -u₄ × v₄ + v₄ × u₄ .|> simplify -``` - -but not associative: - -```julia; -u₄ × (v₄ × w₄) - (u₄ × v₄) × w₄ .|> simplify -``` - -Finally we verify the decomposition of the triple cross product: - -```julia; -(u₄ × v₄) × w₄ - ( (u₄ ⋅ w₄) * v₄ - (v₄ ⋅ w₄) * u₄) .|> simplify -``` - - ----- - -This table shows common usages of the symbols for various multiplication types: `*`, $\cdot$, and $\times$: - - -| Symbol | inputs | output | type | -|:--------:|:-------------- |:----------- |:------ | -| `*` | scalar, scalar | scalar | regular multiplication | -| `*` | scalar, vector | vector | scalar multiplication | -| `*` | vector, vector | *undefined* | | -| $\cdot$ | scalar, scalar | scalar | regular multiplication | -| $\cdot$ | scalar, vector | vector | scalar multiplication | -| $\cdot$ | vector, vector | scalar | dot product | -| $\times$ | scalar, scalar | scalar | regular multiplication | -| $\times$ | scalar, vector | undefined | | -| $\times$ | vector, vector | vector | cross product (``3``D)| - - -##### Example: lines and planes - -A line in two dimensions satisfies the equation $ax + by = c$. Suppose $a$ and $b$ are non-zero. This can be represented in vector form, as the collection of all points associated to the vectors: $p + t \vec{v}$ where $p$ is a point on the line, say $(0,c/b)$, and v is the vector $\langle b, -a \rangle$. We can verify, this for values of `t` as follows: - -```julia; hold=true -@syms a b c x y t - -eq = c - (a*x + b*y) - -p = [0, c/b] -v = [-b, a] -li = p + t * v - -eq(x=>li[1], y=>li[2]) |> simplify -``` - - -Let $\vec{n} = \langle a , b \rangle$, taken from the coefficients in the equation. We can see directly that $\vec{n}$ is orthogonal to $\vec{v}$. The line may then be seen as the collection of all vectors that are orthogonal to $\vec{n}$ that have their tail at the point $p$. - -In three dimensions, the equation of a plane is $ax + by + cz = d$. Suppose, $a$, $b$, and $c$ are non-zero, for simplicity. Setting $\vec{n} = \langle a,b,c\rangle$ by comparison, it can be seen that plane is identified with the set of all vectors orthogonal to $\vec{n}$ that are anchored at $p$. - -First, let $p = (0, 0, d/c)$ be a point on the plane. We find two vectors $u = \langle -b, a, 0 \rangle$ and $v = \langle 0, c, -b \rangle$. Then any point on the plane may be identified with the vector $p + s\vec{u} + t\vec{v}$. We can verify this algebraically through: - -```julia; hold=true -@syms a b c d x y z s t - -eq = d - (a*x + b*y + c * z) - -p = [0, 0, d/c] -u, v = [-b, a, 0], [0, c, -b] -pl = p + t * u + s * v - -subs(eq, x=>pl[1], y=>pl[2], z=>pl[3]) |> simplify -``` - -The above viewpoint can be reversed: - -> a plane is determined by two (non-parallel) vectors and a point. - -The parameterized version of the plane would be $p + t \vec{u} + s -\vec{v}$, as used above. - -The equation of the plane can be given from $\vec{u}$ and -$\vec{v}$. Let $\vec{n} = \vec{u} \times \vec{v}$. Then $\vec{n} \cdot -\vec{u} = \vec{n} \cdot \vec{v} = 0$, from the properties of the cross product. As such, $\vec{n} \cdot (s -\vec{u} + t \vec{v}) = 0$. That is, the cross product is orthogonal to -any *linear* combination of the two vectors. This figure shows one such linear combination: - - - -```julia; hold=true -u = [1,2,3] -v = [2,3,1] -n = u × v -p = [0,0,1] - -plot(legend=false) - -arrow!(p, u) -arrow!(p, v) -arrow!(p + u, v) -arrow!(p + v, u) -arrow!(p, n) - -s, t = 1/2, 1/4 -arrow!(p, s*u + t*v) -``` - - -So if $\vec{n} \cdot p = d$ (identifying the point $p$ with a vector so the dot product is defined), we will have for any vector $\vec{v} = \langle x, y, z \rangle = s \vec{u} + t \vec{v}$ that - -```math -\vec{n} \cdot (p + s\vec{u} + t \vec{v}) = \vec{n} \cdot p + \vec{n} \cdot (s \vec{u} + t \vec{v}) = d + 0 = d, -``` - -But if $\vec{n} = \langle a, b, c \rangle$, then this says $d = ax + by + cz$, so from $\vec{n}$ and $p$ the equation of the plane is given. - -In summary: - -| Object | Equation | vector equation | -|:------- |:------------------:|:-------------------------------- | -|Line | $ax + by = c$ | line: $p + t\vec{u}$ | -|Plane | $ax + by + cz = d$ | plane: $p + s\vec{u} + t\vec{v}$ | - ----- - - -##### Example - -You are given that the vectors $\vec{u} =\langle 6, 3, 1 \rangle$ and $\vec{v} = \langle 3, 2, 1 \rangle$ describe a plane through the point $p=[1,1,2]$. Find the equation of the plane. - -The key is to find the normal vector to the plane, $\vec{n} = \vec{u} \times \vec{v}$: - -```julia; hold=true -u, v, p = [6,3,1], [3,2,1], [1,1,2] -n = u × v -a, b, c = n -d = n ⋅ p -"equation of plane: $a x + $b y + $c z = $d" -``` - - - -## Questions - -###### Question - -Let `u=[1,2,3]`, `v=[4,3,2]`, and `w=[5,2,1]`. - -Find `u ⋅ v`: - -```julia; hold=true; echo=false -u,v,w = [1,2,3], [4,3,2], [5,2,1] -numericq(dot(u,v)) -``` - -Are `v` and `w` orthogonal? - -```julia; hold=true; echo=false -u,v,w = [1,2,3], [4,3,2], [5,2,1] -answ = dot(v,w) == 0 -yesnoq(answ) -``` - -Find the angle between `u` and `w`: - -```julia; hold=true; echo=false -u,v,w = [1,2,3], [4,3,2], [5,2,1] -ctheta = (u ⋅ w)/norm(u)/norm(w) -val = acos(ctheta) -numericq(val) -``` - - -Find `u × v`: - -```julia; hold=true; echo=false -choices = [ -"`[-5, 10, -5]`", -"`[-1, 6, -7]`", -"`[-4, 14, -8]`" -] -answ = 1 -radioq(choices, answ) -``` - -Find the area of the parallelogram formed by `v` and `w` - -```julia; hold=true; echo=false -u,v,w = [1,2,3], [4,3,2], [5,2,1] -val = norm(v × w) -numericq(val) -``` - - - -Find the volume of the parallelepiped formed by `u`, `v`, and `w`: - -```julia; hold=true; echo=false -u,v,w = [1,2,3], [4,3,2], [5,2,1] -val = abs((u × v) ⋅ w) -numericq(val) -``` - - -###### Question - -The dot product of two vectors may be described in words: pair off the corresponding values, multiply them, then add. In `Julia` the `zip` command will pair off two iterable objects, like vectors, so it seems like this command: `sum(prod.(zip(u,v)))` will find a dot product. Investigate if it is does or doesn't by testing the following command and comparing to the dot product: - -```julia; hold=true; eval=false -u,v = [1,2,3], [5,4,2] -sum(prod.(zip(u,v))) -``` - -Does this return the same answer: - -```julia; hold=true; echo=false -yesnoq(true) -``` - - -What does command `zip(u,v)` return? - -```julia; hold=true; echo=false -choices = [ -"An object of type `Base.Iterators.Zip` that is only realized when used", -"A vector of values `[(1, 5), (2, 4), (3, 2)]`" -] -answ = 1 -radioq(choices, answ) -``` - -What does `prod.(zip(u,v))` return? - -```julia; hold=true; echo=false -choices = [ -"A vector of values `[5, 8, 6]`", -"An object of type `Base.Iterators.Zip` that when realized will produce a vector of values" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Let $\vec{u}$ and $\vec{v}$ be 3-dimensional **unit** vectors. What is the value of - -```math -(\vec{u} \times \vec{v}) \cdot (\vec{u} \times \vec{v}) + (\vec{u} \cdot \vec{v})^2? -``` - -```julia; hold=true; echo=false -choices = [ -"``1``", -"``0``", -"Can't say in general" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Consider the projection of $\langle 1, 2, 3\rangle$ on $\langle 3, 2, 1\rangle$. What is its length? - -```julia; hold=true; echo=false -u,v = [1,2,3], [3,2,1] -val = (u ⋅ v)/norm(v) -numericq(val) -``` - -###### Question - -Let $\vec{u} = \langle 1, 2, 3 \rangle$ and $\vec{v} = \langle 3, 2, 1 \rangle$. Describe the plane created by these two non-parallel vectors going through the origin. - -```julia; hold=true; echo=false -choices = [ -"``-4x + 8y - 4z = 0``", -"``x + 2y + z = 0``", -"``x + 2y + 3z = 6``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -A plane $P_1$ is *orthogonal* to $\vec{n}_1$, a plane $P_2$ is *orthogonal* to $\vec{n}_2$. Explain why vector $\vec{v} = \vec{n}_1 \times \vec{n}_2$ is parallel to the *intersection* of $P_1$ and $P_2$. - -```julia; hold=true; echo=false -choices = [ -" ``\\vec{v}`` is in plane ``P_1``, as it is orthogonal to ``\\vec{n}_1`` and ``P_2`` as it is orthogonal to ``\\vec{n}_2``, hence it is parallel to both planes.", -" ``\\vec{n}_1`` and ``\\vec{n_2}`` are unit vectors, so the cross product gives the projection, which must be orthogonal to each vector, hence in the intersection" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -(From Strang). For an (analog) clock draw vectors from the center out to each of the 12 hours marked on the clock. What is the vector sum of these 12 vectors? - -```julia; hold=true; echo=false -choices = [ -"``\\vec{0}``", -"``\\langle 12, 12 \\rangle``", -"``12 \\langle 1, 0 \\rangle``" -] -answ = 1 -radioq(choices, answ) -``` - -If the vector to 3 o'clock is removed, (call this $\langle 1, 0 \rangle$) what expresses the sum of *all* the remaining vectors? - -```julia; hold=true; echo=false -choices = [ -"``\\langle -1, 0 \\rangle``", -"``\\langle 1, 0 \\rangle``", -"``\\langle 11, 11 \\rangle``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Let $\vec{u}$ and $\vec{v}$ be unit vectors. Let $\vec{w} = \vec{u} + \vec{v}$. Then $\vec{u} \cdot \vec{w} = \vec{v} \cdot \vec{w}$. What is the value? - -```julia; hold=true; echo=false -choices = [ -"``1 + \\vec{u}\\cdot\\vec{v}``", -"``\\vec{u} + \\vec{v}``", -"``\\vec{u}\\cdot\\vec{v} + \\vec{v}\\cdot \\vec{v}``" -] -answ = 1 -radioq(choices, answ) -``` - -As the two are equal, which interpretation is true? - -```julia; hold=true; echo=false -choices = [ -"The angle they make with ``\\vec{w}`` is the same", -"The vector ``\\vec{w}`` must also be a unit vector", -"the two are orthogonal" -] -answ=1 -radioq(choices, answ) -``` - - - -###### Question - -Suppose $\| \vec{u} + \vec{v} \|^2 = \|\vec{u}\|^2 + \|\vec{v}\|^2$. What is $\vec{u}\cdot\vec{v}$? - -We have $(\vec{u} + \vec{v})\cdot(\vec{u} + \vec{v}) = \vec{u}\cdot \vec{u} + 2 \vec{u}\cdot\vec{v} + \vec{v}\cdot\vec{v}$. From this, we can infer that: - -```julia; hold=true; echo=false -choices = [ -"``\\vec{u}\\cdot\\vec{v} = 0``", -"``\\vec{u}\\cdot\\vec{v} = 2``", -"``\\vec{u}\\cdot\\vec{v} = -(\\vec{u}\\cdot\\vec{u} \\vec{v}\\cdot\\vec{v})``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - - -Give a geometric reason for this identity: - -```math -\vec{u} \cdot (\vec{v} \times \vec{w}) = -\vec{v} \cdot (\vec{w} \times \vec{u}) = -\vec{w} \cdot (\vec{u} \times \vec{v}) -``` - - -```julia; hold=true; echo=false -choices = [ -"The triple product describes a volume up to sign, this combination preserves the sign", -"The vectors are *orthogonal*, so these are all zero", -"The vectors are all unit lengths, so these are all 1" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Snell's law in planar form is $n_1\sin(\theta_1) = n_2\sin(\theta_2)$ where $n_i$ is a constant depending on the medium. - -```julia; hold=true; echo=false -f(t) = sin(t) -p = plot(ylim=(.2,1.5), xticks=nothing, yticks=nothing, border=:none, legend=false, aspect_ratio=:equal) -plot!(f, pi/6, pi/2, linewidth=4, color=:blue) -t0 = pi/3 -p0 = [t0, f(t0)] -Normal = [f'(t0), -t0] -arrow!(p0, .5 * Normal, linewidth=4, color=:red ) -incident = (Normal + [.5, 0]) * .5 -arrow!(p0 - incident, incident, linewidth=4, color=:black) -out = -incident + [.1,0] -arrow!(p0, -out, linewidth=4, color=:black) -annotate!([(0.8, 1.0, L"\hat{v}_1"), - (.6, .75, L"n_1"), - (1.075, 0.7, L"\hat{N}"), - (1.25, 0.7, L"\hat{v}_2"), - (1.5, .75, L"n_2") - ]) - -p -``` - -In vector form, we can express it using *unit* vectors through: - -```julia; hold=true; echo=false -choices = [ -"``n_1 (\\hat{v_1}\\times\\hat{N}) = n_2 (\\hat{v_2}\\times\\hat{N})``", -"``n_1 (\\hat{v_1}\\times\\hat{N}) = -n_2 (\\hat{v_2}\\times\\hat{N})``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The Jacobi relationship show that for *any* $3$ randomly chosen vectors: - -```math -\vec{a}\times(\vec{b}\times\vec{c})+ -\vec{b}\times(\vec{c}\times\vec{a})+ -\vec{c}\times(\vec{a}\times\vec{b}) -``` - -simplifies. To what? (Use `SymPy` or randomly generated vectors to see.) - - -```julia; hold=true; echo=false -choices = [ -"``\\vec{0}``", -"``\\vec{a}``", -"``\\vec{a} + \\vec{b} + \\vec{c}``" -] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/integral_vector_calculus/Project.toml b/CwJ/integral_vector_calculus/Project.toml deleted file mode 100644 index 98e581a..0000000 --- a/CwJ/integral_vector_calculus/Project.toml +++ /dev/null @@ -1,8 +0,0 @@ -[deps] -ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" -HCubature = "19dc6840-f33b-545b-b366-655c7e3ffd49" -LaTeXStrings = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f" -Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" -QuadGK = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" -Roots = "f2b01f46-fcfa-551c-844a-d8ac1e96c665" -SymPy = "24249f21-da20-56a4-8eb1-6a02cf4ae2e6" diff --git a/CwJ/integral_vector_calculus/cache/div_grad_curl.cache b/CwJ/integral_vector_calculus/cache/div_grad_curl.cache deleted file mode 100644 index eac6660..0000000 Binary files a/CwJ/integral_vector_calculus/cache/div_grad_curl.cache and /dev/null differ diff --git a/CwJ/integral_vector_calculus/cache/double_triple_integrals.cache b/CwJ/integral_vector_calculus/cache/double_triple_integrals.cache deleted file mode 100644 index cdaf9ac..0000000 Binary files a/CwJ/integral_vector_calculus/cache/double_triple_integrals.cache and /dev/null differ diff --git a/CwJ/integral_vector_calculus/cache/line_integrals.cache b/CwJ/integral_vector_calculus/cache/line_integrals.cache deleted file mode 100644 index 040ee6d..0000000 Binary files a/CwJ/integral_vector_calculus/cache/line_integrals.cache and /dev/null differ diff --git a/CwJ/integral_vector_calculus/cache/stokes_theorem.cache b/CwJ/integral_vector_calculus/cache/stokes_theorem.cache deleted file mode 100644 index 60f7865..0000000 Binary files a/CwJ/integral_vector_calculus/cache/stokes_theorem.cache and /dev/null differ diff --git a/CwJ/integral_vector_calculus/div_grad_curl.jmd b/CwJ/integral_vector_calculus/div_grad_curl.jmd deleted file mode 100644 index 587b38d..0000000 --- a/CwJ/integral_vector_calculus/div_grad_curl.jmd +++ /dev/null @@ -1,1470 +0,0 @@ -# The Gradient, Divergence, and Curl - -This section uses these add-on packages: - - -```julia -using CalculusWithJulia -using Plots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -const frontmatter = ( - title = "The Gradient, Divergence, and Curl", - description = "Calculus with Julia: The Gradient, Divergence, and Curl", - tags = ["CalculusWithJulia", "integral_vector_calculus", "the gradient, divergence, and curl"], -); -nothing - -## used in other blocks -_bar(x) = sum(x)/length(x) -_shrink(x, xbar, offset) = xbar .+ (1-offset/100)*(x .- xbar) -function _poly!(plt::Plots.Plot, ps; offset=5, kwargs...) - push!(ps, first(ps)) - xs, ys = unzip(ps) - xbar, ybar = _bar.((xs, ys)) - xs, ys = _shrink.(xs, xbar, offset), _shrink.(ys, ybar, offset) - - plot!(plt, xs, ys; kwargs...) -# xn = [xs[end],ys[end]] -# x0 = [xs[1], ys[1]] -# dxn = 0.95*(x0 - xn) - -# plot!(plt, xn, dxn; kwargs...) - - plt -end -_poly!(ps;offset=5,kwargs...) = _poly!(Plots.current(), ps; offset=offset, kwargs...) - -function apoly!(plt::Plots.Plot, ps; offset=5, kwargs...) - xs, ys = unzip(ps) - xbar, ybar = _bar.((xs, ys)) - xs, ys = _shrink.(xs, xbar, offset), _shrink.(ys, ybar, offset) - - plot!(plt, xs, ys; kwargs...) - xn = [xs[end],ys[end]] - x0 = [xs[1], ys[1]] - dxn = 0.95*(x0 - xn) - - arrow!(plt, xn, dxn; kwargs...) - - plt -end -apoly!(ps;offset=5,kwargs...) = apoly!(Plots.current(), ps; offset=offset, kwargs...) - -nothing -``` - - ----- - -The gradient of a scalar function $f:R^n \rightarrow R$ is a vector field of partial derivatives. In $R^2$, we have: - -```math -\nabla{f} = \langle \frac{\partial{f}}{\partial{x}}, -\frac{\partial{f}}{\partial{y}} \rangle. -``` - -It has the interpretation of pointing out the direction of greatest ascent for the surface $z=f(x,y)$. - -We move now to two other operations, the divergence and the curl, which combine to give a language to describe vector fields in $R^3$. - - -## The divergence - -Let $F:R^3 \rightarrow R^3 = \langle F_x, F_y, F_z\rangle$ be a vector field. Consider now a small box-like region, $R$, with surface, $S$, on the cartesian grid, with sides of length $\Delta x$, $\Delta y$, and $\Delta z$ with $(x,y,z)$ being one corner. The outward pointing unit normals are $\pm \hat{i}, \pm\hat{j},$ and $\pm\hat{k}$. - - - -```julia; hold=true; echo=false -dx = .5 -dy = .250 -offset=5 -p = plot(;ylim=(0-.2, 1+dy+.4), legend=false, aspect_ratio=:equal,xticks=nothing,yticks=nothing, border=:none) -plot!(p, [dx,dx],[dy,1+dy-offset/100], linestyle=:dash) -plot!(p, [0+offset/100,dx],[0+offset/100,dy], linestyle=:dash) -plot!(p, [dx,1+dx-2offset/100],[dy,dy], linestyle=:dash) - -ps = [[1,1], [0,1],[0,0],[1,0]] -_poly!(ps, linewidth=3, color=:blue) - -ps = [[1,1], [1+dx, 1+dy], [dx, 1+dy],[0,1]] -_poly!(ps, linewidth=3, color=:red) - -ps = [[1,0],[1+dx, dy],[1+dx, 1+dy],[1,1]] -_poly!(ps, linewidth=3, color=:green) -arrow!([.55,.6],.3*[-1,-1/2], color=:blue) -arrow!([1+.6dx, .6], .3*[1,0], color=:blue) -arrow!([.75, 1+.5*dy], .3*[0,1], color=:blue) -annotate!([ - (.5, -.1, "Δy"), - (1+.75dx, .1, "Δx"), - (1+dx+.1, .75, "Δz"), - (.5,.15,L"(x,y,z)"), - (.45,.6, "î"), - (1+.8dx, .7, "ĵ"), - (.8, 1+dy+.1, "k̂") - ]) -p -``` - - -Consider the sides with outward normal $\hat{i}$. The contribution to the surface integral, $\oint_S (F\cdot\hat{N})dS$, could be *approximated* by - -```math -\left(F(x + \Delta x, y, z) \cdot \hat{i}\right) \Delta y \Delta z, -``` - -whereas, the contribution for the face with outward normal $-\hat{i}$ could be approximated by: - -```math -\left(F(x, y, z) \cdot (-\hat{i}) \right) \Delta y \Delta z. -``` - -The functions are being evaluated at a point on the face of the -surface. For Riemann integrable functions, any point in a partition -may be chosen, so our choice will not restrict the generality. - -The total contribution of the two would be: - -```math -\left(F(x + \Delta x, y, z) \cdot \hat{i}\right) \Delta y \Delta z + -\left(F(x, y, z) \cdot (-\hat{i})\right) \Delta y \Delta z = -\left(F_x(x + \Delta x, y, z) - F_x(x, y, z)\right) \Delta y \Delta z, -``` - -as $F \cdot \hat{i} = F_x$. - -*Were* we to divide by $\Delta V = \Delta x \Delta y \Delta z$ *and* take a limit as the volume shrinks, the limit would be $\partial{F}/\partial{x}$. - -If this is repeated for the other two pair of matching faces, we get a definition for the *divergence*: - -> The *divergence* of a vector field $F:R^3 \rightarrow R^3$ is given by -> ```math -> \text{divergence}(F) = -> \lim \frac{1}{\Delta V} \oint_S F\cdot\hat{N} dS = -> \frac{\partial{F_x}}{\partial{x}} +\frac{\partial{F_y}}{\partial{y}} +\frac{\partial{F_z}}{\partial{z}}. -> ``` - -The limit expression for the divergence will hold for any smooth closed surface, $S$, converging on $(x,y,z)$, not just box-like ones. - - -### General $n$ - -The derivation of the divergence is done for $n=3$, but could also have easily been done for two dimensions ($n=2$) or higher dimensions $n>3$. The formula in general would be: for $F(x_1, x_2, \dots, x_n): R^n \rightarrow R^n$: - -```math -\text{divergence}(F) = \sum_{i=1}^n \frac{\partial{F_i}}{\partial{x_i}}. -``` - - ----- - -In `Julia`, the divergence can be implemented different ways depending on how the problem is presented. Here are two functions from the `CalculusWithJulia` package for when the problem is symbolic or numeric: - -```julia; eval=false -divergence(F::Vector{Sym}, vars) = sum(diff.(F, vars)) -divergence(F::Function, pt) = sum(diag(ForwardDiff.jacobian(F, pt))) -``` - -The latter being a bit inefficient, as all $n^2$ partial derivatives are found, but only the $n$ diagonal ones are used. - - -## The curl - -Before considering the curl for $n=3$, we derive a related quantity in $n=2$. The "curl" will be a measure of the microscopic circulation of a vector field. To that end we consider a microscopic box-region in $R^2$: - -```julia; hold=true; echo=false -p = plot(legend=false, xticks=nothing, yticks=nothing, border=:none, xlim=(-1/4, 1+1/4),ylim=(-1/4, 1+1/4)) -apoly!([[0,0],[1,0], [1,1], [0, 1]], linewidth=3, color=:blue) - -dx = .025 -arrow!([1/2, dx], .01 *[1,0], linewidth=3, color=:blue) -arrow!([1/2, 1-dx], .01 *[-1,0], linewidth=3, color=:blue) -arrow!([1-dx, 1/2], .01 *[0, 1], linewidth=3, color=:blue) - -annotate!([ - (0,-1/16,L"(x,y)"), - (1, -1/16, L"(x+\Delta{x},y)"), - (0, 1+1/16, L"(x,y+\Delta{y})"), - (1/2, 4dx, L"\hat{i}"), - (1/2, 1-4dx, L"-\hat{i}"), - (3dx, 1/2, L"-\hat{j}"), - (1-3dx, 1/2, L"\hat{j}") - ]) -``` - -Let $F=\langle F_x, F_y\rangle$. For small enough values of $\Delta{x}$ and $\Delta{y}$ the line integral, $\oint_C F\cdot d\vec{r}$ can be *approximated* by $4$ terms: - -```math -\begin{align} -\left(F(x,y) \cdot \hat{i}\right)\Delta{x} &+ -\left(F(x+\Delta{x},y) \cdot \hat{j}\right)\Delta{y} + -\left(F(x,y+\Delta{y}) \cdot (-\hat{i})\right)\Delta{x} + -\left(F(x,y) \cdot (-\hat{j})\right)\Delta{x}\\ -&= -F_x(x,y) \Delta{x} + F_y(x+\Delta{x},y)\Delta{y} + -F_x(x, y+\Delta{y}) (-\Delta{x}) + F_y(x,y) (-\Delta{y})\\ -&= -(F_y(x + \Delta{x}, y) - F_y(x, y))\Delta{y} - -(F_x(x, y+\Delta{y})-F_x(x,y))\Delta{x}. -\end{align} -``` - -The Riemann approximation allows a choice of evaluation point for Riemann integrable functions, and the choice here lends itself to further analysis. -Were the above divided by $\Delta{x}\Delta{y}$, the area of the box, and a limit taken, partial derivatives appear to suggest this formula: - -```math -\lim \frac{1}{\Delta{x}\Delta{y}} \oint_C F\cdot d\vec{r} = -\frac{\partial{F_y}}{\partial{x}} - \frac{\partial{F_x}}{\partial{y}}. -``` - - -The scalar function on the right hand side is called the (two-dimensional) curl of $F$ and the left-hand side lends itself as a measure of the microscopic circulation of the vector field, $F:R^2 \rightarrow R^2$. - ----- - -Consider now a similar scenario for the $n=3$ case. Let $F=\langle F_x, F_y,F_z\rangle$ be a vector field and $S$ a box-like region with side lengths $\Delta x$, $\Delta y$, and $\Delta z$, anchored at $(x,y,z)$. - - -```julia; hold=true; echo=false -dx = .5 -dy = .250 -offset=5 -p = plot(;ylim=(0-.2, 1+dy+.4), legend=false, aspect_ratio=:equal,xticks=nothing,yticks=nothing, border=:none) -plot!(p, [dx,dx],[dy,1+dy-offset/100], linestyle=:dash) -plot!(p, [0+offset/100,dx],[0+offset/100,dy], linestyle=:dash) -plot!(p, [dx,1+dx-2offset/100],[dy,dy], linestyle=:dash) - -ps = [[1,1], [0,1],[0,0],[1,0]] -apoly!(ps, linewidth=3, color=:blue) - -ps = [[1,1], [1+dx, 1+dy], [dx, 1+dy],[0,1]] -apoly!(ps, linewidth=3, color=:red) - -ps = [[1,0],[1+dx, dy],[1+dx, 1+dy],[1,1]] -apoly!(ps, linewidth=3, color=:green) -arrow!([.55,.6],.3*[-1,-1/2], color=:blue) -arrow!([1+.6dx, .6], .3*[1,0], color=:blue) -arrow!([.75, 1+.5*dy], .3*[0,1], color=:blue) -annotate!([ - (.5, -.1, "Δy"), - (1+.75dx, .1, "Δx"), - (1+dx+.1, .75, "Δz"), - (.5,.15,L"(x,y,z)"), - (.45,.6, "î"), - (1+.8dx, .667, "ĵ"), - (.8, 1+dy+.067, "k̂"), - (.9, 1.1, "S₁") - ]) -p -``` - - -The box-like volume in space with the top area, with normal $\hat{k}$, designated as $S_1$. The curve $C_1$ traces around $S_1$ in a counter clockwise manner, consistent with the right-hand rule pointing in the outward normal direction. -The face $S_1$ with unit normal $\hat{k}$ looks like: - -```julia; hold=true; echo=false -p = plot(xlim=(-.1,1.25), ylim=(-.2, 1.25),legend=false, xticks=nothing, yticks=nothing, border=:none) -ps = [[1,0],[1,1],[0,1],[0,0]] -#push!(ps, first(ps)) -apoly!(p, ps, linewidth=3, color=:red) -#plot!(p, unzip(ps), linewidth=3, color=:red) -dx = .025 -arrow!([1/2,dx], .01*[1,0], color=:red, linewidth=3) -arrow!([1/2,1-dx], .01*[-1,0], color=:red, linewidth=3) -arrow!([1-dx,1/2], .01*[0,1], color=:red, linewidth=3) -arrow!([dx,1/2], .01*[0,-1], color=:red, linewidth=3) -dx = .05 -annotate!([ - (0, 1/2, "A"), - (1/2,2dx, "B"), - (1-(3/2)dx,1/2, "C"), - (1/2,1-2dx, "D"), - - (.9, 1+dx, "C₁"), - - (2*dx, 1/2, L"\hat{T}=\hat{i}"), - (1+2*dx,1/2, L"\hat{T}=-\hat{i}"), - (1/2,-3/2*dx, L"\hat{T}=\hat{j}"), - (1/2, 1+(3/2)*dx, L"\hat{T}=-\hat{j}"), - - (3dx,1-2dx, "(x,y,z+Δz)"), - (4dx,2dx, "(x+Δx,y,z+Δz)"), - (1-4dx, 1-2dx, "(x,y+Δy,z+Δz)"), - (1-2dx, 2dx, "S₁") - ]) - -p -``` - - -Now we compute the *line integral*. Consider the top face, $S_1$, connecting $(x,y,z+\Delta z), (x + \Delta x, y, z + \Delta z), (x + \Delta x, y + \Delta y, z + \Delta z), (x, y + \Delta y, z + \Delta z)$, Using the *right hand rule*, parameterize the boundary curve, $C_1$, in a counter clockwise direction so the right hand rule yields the outward pointing normal ($\hat{k}$). Then the integral $\oint_{C_1} F\cdot \hat{T} ds$ is *approximated* by the following Riemann sum of $4$ terms: - - -```math -\begin{align*} -F(x,y, z+\Delta{z}) \cdot \hat{i}\Delta{x} &+ F(x+\Delta x, y, z+\Delta{z}) \cdot \hat{j} \Delta y \\ -&+ F(x, y+\Delta y, z+\Delta{z}) \cdot (-\hat{i}) \Delta{x} \\ -&+ F(x, y, z+\Delta{z}) \cdot (-\hat{j}) \Delta{y}. -\end{align*} -``` - -(The points $c_i$ are chosen from the endpoints of the line segments.) - -```math -\begin{align*} -\oint_{C_1} F\cdot \hat{T} ds -&\approx (F_y(x+\Delta x, y, z+\Delta{z}) \\ -&- F_y(x, y, z+\Delta{z})) \Delta{y} \\ -&- (F_x(x,y + \Delta{y}, z+\Delta{z}) \\ -&- F_x(x, y, z+\Delta{z})) \Delta{x} -\end{align*} -``` - -As before, were this divided by the *area* of the surface, we have after rearranging and cancellation: - -```math -\begin{align*} -\frac{1}{\Delta{S_1}} \oint_{C_1} F \cdot \hat{T} ds &\approx -\frac{F_y(x+\Delta x, y, z+\Delta{z}) - F_y(x, y, z+\Delta{z})}{\Delta{x}}\\ -&- \frac{F_x(x, y+\Delta y, z+\Delta{z}) - F_x(x, y, z+\Delta{z})}{\Delta{y}}. -\end{align*} -``` - -In the limit, as $\Delta{S} \rightarrow 0$, this will converge to $\partial{F_y}/\partial{x}-\partial{F_x}/\partial{y}$. - -Had the bottom of the box been used, a similar result would be found, up to a minus sign. - -Unlike the two dimensional case, there are other directions to consider and here -the other sides will yield different answers. Consider now the face connecting $(x,y,z), (x+\Delta{x}, y, z), (x+\Delta{x}, y, z + \Delta{z})$, and $ (x,y,z+\Delta{z})$ with outward pointing normal $-\hat{j}$. Let $S_2$ denote this face and $C_2$ describe its boundary. Orient this curve so that the right hand rule points in the $-\hat{j}$ direction (the outward pointing normal). Then, as before, we can approximate: - -```math -\begin{align*} -\oint_{C_2} F \cdot \hat{T} ds -&\approx -F(x,y,z) \cdot \hat{i} \Delta{x} \\ -&+ F(x+\Delta{x},y,z) \cdot \hat{k} \Delta{z} \\ -&+ F(x,y,z+\Delta{z}) \cdot (-\hat{i}) \Delta{x} \\ -&+ F(x, y, z) \cdot (-\hat{k}) \Delta{z}\\ -&= (F_z(x+\Delta{x},y,z) - F_z(x, y, z))\Delta{z} - -(F_x(x,y,z+\Delta{z}) - F(x,y,z)) \Delta{x}. -\end{align*} -``` - -Dividing by $\Delta{S}=\Delta{x}\Delta{z}$ and taking a limit will give: - -```math -\lim \frac{1}{\Delta{S}} \oint_{C_2} F \cdot \hat{T} ds = -\frac{\partial{F_z}}{\partial{x}} - \frac{\partial{F_x}}{\partial{z}}. -``` - -Had, the opposite face with outward normal $\hat{j}$ been chosen, the answer would differ by a factor of $-1$. - -Similarly, let $S_3$ be the face with outward normal $\hat{i}$ and curve $C_3$ bounding it with parameterization chosen so that the right hand rule points in the direction of $\hat{i}$. This will give - - -```math -\lim \frac{1}{\Delta{S}} \oint_{C_3} F \cdot \hat{T} ds = -\frac{\partial{F_z}}{\partial{y}} - \frac{\partial{F_y}}{\partial{z}}. -``` - -In short, depending on the face chosen, a different answer is given, but all have the same type. - -> Define the *curl* of a $3$-dimensional vector field $F=\langle F_x,F_y,F_z\rangle$ by: -> ```math -> \text{curl}(F) = -> \langle \frac{\partial{F_z}}{\partial{y}} - \frac{\partial{F_y}}{\partial{z}}, -> \frac{\partial{F_x}}{\partial{z}} - \frac{\partial{F_z}}{\partial{x}}, -> \frac{\partial{F_y}}{\partial{x}} - \frac{\partial{F_x}}{\partial{y}} \rangle. -> ``` - -If $S$ is some surface with closed boundary $C$ oriented so that the unit normal, $\hat{N}$, of $S$ is given by the right hand rule about $C$, then - -```math -\hat{N} \cdot \text{curl}(F) = \lim \frac{1}{\Delta{S}} \oint_C F \cdot \hat{T} ds. -``` - - -The curl has a formal representation in terms of a $3\times 3$ determinant, similar to that used to compute the cross product, that is useful for computation: - -```math -\text{curl}(F) = \det\left[ -\begin{array}{} -\hat{i} & \hat{j} & \hat{k}\\ -\frac{\partial}{\partial{x}} & \frac{\partial}{\partial{y}} & \frac{\partial}{\partial{z}}\\ -F_x & F_y & F_z -\end{array} -\right] -``` - ----- - - -In `Julia`, the curl can be implemented different ways depending on how the problem is presented. -We will use the Jacobian matrix to compute the required partials. If the Jacobian is known, this function from the `CalculusWithJulia` package will combine the off-diagonal terms appropriately: - -```julia; eval=false -function curl(J::Matrix) - Mx, Nx, Px, My, Ny, Py, Mz, Nz, Pz = J - [Py-Nz, Mz-Px, Nx-My] # ∇×VF -end -``` - -The computation of the Jacobian differs whether the problem is treated numerically or symbolically. Here are two functions: - -```julia; eval=false -curl(F::Vector{Sym}, vars=free_symbols(F)) = curl(F.jacobian(vars)) -curl(F::Function, pt) = curl(ForwardDiff.jacobian(F, pt)) -``` - -### The $\nabla$ (del) operator - -The divergence, gradient, and curl all involve partial derivatives. There is a notation employed that can express the operations more succinctly. Let the [Del operator](https://en.wikipedia.org/wiki/Del) be defined in Cartesian coordinates by the formal expression: - -> ```math -> \nabla = \langle -> \frac{\partial}{\partial{x}}, -> \frac{\partial}{\partial{y}}, -> \frac{\partial}{\partial{z}} -> \rangle. -> ``` - -This is a *vector differential operator* that acts on functions and vector fields through the typical notation to yield the three operations: - -```math -\begin{align*} -\nabla{f} &= \langle -\frac{\partial{f}}{\partial{x}}, -\frac{\partial{f}}{\partial{y}}, -\frac{\partial{f}}{\partial{z}} -\rangle, \quad\text{the gradient;}\\ -\nabla\cdot{F} &= \langle -\frac{\partial}{\partial{x}}, -\frac{\partial}{\partial{y}}, -\frac{\partial}{\partial{z}} -\rangle \cdot F \\ -&= -\langle -\frac{\partial}{\partial{x}}, -\frac{\partial}{\partial{y}}, -\frac{\partial}{\partial{z}} -\rangle \cdot -\langle F_x, F_y, F_z \rangle \\ -&= -\frac{\partial{F_x}}{\partial{x}} + -\frac{\partial{F_y}}{\partial{y}} + -\frac{\partial{F_z}}{\partial{z}},\quad\text{the divergence;}\\ -\nabla\times F &= \langle -\frac{\partial}{\partial{x}}, -\frac{\partial}{\partial{y}}, -\frac{\partial}{\partial{z}} -\rangle \times F = -\det\left[ -\begin{array}{} -\hat{i} & \hat{j} & \hat{k} \\ -\frac{\partial}{\partial{x}}& -\frac{\partial}{\partial{y}}& -\frac{\partial}{\partial{z}}\\ -F_x & F_y & F_z -\end{array} -\right],\quad\text{the curl}. -\end{align*} -``` - -!!! note - Mathematically operators have not been seen previously, but the concept of an operation on a function that returns another function is a common one when using `Julia`. We have seen many examples (`plot`, `D`, `quadgk`, etc.). In computer science such functions are called *higher order* functions, as they accept arguments which are also functions. - ----- - -In the `CalculusWithJulia` package, the constant `\nabla[\tab]`, producing $\nabla$ implements this operator for functions and symbolic expressions. - -```julia -@syms x::real y::real z::real -``` - -```julia; -f(x,y,z) = x*y*z -f(v) = f(v...) -F(x,y,z) = [x, y, z] -F(v) = F(v...) - -∇(f(x,y,z)) # symbolic operation on the symbolic expression f(x,y,z) -``` - -This usage of `∇ ` takes partial derivatives according to the order given by: - -```julia; -free_symbols(f(x,y,z)) -``` - -which may **not** be as desired. In this case, -the variables can be specified using a tuple to pair up the expression with the variables to differentiate against: - -```julia; -∇( (f(x,y,z), [x,y,z]) ) -``` - -For numeric expressions, we have: - -```julia; -∇(f)(1,2,3) # a numeric computation. Also can call with a point [1,2,3] -``` - -(The extra parentheses are unfortunate. Here `∇` is called like a function.) - -The divergence can be found symbolically: - -```julia; -∇ ⋅ F(x,y,z) -``` - -Or numerically: - -```julia; -(∇ ⋅ F)(1,2,3) # a numeric computation. Also can call (∇ ⋅ F)([1,2,3]) -``` - - -Similarly, the curl. Symbolically: - -```julia; -∇ × F(x,y,z) -``` - - -and numerically: -```julia; -(∇ × F)(1,2,3) # numeric. Also can call (∇ × F)([1,2,3]) -``` - - -There is a subtle difference in usage. Symbolically the evaluation of -`F(x,y,z)` first is desired, numerically the evaluation of `∇ ⋅ F` or -`∇ × F ` first is desired. As `⋅` and `×` have lower precedence than -function evaluation, parentheses must be used in the numeric case. - - -!!! note - As mentioned, for the symbolic evaluations, a specification of three variables (here `x`, `y`, and `z`) is necessary. This use takes `free_symbols` to identify three free symbols which may not always be the case. (It wouldn't be for, say, `F(x,y,z) = [a*x,b*y,0]`, `a` and `b` constants.) In those cases, the notation accepts a tuple to specify the function or vector field and the variables, e.g. (`∇( (f(x,y,z), [x,y,z]) )`, as illustrated; `∇ × (F(x,y,z), [x,y,z])`; or `∇ ⋅ (F(x,y,z), [x,y,z])` where this is written using function calls to produce the symbolic expression in the first positional argument, though a direct expression could also be used. In these cases, the named versions `gradient`, `curl`, and `divergence` may be preferred. - - - - -## Interpretation - -The divergence and curl measure complementary aspects of a vector field. The divergence is defined in terms of flow out of an infinitesimal box, the curl is about rotational flow around an infinitesimal area patch. - - - -Let $F(x,y,z) = [x, 0, 0]$, a vector field pointing in just the $\hat{i}$ direction. The divergence is simply $1$. If $V$ is a box, as in the derivation, then the divergence measures the flow into the side with outward normal $-\hat{i}$ and through the side with outward normal $\hat{i}$ which will clearly be positive as the flow passes through the region $V$, increasing as $x$ increases, when $x > 0$. - - - -The radial vector field $F(x,y,z) = \langle x, y, z \rangle$ is also an example of a divergent field. The divergence is: - -```julia; hold=true -F(x,y,z) = [x,y,z] -∇ ⋅ F(x,y,z) -``` - -There is a constant outward flow, emanating from the origin. Here we picture the field when $z=0$: - -```julia; hold=true; echo=false -gr() -F12(x,y) = [x,y] -F12(v) = F12(v...) -p = plot(legend=false) -vectorfieldplot!(p, F12, xlim=(-5,5), ylim=(-5,5), nx=10, ny=10) -t0, dt = -pi/6, 2pi/6 -r0, dr = 3, 1 -plot!(p, unzip(r -> r * [cos(t0), sin(t0)], r0, r0 + dr)..., linewidth=3) -plot!(p, unzip(r -> r * [cos(t0+dt), sin(t0+dt)], r0, r0 + dr)..., linewidth=3) -plot!(p, unzip(t -> r0 * [cos(t), sin(t)], t0, t0 + dt)..., linewidth=3) -plot!(p, unzip(t -> (r0+dr) * [cos(t), sin(t)], t0, t0 + dt)..., linewidth=3) - -p -``` - -Consider the limit definition of the divergence: - -```math -\nabla\cdot{F} = \lim \frac{1}{\Delta{V}} \oint_S F\cdot\hat{N} dA. -``` - -In the vector field above, the shape along the curved edges has constant magnitude field. On the left curved edge, the length is smaller and the field is smaller than on the right. The flux across the left edge will be less than the flux across the right edge, and a net flux will exist. That is, there is divergence. - - -Now, were the field on the right edge less, it might be that the two balance out and there is no divergence. This occurs with the inverse square laws, such as for gravity and electric field: - -```julia; hold=true -R = [x,y,z] -Rhat = R/norm(R) -VF = (1/norm(R)^2) * Rhat -∇ ⋅ VF |> simplify -``` - - ----- - -The vector field $F(x,y,z) = \langle -y, x, 0 \rangle$ is an example of a rotational field. It's curl can be computed symbolically through: - -```julia; -curl([-y,x,0], [x,y,z]) -``` - -This vector field rotates as seen in this figure showing slices for different values of $z$: - -```julia; hold=true; echo=false -V(x,y,z) = [-y, x,0] -V(v) = V(v...) -p = plot([NaN],[NaN],[NaN], legend=false) -ys = xs = range(-2,2, length=10 ) -zs = range(0, 4, length=3) -CalculusWithJulia.vectorfieldplot3d!(p, V, xs, ys, zs, nz=3) -plot!(p, [0,0], [0,0],[-1,5], linewidth=3) -p -``` - -The field has a clear rotation about the $z$ axis (illustrated with a line), the curl is a vector that points in the direction of the *right hand* rule as the right hand fingers follow the flow with magnitude given by the amount of rotation. - -This is a bit misleading though, the curl is defined by a limit, and not in terms of a large box. The key point for this field is that the strength of the field is stronger as the points get farther away, so for a properly oriented small box, the integral along the closer edge will be less than that along the outer edge. - -Consider a related field where the strength gets smaller as the point gets farther away but otherwise has the same circular rotation pattern - -```julia; hold=true -R = [-y, x, 0] -VF = R / norm(R)^2 -curl(VF, [x,y,z]) .|> simplify -``` - -Further, the curl of `R/norm(R)^3` now points in the *opposite* direction of the curl of `R`. -This example isn't typical, as dividing by `norm(R)` with a power greater than $1$ makes the vector field discontinuous at the origin. - - -The curl of the vector field $F(x,y,z) = \langle 0, 1+y^2, 0\rangle$ is $0$, as there is clearly no rotation as seen in this slice where $z=0$: - -```julia; hold=true; echo=false -vectorfieldplot((x,y) -> [0, 1+y^2], xlim=(-1,1), ylim=(-1,1), nx=10, ny=8) -``` - -Algebraically, this is so: - -```julia; -curl(Sym[0,1+y^2,0], [x,y,z]) -``` - -Now consider a similar field $F(x,y,z) = \langle 0, 1+x^2, 0,\rangle$. A slice is somewhat similar, in that the flow lines are all in the $\hat{j}$ direction: - -```julia; hold=true; echo=false -vectorfieldplot((x,y) -> [0, 1+x^2], xlim=(-1,1), ylim=(-1,1), nx=10, ny=8) -``` - -However, this vector field has a curl: - -```julia; -curl([0, 1+x^2,0], [x,y,z]) -``` - -The curl points in the $\hat{k}$ direction (out of the figure). A useful visualization is to mentally place a small paddlewheel at a point and imagine if it will turn. -In the constant field case, there is equal flow on both sides of the axis, so it any forces on the wheel blades will balance out. In the latter example, if $x > 0$, the force on the right side will be greater than the force on the left so the paddlewheel would rotate counter clockwise. The right hand rule for this rotation will point in the upward, or $\hat{k}$ direction, as seen algebraically in the curl. - -Following Strang, in general the curl can point in any direction, so the amount the paddlewheel will spin will be related to how the paddlewheel is oriented. The angular velocity of the wheel will be $(1/2)(\nabla\times{F})\cdot\hat{N}$, $\hat{N}$ being the normal for the paddlewheel. - -If $\vec{a}$ is some vector and $\hat{r} = \langle x, y, z\rangle$ is the radial vector, then $\vec{a} \times \vec{r}$ has a curl, which is given by: - -```julia; hold=true -@syms a1 a2 a3 -a = [a1, a2, a3] -r = [x, y, z] -curl(a × r, [x,y, z]) -``` - -The angular velocity then is $\vec{a} \cdot \hat{N}$. The curl is constant. As the dot product involves the cosine of the angle between the two vectors, we see the turning speed is largest when $\hat{N}$ is parallel to $\vec{a}$. This gives a similar statement for the curl like the gradient does for steepest growth rate: the maximum rotation rate of $F$ is $(1/2)\|\nabla\times{F}\|$ in the direction of $\nabla\times{F}$. - - - -The curl of the radial vector field, $F(x,y,z) = \langle x, y, z\rangle$ will be $\vec{0}$: - -```julia; -curl([x,y,z], [x,y,z]) -``` - -We will see that this can be anticipated, as $F = (1/2) \nabla(x^2+y^2+z^2)$ is a gradient field. - - -In fact, the curl of any radial field will be $\vec{0}$. Here we represent a radial field as a scalar function of $\vec{r}$ time $\hat{r}$: - -```julia; hold=true -@syms H() -R = sqrt(x^2 + y^2 + z^2) -Rhat = [x, y, z]/R -curl(H(R) * Rhat, [x, y, z]) -``` - -Were one to represent the curl in [spherical](https://en.wikipedia.org/wiki/Del_in_cylindrical_and_spherical_coordinates) coordinates (below), this follows algebraically from the formula easily enough. -To anticipate this, due to symmetry, the curl would need to be the same along any ray emanating from the origin and again by symmetry could only possible point along the ray. Mentally place a paddlewheel along the $x$ axis oriented along $\hat{i}$. There will be no rotational forces that could make the wheel spin around the $x$-axis, hence the curl must be $0$. - - -## The Maxwell equations - -The divergence and curl appear in [Maxwell](https://en.wikipedia.org/wiki/Maxwell%27s_equations)'s equations describing the relationships of electromagnetism. In the formulas below the notation is $E$ is the electric field; $B$ is the magnetic field; $\rho$ is the charge *density* (charge per unit volume); $J$ the electric current density (current per unit area); and $\epsilon_0$, $\mu_0$, and $c$ are universal constants. - -The equations in differential form are: - -> Gauss's law: $\nabla\cdot{E} = \rho/\epsilon_0$. - -That is, the divergence of the electric field is proportional to the density. We have already mentioned this in *integral* form. - -> Gauss's law of magnetism: $\nabla\cdot{B} = 0$ - -The magnetic field has no divergence. This says that there no magnetic charges (a magnetic monopole) unlike electric charge, according to Maxwell's laws. - -> Faraday's law of induction: $\nabla\times{E} = - \partial{B}/\partial{t}$. - -The curl of the *time-varying* electric field is in the direction of the partial derivative of the magnetic field. For example, if a magnet is in motion in the in the $z$ axis, then the electric field has rotation in the $x-y$ plane *induced* by the motion of the magnet. - -> Ampere's circuital law: $\nabla\times{B} = \mu_0J + \mu_0\epsilon_0 \partial{E}/\partial{t}$ - -The curl of the magnetic field is related to the sum of the electric current density and the change in time of the electric field. - ----- - -In a region with no charges ($\rho=0$) and no currents ($J=\vec{0}$), such as a vacuum, these equations reduce to two divergences being $0$: $\nabla\cdot{E} = 0$ and $\nabla\cdot{B}=0$; and two curl relationships with time derivatives: $\nabla\times{E}= -\partial{B}/\partial{t}$ and $\nabla\times{B} = \mu_0\epsilon_0 \partial{E}/\partial{t}$. - -We will see later how these are differential forms are consequences of related integral forms. - - - - -## Algebra of vector calculus - -The divergence, gradient, and curl satisfy several algebraic [properties](https://en.wikipedia.org/wiki/Vector_calculus_identities). - -Let $f$ and $g$ denote scalar functions, $R^3 \rightarrow R$ and $F$ and $G$ be vector fields, $R^3 \rightarrow R^3$. - -### Linearity - -As with the sum rule of univariate derivatives, these operations satisfy: - -```math -\begin{align} -\nabla(f + g) &= \nabla{f} + \nabla{g}\\ -\nabla\cdot(F+G) &= \nabla\cdot{F} + \nabla\cdot{G}\\ -\nabla\times(F+G) &= \nabla\times{F} + \nabla\times{G}. -\end{align} -``` - -### Product rule - -The product rule $(uv)' = u'v + uv'$ has related formulas: - -```math -\begin{align} -\nabla{(fg)} &= (\nabla{f}) g + f\nabla{g} = g\nabla{f} + f\nabla{g}\\ -\nabla\cdot{fF} &= (\nabla{f})\cdot{F} + f(\nabla\cdot{F})\\ -\nabla\times{fF} &= (\nabla{f})\times{F} + f(\nabla\times{F}). -\end{align} -``` - -### Rules over cross products - -The cross product of two vector fields is a vector field for which the divergence and curl may be taken. There are formulas to relate to the individual terms: - -```math -\begin{align} -\nabla\cdot(F \times G) &= (\nabla\times{F})\cdot G - F \cdot (\nabla\times{G})\\ -\nabla\times(F \times G) &= F(\nabla\cdot{G}) - G(\nabla\cdot{F} + (G\cdot\nabla)F-(F\cdot\nabla)G\\ -&= \nabla\cdot(BA^t - AB^t). -\end{align} -``` -The curl formula is more involved. - -### Vanishing properties - -Surprisingly, the curl and divergence satisfy two vanishing properties. First - -> The curl of a gradient field is $\vec{0}$ -> ```math -> \nabla \times \nabla{f} = \vec{0}, -> ``` - -if the scalar function $f$ is has continuous second derivatives (so the mixed partials do not depend on order). - -Vector fields where $F = \nabla{f}$ are conservative. Conservative fields have path independence, so any line integral, $\oint F\cdot \hat{T} ds$, around a closed loop will be $0$. But the curl is defined as a limit of such integrals, so it too will be $\vec{0}$. In short, conservative fields have no rotation. - -What about the converse? If a vector field has zero curl, then integrals around infinitesimally small loops are $0$. Does this *also* mean that integrals around larger closed loops will also be $0$, and hence the field is conservative? The answer will be yes, *under assumptions*. But the discussion will wait for later. - -The combination $\nabla\cdot\nabla{f}$ is defined and is called the Laplacian. This is denoted $\Delta{f}$. The equation $\Delta{f} = 0$ is called Laplace's equation. It is *not* guaranteed for any scalar function $f$, but the $f$ for which it holds are important. - -Second, - -> The divergence of a curl field is $0$: ->```math ->\nabla \cdot(\nabla\times{F}) = 0. -> ``` - -This is not as clear, but can be seen algebraically as terms cancel. First: - -```math -\begin{align*} -\nabla\cdot(\nabla\times{F}) &= -\langle -\frac{\partial}{\partial{x}}, -\frac{\partial}{\partial{y}}, -\frac{\partial}{\partial{z}}\rangle \cdot -\langle -\frac{\partial{F_z}}{\partial{y}} - \frac{\partial{F_y}}{\partial{z}}, -\frac{\partial{F_x}}{\partial{z}} - \frac{\partial{F_z}}{\partial{x}}, -\frac{\partial{F_y}}{\partial{x}} - \frac{\partial{F_x}}{\partial{y}} -\rangle \\ -&= -\left(\frac{\partial^2{F_z}}{\partial{y}\partial{x}} - \frac{\partial^2{F_y}}{\partial{z}\partial{x}}\right) + -\left(\frac{\partial^2{F_x}}{\partial{z}\partial{y}} - \frac{\partial^2{F_z}}{\partial{x}\partial{y}}\right) + -\left(\frac{\partial^2{F_y}}{\partial{x}\partial{z}} - \frac{\partial^2{F_x}}{\partial{y}\partial{z}}\right) -\end{align*} -``` - -Focusing on one component function, $F_z$ say, we see this contribution: - -```math -\frac{\partial^2{F_z}}{\partial{y}\partial{x}} - -\frac{\partial^2{F_z}}{\partial{x}\partial{y}}. -``` - -This is zero under the assumption that the second partial derivatives are continuous. - -From the microscopic picture of a box this can also be seen. Again we focus on just the appearance of the $F_z$ component function. Let the faces with normals $\hat{i}, \hat{j},-\hat{i}, -\hat{j}$ be labeled $A, B, C$, and $D$. This figure shows $A$ (enclosed in blue) and $B$ (enclosed in green): - -```julia; hold=true; echo=false - -dx = .5 -dy = .250 -offset=5 -p = plot(;ylim=(0-.1, 1+dy+.1), legend=false, aspect_ratio=:equal,xticks=nothing,yticks=nothing, border=:none) -plot!(p, [dx,dx],[dy,1+dy-offset/100], linestyle=:dash) -plot!(p, [0+offset/100,dx],[0+offset/100,dy], linestyle=:dash) -plot!(p, [dx,1+dx-2offset/100],[dy,dy], linestyle=:dash) - -ps = [[1,1], [0,1],[0,0],[1,0]] -apoly!(ps, linewidth=3, color=:blue) - -ps = [[1,1], [1+dx, 1+dy], [dx, 1+dy],[0,1]] -apoly!(ps, linewidth=3, color=:red) - -ps = [[1,0],[1+dx, dy],[1+dx, 1+dy],[1,1]] -apoly!(ps, linewidth=3, color=:green) -annotate!(dx+.02, dy-0.05, L"P_1") -annotate!(0+0.05, 0 - 0.02, L"P_2") -annotate!(1+0.05, 0 - 0.02, L"P_3") -annotate!(1+dx+.02, dy-0.05, L"P_4") -p -``` - - -We will get from the *approximate* surface integral of the *approximate* curl the following terms: - -```julia; hold=true -@syms x y z Δx Δy Δz -p1, p2, p3, p4=(x, y, z), (x + Δx, y, z), (x + Δx, y + Δy, z), (x, y + Δy, z) -@syms F_z() -global exₐ = (-F_z(p2...) + F_z(p3...))*Δz + # face A -(-F_z(p3...) + F_z(p4...))*Δz + # face B -(F_z(p1...) - F_z(p4...))*Δz + # face C -(F_z(p2...) - F_z(p1...))*Δz # face D -``` - -The term for face $A$, say, should be divided by $\Delta{y}\Delta{z}$ for the curl approximation, but this will be multiplied by the same amount for the divergence calculation, so it isn't written. - -The expression above simplifies to: - -```julia; -simplify(exₐ) -``` - -This is because of how the line integrals are oriented so that the right-hand rule gives outward pointing normals. For each up stroke for one face, there is a downstroke for a different face, and so the corresponding terms cancel each other out. So providing the limit of these two approximations holds, the vanishing identity can be anticipated from the microscopic picture. - - -##### Example - -The [invariance of charge](https://en.wikipedia.org/wiki/Maxwell%27s_equations#Charge_conservation) can be derived as a corollary of Maxwell's equation. The divergence of the curl of the magnetic field is $0$, leading to: - -```mat -\begin{align*} -0 &= \nabla\cdot(\nabla\times{B}) \\ -&= -\mu_0(\nabla\cdot{J} + \epsilon_0 \nabla\cdot{\frac{\partial{E}}{\partial{t}}}) \\ -&= -\mu_0(\nabla\cdot{J} + \epsilon_0 \frac{\partial}{\partial{t}}(\nabla\cdot{E})) \\ -&= -\mu_0(\nabla\cdot{J} + \frac{\partial{\rho}}{\partial{t}}). -\end{align*} -``` - -That is $\nabla\cdot{J} = -\partial{\rho}/\partial{t}$. -This says any change in the charge density in time ($\partial{\rho}/\partial{t}$) is balanced off by a divergence in the electric current density ($\nabla\cdot{J}$). That is, charge can't be created or destroyed in an isolated system. - - -## Fundamental theorem of vector calculus - -The divergence and curl are complementary ideas. Are there other distinct ideas to sort a vector field by? -The Helmholtz decomposition says not really. It states that vector fields that decay rapidly enough can be expressed in terms of two pieces: one with no curl and one with no divergence. - -From [Wikipedia](https://en.wikipedia.org/wiki/Helmholtz_decomposition) we have this formulation: - -Let $F$ be a vector field on a **bounded** domain $V$ which is twice continuously differentiable. Let $S$ be the surface enclosing $V$. Then $F$ can be decomposed into a curl-free component and a divergence-free component: - -```math -F = -\nabla(\phi) + \nabla\times A. -``` - -Without explaining why, these values can be computed using volume and -surface integrals: - -```math -\begin{align} -\phi(\vec{r}') &= -\frac{1}{4\pi} \int_V \frac{\nabla \cdot F(\vec{r})}{\|\vec{r}'-\vec{r} \|} dV - -\frac{1}{4\pi} \oint_S \frac{F(\vec{r})}{\|\vec{r}'-\vec{r} \|} \cdot \hat{N} dS\\ -A(\vec{r}') &= \frac{1}{4\pi} \int_V \frac{\nabla \times F(\vec{r})}{\|\vec{r}'-\vec{r} \|} dV + -\frac{1}{4\pi} \oint_S \frac{F(\vec{r})}{\|\vec{r}'-\vec{r} \|} \times \hat{N} dS. -\end{align} -``` - -If $V = R^3$, an unbounded domain, *but* $F$ *vanishes* faster than $1/r$, then the theorem still holds with just the volume integrals: - -```math -\begin{align} -\phi(\vec{r}') &=\frac{1}{4\pi} \int_V \frac{\nabla \cdot F(\vec{r})}{\|\vec{r}'-\vec{r} \|} dV\\ -A(\vec{r}') &= \frac{1}{4\pi} \int_V \frac{\nabla \times F(\vec{r})}{\|\vec{r}'-\vec{r}\|} dV. -\end{align} -``` - - -## Change of variable - -The divergence and curl are defined in a manner independent of the coordinate system, though the method to compute them depends on the Cartesian coordinate system. If that is inconvenient, then it is possible to develop the ideas in different coordinate systems. - -Some details are [here](https://en.wikipedia.org/wiki/Curvilinear_coordinates), the following is based on [some lecture notes](https://www.jfoadi.me.uk/documents/lecture_mathphys2_05.pdf). - -We restrict to $n=3$ and use $(x,y,z)$ for Cartesian coordinates and $(u,v,w)$ for an *orthogonal* curvilinear coordinate system, such as spherical or cylindrical. If $\vec{r} = \langle x,y,z\rangle$, then - -```math -\begin{align} -d\vec{r} &= \langle dx,dy,dz \rangle = J \langle du,dv,dw\rangle\\ -&= -\left[ \frac{\partial{\vec{r}}}{\partial{u}} \vdots -\frac{\partial{\vec{r}}}{\partial{v}} \vdots -\frac{\partial{\vec{r}}}{\partial{w}} \right] \langle du,dv,dw\rangle\\ -&= \frac{\partial{\vec{r}}}{\partial{u}} du + -\frac{\partial{\vec{r}}}{\partial{v}} dv -\frac{\partial{\vec{r}}}{\partial{w}} dw. -\end{align} -``` - -The term ${\partial{\vec{r}}}/{\partial{u}}$ is tangent to the curve formed by *assuming* $v$ and $w$ are constant and letting $u$ vary. Similarly for the other partial derivatives. Orthogonality assumes that at every point, these tangent vectors are orthogonal. - -As ${\partial{\vec{r}}}/{\partial{u}}$ is a vector it has a magnitude and direction. Define the scale factors as the magnitudes: - -```math -h_u = \| \frac{\partial{\vec{r}}}{\partial{u}} \|,\quad -h_v = \| \frac{\partial{\vec{r}}}{\partial{v}} \|,\quad -h_w = \| \frac{\partial{\vec{r}}}{\partial{w}} \|. -``` - -and let $\hat{e}_u$, $\hat{e}_v$, and $\hat{e}_w$ be the unit, direction vectors. - -This gives the following notation: - -```math -d\vec{r} = h_u du \hat{e}_u + h_v dv \hat{e}_v + h_w dw \hat{e}_w. -``` - - -From here, we can express different formulas. - -For line integrals, we have the line element: - -```math -dl = \sqrt{d\vec{r}\cdot d\vec{r}} = \sqrt{(h_ud_u)^2 + (h_vd_v)^2 + (h_wd_w)^2}. -``` - - -Consider the surface for constant $u$. The vector $\hat{e}_v$ and $\hat{e}_w$ lie in the surface's tangent plane, and the surface element will be: - -```math -dS_u = \| h_v dv \hat{e}_v \times h_w dw \hat{e}_w \| = h_v h_w dv dw \| \hat{e}_v \| = h_v h_w dv dw. -``` - -This uses orthogonality, so $\hat{e}_v \times \hat{e}_w$ is parallel to $\hat{e}_u$ and has unit length. Similarly, $dS_v = h_u h_w du dw$ and $dS_w = h_u h_v du dv$ . - -The volume element is found by *projecting* $d\vec{r}$ onto the $\hat{e}_u$, $\hat{e}_v$, $\hat{e}_w$ coordinate system through $(d\vec{r} \cdot\hat{e}_u) \hat{e}_u$, $(d\vec{r} \cdot\hat{e}_v) \hat{e}_v$, and $(d\vec{r} \cdot\hat{e}_w) \hat{e}_w$. Then forming the triple scalar product to compute the volume of the parallelepiped: - -```math -\begin{align*} -\left[(d\vec{r} \cdot\hat{e}_u) \hat{e}_u\right] \cdot -\left( -\left[(d\vec{r} \cdot\hat{e}_v) \hat{e}_v\right] \times -\left[(d\vec{r} \cdot\hat{e}_w) \hat{e}_w\right] -\right) &= -(h_u h_v h_w) ( du dv dw ) (\hat{e}_u \cdot (\hat{e}_v \times \hat{e}_w) \\ -&= -h_u h_v h_w du dv dw, -\end{align*} -``` - -as the unit vectors are orthonormal, their triple scalar product is $1$ and $d\vec{r}\cdot\hat{e}_u = h_u du$, etc. - - -### Example - -We consider spherical coordinates with - -```math -F(r, \theta, \phi) = \langle -r \sin(\phi) \cos(\theta), -r \sin(\phi) \sin(\theta), -r \cos(\phi) -\rangle. -``` - -The following figure draws curves starting at $(r_0, \theta_0, \phi_0)$ formed by holding $2$ of the $3$ variables constant. The tangent vectors are added in blue. The surface $S_r$ formed by a constant value of $r$ is illustrated. - - -```julia; hold=true; echo=false -Fx(r, theta, phi) = r * cos(theta) * sin(phi) -Fy(r, theta, phi) = r * sin(theta) * sin(phi) -Fz(r, theta, phi) = r * cos(phi) -F(r, theta, phi) = [Fx(r,theta,phi), Fy(r, theta, phi), Fz(r, theta, phi)] -Ftp(theta, phi;r = r0) = F(r, theta, phi) - -r0, t0, p0 = 1, pi/4, pi/4 -dr, dt, dp = 0.15, pi/8, pi/24 -nr = nt = np = 5 -rs = range(r0, r0+dr, length=nr) -ts = range(t0, t0+dt, length=nt) -ps = range(p0, p0 + dp, length=np) - - -# plot lines for fixed r, theta, phi -p = plot(unzip(r -> F(r,t0,p0), r0, r0+dr)..., legend=false, linewidth=2, color=:black, camera=(50, 60)) -plot!(p, unzip(t -> F(r0,t,p0), t0, t0+dt)..., linewidth=2, color=:black) -plot!(p, unzip(p -> F(r0,t0,p), p0, p0+dp)..., linewidth=2, color=:black) - -for theta in ts[2:end] - plot!(p, unzip(phi -> Ftp(theta, phi), p0, p0+dp)...) -end -for phi in ps[2:end] - plot!(p, unzip(theta -> Ftp(theta, phi), t0, t0+dt)...) -end - -∂Fr(r, theta, phi) = [cos(theta) * sin(phi), sin(theta) * sin(phi), cos(phi)] -∂Fθ(r, theta, phi) = [-r*sin(theta)*sin(phi), r*cos(theta)*sin(phi), 0] -∂Fϕ(r, theta, phi) = [r*cos(theta)*cos(phi), r*sin(theta)*cos(phi), -r*sin(phi)] - -pt = (r0, t0, p0) -arrow!(p, F(pt...), (1/15) * ∂Fr(pt...), color=:blue, linewidth=4) -arrow!(p, F(pt...), (1/4) * ∂Fθ(pt...), color=:blue, linewidth=4) -arrow!(p, F(pt...), (1/10) * ∂Fϕ(pt...), color=:blue, linewidth=4) -p -``` - -The tangent vectors found from the partial derivatives of $\vec{r}$: - -```math -\begin{align} -\frac{\partial{\vec{r}}}{\partial{r}} &= -\langle \cos(\theta) \cdot \sin(\phi), \sin(\theta) \cdot \sin(\phi), \cos(\phi)\rangle,\\ -\frac{\partial{\vec{r}}}{\partial{\theta}} &= -\langle -r\cdot\sin(\theta)\cdot\sin(\phi), r\cdot\cos(\theta)\cdot\sin(\phi), 0\rangle,\\ -\frac{\partial{\vec{r}}}{\partial{\phi}} &= -\langle r\cdot\cos(\theta)\cdot\cos(\phi), r\cdot\sin(\theta)\cdot\cos(\phi), -r\cdot\sin(\phi) \rangle. -\end{align} -``` - -With this, we have $h_r=1$, $h_\theta=r\sin(\phi)$, and $h_\phi = r$. So that - -```math -\begin{align*} -dl &= \sqrt{dr^2 + (r\sin(\phi)d\theta^2) + (rd\phi)^2},\\ -dS_r &= r^2\sin(\phi)d\theta d\phi,\\ -dS_\theta &= rdr d\phi,\\ -dS_\phi &= r\sin(\phi)dr d\theta, \quad\text{and}\\ -dV &= r^2\sin(\phi) drd\theta d\phi. -\end{align*} -``` - -The following visualizes the volume and the surface elements. - -```julia; hold=true; echo=false -Fx(r, theta, phi) = r * cos(theta) * sin(phi) -Fy(r, theta, phi) = r * sin(theta) * sin(phi) -Fz(r, theta, phi) = r * cos(phi) -F(r, theta, phi) = [Fx(r,theta,phi), Fy(r, theta, phi), Fz(r, theta, phi)] -Ftp(theta, phi;r = r0) = F(r, theta, phi) - -r0, t0, p0 = 1, pi/4, pi/4 -dr, dt, dp = 0.15, pi/8, pi/24 -nr = nt = np = 5 -rs = range(r0, r0+dr, length=nr) -ts = range(t0, t0+dt, length=nt) -ps = range(p0, p0 + dp, length=np) - - -# plot lines for fixed r, theta, phi -p = plot(unzip(r -> F(r,t0,p0), r0, r0+dr)..., legend=false, linewidth=2, color=:black, camera=(50, 60)) -plot!(p, unzip(r -> F(r,t0+dt,p0), r0, r0+dr)..., linewidth=2, color=:black) -plot!(p, unzip(r -> F(r,t0,p0+dp), r0, r0+dr)..., linewidth=2, color=:black) -plot!(p, unzip(r -> F(r,t0+dt,p0+dp), r0, r0+dr)..., linewidth=2, color=:black) - -plot!(p, unzip(t -> F(r0,t,p0), t0, t0+dt)..., linewidth=2, color=:black) -plot!(p, unzip(t -> F(r0+dr,t,p0), t0, t0+dt)..., linewidth=2, color=:black) -plot!(p, unzip(t -> F(r0,t,p0+dp), t0, t0+dt)..., linewidth=2, color=:black) -plot!(p, unzip(t -> F(r0+dr,t,p0+dp), t0, t0+dt)..., linewidth=2, color=:black) - -plot!(p, unzip(p -> F(r0,t0,p), p0, p0+dp)..., linewidth=2, color=:black) -plot!(p, unzip(p -> F(r0+dr,t0,p), p0, p0+dp)..., linewidth=2, color=:black) -plot!(p, unzip(p -> F(r0,t0+dt,p), p0, p0+dp)..., linewidth=2, color=:black) -plot!(p, unzip(p -> F(r0+dr,t0+dt,p), p0, p0+dp)..., linewidth=2, color=:black) - - -∂Fr(r, theta, phi) = [cos(theta) * sin(phi), sin(theta) * sin(phi), cos(phi)] -∂Fθ(r, theta, phi) = [-r*sin(theta)*sin(phi), r*cos(theta)*sin(phi), 0] -∂Fϕ(r, theta, phi) = [r*cos(theta)*cos(phi), r*sin(theta)*cos(phi), -r*sin(phi)] - -pt = (r0, t0, p0) -arrow!(p, F(pt...), (1/15) * ∂Fr(pt...), color=:blue, linewidth=4) -arrow!(p, F(pt...), (1/4) * ∂Fθ(pt...), color=:blue, linewidth=4) -arrow!(p, F(pt...), (1/10) * ∂Fϕ(pt...), color=:blue, linewidth=4) -p -``` - - -### The gradient in a new coordinate system - -If $f$ is a scalar function then $df = \nabla{f} \cdot d\vec{r}$ by the chain rule. Using the curvilinear coordinates: - -```math -\begin{align*} -df &= -\frac{\partial{f}}{\partial{u}} du + -\frac{\partial{f}}{\partial{v}} dv + -\frac{\partial{f}}{\partial{w}} dw \\ -&= -\frac{1}{h_u}\frac{\partial{f}}{\partial{u}} h_udu + -\frac{1}{h_v}\frac{\partial{f}}{\partial{v}} h_vdv + -\frac{1}{h_w}\frac{\partial{f}}{\partial{w}} h_wdw. -\end{align*} -``` - -But, as was used above, $d\vec{r} \cdot \hat{e}_u = h_u du$, etc. so $df$ can be re-expressed as: - -```math -df = (\frac{1}{h_u}\frac{\partial{f}}{\partial{u}}\hat{e}_u + -\frac{1}{h_v}\frac{\partial{f}}{\partial{v}}\hat{e}_v + -\frac{1}{h_w}\frac{\partial{f}}{\partial{w}}\hat{e}_w) \cdot d\vec{r} = -\nabla{f} \cdot d\vec{r}. -``` - -The gradient is the part within the parentheses. - ----- - -As an example, in cylindrical coordinates, we have $h_r =1$, $h_\theta=r$, and $h_z=1$, giving: - -```math -\nabla{f} = \frac{\partial{f}}{\partial{r}}\hat{e}_r + -\frac{1}{r}\frac{\partial{f}}{\partial{\theta}}\hat{e}_\theta + -\frac{\partial{f}}{\partial{z}}\hat{e}_z -``` - - -### The divergence in a new coordinate system - -The divergence is a result of the limit of a surface integral, - -```math -\nabla \cdot F = \lim \frac{1}{\Delta{V}}\oint_S F \cdot \hat{N} dS. -``` - -Taking $V$ as a box in the curvilinear coordinates, with side lengths $h_udu$, $h_vdv$, and $h_wdw$ -the surface integral is computed by projecting $F$ onto each normal area element and multiplying by the area. The task is similar to how the the divergence was derived above, only now the terms are like $\partial{(F_uh_vh_w)}/\partial{u}$ due to the scale factors ($F_u$ is the u component of $F$.) The result is: - -```math -\nabla\cdot F = \frac{1}{h_u h_v h_w}\left[ -\frac{\partial{(F_uh_vh_w)}}{\partial{u}} + -\frac{\partial{(h_uF_vh_w)}}{\partial{v}} + -\frac{\partial{(h_uh_vF_w)}}{\partial{w}} \right]. -``` - ----- - -For example, in cylindrical coordinates, we have - -```math -\nabla \cdot F = \frac{1}{r} -\left[ -\frac{\partial{F_r r}}{\partial{r}} + -\frac{\partial{F_\theta}}{\partial{\theta}} + -\frac{\partial{F_x}}{\partial{z}} -\right]. -``` - -### The curl in a new coordinate system - -The curl, like the divergence, can be expressed as the limit of an integral: - -```math -(\nabla \times F) \cdot \hat{N} = \lim \frac{1}{\Delta{S}} \oint_C F \cdot d\vec{r}, -``` - -where $S$ is a surface perpendicular to $\hat{N}$ with boundary $C$. For a small rectangular surface, the derivation is similar to above, only the scale factors are included. This gives, say, for the $\hat{e}_u$ normal, $\frac{\partial{(h_zF_z)}}{\partial{y}} - \frac{\partial{(h_yF_y)}}{\partial{z}}$. The following determinant form combines the terms compactly: - -```math -\nabla\times{F} = \det \left[ -\begin{array}{} -h_u\hat{e}_u & h_v\hat{e}_v & h_w\hat{e}_w \\ -\frac{\partial}{\partial{u}} & \frac{\partial}{\partial{v}} & \frac{\partial}{\partial{w}} \\ -h_uF_u & h_v F_v & h_w F_w -\end{array} -\right]. -``` - ----- - -For example, in cylindrical coordinates, the curl is: - -```math -\det\left[ -\begin{array}{} -\hat{r} & r\hat{\theta} & \hat{k} \\ -\frac{\partial}{\partial{r}} & \frac{\partial}{\partial{\theta}} & \frac{\partial}{\partial{z}} \\ -F_r & rF_\theta & F_z -\end{array} -\right] -``` - -Applying this to the function $F(r,\theta, z) = \hat{\theta}$ we get: - -```math -\text{curl}(F) = \det\left[ -\begin{array}{} -\hat{r} & r\hat{\theta} & \hat{k} \\ -\frac{\partial}{\partial{r}} & \frac{\partial}{\partial{\theta}} & \frac{\partial}{\partial{z}} \\ -0 & r & 0 -\end{array} -\right] = -\hat{k} \det\left[ -\begin{array}{} -\frac{\partial}{\partial{r}} & \frac{\partial}{\partial{\theta}}\\ -0 & r -\end{array} -\right] = -\hat{k}. -``` - -As $F$ represents a vector field that rotates about the $z$ axis at a constant rate, the magnitude of the curl should be a constant and it should point in the $\hat{k}$ direction, as we found. - -## Questions - - -###### Question - -Numerically find the divergence of $F(x,y,z) = \langle xy, yz, zx\rangle$ at the point $\langle 1,2,3\rangle$. - -```julia; hold=true; echo=false -F(x,y,z) = [x*y, y*z, z*x] -pt = [1,2,3] -Jac = ForwardDiff.jacobian(pt -> F(pt...), pt) -val = sum(diag(Jac)) -numericq(val) -``` - -###### Question - -Numerically find the curl of $F(x,y,z) = \langle xy, yz, zx\rangle$ at the point $\langle 1,2,3\rangle$. What is the $x$ component? - -```julia; hold=true; echo=false -F(x,y,z) = [x*y, y*z, z*x] -F(v) = F(v...) -pt = [1,2,3] -vals = (∇×F)(pt) -val = vals[1] -numericq(val) -``` - - -###### Question - -Let $F(x,y,z) = \langle \sin(x), e^{xy}, xyz\rangle$. Find the divergence of $F$ symbolically. - -```julia; hold=true; echo=false -choices = [ -raw" ``x y + x e^{x y} + \cos{\left (x \right )}``", -raw" ``x y + x e^{x y}``", -raw" ``x e^{x y} + \cos{\left (x \right )}``" -] -answ=1 -radioq(choices, answ) -``` - - - -###### Question - -Let $F(x,y,z) = \langle \sin(x), e^{xy}, xyz\rangle$. Find the curl of $F$ symbolically. What is the $x$ component? - -```julia; hold=true; echo=false -choices = [ -raw" ``xz``", -raw" ``-yz``", -raw" ``ye^{xy}``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Let $\phi(x,y,z) = x + 2y + 3z$. We know that $\nabla\times\nabla{\phi}$ is zero by the vanishing property. Compute $\nabla\cdot\nabla{\phi}$. - -```julia; hold=true; echo=false -choices=[ -raw" ``0``", -raw" ``\vec{0}``", -raw" ``6``" -] -answ=1 -radioq(choices, answ) -``` - - -###### Question - -In two dimension's the curl of a gradient field simplifies to: - -```math -\nabla\times\nabla{f} = \nabla\times -\langle\frac{\partial{f}}{\partial{x}}, -\frac{\partial{f}}{\partial{y}}\rangle = -\frac{\partial{\frac{\partial{f}}{\partial{y}}}}{\partial{x}} - -\frac{\partial{\frac{\partial{f}}{\partial{x}}}}{\partial{y}}. -``` - -```julia; hold=true; echo=false -choices = [ -L"This is $0$ if the partial derivatives are continuous by Schwarz's (Clairault's) theorem", -L"This is $0$ for any $f$, as $\nabla\times\nabla$ is $0$ since the cross product of vector with itself is the $0$ vector." -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Based on this vector-field plot - -```julia; hold=true; echo=false -gr() -F(x,y) = [-y,x]/sqrt(0.0000001 + x^2+y^2) -vectorfieldplot(F, xlim=(-5,5),ylim=(-5,5), nx=15, ny=15) -``` - -which seems likely - -```julia; hold=true; echo=false -choices=[ -"The field is incompressible (divergence free)", -"The field is irrotational (curl free)", -"The field has a non-trivial curl and divergence" -] -answ=1 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -Based on this vectorfield plot - -```julia; hold=true; echo=false -gr() -F(x,y) = [x,y]/sqrt(0.0000001 + x^2+y^2) -vectorfieldplot(F, xlim=(-5,5),ylim=(-5,5), nx=15, ny=15) -``` - -which seems likely - -```julia; hold=true; echo=false -choices=[ -"The field is incompressible (divergence free)", -"The field is irrotational (curl free)", -"The field has a non-trivial curl and divergence" -] -answ=2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The electric field $E$ (by Maxwell's equations) satisfies: - - -```julia; hold=true; echo=false -choices=[ -"The field is incompressible (divergence free)", -"The field is irrotational (curl free)", -"The field has a non-trivial curl and divergence" -] -answ=3 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -The magnetic field $B$ (by Maxwell's equations) satisfies: - - -```julia; hold=true; echo=false -choices=[ -"The field is incompressible (divergence free)", -"The field is irrotational (curl free)", -"The field has a non-trivial curl and divergence" -] -answ=1 -radioq(choices, answ, keep_order=true) -``` - - - -###### Question - -For spherical coordinates, $\Phi(r, \theta, \phi)=r \langle \sin\phi\cos\theta,\sin\phi\sin\theta,\cos\phi\rangle$, the scale factors are $h_r = 1$, $h_\theta=r\sin\phi$, and $h_\phi=r$. - -The curl then will then be - -```math -\nabla\times{F} = \det \left[ -\begin{array}{} -\hat{e}_r & r\sin\phi\hat{e}_\theta & r\hat{e}_\phi \\ -\frac{\partial}{\partial{r}} & \frac{\partial}{\partial{\theta}} & \frac{\partial}{\partial{phi}} \\ -F_r & r\sin\phi F_\theta & r F_\phi -\end{array} -\right]. -``` - -For a *radial* function $F = h(r)e_r$. (That is $F_r = h(r)$, $F_\theta=0$, and $F_\phi=0$. What is the curl of $F$? - -```julia; hold=true; echo=false -choices = [ -raw" ``\vec{0}``", -raw" ``re_\phi``", -raw" ``rh'(r)e_\phi``" -] -answ=1 -radioq(choices, answ) -``` diff --git a/CwJ/integral_vector_calculus/double_triple_integrals.jmd b/CwJ/integral_vector_calculus/double_triple_integrals.jmd deleted file mode 100644 index df0fc70..0000000 --- a/CwJ/integral_vector_calculus/double_triple_integrals.jmd +++ /dev/null @@ -1,1711 +0,0 @@ -# Multi-dimensional integrals - -This section uses these add-on packages: - - -```julia -using CalculusWithJulia -using Plots -using QuadGK -using SymPy -using HCubature -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -#import PyPlot -#pyplot() - - -const frontmatter = ( - title = "Multi-dimensional integrals", - description = "Calculus with Julia: Multi-dimensional integrals", - tags = ["CalculusWithJulia", "integral_vector_calculus", "multi-dimensional integrals"], -); -nothing -``` - ----- - -The definition of the definite integral, $\int_a^b f(x)dx$, is based on -Riemann sums. - - -We review, using a more general form than -[previously](../integrals/area.html). Consider a bounded function $f$ -over $[a,b]$. A partition, $P$, is based on $a = x_0 < x_1 < \cdots < -x_n = b$. For each subinterval $[x_{i-1}, x_{i}]$ take $m_i(f) = -\inf_{u \text{ in } [x_{i-1},x_i]} f(u)$ and $M_i(f) = \sup_{u \text{ -in } [x_{i-1},x_i]} f(u)$. (When $f$ is continuous, $m_i$ and $M_i$ -are realized at points of $[x_{i-1},x_i]$, though that isn't assumed -here. The use of "$\sup$" and "$\inf$" is a mathematically formal -means to replace this in general.) Let $\Delta x_i = x_i - x_{i-1}$. -Form the sums $m(f, P) = \sum_i m_i(f) \Delta x_i$ and $M(f, P) = -\sum_i M_i(f) \Delta x_i$. These are the *lower* and *upper* Riemann -sums for a partition. A general Riemann sum would be formed by -selecting $c_i$ from $[x_{i-1}, x_i]$ and forming $S(f,P) = \sum -f(c_i) \Delta x_i$. It will be the case that $m(f,P) \leq S(f,P) \leq -M(f,P)$, as this is true for *each* sub-interval of the partition. - -If, as the largest diameter ($\Delta x_i$) of the partition $P$ goes to $0$, the upper and lower sums converge to the same limit, then $f$ is called Riemann integrable over $[a,b]$. If $f$ is Riemann integrable, any Riemann sum will converge to the definite integral as the partitioning shrinks. - -Continuous functions are known to be Riemann integrable, as are functions with only finitely many discontinuities, though this isn't the most general case of integrable functions, which will be stated below. - -In practice, we don't typically compute integrals using a limit of a partition, though the approach may provide direction to numeric answers, as the Fundamental Theorem of Calculus relates the definite integral with an antiderivative of the integrand. - -The multidimensional case will prove to be similar where a Riemann sum is used to define the value being discussed, but a theorem of Fubini will allow the computation of integrals using the Fundamental Theorem of Calculus. - ----- - - -## Integration theory - -```julia; hold=true; echo=false -imgfile = "figures/chrysler-building-in-new-york.jpg" -caption = """How to estimate the volume contained within the Chrysler Building? One way might be to break the building up into tall vertical blocks based on its skyline; compute the volume of each block using the formula of volume as area of the base times the height; and, finally, adding up the computed volumes This is the basic idea of finding volumes under surfaces using Riemann integration.""" -ImageFile(:integral_vector_calculus, imgfile, caption) -``` - -```julia; hold=true; echo=false -imgfile ="figures/chrysler-nano-block.png" -caption = """ -Computing the volume of a nano-block construction of the Chrysler building is easier than trying to find an actual tree at the Chrysler building, as we can easily compute the volume of columns of equal-sized blocks. Riemann sums are similar. -""" - -ImageFile(:integral_vector_calculus, imgfile, caption) -``` - -The definition of the multi-dimensional integral is more involved then the one-dimensional case due to the possibly increased complexity of the region. This will require additional [steps](https://math.okstate.edu/people/lebl/osu4153-s16/chapter10-ver1.pdf). The basic approach is as follows. - -First, let $R = [a_1, b_1] \times [a_2, b_2] \times \cdots \times [a_n, b_n]$ be a closed rectangular region. If $n=2$, this is a rectangle, and if $n=3$, a box. We begin by defining integration over closed rectangular regions. For each side, a partition $P_i$ is chosen based on $a_i = x_{i0} < x_{i1} < \cdots < x_{ik} = b_i$. Then a sub-rectangular region would be of the form $R' = P_{1j_1} \times P_{2j_2} \times \cdots \times P_{nj_n}$, where $P_{ij_i}$ is one of the partitioning sub intervals of $[a_i, b_i]$. Set $\Delta R' = \Delta P_{1j_1} \cdot \Delta P_{2j_2} \cdot\cdots\cdot\Delta P_{nj_n}$ to be the $n$-dimensional volume of the sub-rectangular region. - -For each sub-rectangular region, we can define $m(f,R')$ to be $\inf_{u \text{ in } R'} f(u)$ and $M(f, R') = \sup_{u \text{ in } R'} f(u)$. If we enumerate all the sub-rectangular regions, we can define $m(f, P) = \sum_i m(f, R_i) \Delta R_i$ and $M(f,P) = \sum_i M(f, R_i)\Delta R_i$, as in the one-dimensional case. These are upper and lower sums, and, as before, would bound the Riemann sum formed by choosing any $c_i$ in $R_i$ and computing $S(f,P) = \sum_i f(c_i) \Delta R_i$. - - - -As with the one-dimensional case, $f$ is Riemann integrable over $R$ if the limits of $m(f,P)$ and $M(f,P)$ exist and are identical as the diameter of the partition (defined as the largest diameter of each side) goes to $0$. If the limits are equal, then so is the limit of any Riemann sum. - - -When $f$ is Riemann integrable over a rectangular region $R$, we denote the limit by any of: - -```math -\iint_R f(x) dV, \quad \iint_R fdV, \quad \iint_R f(x_1, \dots, x_n) dx_1 \cdot\cdots\cdot dx_n, \quad\iint_R f(\vec{x}) d\vec{x}. -``` - -A key fact, requiring proof, is: - -> Any continuous function, $f$, is Riemann integrable over a closed, bounded rectangular region. - ----- - -As with one-dimensional integrals, from the Riemann sum definition, several familiar properties for integrals follow. Let $V(R)$ be the volume of $R$ found by multiplying the side-lengths together. - - -**Constants:** - -* A constant is Riemann integrable and: $\iint_R c dV = c V(R)$. - - -**Linearity:** - - -* For integrable $f$ and $g$ and constants $a$ and $b$: -```math -\iint_R (af(x) + bg(x))dV = a\iint_R f(x)dV + b\iint_R g(x) dV. -``` - -**Disjoint:** - -* If $R$ and $R'$ are *disjoint* rectangular regions (possibly sharing a boundary), then the integral over the union is defined by linearity: - -```math -\iint_{R \cup R'} f(x) dV = \iint_R f(x)dV + \iint_{R'} f(x) dV. -``` - -**Monotonicity:** - -* As $f$ is bounded, let $m \leq f(x) \leq M$ for all $x$ in $R$. Then - -```math -m V(R) \leq \iint_R f(x) dV \leq MV(R). -``` - -* If $f$ and $g$ are integrable *and* $f(x) \leq g(x)$, then the integrals have the same property, namely $\iint_R f dV \leq \iint_R gdV$. - -* If $S \subset R$, both closed rectangles, then if $f$ is integrable over $R$ it will be also over $S$ and, when $f\geq 0$, $\iint_S f dV \leq \iint_R fdV$. - -**Triangle inequality:** - -* If $f$ is bounded and integrable, then $|\iint_R fdV| \leq \iint_R |f| dV$. - -### HCubature - -To numerically compute multidimensional integrals over rectangular regions in `Julia` is efficiently done with the `HCubature` package. The `hcubature` function is defined for $n$-dimensional integrals, so the integrand is specified through a function which takes a vector as an input. The region to integrate over is of rectangular form. It is specified by a tuple of left endpoints and a tuple of right endpoints. The order is in terms of the order of the vector. - -To elaborate, if we think of $f(\vec{x}) = f(x_1, x_2, \dots, x_n)$ and we are integrating over $[a_1, b_1] \times \cdots \times [a_n, b_n]$, then the region would be specified through two tuples: `(a1, a2, ..., an)` and `(b1, b2, ..., bn)`. - - -To illustrate, to integrate the function $f(x,y) = x^2 + 5y^2$ over the region $[0,1] \times [0,2]$ using `HCubature`'s `hcubature` function, we would proceed as follows: - - -```julia; hold=true -f(x,y) = x^2 + 5y^2 -f(v) = f(v...) # f accepts a vector -a0, b0 = 0, 1 -a1, b1 = 0, 2 -hcubature(f, (a0, a1), (b0, b1)) -``` - -The computed value and a worst case estimate for the error is returned, in a manner similar to the `quadgk` function (from the `QuadGK` package) used previously for one-dimensional numeric integrals. - -The order above is `x` then `y`, which is clear from the first definition of `f` and as belabored in the tuples passed to `hcubature`. A more convenient use is to just put the constants into the function call, as in `hcubature(f, (0,0), (1,2))`. - - -##### Example - -Let's verify the numeric approach works for figures where an answer is known from the geometry of the problem. - -* A constant function $c=f(x,y)$. In this case, the volume is simply a box, so the volume will come from multiplying the three dimensions. Here is an example: - -```julia; hold=true -f(x,y) = 3 -f(v) = f(v...) -a0, b0 = 0, 4 -a1, b1 = 0, 5 # R is area 20, so V = 60 = 3 ⋅ 20 -hcubature(f, (a0, a1), (b0, b1)) -``` - -* A wedge. Let $f(x,y) = x$ and $R= [0,1] \times [0,1]$. The the volume is a wedge, and should be half the value of the unit cube, or simply $1/2$: - -```julia; hold=true -f(x,y) = x -f(v) = f(v...) -a0, b0 = 0, 1 -a1, b1 = 0, 1 -hcubature(f, (a0, a1), (b0, b1)) -``` - - -* The volume of a right square pyramid is $V=(1/3)a^2 h$, or a third of an enclosing box. We computed this area previously using the method of [slices](../integrals/volumes_slice.html). Here we do it thinking of the pyramid as the volume formed by the surface over the region $[-a,a] \times [-a,a]$ generated by $f(x,y) = h \cdot (l(x,y) - d(x,y))/l(x,y)$ where $d(x,y)$ is the distance to the origin, or $\sqrt{x^2 + y^2}$ and $l(x,y)$ is the length of the line segment from the origin to the boundary of $R$ that goes through $(x,y)$. - -Identifying a formula for this is a bit tricky. Here we use a brute force approach; later we will simplify this. Using polar coordinates, we know $r\cos(\theta) = a$ describes the line $x=a$ and $r\sin(\theta)=a$ describes the line $y=a$. Using the square, we have to alternate between these depending on where $\theta$ is (e.g., between $-\pi/4$ and $\pi/4$ it would be $r\cos(\theta)=a$ or $a/\cos(\theta)$ is $l(x,y)$. We write a function for this: - -```julia; -𝒅(x, y) = sqrt(x^2 + y^2) -function 𝒍(x, y, a) - theta = atan(y,x) - atheta = abs(theta) - if (pi/4 <= atheta < 3pi/4) # this is the y=a or y=-a case - (a/2)/sin(atheta) - else - (a/2)/abs(cos(atheta)) - end -end -``` - -And then - -```julia; -𝒇(x,y,a,h) = h * (𝒍(x,y,a) - 𝒅(x,y))/𝒍(x,y,a) -𝒂, 𝒉 = 2, 3 -𝒇(x,y) = 𝒇(x, y, 𝒂, 𝒉) # fix a and h -𝒇(v) = 𝒇(v...) -``` - -We can visualize the volume to be computed, as follows: - -```julia; hold=true -xs = ys = range(-1, 1, length=20) -surface(xs, ys, 𝒇) -``` - -Trying this, we have: - -```julia; -hcubature(𝒇, (-𝒂/2, -𝒂/2), (𝒂/2, 𝒂/2)) -``` - -The answer agrees with that known from the formula, $4 = (1/3)a^2 h$, but the answer takes a long time to be produce. The `hcubature` function is slow with functions defined in terms of conditions. For this problem, volumes by [slicing](../integrals/volumes_slice.html) is more direct. But also symmetry can be used, were we able to compute the volume above the triangular region formed by the $x$-axis, the line $x=a/2$ and the line $y=x$, which would be $1/8$th the total volume. (As then $l(x,y,a) = (a/2)/\sin(\tan^{-1}(y,x))$.). - - -* The volume of a sphere is $4/3 \pi r^3$. We could verify this by integrating $z = f(x,y) = \sqrt{r^2 - (x^2 + y^2)}$ over $R = \{(x,y): x^2 + y^2 \leq r^2\}$. *However*, this is not a *rectangular* region, so we couldn't directly proceed. - -We might try integrating a function with a condition: - -```julia; hold=true -function f(x, y, r) - if x^2 + y^2 < r - sqrt(z - x^2 + y^2) - else - 0.0 - end -end -``` - -**But** `hcubature` is **very** slow to integrate such functions. We will see our instincts are good -- this is the approach taken to discuss integrals over general regions -- but this is not practical here. There are two alternative approaches to be discussed: approach the integral *iteratively* or *transform* the circular region into a rectangular region and integrate. Before doing so, we discuss how the integral is developed for more general regions. - - -!!! note - The approach above takes a nice smooth function and makes it non smooth at the boundary. In general this is not a good idea for numeric solutions, as many algorithms work better with assumptions of smoothness. - -!!! note - The `Quadrature` package provides a uniform interface for `QuadGK`, `HCubature`, and other numeric integration routines available in `Julia`. - -## Integrals over more general regions - -To proceed further, it is necessary to discuss certain types of sets that will be used to describe the boundaries of regions that can be integrated over, though we don't dig into the details. - -Let the *measure* of a rectangular region be its volume and for any subset of $S \subset R^n$, define the *outer* measure of $S$ by $m^*(S) = \inf\sum_{j=1}^\infty V(R_j)$ where the infimum is taken over all closed, countable, rectangles with $S \subset \cup_{j=1}^\infty R_j$. - -In two dimensions, if $S$ is viewed on a grid, then this would be *area* of the smallest collection of cells that contain any part of $S$. This is the smallest this value takes as the grid becomes infinite. - -For the following graph, there are $100$ cells each of area $8/100$. Their are 58 cells covering the curve and its interior. So the outer measure is less than $58\cdot 8/100$, as this is just one possible covering. - -```julia; hold=true; echo=false -function cassini(theta) - a, b = .75, 1 - A = 1; B = -2a^2*cos(2theta) - C = a^4 - b^4 - (-B - sqrt(B^2 - 4*A*C))/(2A) -end - -polar_plot(r, a, b) = plot(t -> r(t)*cos(t), t->r(t)*sin(t), a, b, legend=false, linewidth=3) -p = polar_plot(cassini, 0, 8pi) -n=10 -a1,b1 = -1, 1 -a2, b2 = -2, 2 -for a in range(a1, b1, length=n+1) - for b in range(a2, b2, length=n+1) - plot!(p, [a,a],[a2, b2], alpha=0.75) - plot!(p, [a1,b1],[b,b], alpha=0.75) - end -end -p -``` - - -A set has measure $0$ if the outer measure is $0$. An alternate definition, among other characterizations, is a set has measure $0$ if for any $\epsilon > 0$ there exists rectangular regions $R_1, R_2, \dots, R_n$ (for some $n$) with $\sum V(R_i) < \epsilon$. -Measure zero sets have many properties not discussed here. - - -For now, let's see that graph of $y=f(x)$ over $[a,b]$, as a two dimensional set, has measure zero when $f(x)$ has a bounded derivative ($|f'|$ bounded by $M$). Fix some $\epsilon>0$. Take $n$ with $2M(b-a)^2/n < \epsilon$, then divide $[a,b]$ into $n$ equal length intervals (of length $\delta = (b-a)/n)$. For each interval, we consider the box $[a_i, b_i] \times [f(a_i)-\delta M, f(a_i) + \delta M]$. By the mean value theorem, we have $|f(x) - f(a_i)| \leq |b_i-a_i|M$ so $f(a_i) - \delta M \leq f(x) \leq f(a_i) + \delta M$, so the curve will stay in the boxes. These boxes have total area $n \cdot \delta \cdot 2\delta M = 2M(b-a)^2/n$, an area less than $\epsilon$. - -The above can be extended to any graph of a continuous function over $[a,b]$. - -For a function $f$ the set of discontinuities in $R$ is all points where $f$ is not continuous. A formal definition is often given in terms of oscillation. Let $o(f, \vec{x}, \delta) = \sup_{\{\vec{y} : \| \vec{y}-\vec{x}\| < \delta\}}f(\vec{y}) - \inf_{\{\vec{y}: \|\vec{y}-\vec{x}\|<\delta\}}f(\vec{y})$. A function is discontinuous at $\vec{x}$ if the limit as $\delta \rightarrow 0+$ (which must exist) is not $0$. - -With this, we can state the Riemann-Lebesgue theorem on integrable functions: - -> Let $R$ be a closed, rectangular region, and $f:R^n \rightarrow R$ a bounded function. Then $f$ is Riemann integrable over $R$ if and only if the set of discontinuities is a set of measure $0$. - -It was said at the outset we would generalize the regions we can integrate over, but this theorem generalizes the functions. We can tie the two together as follows. Define the integral over any *bounded* set $S$ with boundary of measure $0$. Bounded means $S$ is contained in some bounded rectangle $R$. Let $f$ be defined on $S$ and extend it to be $0$ on points in $R$ that are not in $S$. If this extended function is integrable over $R$, then we can define the integral over $S$ in terms of that. This is why the *boundary* of $S$ must have measure zero, as in general it is among the set of discontinuities of the extend function $f$. Such regions are also called Jordan regions. - - -## Fubini's theorem - -Consider again this figure - -```julia; hold=true; echo=false -function cassini(theta) - a, b = .75, 1 - A = 1; B = -2a^2*cos(2theta) - C = a^4 - b^4 - (-B - sqrt(B^2 - 4*A*C))/(2A) -end - -polar_plot(r, a, b) = plot(t -> r(t)*cos(t), t->r(t)*sin(t), a, b, legend=false, linewidth=3) -p = polar_plot(cassini, 0, 8pi) -n=10 -a1,b1 = -1, 1 -a2, b2 = -2, 2 -for a in range(a1, b1, length=n+1) - for b in range(a2, b2, length=n+1) - plot!(p, [a,a],[a2, b2], alpha=0.75) - plot!(p, [a1,b1],[b,b], alpha=0.75) - end -end -p -``` - - -Let $C_i$ enumerate all the cells shown, assume $f$ is extended to be $0$ outside the region, and let $c_i$ be a point in the cell. Then the Riemann sum $\sum_i f(c_i) V(C_i)$ can be visualized three identical ways: - -* as a linear sum over the indices $i$, as written, leading to $\iint_R f(x) dV$. -* by indexing the cells by row ($i$) and column ($j$) and summing as $\sum_i (\sum_j f(x_{ij}, y_{ij}) \Delta y_j) \Delta x_i$. -* by indexing the cells by row ($i$) and column ($j$) and summing as $\sum_j (\sum_i f(x_{ij}, y_{ij}) \Delta x_i) \Delta y_j$. - -The last two suggest that their limit will be *iterated* integrals of the form $\int_{-1}^1 (\int_{-2}^2 f(x,y) dy) dx$ and $\int_{-2}^2 (\int_{-1}^1 f(x,y) dx) dy$. - -By "iterated" we mean performing two different definite integrals. For example, to compute -$\int_{-1}^1 (\int_{-2}^2 f(x,y) dy) dx$ the first task would be to compute -$I(x) = \int_{-2}^2 f(x,y) dy$. Like partial derivatives, this integrates in $y$ while treating $x$ as a constant. Once the interior integral is computed, then the integral $\int_{-1}^1 I(x) dx$ would be computed to find the answer. - - -The question then: under what conditions will the three integrals be equal? - -> [Fubini](https://math.okstate.edu/people/lebl/osu4153-s16/chapter10-ver1.pdf). Let $R \times S$ be a closed rectangular region in $R^n \times R^m$. Suppose $f$ is bounded. Define $f_x(y) = f(x,y)$ and $f^y(x) = f(x,y)$ where $x$ is in $R^n$ and $y$ in $R^m$. *If* $f_x$ and $f^y$ are integrable then -> ```math -> \iint_{R\times S}fdV = \iint_R \left(\iint_S f_x(y) dy\right) dx -> = \iint_S \left(\iint_R f^y(x) dx\right) dy. -> ``` - -Similarly, if $f^y$ is integrable for all $y$, then $\iint_{R\times S}fdV =\iint_S \iint_R f(x,y) dx dy$. - -An immediate corollary is that the above holds for continuous functions when $R$ and $S$ are bounded, the case described here. - -The case of continuous functions was known to [Euler](https://en.wikipedia.org/wiki/Fubini%27s_theorem#History), Lebesgue (1904) discussed bounded functions, as in our statement, and Fubini and Tonnelli (1907 and 1909) generalized the statement to more general functions than continuous functions, thereby earning naming rights. - -In [Ferzola](https://doi.org/10.2307/2687130) we can read a summary of Euler's thinking of 1769 when trying to understand the integral of a function $f(x,y)$ over a bounded domain $R$ enclosed by arcs in the $x$-$y$ plane. (That is, the area below $g(x)$ and above $h(x)$ over the interval $[a,b]$.) Euler wrote the answer as $\int_a^b dx (\int_{g(x)}^{h(x)} f(x,y)dy)$. Ferzola writes that Euler saw this integral yielding a *volume* as the integral $\int_{g(x)}^{h(x)} f(x,y)dy$ gives the area of a slice (parallel to the $y$ axis) and integrating in $x$ adds these slices to give a volume. This is the typical usage of Fubini's theorem today. - -```julia; hold=true; echo=false -imgfile ="figures/strang-slicing.png" -caption = L"""Figure 14.2 of Strang illustrating the slice when either $x$ is fixed or $y$ is fixed. The inner integral computes the shared area, the outer integral adds the areas up to compute volume.""" - -ImageFile(:integral_vector_calculus, imgfile, caption) -``` - - -In [Volumes](../integrals/volumes_slice.html) the formula for a volume with a known cross-sectional area is given by $V = \int_a^b CA(x) dx$. The inner integral, $\int_{R_x} f(x,y) dy$ is a function depending on $x$ that yields the area of the slice (where $R_x$ is the region sliced by the line of constant $x$ value). This is consistent with Euler's view of the iterated integral. - - - -A domain, as described above, is known as a [normal](https://en.wikipedia.org/wiki/Multiple_integral#Normal_domains_on_R2) domain. Using Fubini's theorem to integrate iteratively, employing the fundamental theorem of calculus at each step, is the standard approach. - - -For example, we return to the problem of a square pyramid, only now using symmetry, we integrate only over the triangular region between $0 \leq x \leq a/2$ and $0 \leq y \leq x$. The answer is then (the $8$ by symmetry) - -```math -V = 8 \int_0^{a/2} \int_0^x h(l(x,y) - d(x,y))/l(x,y) dy dx. -``` - -But, using similar triangles, we have $d/x = l/(a/2)$ so $(l-d)/l = 1 - 2x/a$. Continuing, our answer becomes - -```math -V = 8 \int_0^{a/2} (\int_0^x h(1-\frac{2x}{a}) dy) dx = -8 \int_0^{a/2} (h(1-2x/a) \cdot x) dx = -8 (hx^2_2 \big\lvert_{0}^{a/2} - \frac{2}{a}\frac{x^3}{3}\big\lvert_0^{a/2})= -8 h(\frac{a^2}{8} - \frac{2}{24}a^2) = \frac{a^2h}{3}. -``` - - - -### `SymPy`'s `integrate` - - -The `integrate` function of `SymPy` uses various algorithms to symbolically integrate definite (and indefinite) integrals. In the section on [integrals](../integrals/ftc.html) its use for one-dimensional integrals was shown. For multi-dimensional integrals the usage is similar, the syntax following, somewhat, the Fubini-like notation. - -For example, to perform the integral - -```math -\int_a^b \int_{h(x)}^{g(x)} f(x,y) dy dx -``` - -the call would look like: - -```julia; eval=false -integrate(f(x,y), (y, h(x), g(x)), (x, a, b)) -``` - -That is, the variable to integrate and the endpoints are passed as tuples. (Unlike `hcubature` which always uses two tuples to specify the bounds, `integrate` uses $n$ tuples to specify an $n$-dimensional integral.) The iteration happens from left to write, so in the above the `y` integral is done (and, as seen, may depend on the variable `x`) and then the `x` integral is performed. The above uses `f(x,y)`, `h(x)` and `g(x)`, but these may be simple symbolic expressions and not function calls using symbolic variables. - -We define `x` and `y` below for use throughout: - -```julia -@syms x::real y::real z::real -``` - - -##### Example - -For example, the last integral to compute the volume of a square pyramid, could be computed through - -```julia; hold=true -@syms a height -8 * integrate(height * (1 - 2x/a), (y, 0, x), (x, 0, a/2)) -``` - - -##### Example - -Find the integral $\int_0^1\int_{y^2}^1 y \sin(x^2) dx dy$. - -Without concerning ourselves with what or why, we just translate: - -```julia; hold=true -integrate( y * sin(x^2), (x, y^2, 1), (y, 0, 1)) -``` - -##### Example - -Find the volume enclosed by $y = x^2$, -$y = 5$, $z = x^2$, -and $z = 0$. - -The limits on $z$ say this is the volume under the surface $f(x,y) = x^2$, over the region defined by $y=5$ and $y = x^2$. The region is a parabola with $y$ running from $x^2$ to $5$, while $x$ ranges from $-\sqrt{5}$ to $\sqrt{5}$. - -```julia; hold=true -f(x, y) = x^2 -h(x) = x^2 -g(x) = 5 -integrate(f(x,y), (y, h(x), g(x)), (x, -sqrt(Sym(5)), sqrt(Sym(5)))) -``` - -##### Example - -Find the volume above the $x$-$y$ plane when a cylinder, $x^2 + y^2 = 2^2$ is intersected by a plane $3x + 4y + 5z = 6$. - -We solve for $z = (1/5)\cdot(6 - 3x - 4y)$ and take $R$ as the disk at the origin of radius $2$: - -```julia; hold=true -f(x,y) = 6 - 3x - 4y -g(x) = sqrt(2^2 - x^2) -h(x) = -sqrt(2^2 - x^2) -(1//5) * integrate(f(x,y), (y, h(x), g(x)), (x, -2, 2)) -``` - - - -##### Example - -Find the volume: - -* in the first octant -* bounded by $x+y+z = 10$, $2x + 3y = 20$, and $x + 3y = 10$ - -The first plane can be expressed as $z = f(x,y) = 10 - x - y$ and the volume is that below the surface of $f$ over the region $R$ formed by the two lines and the $x$ and $y$ axes. Plotting that we have: - -```julia; hold=true -g1(x) = (20 - 2x)/3 -g2(x) = (10 - x)/3 -plot(g1, 0, 20) -plot!(g2, 0, 20) -``` - -We see the intersection is when $x=10$, so this becomes - -```julia; hold=true -f(x,y) = 10 - x - y -h(x) = (10 - x)/3 -g(x) = (20 - 3x)/3 -integrate(f(x,y), (y, h(x), g(x)), (x, 0, 10)) -``` - - - - - -##### Example - -Let $r=1$ and define three cylinders along the $x$, $y$, and $z$ axes by: $y^2+z^2 = r^2$, $x^2 + z^2 = r^2$, and $x^2 + y^2 = r^2$. What is the enclosed [volume](http://mathworld.wolfram.com/SteinmetzSolid.html)? - -Using the cylinder along the $z$ axis, we have the volume sits above and below the disk $R = x^2 + y^2 \leq r^2$. By symmetry, we can double the volume that sits above the disk to answer the question. - - - -Using symmetry, we can tell that the the wedge between $x=0$, $y=x$, and $x^2 + y^2 \leq 1$ (corresponding to a polar angle in $[0,\pi/4]$ in $R$ contains $1/8$ the volume of the top, so $1/16$ of the total. - - -```julia; hold=true; echo=false -rad(theta) = 1 -plot(t -> rad(t)*cos(t), t -> rad(t)*sin(t), 0, pi/4, legend=false, linewidth=3) -plot!([0,cos(pi/4)], [0, sin(pi/4)], linewidth=3) -plot!([0, 1], [0, 0], linewidth=3) -plot!([cos(pi/4), cos(pi/4)], [0, sin(pi/4)], linewidth=3) -``` - - -Over this wedge the height is given by the cylinder along the $y$ axis, $x^2 + z^2 = r^2$. We *could* break this wedge into a triangle and a semicircle to integrate piece by piece. However, from the figure we can integrate in the $y$ direction on the outside, and use only one intergral: - - -```julia; hold=true -r = 1 # if using r as a symbolic variable specify `positive=true` -f(x,y) = sqrt(r^2 - x^2) -16 * integrate(f(x,y), (x, y, sqrt(r^2-y^2)), (y, 0, r*cos(PI/4))) -``` - -##### Example - -Find the volume under $f(x,y) = xy$ in the cone swept out by $r(\theta) = 1$ as $\theta$ goes between $[0, \pi/4]$. - -The region $R$, the same as the last one. -As seen, it can be described in two pieces as a function of $x$, but needs only $1$ as a function of $y$, so we use that below: - -```julia; hold=true -f(x,y) = x*y -g(y) = sqrt(1 - y^2) -h(y) = y -integrate(f(x,y), (x, h(y), g(y)), (y, 0, sin(PI/4))) -``` - -##### Example: Average value - -The average value of a function, $f(x,y)$, over a region $R$ is the integral of $f$ over $R$ divided by the area of $R$. It can be computed through two integrals, as below. - -let $R$ be the region in the first quadrant bounded by $x - y = 0$ and $f(x,y) = x^2 + y^2$. Find the average value. - -```julia; hold=true -f(x,y) = x^2 + y^2 -g(x) = x # solve x - y = 0 for y -h(x) = 0 -A = integrate(f(x,y), (y, h(x), g(x)), (x, 0, 1)) -B = integrate(Sym(1), (y, h(x), g(x)), (x, 0, 1)) -A/B -``` - -(We integrate `Sym(1)` and not just `1`, as we either need to have a symbolic value for the first argument or use the `sympy.integrate` method directly.) - -##### Example: Density - -The area of a region $R$ can be computed by $\iint_R 1 dA$. If the region is physical, say a disc, then its mass can be of interest. If the mass is uniform with density $\rho$, then the mass would be $\iint_R \rho dA$. If the mass is non uniform, say it is a function $\rho(x,y)$, then the integral to find the mass becomes $\iint_R \rho(x,y) dA$. (In a Riemann sum, the term $\rho(c_{ij}) \Delta x_i\Delta y_j$ would be the mass of a constant-density solid, the integral just adds these up to find total mass.) - -Find the mass of a disc bounded by the two parabolas $y=2 - x^2$ and $y = -3 + 2x^2$ with density function given by $\rho(x,y) = x^2y^2$. - -First we need the intersection points of the two parabolas. Solving $2-x^2 = -3 + 2x^2$ for $x$ yields: $5 = x^2$. - -So we get a mass of: - -```julia; hold=true -rho(x,y) = x^2*y^2 -g(x) = 2 - x^2 -h(x) = -3 + 2x^2 -a = sqrt(Sym(5)) -integrate(rho(x,y), (y, h(x), g(x)), (x, -a, a)) -``` - - - - -##### Example (Strang) - -Integrate $\int_0^1 \int_y^1 \cos(x^2) dx dy$ avoiding the *impossible* integral of $\cos(x^2)$. As the integrand is continuous, Fubini's Theorem allows the interchange of the variable of integraton. The region, $R$, is a triangle in the first quadrant below the line $y=x$ and left of the line $x=1$. So we have: - -```math -\int_0^1 \int_0^x \cos(x^2) dy dx -``` - -We can integrate this, as the interior integral leaves $x \cos(x^2)$ to integrate: - -```julia; -integrate(cos(x^2), (y, 0, x), (x, 0, 1)) -``` - - - - - - -### A "Fubini" function - -The computationally efficient way to perform multiple integrals numerically would be to use `hcubature`. However, this function is defined only for *rectangular* regions. In the event of non-rectangular regions, the suggested performant way would be to find a suitable transformation (below). - -However, for simple problems, where ease of expressing a region is preferred to computational efficiency, something can be implemented using repeated uses of `quadgk`. Again, this isn't recommended, save for its relationship to how iteration is approached algebraically. - - -In the `CalculusWithJulia` package, the `fubini` function is provided. For these notes, we define three operations using Unicode operators entered with `\int[tab]`, `\iint[tab]`, `\iiint[tab]`. (Using this, better shows the mechanics involved.) - -```julia; -# adjust endpoints when expressed as a functions of outer variables -callf(f::Number, x) = f -callf(f, x) = f(x...) -endpoints(ys, x) = callf.(ys, Ref(x)) - -# integrate f(x) dx -∫(@nospecialize(f), xs) = quadgk(f, xs...)[1] # @nospecialize is not necessary, but offers a speed boost - -# integrate int_a^b int_h(x)^g(y) f(x,y) dy dx -∬(f, ys, xs) = ∫(x -> ∫(y -> f(x,y), endpoints(ys, x)), xs) - -# integrate f(x,y,z) dz dy dx -∭(f, zs, ys, xs) = ∫( - x -> ∫( - y -> ∫( - z -> f(x,y,z), - endpoints(zs, (x,y))), - endpoints(ys,x)), - xs) -``` - -##### Example - -Compare the integral of $f(x,y) = \exp(-x^2 -2y^2)$ over the region $R=[0,3]\times[0,3]$ using `hcubature` and the above. - -```julia; hold=true -f(x,y) = exp(-x^2 - 2y^2) -f(v) = f(v...) -hcubature(f, (0,0), (3,3)) # (a0, a1), (b0, b1) -``` - -```julia; hold=true -f(x,y) = exp(-x^2 - 2y^2) -∬(f, (0,3), (0,3)) # (a1, b1), (a0, b0) -``` - - - - -##### Example - -Show the area of the unit circle is $\pi$ using the "Fubini" function. - -```julia; hold=true -f(x,y) = 1 -a = ∬(f, (x-> -sqrt(1-x^2), x-> sqrt(1-x^2)), (-1, 1)) -a, a - pi # answer and error -``` - -(The error is similar to that returned by `quadgk(x -> sqrt(1-x^2), -1, 1)`.) - - -###### Example - - -Show the volume of a sphere of radius $1$ is $4/3\pi = 4/3\pi\cdot 1^3$ by doubling the integral of $f(x,y) = \sqrt{1-x^2-y^2}$ over $R$, the unit disk. - -```julia; hold=true -f(x,y) = sqrt(1 - x^2 - y^2) -a = 2 * ∬(f, (x-> -sqrt(1-x^2), x-> sqrt(1-x^2)), (-1, 1)) -a, a - 4/3*pi -``` - -##### Example - -Numeric integrals don't need to worry about integrands without antiderivatives. Their concerns are highly oscillatory integrands. Here we compute -$\int_0^1 \int_y^1 \cos(x^2) dx dy$ directly. The limits are in a different order than the "Fubini" function expects, so we switch the variables: - -```julia; -∬((y,x) -> cos(x^2), (y -> y, 1), (0, 1)) -``` - -Compare to - -```julia; -sin(1)/2 -``` - -## Triple integrals - -Triple integrals are identical in theory to double integrals, though the computations can be more involved and the regions more complicated to describe. The main regions (emphasized by Strang) to understand are: box, prism, cylinder, cone, tetrahedron, and sphere. - -```julia; echo=false -let - ts = range(0, pi/2, length=50) - O = [0,0,0] - bx, by,bz = [1, 0, 0], [0,2,0], [0,0,3] - - p = plot(unzip([O])..., legend=false, title="box", - axis=nothing) - arrow!(p, O, bx), arrow!(p, O, by), arrow!(p, O, bz) - arrow!(p, bx, bz), arrow!(p, bx, by) - arrow!(p, by, bx), arrow!(p, by, bz) - arrow!(p, bz, bx), arrow!(p, bz, by) - arrow!(p, bx+by, bz), - arrow!(p, bx+by+bz, -bx), arrow!(p, bx+by+bz, -by) - ps = [p] - - p = plot(unzip([O])..., legend=false, title="prism", - axis=nothing) - arrow!(p, O, bx), arrow!(p, O, by), - arrow!(p, bx, by), arrow!(p, by, bx), - arrow!(p, O, bz), arrow!(p, by, bz) - arrow!(p, bz, by) - arrow!(p, bx, bz-bx), arrow!(p, bx + by, by+bz - (bx + by)) - push!(ps, p) - - p = plot(unzip([O])..., legend=false, title="tetrahedron", - axis=nothing) - arrow!(p, O, bx), arrow!(p, O, by), arrow!(p, O, bz) - arrow!(p, bx, by-bx) - arrow!(p, bx, bz-bx), arrow!(p, by, bz-by) - push!(ps, p) - - p = plot(unzip([O])..., legend=false, title="cone", - camera=(70,20), axis=nothing) - arrow!(p, O, bx), arrow!(p, O, by/2), arrow!(p, O, bz) - arrow!(p, bx, bz-bx), arrow!(p, by/2, bz-by/2) - - for h in range(0.1, 0.9, length=5) - z = 3*h - r = 1 - h - plot!(p, r*cos.(ts), r*sin.(ts), z .+ 0*ts) - end - push!(ps, p) - - p = plot(unzip([O])..., legend=false, title="sphere", - camera=(70,20), axis=nothing) - Os = 0 * ts - plot!(p, cos.(ts), sin.(ts), Os) - plot!(p, cos.(ts), Os, sin.(ts)) - plot!(p, Os, cos.(ts), sin.(ts)) - for h in .2:.2:.8 - r = sqrt(1 - h^2) - plot!(p, r*cos.(ts), r*sin.(ts), h .+ Os) - end - push!(ps, p) - - l = @layout [a b; c d; e] - plot(ps..., layout=l) -end -``` - - -Here we compute the volumes of these using a triple integral of the form $\iint_R 1 dV$. - - -* Box. Consider the box-like, or "rectangular," region $[0,a]\times [0,b] \times [0,c]$. This has volume $abc$ which we see here using Fubini's theorem: - -```julia; hold=true -@syms a b c -f(x,y,z) = Sym(1) # need to integrate a symbolic object in integrand or call `sympy.integrate` -integrate(f(x,y,z), (x, 0, a), (y, 0, b), (z, 0, c)) -``` - -* Prism. Consider a prism or wedge formed by $ay + bz = 1$ with $a,b > 0$ and over the region in the first quadrant $0 \leq x \leq c$. Find its area. - -The function to integrate is $f(x,y) = (1 - ay)/b$ over the region bounded by $[0,c] \times [0,1/a]$: - -```julia; hold=true -@syms a b c -f(x,y,z) = Sym(1) -integrate(f(x,y,z), (z, 0, (1 - a*y)/b), (y, 0, 1/a), (x, 0, c)) -``` - -Which, as expected, is half the volume of the box $[0,c] \times [0, 1/a] \times [0, 1/b]$. - -* Tetrahedron. Consider the volume formed by $x,y,z \geq 0$ and bounded by $ax+by+cz = 1$ where $a,b,c \geq 0$. The volume is a tetrahedron. The base in the $x$-$y$ plane is a triangle with vertices $(1/a, 0, 0)$ and $(0, 1/b, 0)$. -(The third easy-to-find point is $(0, 0, 1/c)$). The line connecting the points in the $x$-$y$ plane is $ax + by = 1$. With this, the integral to compute the volume is - -```julia; hold=true -@syms a b c -f(x,y,z) = Sym(1) -integrate(f(x,y,z), (z, 0, (1 - a*x - b*y)/c), (y, 0, (1 - a*x)/b), (x, 0, 1/a)) -``` - -This is $1/6$th the volume of the box. - - -* Cone. Consider a cone formed by the function $z = f(x,y) = a - b(x^2+y^2)^{1/2}$ ($a,b > 0$) and the $x$-$y$ plane. This will have radius $r = a/b$ and height $a$. The volume is given by this integral: - -```math -\int_{x=-r}^r \int_{y=-\sqrt{r^2 - x^2}}^{\sqrt{r^2-x^2}} \int_0^{a - b(x^2 + y^2)} 1 dz dy dx. -``` - -This integral is doable, but `SymPy` has trouble with it. We will return to this when cylindrical coordinates are defined. - - - -* Sphere. The sphere $x^2 + y^2 + z^2 \leq 1$ has a known volume. Can we compute it using integration? In Cartesian coordinates, we can describe the region $x^2 + y^2 \leq 1$ and then the $z$-limits will follow: - -```math -\int_{x=-1}^1 \int_{y=-\sqrt{1-x^2}}^{\sqrt{1-x^2}} \int_{z=-\sqrt{1 - x^2 - y^2}}^{\sqrt{1-x^2 - y^2}} 1 dz dy dx. -``` - -This integral is doable, but `SymPy` has trouble with it. We will return to this when spherical coordinates are defined. - - - - -## Change of variables - - -The change of variables, or substitution, formula from first-semester calculus is expressed, under assumptions, by: - -```math -\int_{g(R)} f(x) dx = \int_R (f\circ g)(u)g'(u) du. -``` - -The derivation comes from reversing the chain rule. When using it, we start on the right hand side and typically write $x = g(u)$ and from here derive an expression involving differentials: $dx = g'(u) du$ and the rest follows. In practice, this is used to simplify the integrand in the search for an antiderivative, as $(f\circ g)$ is generally more complicated than $f$ alone. - -In higher dimensions, we will see that change of variables can not only simplify the integrand, but is also of great use to simplify the region to integrate over. We mentioned, for example, that to use `hcubature` efficiently over a non-rectangular region, a transformation---or change of variables---is needed. -The key to the multi-dimensional formula is understanding what should replace $dx = g'(u) du$. We take a bit of a circuitous route to get there. - - -In [Katz](http://www.jstor.org/stable/2689856) a review of the history of "change of variables" from Euler to Cartan is given. We follow Lagrange's formal analysis to derive the change of variable formula in two dimensions. - -We view $R$ in two coordinate systems $(x,y)$ and $(u,v)$. We have that - -```math -\begin{align} -dx &= A du + B dv\\ -dy &= C du + D dv, -\end{align} -``` - -where $A = \partial{x}/\partial{u}$, $B = \partial{x}/\partial{v}$, $C= \partial{y}/\partial{u}$, and $D = \partial{y}/\partial{v}$. Lagrange, following Euler, first sets $x$ to be constant (as is done in iterated integration). Hence, $dx = 0$ and so $du = -C(B/A) dv$ and, after substitution, $dy = (D-C(B/A))dv$. Then Lagrange set $y$ to be a constant, so $dy = 0$ and hence $dv=0$ so $dx = Adu$. The area "element" $dx dy = A du \cdot (D - (B/A)) dv = (AD - BC) du dv$. Since areas and volumes are non-negative, the absolute value is used. With this, we have "$dxdy = |AD-BC|du dv$" as the analog of $dx = g'(u) du$. - -The expression $AD - BC$ was also derived by Euler, by related means. Lagrange extended the analysis to 3 dimensions. Before doing so, it is helpful to understand the problem from a geometric perspective. Euler was attempting to understand the effects of the following change of variable: - -```math -\begin{align} -x &= a + mt + \sqrt{1-m^2} v\\ -y & = b + \sqrt{1-m^2}t -mv -\end{align} -``` - -Euler knew this to be a clockwise *rotation* by an angle $\theta$ with $\cos(\theta) = m$, a *reflection* through the $x$ axis, and a translation by $\langle a, b\rangle$. All these *should* preserve the area represented by $dx dy$, so he was *expecting* $dx dy = dt dv$. - -```julia; hold=true; echo=false -imgfile ="figures/euler-rotation.png" -caption = "Figure from Katz showing rotation of Euler." -ImageFile(:integral_vector_calculus, imgfile, caption) -``` - -The figure, taken from Katz, shows the translation, and rotation that should preserve area on a differential scale. - -However Euler knew $dx = (\partial{g_1}/\partial{t}) dt + (\partial{g_1}/\partial{v}) dv$ and $dy = (\partial{g_2}/{\partial{t}}) dt + (\partial{g_2}/\partial{v}) dv$. Just multiplying gives $dx dy = m\sqrt{1-m^2} dt dt + (1-m^2) dv dt -m^2 dt dv -m\sqrt{1-m^2} dv dv$, a result that didn't make sense physically as $dt dt$ and $dv dv$ have no meaning in integration and $1 - m^2 - m^2$ is not $1$ as expected. Euler, like Lagrange, used a formal trick to proceed, but the geometric insight that the incremental areas for a change of variable should be related and for this change of variable identical is correct. - - - - -The following illustrates the polar-coordinate transformation $\langle x,y\rangle = G(r, \theta) = r \langle \cos\theta, \sin\theta\rangle$. - -```julia; hold=true -G(u, v) = u * [cos(v), sin(v)] - -G(v) = G(v...) -J(v) = ForwardDiff.jacobian(G, v) # [∇g1', ∇g2'] - -n = 6 -us = range(0, 1, length=3n) # radius -vs = range(0, 2pi, length=3n) # angle - -plot(unzip(G.(us', vs))..., legend = false, aspect_ratio=:equal) # plots constant u lines -plot!(unzip(G.(us, vs'))...) # plots constant v lines - -pt = [us[n],vs[n]] - - -arrow!(G(pt), J(pt)*[1,0], color=:blue) -arrow!(G(pt), J(pt)*[0,1], color=:blue) -``` - -This graphic shows the image of the box $[0,1] \times [0, 2\pi]$ under the transformation. -The `plot` commands draw lines for values of constant `u` or constant `v`. If $G(u,v) = \langle g_1(u,v), g_2(u,v)\rangle$, then the Taylor expansion for $g_i$ is -$g_i(u+du, v+dv) \approx g_i(u,v) + (\nabla{g_i})^T \cdot \langle du, dv \rangle$ and combining $G(u+du, v+dv) \approx G(u,v) + J_G(u,v) \langle du, dv \rangle$. The vectors added above represent the images when $u$ is constant (so $du=0$) and when $v$ is constant (so $dv=0$). The two arrows define a parallelogram whose area gives the change of area undergone by the unit square under the transformation. The area is $|\det(J_G)|$, the absolute value of the determinant of the Jacobian. - - - - -```julia; echo=false -function showG(G, a=1, b=1;a0=0, b0=0, an = 3, bn=3, n=5, lambda=1/2, k1=1, k2=1) - -J(v) = ForwardDiff.jacobian(v -> G(v...), v) # [∇g1', ∇g2'] - -us = range(0, a, length=an*n) # radius -vs = range(0, b, length=bn*n) # angle - -p = plot(unzip(G.(us', vs))..., legend = false, aspect_ratio=:equal) # plots constant u lines -plot!(p,unzip(G.(us, vs'))...) # plots constant v lines - -pt = [us[k1 * n],vs[k2*n]] -P, U, V = G(pt...), lambda * J(pt)*[1,0], lambda * J(pt)*[0,1] -arrow!(P, U, color=:blue, linewidth=2) -arrow!(P+V, U, color=:red, linewidth=1) -arrow!(P, V, color=:blue, linewidth=2) -arrow!(P+U, V, color=:red, linewidth=1) -p -end - - -``` - -The tranformation to elliptical coordinates, $G(u,v) = \langle \cosh(u)\cos(v), \sinh(u)\sin(v)\rangle$, may be viewed similarly: - -```julia; hold=true; echo=false -G(u,v) = [cosh(u)*cos(v), sinh(u)*sin(v)] -showG(G, 1, 2pi) -``` - -The transformation $G(u,v) = v \langle e^u, e^{-u} \rangle$ uses hyperbolic coordinates: - -```julia; hold=true; echo=false -G(u,v) = v * [exp(u), exp(-u)] -showG(G, 1, 2pi, bn = 6, k2=4) -``` - -The transformation $G(u,v) = \langle u^2-v^2, u\cdot v \rangle$ yields a partition of the plane: - -```julia; hold=true; echo=false -G(u,v) = [u^2 - v^2, u*v] -showG(G, 1, 1) -``` - - -The arrows are the images of the standard unit vectors. We see some transformations leave these *orthogonal* and some change the respective lengths. -The area of the associated parallelogram can be found using the determinant of an accompanying matrix. For two dimensions, using the cross product formulation on the embedded vectors, the area is - -```math -\| \det\left(\left[ -\begin{array}{} -\hat{i} & \hat{j} & \hat{k}\\ -u_1 & u_2 & 0\\ -v_1 & v_2 & 0 -\end{array} -\right] -\right) \| -= -\| \hat{k} \det\left(\left[ -\begin{array}{} -u_1 & u_2\\ -v_1 & v_2 -\end{array} -\right] -\right) \| -= | \det\left(\left[ -\begin{array}{} -u_1 & u_2\\ -v_1 & v_2 -\end{array} -\right] -\right)|. -``` - - -Using the fact that the two vectors involved are columns in the Jacobian of the transformation, this is just $|\det(J_G)|$. For $3$ dimensions, the determinant gives the volume of the 3-dimensional parallelepiped in the same manner. This holds for higher dimensions. - -The absolute value of the determinant of the Jacobian -is the multiplying factor that is seen in the change of variable formula for all dimensions: - -> [Change of variable](https://en.wikipedia.org/wiki/Integration_by_substitution#Substitution_for_multiple_variables) Let $U$ be an open set in $R^n$, $G:U \rightarrow R^n$ be an *injective* differentiable function with *continuous* partial derivatives. If $f$ is continuous and compactly supported, then -> ```math -> \iint_{G(S)} f(\vec{x}) dV = \iint_S (f \circ G)(\vec{u}) |\det(J_G)(\vec{u})| dU. -> ``` - - -For the one-dimensional case, there is no absolute value, but there the interval is reversed, producing "negative" area. This is not the case here, where $S$ is parameterized to give positive volume. - -!!! note - The term "functional determinant" is found for the value $\det(J_G)$, as is the notation $\partial(x_1, x_2, \dots x_n)/\partial(u_1, u_2, \dots, u_n)$. - - -### Two dimensional change of variables - -Now we see several examples of two-dimensional transformations. - -#### Polar integrals - - -We have [seen](../differentiable_vector_calculus/polar_coordinates.html) how to compute area in polar coordinates through the formula $A = \int (1/2) r^2(\theta) d\theta$. This formula can be derived as follows. Consider a region $R$ parameterized in polar coordinates by $r(\theta)$ for $a \leq \theta \leq b$. The area of this region would be $\iint_R fdA$. Let $G(r, \theta) = r \langle \cos\theta, \sin\theta\rangle$. Then - -```math -J_G = \left[ -\begin{array}{} -\cos(\theta) & - r\sin(\theta)\\ -\sin(\theta) & r\cos(\theta) -\end{array} -\right], -``` - -with determinant $r$. - -That is, for *polar coordinates* $dx dy = r dr d\theta$ ($r \geq 0$). - - -So by the change of variable formula, we have: - -```math -A = \iint_R 1 dx dy = \int_a^b \int_0^{r(\theta)} 1 r dr d\theta = \int_a^b \frac{r^2(\theta)}{2} d\theta. -``` - -The key is noting that the region, $S$, described by $\theta$ running from $a$ to $b$ and $r$ running from $0$ to $r(\theta)$, maps onto $R$ through the change of variables. As polar coordinates is just a renaming, this is clear to see. - ----- - -Now consider finding the volume of a sphere using polar coordinates. We have, with $\rho$ being the radius: - -```math -V = 2 \iint_R \sqrt{\rho^2 - x^2 - y^2} dy dx, -``` - -where $R$ is the disc of radius $\rho$. Using polar coordinates, we have $x^2 + y^2 = r^2$ and the expression becomes: - -```math -V = 2 \int_0^{2\pi} \int_0^\rho \sqrt{\rho^2 - r^2} r dr d\theta = 2 \int_0^{2\pi} -(1 - r^2)^{3/2}\frac{1}{3} \mid_0^\rho d\theta = 2\int_0^{2\pi} \frac{\rho^3}{3}d\theta = \frac{4\pi\rho^3}{3}. -``` - -##### Linear transformations - -Some [transformations](https://en.wikipedia.org/wiki/Transformation_matrix#Examples_in_2D_computer_graphics) from ``2``D computer graphics are represented in matrix notation: - -```math -\left[ -\begin{array}{} -x\\ -y -\end{array} -\right] = -\left[ -\begin{array}{} -a & b\\ -c & d -\end{array} -\right] -\left[ -\begin{array}{} -u\\ -v -\end{array} -\right], -``` - -or $G(u,v) = \langle au+bv, cu+dv\rangle$. The Jacobian of this *linear* transformation is the matrix itself. - -Some common transformations are: - -* **Stretching** or $G(u,v) = \langle ku, v \rangle$ or $G(u,v) = \langle u, kv\rangle$ for some $k >0$. The former stretching the $x$ axis, the latter the $y$. These have Jacobian determinant $k$ - -```julia; hold=true; echo=false -k = 2 -G(u,v) = [k*u, v] -showG(G, 1, 1) -``` - -* **Rotation**. Let $\theta$ be a clockwise rotation parameter, then $G(u,v) = \langle\cos\theta u + \sin\theta v, -\sin\theta u + \cos\theta v\rangle$ will be the transform. The Jacobian is $1$. This figure rotates by $\pi/6$: - -```julia; hold=true; echo=false -theta = pi/6 -G(u,v) = [cos(theta)*u + sin(theta)*v, -sin(theta)*u + cos(theta)*v] -showG(G, 1, 1) -``` - -* **Shearing**. Let $k > 0$ and $G(u,v) = \langle u + kv, v \rangle$. This transformation is shear parallel to the $x$ axis. (Use $G(u,v) = \langle u, ku+v\rangle$ for the $y$ axis). A shear has Jacobian $1$. - -```julia; hold=true -k = 2 -G(u, v) = [u + 2v, v] -showG(G) -``` - -* **Reflection** If $\vec{l} = \langle l_x, l_y \rangle$ with norm $\|\vec{l}\|$. The reflection through the line in the direction of $\vec{l}$ through the origin is defined, using a matrix, by: - -```math -\frac{1}{\| \vec{l} \|^2} -\left[ -\begin{array}{} -l_x^2 - l_y^2 & 2 l_x l_y\\ -2l_x l_y & l_y^2 - l_x^2 -\end{array} -\right] -``` - -For some simple cases: $\langle l_x, l_y \rangle = \langle 1, 1\rangle$, the diagonal, this is $G(u,v) = (1/2) \langle 2v, 2u \rangle$; $\langle l_x, l_y \rangle = \langle 0, 1\rangle$ (the $y$-axis) this is $G(u,v) = \langle -u, v\rangle$. - -* A translation by $\langle a ,b \rangle$ would be given by $G(u,v) = \langle u+a, y+b \rangle$ and would have Jacobian determinant $1$. - -As an example, consider the transformation of reflecting through the line $x = 1/2$. Let $\vec{ab} = \langle 1/2, 0\rangle$. This would be found by translating by $-\vec{ab}$ then reflecting through the $y$ axis, then translating by $\vec{ab}$: - -```julia; hold=true -T(u, v, a, b) = [u+a, v+b] -G(u, v) = [-u, v] -@syms u v -a,b = 1//2, 0 -x1, y1 = T(u,v, -a, -b) -x2, y2 = G(x1, y1) -x, y = T(x2, y2, a, b) -``` - - - -##### Triangle - -Consider the problem of integrating $f(x,y)$ over the triangular region bounded by $y=x$, $y=0$, and $x=1$. Such an integral may be computed through Fubini's theorem through $\int_0^1 \int_0^x f(x,y) dy dx$ or $\int_0^1 \int_y^1 f(x,y) dx dy$, but *if* these can not be computed, and a numeric option is preferred, a transformation so that the integral is over a rectangle is preferred. - -For this, the transformation $x = u$, $y=uv$ for $(u,v)$ in $[0,1] \times [0,1]$ is possible: - -```julia; hold=true; echo=false -G(u,v) = [u,u*v] -showG(G, lambda=1/3) -``` - -The determinant of the Jacobian is - -```math -\det(J_G) = \det\left( -\left[ -\begin{array}{} -1 & 0\\ -v & u -\end{array} -\right] -\right) = u. -``` - -So, $\iint_R f(x,y) dA = \int_0^1\int_0^1 f(u, uv) u du dv$. Here we illustrate with a generic monomial: - -```julia; -@syms x y n::positive m::positive -monomial(x,y) = x^n*y^m -integrate(monomial(x,y), (y, 0, x), (x, 0, 1)) -``` - -And compare with: - -```julia; hold=true -@syms u v -integrate(monomial(u, u*v)*u, (u,0,1), (v,0,1)) -``` - - -###### Composition of transformations - -What about other triangles, say the triangle bounded by $x=0$, $y=0$ and $y-x=1$? - -This can be seen as a reflection through the line $x=1/2$ of the triangle above. If $G_1$ represents the mapping from $U [0,1]\times[0,1]$ into the triangle of the last problem, and $G_2$ represents the reflection through the line $x=1/2$, then the transformation $G_2 \circ G_1$ will map the box $U$ into the desired region. By the chain rule, we have: - -```math -\begin{align*} -\int_{(G_2\circ G_1)(U))} f dx &= \int_U (f\circ G_2 \circ G_1) |\det(J_{G_2 \circ G_1}| du \\ -&= -\int_U (f\circ G_2 \circ G_1) |\det(J_{G_2}(G_1(u))||\det J_{G_1}(u)| du. -\end{align*} -``` - -(In [Katz](http://www.jstor.org/stable/2689856) it is mentioned that Jacobi showed this in 1841.) - -The flip through the $x=1/2$ line was done above and is $\langle u, v\rangle \rightarrow \langle 1-u, v\rangle$ which has Jacobian determinant $-1$. - -We compare now using and `hcubature` and our "Fubini" function: - -```julia; hold=true -G1(u,v) = [u, u*v] -G1(v) = G1(v...) -G2(u,v) = [1-u, v] -G2(v) = G2(v...) -f(x,y) = x^2*y^3 -f(v) = f(v...) -A = ∬((y,x) -> f(x,y), (0, x -> 1 - x), (0, 1)) -B = hcubature(v -> (f∘G2∘G1)(v) * v[1] * 1, (0,0), (1, 1)) -A, B[1], A - B[1] -``` - - -##### Hyperbolic transformation - -Consider the region, $R$, bounded by $y=0$, $x=e^{-n}$, $x=e^n$, and $y=1/x$. An integral over this region may be computed with the help of the transform $G(u,v) = v \langle e^u, e^{-u}\rangle$ which takes the box $[-n, n] \times [0,1]$ onto $R$. - -With this, we compute $\iint_R x^2 y^3 dA$ using `SymPy` to compute the Jacobian: - -```julia; hold=true -@syms u v n -G(u,v) = v * [exp(u), exp(-u)] -Jac = G(u,v).jacobian([u,v]) -f(x,y) = x^2 * y^3 -f(v) = f(v...) -integrate(f(G(u,v)) * abs(det(Jac)), (u, -n, n), (v, 0, 1)) -``` - ----- - -This collection shows a summary of the above ``2``D transformations: - -```julia; hold=true; echo=false -transform(u,v) = [u+2, v+3] -ps = [showG(transform)] -xlabel!(ps[end], "transformation") - -rotation(u,v, θ=pi/3) = [cos(θ) sin(θ); -sin(θ) cos(θ)]*[u,v] -push!(ps, showG(rotation)) -xlabel!(ps[end], "rotation") - -shear(u,v) = [u+v, v] -push!(ps, showG(shear)) -xlabel!(ps[end], "shear") - -triangle(u,v) = [u, u*v] -push!(ps, showG(triangle)) -xlabel!(ps[end], "triangle") - -shear(v) = shear(v...) -push!(ps, showG(shear ∘ triangle)) -xlabel!(ps[end], "shear ∘ triangle") - -circle(u, v) = v*[sin(2pi*u), cos(2pi*u)] -push!(ps, showG(circle)) -xlabel!(ps[end], "polar") - - -ellipse(u, v) = [cosh(u)*cos(v), sinh(u)*sin(v)] -push!(ps, showG(ellipse)) -xlabel!(ps[end], "ellipse") - - -hyperbolic(u, v) = v * [exp(u), exp(-u)] -push!(ps, showG(hyperbolic)) -xlabel!(ps[end], "hyperbolic") - -partition(u,v) = [ u^2-v^2, u*v ] -push!(ps, showG(partition)) -xlabel!(ps[end], "partition") - -l = @layout [a b c; - d e f; - g h i] - -plot(ps..., layout=l) -``` - - -### Examples - -##### Centroid: - -The center of mass is a balancing point of a region with density $\rho(x,y)$. In two dimensions it is a point $\langle \bar{x}, \bar{y}\rangle$. These are found by the following formulas: - -```math -A = \iint_R \rho(x,y) dA, \quad \bar{x} = \frac{1}{A} \iint_R x \rho(x,y) dA, \quad -\bar{y} = \frac{1}{A} \iint_R y \rho(x,y) dA. -``` - -The $x$ value can be seen in terms of Fubini by integrating in $y$ first: - -```math -\iint_R x \rho(x,y) dA = \int_{x=a}^b (\int_{y=h(x)}^{g(x)} \rho(x,y) dy) dx. -``` - -The inner integral is the mass of a slice at a value along the $x$ axis. The center of mass is formed then by the mass times the distance from the origin. The center of mass is a "balance" point, in the sense that $\iint_R (x - \bar{x}) dA = 0$ and $\iint_R (y-\bar{y})dA = 0$. - -For example, the center of mass of the upper half *unit* disc will have a centroid with $\bar{x} = 0$, by symmetry. We can see this by integrating in *Cartesian* coordinates, as follows - -```math -\iint_R x dA = \int_{y=0}^1 \int_{x=-\sqrt{1-y^2}}^{\sqrt{1 - y^2}} x dx dy. -``` - -The inner integral is $0$ as it an integral of an *odd* function over an interval symmetric about $0$. - - -The value of $\bar{y}$ is found using polar coordinate transformation from: - -```math -\iint_R y dA = \int_{r=0}^1 \int_{\theta=0}^{\pi} (r\sin(\theta))r d\theta dr = -\int_{r=0}^1 r^2 dr \int_{\theta=0}^{\pi}\sin(\theta) = \frac{1}{3} \cdot 2. -``` - -The third equals sign uses separability. The answer for $\bar{ is this value divided by the area, or $2/(3\pi)$. - -##### Example: Moment of inertia - -The moment of [inertia](https://en.wikipedia.org/wiki/Moment_of_inertia) of a point mass about an axis is $I = mr^2$ where $m$ is the mass and $r$ the distance to the axis. The moment of inertia of a body is the sum of the moment of inertia of each piece. If $R$ is a region in the $x$-$y$ plane with density $\rho(x,y)$ and the axis is the $y$ axis, then an approximate moment of inertia would be $\sum (x_i)^2\rho(x_i, y_i)\Delta x_i \Delta y_i$ which would lead to $I = \iint_R x^2\rho(x,y) dA$. - -Let $R$ be the half disc contained by $x^2 + y^2 = 1$ and $y \geq 0$. Let $\rho(x,y) = xy^2$. Find the moment of inertia. - -$R$ is best described in polar coordinates, so we try to compute - -```math -\int_0^1 \int_{-\pi/2}^{\pi/2} (r\cos(\theta))^2 (r\cos(\theta))(r\sin(\theta)) r d\theta dr. -``` - -That requires integrating $\sin^2(\theta)\cos^3(\theta)$, a doable task, but best left to SymPy: - -```julia; hold=true -@syms r theta -x = r*cos(theta) -y = r*sin(theta) -rho(x,y) = x*y^2 -integrate(x^2 * rho(x, y), (theta, -PI/2, PI/2), (r, 0, 1)) -``` - -##### Example - -(Strang) Find the moment of inertia about the $y$ axis of the unit square tilted *counter*-clockwise an angle $0 \leq \alpha \leq \pi/2$. - -The counterclockwise rotation of the unit square is $G(u,v) = \langle \cos(\alpha)u-\sin(\alpha)v, \sin(\alpha)u + \cos(\alpha) v\rangle$. This comes from the above formula for clockwise rotation using $-\alpha$. This transformation has Jacobian determinant $1$, as the area is not deformed. With this, we have - -```math -\iint_R x^2 dA = \iint_{G(U)} (f\circ G)(u) |\det(J_G(u))| dU, -``` - -which is computed with: - -```julia; hold=true -@syms u v alpha -f(x,y) = x^2 -G(u,v) = [cos(alpha)*u - sin(alpha)*v, sin(alpha)*u + cos(alpha)*v] -Jac = det(G(u,v).jacobian([u,v])) |> simplify -integrate(f(G(u,v)...) * Jac , (u, 0, 1), (v, 0, 1)) -``` - -##### Example - -Let $R$ be a ring with inner radius $4$ and outer radius $5$. Find its moment of inertia about the $y$ axis. - -The integral to compute is: - -```math -\iint_R x^2 dA, -``` - -with domain that is easy to describe in polar coordinates: - -```julia; hold=true -@syms r theta -x = r*cos(theta) -integrate(x^2 * r, (r, 4, 5), (theta, 0, 2PI)) -``` - - -### Three dimensional change of variables - -The change of variables formula is no different between dimensions $2$ and $3$ (or higher), but the question of suitable transformation is more involved as the dimensions increase. We stick here to a few widely used ones. - -#### Cylindrical coordinates - - -Polar coordinates describe the $x$-$y$ plane in terms of a radius $r$ and angle $\theta$. *Cylindrical* coordinates describe the $x-y-z$ plane in terms of $r, \theta$, and $z$. A transformation is: - -```math -G(r,\theta, z) = \langle r\cos(\theta), r\sin(\theta), z\rangle. -``` - -This has Jacobian determinant $r$, similar to polar coordinates. - - - -##### Example - -Returning to the volume of a cone above the $x$-$y$ plane under $z = a - b(x^2 + y^2)^{12}$. This yielded the integral in Cartesian coordinates: - -```math -\int_{x=-r}^r \int_{y=-\sqrt{r^2 - x^2}}^{\sqrt{r^2-x^2}} \int_0^{a - b(x^2 + y^2)} 1 dz dy dx, -``` - -where $r=a/b$. This is *much* simpler in Cylindrical coordinates, as the region is described by the rectangle in $(r, \theta)$: $[0, \sqrt{b/a}] \times [0, 2\pi]$ and the $z$ range is from $0$ to $a - b r$. - -The volume then is: - -```math -\int_{theta=0}^{2\pi} \int_{r=0}^{a/b} \int_{z=0}^{a - br} 1 r dz dr d\theta = -2\pi \int_{r=0}^{a/b} (a-br)r dr = \frac{\pi a^3}{3b^2}. -``` - - -This is in agreement with $\pi r^2 h/3$. - ----- - -Find the centroid for the cone. First in the $x$ direction, $\iint_R x dV$ is found by: - -```julia; hold=true -@syms r theta z a b -f(x,y,z) = x -x = r*cos(theta) -y = r*sin(theta) -Jac = r -integrate(f(x,y,z) * Jac, (z, 0, a - b*r), (r, 0, a/b), (theta, 0, 2PI)) -``` - -That this is $0$ is no surprise. The same will be true for the $y$ direction, as the figure is symmetric about the plane $y=0$ and $x=0$. However, the $z$ direction is different: - -```julia; hold=true -@syms r theta z a b -f(x,y,z) = z -x = r*cos(theta) -y = r*sin(theta) -Jac = r -A = integrate(f(x,y,z) * Jac, (z, 0, a - b*r), (r, 0, a/b), (theta, 0, 2PI)) -B = integrate(1 * Jac, (z, 0, a - b*r), (r, 0, a/b), (theta, 0, 2PI)) -A, B, A/B -``` - -The answer depends on the height through $a$, but *not* the size of the base, parameterized by $b$. To finish, the centroid is $\langle 0, 0, a/4\rangle$. - -##### Example - -A sphere of radius $2$ is intersected by a cylinder of radius $1$ along the $z$ axis. Find the volume of the intersection. - -We have $x^2 + y^2 + z^2 = 4$ or $z^2 = 4 - r^2$ in cylindrical coordinates. The integral then is: - -```julia; hold=true -@syms r::real theta::real z::real -integrate(1 * r, (z, -sqrt(4-r^2), sqrt(4-r^2)), (r, 0, 1), (theta, 0, 2PI)) -``` - -If instead of a fixed radius of $1$ we use $0 \leq a \leq 2$ we have: - -```julia; hold=true -@syms a r theta -integrate(1 * r, (z, -sqrt(4-r^2), sqrt(4-r^2)), (r, 0, a), (theta,0, 2PI)) -``` - -#### Spherical integrals - -Spherical coordinates describe a point in space by a radius from the origin, $r$ or $\rho$; a azimuthal angle $\theta$ in $[0, 2\pi]$ and an *inclination* angle $\phi$ (also called polar angle) in $[0, \pi]$. The $z$ axis is the direction of the zenith and gives a reference line to define the inclination angle. The $x$-$y$ plane is the reference plane, with the $x$ axis giving a reference direction for the azimuth measurement. - - -The exact formula to relate $(\rho, \theta, \phi)$ to $(x,y,z)$ is given by - -```math -G(\rho, \theta, \phi) = \rho \langle -\sin(\phi)\cos(\theta), -\sin(\phi)\sin(\theta), -\cos(\phi) -\rangle. -``` - - -```julia; hold=true; echo=false -imgfile = "figures/spherical-coordinates.png" -caption = "Figure showing the parameterization by spherical coordinates. (Wikipedia)" - -ImageFile(:integral_vector_calculus, imgfile, caption) -``` - -The Jacobian can be computed to be $\rho^2\sin(\phi)$. - -```julia; hold=true -@syms ρ theta phi -G(ρ, theta, phi) = ρ * [sin(phi)*cos(theta), sin(phi)*sin(theta), cos(phi)] -det(G(ρ, theta, phi).jacobian([ρ, theta, phi])) |> simplify |> abs -``` - -##### Example - -Computing the volume of a sphere is a challenge (for SymPy) in Cartesian coordinates, but a breeze in spherical coordinates. Using $r^2\sin(\phi)$ as the multiplying factor, the volume is simply: - -```math -\int_{\theta=0}^{2\pi} \int_{\phi=0}^{\pi} \int_{r=0}^R 1 \cdot r^2 \sin(\phi) dr d\phi d\theta = -\int_{\theta=0}^{2\pi} d\theta \int_{\phi=0}^{\pi} \sin(\phi)d\phi \int_{r=0}^R r^2 dr = (2\pi)(2)\frac{R^3}{3} = \frac{4\pi R^3}{3}. -``` - - - -##### Example - -Compute the volume of the ellipsoid, $R$, described by $(x/a)^2 + (y/v)^2 + (z/c)^2 \leq 1$. - -We first change variables via $G(u,v,w) = \langle ua, vb, wc \rangle$. This maps the unit sphere, $S$, given by $u^2 + v^2 + w^2 \leq 1$ into the ellipsoid. Then - -```math -\iint_R 1 dV = \iint_S 1 |\det(J_G)| dU -``` - -But the Jacobian is a constant: - -```julia; hold=true -@syms u v w a b c -G(u,v,w) = [u*a, v*b, w*c] -det(G(u,v,w).jacobian([u,v,w])) -``` - -So the answer is $abc V(S) = 4\pi abc/3$ - - -## Questions - -###### Question - -Suppose $f(x,y) = f_1(x)f_2(y)$ and $R = [a_1, b_1] \times [a_2,b_2]$ is a rectangular region. Is this true? - -```math -\iint_R f dA = (\int_{a_1}^{b_1} f_1(x) dx) \cdot (\int_{a_2}^{b_2} f_2(y) dy). -``` - -```julia; hold=true; echo=false -choices = [ -L"Yes. As an inner integral $\int_{a^2}^{b_2} f(x,y) dy = f_1(x) \int_{a_2}^{b_2} f_2(y) dy$.", -"No." -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Which integrals of the following are $0$ by symmetry? Let $R$ be the unit disc. - -```math -a = \iint_R x dA, \quad b = \iint_R (x^2 + y^2) dA, \quad c = \iint_R xy dA -``` - -```julia; hold=true; echo=false -choices = [ -L"Both $a$ and $b$", -L"Both $a$ and $c$", -L"Both $b$ and $c$" -] -answ = 2 -radioq(choices, answ) -``` - - -###### Question - -Let $R$ be the unit disc. Which integrals can be found from common geometric formulas (e.g., known formulas for the sphere, cone, pyramid, ellipse, ...) - -```math -a = \iint_R (1 - (x^2+y2)) dA, \quad -b = \iint_R (1 - \sqrt{x^2 + y^2}) dA, \quad -c = \iint_R (1 - (x^2 + y^2)^2 dA -``` - - -```julia; hold=true; echo=false -choices = [ -L"Both $a$ and $b$", -L"Both $a$ and $c$", -L"Both $b$ and $c$" -] -answ = 1 -radioq(choices, answ) -``` - - - - - -###### Question - -Let the region $R$ be described by: in the first quadrant and bounded by $x^3 + y^3 = 1$. What integral below will **not** find the area of $R$? - -```julia; hold=true; echo=false -choices = [ -raw" ``\int_0^1 \int_0^{(1-x^3)^{1/3}} 1\cdot dy dx``", -raw" ``\int_0^1 \int_0^{(1-y^3)^{1/3}} 1\cdot dx dy``", -raw" ``\int_0^1 \int_0^{(1-y^3)^{1/3}} 1\cdot dy dx``" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -Let $R$ be a triangular region with vertices $(0,0), (2,0), (1, b)$ where $b \geq 0$. What integral below computes the area of $R? - - -```julia; hold=true; echo=false -choices = [ -raw" ``\int_0^b\int_{y/b}^{2-y/b} dx dy``", -raw" ``\int_0^2\int_0^{bx} dy dx``", -raw" ``\int_0^2 \int_0^{2b - bx} dy dx``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Let $f(x) \geq 0$ be an integrable function. The area under $f(x)$ over $[a,b]$, $\int_a^b f(x) dx$, is equivalent to? - -```julia; hold=true; echo=false -choices = [ -raw" ``\int_a^b \int_0^{f(x)} dy dx``", -raw" ``\int_a^b \int_0^{f(x)} dx dy``", -raw" ``\int_0^{f(x)} \int_a^b dx dy``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -The region $R$ contained within $|x| + |y| = 1$ is square, but not rectangular (in the sense of integration). What transformation of $S = [-1/2,1/2] \times [-1/2,1/2]$ will have $G(S) = R$? - -```julia; hold=true; echo=false -choices = [ -raw" ``G(u,v) = \langle u-v, u+v \rangle``", -raw" ``G(u,v) = \langle u^2-v^2, u^2+v^2 \rangle``", -raw" ``G(u,v) = \langle u-v, u \rangle``" -] -answ = 1 -radioq(choices, answ) -``` - - - -###### Question - -Let $G(u,v) = \langle \cosh(u)\cos(v), \sinh(u)\sin(v) \rangle$. Using `ForwardDiff` find the determinant of the Jacobian at $[1,2]$. - -```julia; hold=true; echo=false -G(u,v) = [cosh(u)*cos(v), sinh(u)*sin(v)] -pt = [1,2] -val = det(ForwardDiff.jacobian(v -> G(v...), [1,2])) -numericq(val) -``` - -###### Question - -Let $G(u, v) = \langle \cosh(u)\cos(v), \sinh(u)\sin(v) \rangle$. Compute the determinant of the Jacobian symbolically: - -```julia; hold=true; echo=false -choices = [ -raw" ``\sin^{2}{\left (v \right )} \cosh^{2}{\left (u \right )} + \cos^{2}{\left (v \right )} \sinh^{2}{\left (u \right )}``", -raw" ``1``", -raw" ``\sinh(u)\cosh(v)``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Compute the determinant of the Jacobian of the composition of a clockwise rotation by $\theta$, a reflection through the $x$ axis, and then a translation by $\langle a,b\rangle$, using the fact that the Jacobian determinant of *compositions* can be written as product of determinants of the individual Jacobians. - -```julia; hold=true; echo=false -choices = [ -L"It is $1$, as each is area preserving", -L"It is $r$, as the rotation uses polar coordinates", -L"It is $r^2 \sin(\phi)$, as the rotations use spherical coordinates" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -A wedge, $R$, is specified by $0 \leq r \leq a$, $0 \leq \theta \leq b$. - -```julia; hold=true; echo=false -@syms r theta a b -x = r*cos(theta) -y = r*sin(theta) -A = integrate(r, (r, 0, a), (theta, 0, b)) -B = integrate(x * r, (r, 0, a), (theta, 0, b)) -C = integrate(y * r, (r, 0, a), (theta, 0, b)) -``` - -What does `A` compute? - -```julia; hold=true; echo=false -choices = [ -L"The area of $R$", -L"The value $\bar{x}$ of the centroid", -L"The value $\bar{y}$ of the centroid", -L"The moment of inertia of $R$ about the $x$ axis" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -What does $B/A$ compute? - - -```julia; hold=true; echo=false -choices = [ -L"The area of $R$", -L"The value $\bar{x}$ of the centroid", -L"The value $\bar{y}$ of the centroid", -L"The moment of inertia of $R$ about the $x$ axis" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -According to [Katz](http://www.jstor.org/stable/2689856) in 1899 Cartan formalized the subject of differential forms (elements such as $dx$ or $du$). Using the rules $dtdt = 0 = dv=dv$ and $dv dt = - dt dv$, what is the product of $dx=mdt + dv\sqrt{1-m^2}$ and $dy=dt\sqrt{1-m^2}-mdv$? - -```julia; hold=true; echo=false -choices = [ -raw" ``dtdv``", -raw" ``(1-2m^2)dt dv``", -raw" ``m\sqrt{1-m^2}dt^2+(1-2m^2)dtdv -m\sqrt{1-m^2}dv^2``" -] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/integral_vector_calculus/figures/Jacobian_determinant_and_distortion.png b/CwJ/integral_vector_calculus/figures/Jacobian_determinant_and_distortion.png deleted file mode 100644 index 1cdd1e1..0000000 Binary files a/CwJ/integral_vector_calculus/figures/Jacobian_determinant_and_distortion.png and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/chrysler-building-in-new-york.jpg b/CwJ/integral_vector_calculus/figures/chrysler-building-in-new-york.jpg deleted file mode 100644 index 22db33a..0000000 Binary files a/CwJ/integral_vector_calculus/figures/chrysler-building-in-new-york.jpg and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/chrysler-construction.jpg b/CwJ/integral_vector_calculus/figures/chrysler-construction.jpg deleted file mode 100644 index e0a8e78..0000000 Binary files a/CwJ/integral_vector_calculus/figures/chrysler-construction.jpg and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/chrysler-nano-block.png b/CwJ/integral_vector_calculus/figures/chrysler-nano-block.png deleted file mode 100644 index c231b86..0000000 Binary files a/CwJ/integral_vector_calculus/figures/chrysler-nano-block.png and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/curl-derivation.png b/CwJ/integral_vector_calculus/figures/curl-derivation.png deleted file mode 100644 index 3c172db..0000000 Binary files a/CwJ/integral_vector_calculus/figures/curl-derivation.png and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/divergence-derivation.png b/CwJ/integral_vector_calculus/figures/divergence-derivation.png deleted file mode 100644 index cb04ae3..0000000 Binary files a/CwJ/integral_vector_calculus/figures/divergence-derivation.png and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/euler-rotation.png b/CwJ/integral_vector_calculus/figures/euler-rotation.png deleted file mode 100644 index a6b9de0..0000000 Binary files a/CwJ/integral_vector_calculus/figures/euler-rotation.png and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/jiffy-pop.jpg b/CwJ/integral_vector_calculus/figures/jiffy-pop.jpg deleted file mode 100644 index 5a5def6..0000000 Binary files a/CwJ/integral_vector_calculus/figures/jiffy-pop.jpg and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/jiffy-pop.png b/CwJ/integral_vector_calculus/figures/jiffy-pop.png deleted file mode 100644 index 7a82c95..0000000 Binary files a/CwJ/integral_vector_calculus/figures/jiffy-pop.png and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/kapoor-cloud-gate.jpg b/CwJ/integral_vector_calculus/figures/kapoor-cloud-gate.jpg deleted file mode 100644 index 373f48e..0000000 Binary files a/CwJ/integral_vector_calculus/figures/kapoor-cloud-gate.jpg and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/spherical-coordinates.png b/CwJ/integral_vector_calculus/figures/spherical-coordinates.png deleted file mode 100644 index 94d47a4..0000000 Binary files a/CwJ/integral_vector_calculus/figures/spherical-coordinates.png and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/strang-slicing.png b/CwJ/integral_vector_calculus/figures/strang-slicing.png deleted file mode 100644 index 4065e75..0000000 Binary files a/CwJ/integral_vector_calculus/figures/strang-slicing.png and /dev/null differ diff --git a/CwJ/integral_vector_calculus/figures/surface-integral-cell.png b/CwJ/integral_vector_calculus/figures/surface-integral-cell.png deleted file mode 100644 index 6cb4a1f..0000000 Binary files a/CwJ/integral_vector_calculus/figures/surface-integral-cell.png and /dev/null differ diff --git a/CwJ/integral_vector_calculus/line_integrals.jmd b/CwJ/integral_vector_calculus/line_integrals.jmd deleted file mode 100644 index 3984468..0000000 --- a/CwJ/integral_vector_calculus/line_integrals.jmd +++ /dev/null @@ -1,1315 +0,0 @@ -# Line and Surface Integrals - -This section uses these add-on packages: - - -```julia -using CalculusWithJulia -using Plots -using QuadGK -using SymPy -using HCubature -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Line and Surface Integrals", - description = "Calculus with Julia: Line and Surface Integrals", - tags = ["CalculusWithJulia", "integral_vector_calculus", "line and surface integrals"], -); -nothing -``` - ----- - -This section discusses generalizations to the one- and two-dimensional definite integral. These two integrals integrate a function over a one or two dimensional region (e.g., $[a,b]$ or $[a,b]\times[c,d]$). The generalization is to change this region to a one-dimensional piece of path in $R^n$ or a two-dimensional surface in $R^3$. - -To fix notation, consider $\int_a^b f(x)dx$ and $\int_a^b\int_c^d g(x,y) dy dx$. In defining both, a Riemann sum is involved, these involve a partition of $[a,b]$ or $[a,b]\times[c,d]$ and terms like $f(c_i) \Delta{x_i}$ and $g(c_i, d_j) \Delta{x_i}\Delta{y_j}$. The $\Delta$s the diameter of an intervals $I_i$ or $J_j$. Consider now two parameterizations: $\vec{r}(t)$ for $t$ in $[a,b]$ and $\Phi(u,v)$ for $(u,v)$ in $[a,b]\times[c,d]$. One is a parameterization of a space curve, $\vec{r}:R\rightarrow R^n$; the other a parameterization of a surface, $\Phi:R^2 \rightarrow R^3$. The *image* of $I_i$ or $I_i\times{J_j}$ under $\vec{r}$ and $\Phi$, respectively, will look *almost* linear if the intervals are small enough, so, at least on the microscopic level. A Riemann term can be based around this fact, provided it is understood how much the two parameterizations change the interval $I_i$ or region $I_i\times{J_j}$. - -This chapter will quantify this change, describing it in terms of associated vectors to $\vec{r}$ and $\Phi$, yielding formulas for an integral of a *scalar* function along a path or over a surface. Furthermore, these integrals will be generalized to give meaning to physically useful interactions between the path or surface and a vector field. - - - - - -## Line integrals - -In [arc length](../integrals/arc-length.html) a formula to give the -arc-length of the graph of a univariate function or parameterized -curve in $2$ dimensions is given in terms of an integral. The -intuitive approximation involved segments of the curve. To review, let -$\vec{r}(t)$, $a \leq t \leq b$, describe a curve, $C$, in $R^n$, $n -\geq 2$. Partition $[a,b]$ into $a=t_0 < t_1 < \cdots < t_{n-1} < t_n = -b$. - -Consider the path segment connecting $\vec{r}(t_{i-1})$ to $\vec{r}(t_i)$. If the partition of $[a,b]$ is microscopically small, this path will be *approximated* by $\vec{r}(t_i) - \vec{r}(t_{i-1})$. This difference in turn is approximately $\vec{r}'(t_i) (t_i - t_{i-1}) = \vec{r}'(t_i) \Delta{t}_i$, provided $\vec{r}$ is differentiable. - - -If $f:R^n \rightarrow R$ is a scalar function. Taking right-hand end points, we can consider the Riemann sum $\sum (f\circ\vec{r})(t_i) \|\vec{r}'(t_i)\| \Delta{t}_i$. For integrable functions, this sum converges to the *line integral* defined as a one-dimensional integral for a given parameterization: - -```math -\int_a^b f(\vec{r}(t)) \| \vec{r}'(t) \| dt. -``` - -The weight $\| \vec{r}'(t) \|$ can be interpreted by how much the parameterization stretches (or contracts) an interval $[t_{i-1},t_i]$ when mapped to its corresponding path segment. - ----- - - -The curve $C$ can be parameterized many different ways by introducing a function $s(t)$ to change the time. If we use the arc-length parameterization with $\gamma(0) = a$ and $\gamma(l) = b$, where $l$ is the arc-length of $C$, then we have by change of variables $t = \gamma(s)$ that - -```math -\int_a^b f(\vec{r}(t)) \| \vec{r}'(t) \| dt = -\int_0^l (f \circ \vec{r} \circ \gamma)(s) \| \frac{d\vec{r}}{dt}\mid_{t = \gamma(s)}\| \gamma'(s) ds. -``` - -But, by the chain rule: - -```math -\frac{d(\vec{r} \circ\gamma)}{du}(s) = \frac{d\vec{r}}{dt}\mid_{t=\gamma(s)} \frac{d\gamma}{du}. -``` - -Since $\gamma$ is increasing, $\gamma' \geq 0$, so we get: - -```math -\int_a^b f(\vec{r}(t)) \| \vec{r}'(t) \| dt = -\int_0^l (f \circ \vec{r} \circ \gamma)(s) \|\frac{d(\vec{r}\circ\gamma)}{ds}\| ds = -\int_0^l (f \circ \vec{r} \circ \gamma)(s) ds. -``` - -The last line, as the derivative is the unit tangent vector, $T$, with norm $1$. - -This shows that the line integral is *not* dependent on the parameterization. The notation $\int_C f ds$ is used to represent the line integral of a scalar function, the $ds$ emphasizing an implicit parameterization of $C$ by arc-length. When $C$ is a closed curve, the $\oint_C fds$ is used to indicate that. - -### Example - - -When $f$ is identically $1$, the line integral returns the arc length. When $f$ varies, then the line integral can be interpreted a few ways. First, if $f \geq 0$ and we consider a sheet hung from the curve $f\circ \vec{r}$ and cut to just touch the ground, the line integral gives the area of this sheet, in the same way an integral gives the area under a positive curve. - -If the composition $f \circ \vec{r}$ is viewed as a density of the arc (as though it were constructed out of some non-uniform material), then the line integral can be seen to return the mass of the arc. - -Suppose $\rho(x,y,z) = 5 - z$ gives the density of an arc where the arc is parameterized by $\vec{r}(t) = \langle \cos(t), 0, \sin(t) \rangle$, $0 \leq t \leq \pi$. (A half-circular arc.) Find the mass of the arc. - -```julia; -rho(x,y,z) = 5 - z -rho(v) = rho(v...) -r(t) = [cos(t), 0, sin(t)] - -@syms t -rp = diff.(r(t),t) # r' -area = integrate((rho ∘ r)(t) * norm(rp), (t, 0, PI)) -``` - -Continuing, we could find the center of mass by integrating $\int_C z (f\circ \vec{r}) \|r'\| dt$: - -```julia; -Mz = integrate(r(t)[3] * (rho ∘ r)(t) * norm(rp), (t, 0, PI)) -Mz -``` - -Finally, we get the center of mass by - -```julia; -Mz / area -``` - -##### Example - -Let $f(x,y,z) = x\sin(y)\cos(z)$ and $C$ the path described by $\vec{r}(t) = \langle t, t^2, t^3\rangle$ for $0 \leq t \leq \pi$. Find the line integral $\int_C fds$. - -We find the numeric value with: - -```julia; hold=true -f(x,y,z) = x*sin(y)*cos(z) -f(v) = f(v...) -r(t) = [t, t^2, t^3] -integrand(t) = (f ∘ r)(t) * norm(r'(t)) -quadgk(integrand, 0, pi) -``` - - -##### Example - -Imagine the $z$ axis is a wire and in the $x$-$y$ plane the unit circle is a path. If there is a magnetic field, $B$, then the field will induce a current to flow along the wire. [Ampere's]https://tinyurl.com/y4gl9pgu) circuital law states $\oint_C B\cdot\hat{T} ds = \mu_0 I$, where $\mu_0$ is a constant and $I$ the current. If the magnetic field is given by $B=(x^2+y^2)^{1/2}\langle -y,x,0\rangle$ compute $I$ in terms of $\mu_0$. - -We have the path is parameterized by $\vec{r}(t) = \langle \cos(t), \sin(t), 0\rangle$, and so $\hat{T} = \langle -\sin(t), \cos(t), 0\rangle$ and the integrand, $B\cdot\hat{T}$ is - -```math -(x^2 + y^2)^{-1/2}\langle -\sin(t), \cos(t), 0\rangle\cdot -\langle -\sin(t), \cos(t), 0\rangle = (x^2 + y^2)(-1/2), -``` - -which is $1$ on the path $C$. So $\int_C B\cdot\hat{T} ds = \int_C ds = 2\pi$. So the current satisfies $2\pi = \mu_0 I$, so $I = (2\pi)/\mu_0$. - -(Ampere's law is more typically used to find $B$ from an current, then $I$ from $B$, for special circumstances. The Biot-Savart does this more generally.) - -### Line integrals and vector fields; work and flow - -As defined above, the line integral is defined for a scalar function, but this can be generalized. If $F:R^n \rightarrow R^n$ is a vector field, then each component is a scalar function, so the integral $\int (F\circ\vec{r}) \|\vec{r}'\| dt$ can be defined component by component to yield a vector. - -However, it proves more interesting to define an integral -incorporating how properties of the path interact with the vector -field. The key is $\vec{r}'(t) dt = \hat{T} \| \vec{r}'(t)\|dt$ describes both the magnitude of how the parameterization stretches an interval but also a direction the path is taking. This direction allows interaction with the vector field. - -The canonical example is [work](https://en.wikipedia.org/wiki/Work_(physics)), which is a measure of a -force times a distance. For an object following a path, the work done is still a force times a distance, but only that force in the direction of the motion is considered. (The *constraint force* keeping the object on the path does no work.) Mathematically, $\hat{T}$ describes the direction of motion along a path, so the work done in moving an object over a small segment of the path is $(F\cdot\hat{T}) \Delta{s}$. Adding up incremental amounts of work leads to a Riemann sum for a line integral involving a vector field. - -> The *work* done in moving an object along a path $C$ by a force field, $F$, is given by the integral -> ```math -> \int_C (F \cdot \hat{T}) ds = \int_C F\cdot d\vec{r} = \int_a^b ((F\circ\vec{r}) \cdot \frac{d\vec{r}}{dt})(t) dt. -> ``` - - ----- - - -In the $n=2$ case, there is another useful interpretation of the line integral. -In this dimension the normal vector, $\hat{N}$, is well defined in terms of the tangent vector, $\hat{T}$, through a rotation: -$\langle a,b\rangle^t = \langle b,-a\rangle^t$. (The negative, $\langle -b,a\rangle$ is also a candidate, the difference in this choice would lead to a sign difference in in the answer.) -This allows the definition of a different line integral, called a flow integral, as detailed later: - -> The *flow* across a curve $C$ is given by -> ```math -> \int_C (F\cdot\hat{N}) ds = \int_a^b (F \circ \vec{r})(t) \cdot (\vec{r}'(t))^t dt. -> ``` - - -### Examples - -##### Example - -Let $F(x,y,z) = \langle x - y, x^2 - y^2, x^2 - z^2 \rangle$ and -$\vec{r}(t) = \langle t, t^2, t^3 \rangle$. Find the work required to move an object along the curve described by $\vec{r}$ between $0$ and $1$. - -```julia; hold=true -F(x,y,z) = [x-y, x^2 - y^2, x^2 - z^2] -F(v) = F(v...) -r(t) = [t, t^2, t^3] - -@syms t::real -integrate((F ∘ r)(t) ⋅ diff.(r(t), t), (t, 0, 1)) -``` - - -##### Example - -Let $C$ be a closed curve. For a closed curve, the work integral is also termed the *circulation*. For the vector field $F(x,y) = \langle -y, x\rangle$ compute the circulation around the triangle with vertices $(-1,0)$, $(1,0)$, and $(0,1)$. - -We have three integrals using $\vec{r}_1(t) = \langle -1+2t, 0\rangle$, -$\vec{r}_2(t) = \langle 1-t, t\rangle$ and -$\vec{r}_3(t) = \langle -t, 1-t \rangle$, all from $0$ to $1$. (Check that the parameterization is counter clockwise.) - -The circulation then is: - -```julia; hold=true -r1(t) = [-1 + 2t, 0] -r2(t) = [1-t, t] -r3(t) = [-t, 1-t] -F(x,y) = [-y, x] -F(v) = F(v...) -integrand(r) = t -> (F ∘ r)(t) ⋅ r'(t) -C1 = quadgk(integrand(r1), 0, 1)[1] -C2 = quadgk(integrand(r2), 0, 1)[1] -C3 = quadgk(integrand(r3), 0, 1)[1] -C1 + C2 + C3 -``` - -That this is non-zero reflects a feature of the vector field. In this case, the vector field spirals around the origin, and the circulation is non zero. - -##### Example - -Let $F$ be the force of gravity exerted by a mass $M$ on a mass $m$ a distance $\vec{r}$ away, that is $F(\vec{r}) = -(GMm/\|\vec{r}\|^2)\hat{r}$. - -Let $\vec{r}(t) = \langle 1-t, 0, t\rangle$, $0 \leq t \leq 1$. For concreteness, we take $G M m$ to be $10$. Then the work to move the mass is given by: - -```julia; -uvec(v) = v/norm(v) # unit vector -GMm = 10 -Fₘ(r) = - GMm /norm(r)^2 * uvec(r) -rₘ(t) = [1-t, 0, t] -quadgk(t -> (Fₘ ∘ rₘ)(t) ⋅ rₘ'(t), 0, 1) -``` - -Hmm, a value of $0$. That's a bit surprising at first glance. Maybe it had something to do with the specific path chosen. To investigate, we connect the start and endpoints with a circular arc, instead of a straight line: - - -```julia; -rₒ(t) = [cos(t), 0, sin(t)] -quadgk(t -> (Fₘ ∘ rₒ)(t) ⋅ rₒ'(t), 0, 1) -``` - -Still $0$. We will see next that this is not surprising if something about $F$ is known. - -!!! note - The [Washington Post](https://www.washingtonpost.com/outlook/everything-you-thought-you-knew-about-gravity-is-wrong/2019/08/01/627f3696-a723-11e9-a3a6-ab670962db05_story.html") had an article by Richard Panek with the quote "Well, yes — depending on what we mean by 'attraction.' Two bodies of mass don’t actually exert some mysterious tugging on each other. Newton himself tried to avoid the word 'attraction' for this very reason. All (!) he was trying to do was find the math to describe the motions both down here on Earth and up there among the planets (of which Earth, thanks to Copernicus and Kepler and Galileo, was one)." The point being the formula above is a mathematical description of the force, but not an explanation of how the force actually is transferred. - - -#### Work in a *conservative* vector field - -Let $f: R^n \rightarrow R$ be a scalar function. Its gradient, $\nabla f$ is a *vector field*. For a *scalar* function, we have by the chain rule: - -```math -\frac{d(f \circ \vec{r})}{dt} = \nabla{f}(\vec{r}(t)) \cdot \frac{d\vec{r}}{dt}. -``` - -If we integrate, we see: - -```math -W = \int_a^b \nabla{f}(\vec{r}(t)) \cdot \frac{d\vec{r}}{dt} dt = -\int_a^b \frac{d(f \circ \vec{r})}{dt} dt = -(f\circ\vec{r})\mid_{t = a}^b = -(f\circ\vec{r})(b) - (f\circ\vec{r})(a), -``` -using the Fundamental Theorem of Calculus. - -The main point above is that *if* the vector field is the gradient of a scalar field, then the work done depends *only* on the endpoints of the path and not the path itself. - -> **Conservative vector field**: -> If $F$ is a vector field defined in an *open* region $R$; $A$ and $B$ are points in $R$ and *if* for *any* curve $C$ in $R$ connecting $A$ to $B$, the line integral of $F \cdot \vec{T}$ over $C$ depends *only* on the endpoint $A$ and $B$ and not the path, then the line integral is called *path indenpendent* and the field is called a *conservative field*. - -The force of gravity is the gradient of a scalar field. As such, the two integrals above which yield $0$ could have been computed more directly. The particular scalar field is $f = -GMm/\|\vec{r}\|$, which goes by the name the gravitational *potential* function. As seen, $f$ depends only on magnitude, and as the endpoints of the path in the example have the same distance to the origin, the work integral, $(f\circ\vec{r})(b) - (f\circ\vec{r})(a)$ will be $0$. - - -##### Example - -Coulomb's law states that the electrostatic force between two charged particles is proportional to the product of their charges and *inversely* proportional to square of the distance between the two particles. That is, - -```math -F = k\frac{ q q_0}{\|\vec{r}\|^2}\frac{\vec{r}}{\|\vec{r}\|}. -``` - -This is similar to gravitational force and is a *conservative force*. We saw that a line integral for work in a conservative force depends only on the endpoints. Verify, that for a closed loop the work integral will yield $0$. - -Take as a closed loop the unit circle, parameterized by arc-length by $\vec{r}(t) = \langle \cos(t), \sin(t)\rangle$. The unit tangent will be $\hat{T} = \vec{r}'(t) = \langle -\sin(t), \cos(t) \rangle$. The work to move a particle of charge $q_0$ about a partical of charge $q$ at the origin around the unit circle would be computed through: - -```julia; hold=true -@syms k q q0 t -F(r) = k*q*q0 * r / norm(r)^3 -r(t) = [cos(t), sin(t)] -T(r) = [-r[2], r[1]] -W = integrate(F(r(t)) ⋅ T(r(t)), (t, 0, 2PI)) -``` - -### Closed curves and regions; - -There are technical assumptions about curves and regions that are necessary for some statements to be made: - -* Let $C$ be a [Jordan](https://en.wikipedia.org/wiki/Jordan_curve_theorem) curve - a non-self-intersecting continuous loop in the plane. Such a curve divides the plane into two regions, one bounded and one unbounded. The normal to a Jordan curve is assumed to be in the direction of the unbounded part. - -* Further, we will assume that our curves are *piecewise smooth*. That is comprised of finitely many smooth pieces, continuously connected. - -* The region enclosed by a closed curve has an *interior*, $D$, which we assume is an *open* set (one for which every point in $D$ has some "ball" about it entirely within $D$ as well.) - -* The region $D$ is *connected* meaning between any two points there is a continuous path in $D$ between the two points. - -* The region $D$ is *simply connected*. This means it has no "holes." Technically, any path in $D$ can be contracted to a point. Connected means one piece, simply connected means no holes. - - -### The fundamental theorem of line integrals - -The fact that work in a potential field is path independent is a consequence of the Fundamental Theorem of Line [Integrals](https://en.wikipedia.org/wiki/Gradient_theorem): - -> Let $U$ be an open subset of $R^n$, $f: U \rightarrow R$ a *differentiable* function and $\vec{r}: R \rightarrow R^n$ a differentiable function such that the the path $C = \vec{r}(t)$, $a\leq t\leq b$ is contained in $U$. Then -> ```math -> \int_C \nabla{f} \cdot d\vec{r} = -> \int_a^b \nabla{f}(\vec{r}(t)) \cdot \vec{r}'(t) dt = -> f(\vec{r}(b)) - f(\vec{r}(a)). -> ``` - - - -That is, a line integral through a gradient field can be evaluated by -evaluating the original scalar field at the endpoints of the -curve. In other words, line integrals through gradient fields are conservative. - -Are conservative fields gradient fields? The answer is yes. - -Assume $U$ is an open region in $R^n$ and $F$ is a continuous and conservative vector field in $U$. - -Let $a$ in $U$ be some fixed point. For $\vec{x}$ in $U$, define: - -```math -\phi(\vec{x}) = \int_{\vec\gamma[a,\vec{x}]} F \cdot \frac{d\vec\gamma}{dt}dt, -``` - -where $\vec\gamma$ is *any* differentiable path in $U$ connecting $a$ to -$\vec{x}$ (as a point in $U$). The function $\phi$ is uniquely -defined, as the integral only depends on the endpoints, not the choice -of path. - -It is [shown](https://en.wikipedia.org/wiki/Gradient_theorem#Converse_of_the_gradient_theorem) that the directional derivative $\nabla{\phi} \cdot \vec{v}$ is equal to $F \cdot \vec{v}$ by showing - -```math -\lim_{t \rightarrow 0}\frac{\phi(\vec{x} + t\vec{v}) - \phi(\vec{x})}{t} -= \lim_{t \rightarrow 0} \frac{1}{t} \int_{\vec\gamma[\vec{x},\vec{x}+t\vec{v}]} F \cdot \frac{d\vec\gamma}{dt}dt -= F(\vec{x}) \cdot \vec{v}. -``` - -This is so for all $\vec{v}$, so in particular for the coordinate vectors. So $\nabla\phi = F$. - - -##### Example - -Let $Radial(x,y) = \langle x, y\rangle$. This is a conservative field. Show the work integral over the half circle in the upper half plane is the same as the work integral over the $x$ axis connecting $-1$ to $1$. - -We have: - -```julia; -Radial(x,y) = [x,y] -Radial(v) = Radial(v...) - -r₁(t) = [-1 + t, 0] -quadgk(t -> Radial(r₁(t)) ⋅ r₁'(t), 0, 2) -``` - -Compared to - -```julia; -r₂(t) = [-cos(t), sin(t)] -quadgk(t -> Radial(r₂(t)) ⋅ r₂'(t), 0, pi) -``` - -##### Example - - - - - ----- - -Not all vector fields are conservative. How can a vector field in $U$ -be identified as conservative? For now, this would require either -finding a scalar potential *or* showing all line integrals are path -independent. - -In dimension $2$ there is an easy to check method assuming $U$ is *simply connected*: If $F=\langle F_x, F_y\rangle$ is continuously differentiable in an simply connected region *and* $\partial{F_y}/\partial{x} - \partial{F_x}/\partial{y} = 0$ then $F$ is conservative. A similarly statement is available in dimension $3$. The reasoning behind this will come from the upcoming Green's theorem. - - - - -### Flow across a curve - - -The flow integral in the $n=2$ case was - -```math -\int_C (F\cdot\hat{N}) ds = \int_a^b (F \circ \vec{r})(t) \cdot (\vec{r}'(t))^{t} dt, -``` - -where $\langle a,b\rangle^t = \langle b, -a\rangle$. - - -For a given section of $C$, the vector field breaks down into a -tangential and normal component. The tangential component moves along -the curve and so doesn't contribute to any flow *across* the curve, only -the normal component will contribute. Hence the -$F\cdot\hat{N}$ integrand. The following figure indicates the flow of a vector -field by horizontal lines, the closeness of the lines representing -strength, though these are all evenly space. The two line segments -have equal length, but the one captures more flow than the other, as -its normal vector is more parallel to the flow lines: - -```julia; hold=true; echo=false -p = plot(legend=false, aspect_ratio=:equal) -for y in range(0, 1, length=15) - arrow!( [0,y], [3,0]) -end -plot!(p, [2,2],[.6, .9], linewidth=3) -arrow!( [2,.75],1/2*[1,0], linewidth=3) -theta = pi/3 -l = .3/2 -plot!(p, [2-l*cos(theta), 2+l*cos(theta)], [.25-l*sin(theta), .25+l*sin(theta)], linewidth=3) -arrow!( [2, 0.25], 1/2*[sin(theta), -cos(theta)], linewidth=3) - -p -``` - - -The flow integral is typically computed for a closed (Jordan) curve, measuring the total flow out of a region. In this case, the integral is written $\oint_C (F\cdot\hat{N})ds$. - - -!!! note - For a Jordan curve, the positive orientation of the curve is such that the normal direction (proportional to $\hat{T}'$) points away from the bounded interior. For a non-closed path, the choice of parameterization will determine the normal and the integral for flow across a curve is dependent - up to its sign - on this choice. - - -##### Example - -The [New York Times](https://www.nytimes.com/interactive/2019/06/20/world/asia/hong-kong-protest-size.html) showed aerial photos to estimate the number of protest marchers in Hong Kong. This is a more precise way to estimate crowd size, but requires a drone or some such to take photos. If one is on the ground, the number of marchers could be *estimated* by finding the flow of marchers across a given width. In the Times article, we see "Protestors packed the width of Hennessy Road for more than 5 hours. If this road is 50 meters wide and the rate of the marchers is 3 kilometers per hour, estimate the number of marchers. - -The basic idea is to compute the rate of flow *across* a part of the street and then multiply by time. For computational sake, say the marchers are on a grid of 1 meters (that is in a 40m wide street, there is room for 40 marchers at a time. In one minute, the marchers move 50 meters: - -```julia; -3000/60 -``` - -This means the rate of marchers per minute is `40 * 50`. If this is steady over 5 hours, this *simple* count gives: - -```julia; -40 * 50 * 5 * 60 -``` - -This is short of the estimate 2M marchers, but useful for a rough estimate. The point is from rates of flow, which can be calculated locally, amounts over bigger scales can be computed. The word "*across*" is used, as only the direction across the part of the street counts in the computation. Were the marchers in total unison and then told to take a step to the left and a step to the right, they would have motion, but since it wasn't across the line in the road (rather along the line) there would be no contribution to the count. The dot product with the normal vector formalizes this. - -##### Example - -Let a path $C$ be parameterized by $\vec{r}(t) = \langle \cos(t), 2\sin(t)\rangle$, $0 \leq t \leq \pi/2$ and $F(x,y) = \langle \cos(x), \sin(xy)\rangle$. Compute the flow across $C$. - -We have - -```julia; hold=true -r(t) = [cos(t), 2sin(t)] -F(x,y) = [cos(x), sin(x*y)] -F(v) = F(v...) -normal(a,b) = [b, -a] -G(t) = (F ∘ r)(t) ⋅ normal(r(t)...) -a, b = 0, pi/2 -quadgk(G, a, b)[1] -``` - -##### Example - -Example, let $F(x,y) = \langle -y, x\rangle$ be a vector field. (It represents an rotational flow.) What is the flow across the unit circle? - -```julia; hold=true -@syms t::real -F(x,y) = [-y,x] -F(v) = F(v...) -r(t) = [cos(t),sin(t)] -T(t) = diff.(r(t), t) -normal(a,b) = [b,-a] -integrate((F ∘ r)(t) ⋅ normal(T(t)...) , (t, 0, 2PI)) -``` - - - -##### Example - -Let $F(x,y) = \langle x,y\rangle$ be a vector field. (It represents a *source*.) What is the flow across the unit circle? - -```julia; hold=true -@syms t::real -F(x,y) = [x, y] -F(v) = F(v...) -r(t) = [cos(t),sin(t)] -T(t) = diff.(r(t), t) -normal(a,b) = [b,-a] -integrate((F ∘ r)(t) ⋅ normal(T(t)...) , (t, 0, 2PI)) -``` - -##### Example - - -Let $F(x,y) = \langle x, y\rangle / \| \langle x, y\rangle\|^3$: - -```julia; -F₁(x,y) = [x,y] / norm([x,y])^2 -F₁(v) = F₁(v...) -``` - -Consider $C$ to be the square with vertices at $(-1,-1)$, $(1,-1)$, $(1,1)$, and $(-1, 1)$. What is the flow across $C$ for this vector field? The region has simple outward pointing *unit* normals, these being $\pm\hat{i}$ and $\pm\hat{j}$, the unit vectors in the $x$ and $y$ direction. The integral can be computed in 4 parts. The first (along the bottom): - -```julia; hold=true -@syms s::real - -r(s) = [-1 + s, -1] -n = [0,-1] -A1 = integrate(F₁(r(s)) ⋅ n, (s, 0, 2)) - -#The other three sides are related as each parameterization and normal is similar: - -r(s) = [1, -1 + s] -n = [1, 0] -A2 = integrate(F₁(r(s)) ⋅ n, (s, 0, 2)) - - -r(s) = [1 - s, 1] -n = [0, 1] -A3 = integrate(F₁(r(s)) ⋅ n, (s, 0, 2)) - - -r(s) = [-1, 1-s] -n = [-1, 0] -A4 = integrate(F₁(r(s)) ⋅ n, (s, 0, 2)) - -A1 + A2 + A3 + A4 -``` - -As could have been anticipated by symmetry, the answer is simply `4A1` or $2\pi$. What likely is not anticipated, is that this integral will be the same as that found by integrating over the unit circle (an easier integral): - -```julia; hold=true -@syms t::real -r(t) = [cos(t), sin(t)] -N(t) = r(t) -integrate(F₁(r(t)) ⋅ N(t), (t, 0, 2PI)) -``` - -This equivalence is a consequence of the upcoming Green's theorem, as the vector field satisfies a particular equation. - - - - -## Surface integrals - -```julia; hold=true; echo=false -#out = download("https://upload.wikimedia.org/wikipedia/en/c/c1/Cloud_Gate_%28The_Bean%29_from_east%27.jpg") -#cp(out, "figures/kapoor-cloud-gate.jpg") -imgfile = "figures/kapoor-cloud-gate.jpg" -caption = """ -The Anish Kapoor sculpture Cloud Gate maps the Cartesian grid formed by its concrete resting pad onto a curved surface showing the local distortions. Knowing the areas of the reflected grid after distortion would allow the computation of the surface area of the sculpture through addition. (Wikipedia) -""" -ImageFile(:integral_vector_calculus, imgfile, caption) -``` - - -We next turn attention to a generalization of line integrals to surface integrals. Surfaces were described in one of three ways: directly through a function as $z=f(x,y)$, as a level curve through $f(x,y,z) = c$, and parameterized through a function $\Phi: R^2 \rightarrow R^3$. The level curve description is locally a function description, and the function description leads to a parameterization ($\Phi(u,v) = \langle u,v,f(u,v)\rangle$) so we restrict to the parameterized case. - - - - -Consider the figure of the surface described by $\Phi(u,v) = \langle u,v,f(u,v)\rangle$: - - -```julia; hold=true; echo=false -f(x,y) = 2 - (x+1/2)^2 - y^2 -xs = ys = range(0, 1/2, length=10) -p = surface(xs, ys, f, legend=false, camera=(45,45)) -for x in xs - plot!(p, unzip(y -> [x, y, f(x,y)], 0, 1/2)..., linewidth=3) - plot!(p, unzip(y -> [x, y, 0], 0, 1/2)..., linewidth=3) -end -for y in ys - plot!(p, unzip(x -> [x, y, f(x,y)], 0, 1/2)..., linewidth=3) - plot!(p, unzip(x -> [x, y, 0], 0, 1/2)..., linewidth=3) -end -p -``` - -The partitioning of the $u-v$ plane into a grid, lends itself to a partitioning of the surface. To compute the total *surface area* of the surface, it would be natural to begin by *approximating* the area of each cell of this partition and add. As with other sums, we would expect that as the cells got smaller in diameter, the sum would approach an integral, in this case an integral yielding the surface area. - -Consider a single cell: - -```julia; hold=true; echo=false; -# from https://commons.wikimedia.org/wiki/File:Surface_integral1.svg -#cp(download("https://upload.wikimedia.org/wikipedia/commons/thumb/8/87/Surface_integral1.svg/500px-Surface_integral1.svg.png"), "figures/surface-integral-cell.png", force=true) -#imgfile = "figures/surface-integral-cell.png" -#caption = "The rectangular region maps to a piece of the surface approximated by #a parallelogram whose area can be computed. (Wikipedia)" -#ImageFile(:integral_vector_calculus, imgfile, caption) -nothing -``` - -```julia; hold=true; echo=false -f(x,y)= .5 - ((x-2)/4)^2 - ((y-1)/3)^2 -Phi(uv) = [uv[1],uv[2],f(uv...)] - -xs = range(0, 3.5, length=50) -ys = range(0, 2.5, length=50) -surface(xs,ys, f, legend=false) -Δx = 0.5; Δy = 0.5 -x0 = 2.5; y0 = 0.25 - -ps = [[x0,y0,0], [x0+Δx,y0,0],[x0+Δx,y0+Δy,0],[x0, y0+Δy, 0],[x0,y0,0]] -plot!(unzip(ps)..., seriestype=:shape, color =:blue) - -fx = t -> [x0+t, y0, f(x0+t, y0)] -fy = t -> [x0, y0+t, f(x0, y0+t)] -plot!(unzip(fx.(xs.-x0))..., color=:green) -plot!(unzip(fy.(ys.-y0))..., color=:green) -fx = t -> [x0+t, y0+Δy, f(x0+t, y0+Δy)] -fy = t -> [x0+Δx, y0+t, f(x0+Δx, y0+t)] -ts = range(0, 1, length=20) -plot!(unzip(fx.(ts*Δx))..., color=:green) -plot!(unzip(fy.(ts*Δy))..., color=:green) - -Pt = [x0,y0,f(x0,y0)] -Jac = ForwardDiff.jacobian(Phi, Pt[1:2]) -v1 = Jac[:,1]; v2 = Jac[:,2] -arrow!(Pt, v1/2, linewidth=5, color=:red) -arrow!(Pt, v2/2, linewidth=5, color=:red) -arrow!(Pt + v1/2, v2/2, linewidth=1, linetype=:dashed, color=:red) -arrow!(Pt + v2/2, v1/2, linewidth=1, linetype=:dashed, color=:red) -arrow!(Pt, (1/4)*(v1 × v2), linewidth=3, color=:blue) -``` - -The figure shows that a cell on the grid in the $u-v$ plane of area $\Delta{u}\Delta{v}$ maps to a cell of the partition with surface area $\Delta{S}$ which can be *approximated* by a part of the tangent plane described by two vectors $\vec{v}_1 = \partial{\Phi}/\partial{u}$ and $\vec{v}_2 = \partial{\Phi}/\partial{v}$. These two vectors have cross product which a) points in the direction of the normal vector, and b) has magnitude yielding the approximation $\Delta{S} \approx \|\vec{v}_1 \times \vec{v}_2\|\Delta{u}\Delta{v}$. - -If we were to integrate the function $G(x,y, z)$ over the *surface* $S$, then an approximating Riemann sum could be produced by $G(c) \| \vec{v}_1 \times \vec{v}_2\| \Delta u \Delta v$, for some point $c$ on the surface. - -In the limit a definition of an *integral* over a surface $S$ in $R^3$ is found by a two-dimensional integral over $R$ in $R^2$: - -```math -\int_S G(x,y,z) dS = \int_R G(\Phi(u,v)) -\| \frac{\partial{\Phi}}{\partial{u}} \times \frac{\partial{\Phi}}{\partial{v}} \| du dv. -``` - -In the case that the surface is described by $z = f(x,y)$, then the formula's become $\vec{v}_1 = \langle 1,0,\partial{f}/\partial{x}\rangle$ and $\vec{v}_2 = \langle 0, 1, \partial{f}/\partial{y}\rangle$ with cross product $\vec{v}_1\times\vec{v}_2 =\langle -\partial{f}/\partial{x}, -\partial{f}/\partial{y},1\rangle$. - -The value $\| \frac{\partial{\Phi}}{\partial{u}} \times -\frac{\partial{\Phi}}{\partial{y}} \|$ is called the *surface -element*. As seen, it is the scaling between a unit area in the $u-v$ -plane and the approximating area on the surface after the -parameterization. - -### Examples - -Let us see that the formula holds for some cases where the answer is known by other means. - -##### A cone - -The surface area of cone is a known quantity. In cylindrical coordinates, the cone may be described by $z = a - br$, so the parameterization $(r, \theta) \rightarrow \langle r\cos(\theta), r\sin(\theta), a - br \rangle$ maps $T = [0, a/b] \times [0, 2\pi]$ onto the surface (less the bottom). - -The surface element is the cross product $\langle \cos(\theta), \sin(\theta), -b\rangle$ and $\langle -r\sin(\theta), r\cos(\theta), 0\rangle$, which is: - -```julia; -@syms 𝑹::postive θ::positive 𝒂::positive 𝒃::positive -𝒏 = [cos(θ), sin(θ), -𝒃] × [-𝑹*sin(θ), 𝑹*cos(θ), 0] -𝒔𝒆 = simplify(norm(𝒏)) -``` - -(To do this computationally, one might compute: -```julia; hold=true -Phi(r, theta) = [r*cos(theta), r*sin(theta), 𝒂 - 𝒃*r] -Phi(𝑹, θ).jacobian([𝑹, θ]) -``` - -and from here pull out the two vectors to take a cross product.) - - -The surface area is then found by integrating $G(\vec{x}) = 1$: - -```julia; -integrate(1 * 𝒔𝒆, (𝑹, 0, 𝒂/𝒃), (θ, 0, 2PI)) -``` - -A formula from a *quick* Google search is $A = \pi r(r^2 + \sqrt{h^2 + r^2}$. Does this match up? - -```julia; hold=true -𝑹 = 𝒂/𝒃; 𝒉 = 𝒂 -pi * 𝑹 * (𝑹 + sqrt(𝑹^2 + 𝒉^2)) |> simplify -``` - -Nope, off by a summand of $\pi(a/b)^2 = \pi r^2$, which may be recognized as the area of the base, which we did not compute, but which the Google search did. So yes, the formulas do agree. - -##### Example - -The sphere has known surface area $4\pi r^2$. Let's see if we can compute this. With the parameterization from spherical coordinates $(\theta, \phi) \rightarrow \langle r\sin\phi\cos\theta, r\sin\phi\sin\theta,r\cos\phi\rangle$, we have approaching this *numerically*: - -```julia; hold=true -Rad = 1 -Phi(theta, phi) = Rad * [sin(phi)*cos(theta), sin(phi)*sin(theta), cos(phi)] -Phi(v) = Phi(v...) - -function surface_element(pt) - Jac = ForwardDiff.jacobian(Phi, pt) - v1, v2 = Jac[:,1], Jac[:,2] - norm(v1 × v2) -end -out = hcubature(surface_element, (0, 0), (2pi, 1pi)) -out[1] - 4pi*Rad^2 # *basically* zero -``` - - -##### Example - -In [Surface area](../integrals/surface_area.mmd) the following formula for the surface area of a surface of *revolution* about the $x$ axis is described by $r=f(x)$ is given: - -```math -\int_a^b 2\pi f(x) \cdot \sqrt{1 + f'(x)^2} dx. -``` - -Consider the transformation $(x, \theta) \rightarrow \langle x, f(x)\cos(\theta), f(x)\sin(\theta)$. This maps the region $[a,b] \times [0, 2\pi]$ *onto* the surface of revolution. As such, the surface element would be: - -```julia -@syms 𝒇()::positive x::real theta::real - -Phi(x, theta) = [x, 𝒇(x)*cos(theta), 𝒇(x)*sin(theta)] -Jac = Phi(x, theta).jacobian([x, theta]) -v1, v2 = Jac[:,1], Jac[:,2] -se = norm(v1 × v2) -se .|> simplify -``` - -This in agreement with the previous formula. - - -##### Example - -Consider the *upper* half sphere, $S$. Compute $\int_S z dS$. - -Were the half sphere made of a thin uniform material, this would be computed to find the $z$ direction of the centroid. - -We use the spherical coordinates to parameterize: - -```math -\Phi(\theta, \phi) = \langle \cos(\phi)\cos(\theta), \cos(\phi)\sin(\theta), \sin(\phi) \rangle -``` - -The Jacobian and surface element are computed and then the integral is performed: - -```julia; hold=true -@syms theta::real phi::real -Phi(theta, phi) = [cos(phi)*cos(theta), cos(phi)*sin(theta), sin(phi)] -Jac = Phi(theta,phi).jacobian([theta, phi]) - -v1, v2 = Jac[:,1], Jac[:,2] -SurfElement = norm(v1 × v2) |> simplify - -z = sin(phi) -integrate(z * SurfElement, (theta, 0, 2PI), (phi, 0, PI/2)) -``` - -### Orientation - -A smooth surface $S$ is *orientable* if it possible to define a unit normal vector, $\vec{N}$ that varies continuously with position. For example, a sphere has a normal vector that does this. On the other hand, a Mobius strip does not, as a normal when moved around the surface may necessarily be reversed as it returns to its starting point. For a closed, orientable smooth surface there are two possible choices for a normal, and convention chooses the one that points away from the contained region, such as the outward pointing normal for the sphere or torus. - - -### Surface integrals in vector fields - -Beyond finding surface area, surface integrals can also compute interesting physical phenomena. These are often associated to a vector field (in this case a function $\vec{F}: R^3 \rightarrow R^3$), and the typical case is the *flux* through a surface defined locally by $\vec{F} \cdot \hat{N}$, that is the *magnitude* of the *projection* of the field onto the *unit* normal vector. - - -Consider the flow of water through an opening in a time period $\Delta t$. The amount of water mass to flow through would be the area of the opening times the velocity of the flow perpendicular to the surface times the density times the time period; symbolically: $dS \cdot ((\rho \vec{v}) \cdot \vec{N}) \cdot \Delta t$. Dividing by $\Delta t$ gives a rate of flow as $((\rho \vec{v}) \cdot \vec{N}) dS$. With $F = \rho \vec{v}$, the flux integral can be seen as the rate of flow through a surface. - -To find the normal for a surface element arising from a parameterization $\Phi$, we have the two *partial* derivatives $\vec{v}_1=\partial{\Phi}/\partial{u}$ and $\vec{v}_2 = \partial{\Phi}/\partial{v}$, the two column vectors of the Jacobian matrix of $\Phi(u,v)$. These describe the tangent plane, and even more their cross product will be a) *normal* to the tangent plane and b) have magnitude yielding the surface element of the transformation. - - -From this, for a given parameterization, $\Phi(u,v):T \rightarrow S$, the following formula is suggested for orientable surfaces: - -```math -\int_S \vec{F} \cdot \hat{N} dS = -\int_T \vec{F}(\Phi(u,v)) \cdot -(\frac{\partial{\Phi}}{\partial{u}} \times \frac{\partial{\Phi}}{\partial{v}}) -du dv. -``` - - -When the surface is described by a function, $z=f(x,y)$, the parameterization is $(u,v) \rightarrow \langle u, v, f(u,v)\rangle$, and the two vectors are $\vec{v}_1 = \langle 1, 0, \partial{f}/\partial{u}\rangle$ and $\vec{v}_2 = \langle 0, 1, \partial{f}/\partial{v}\rangle$ and their cross product is $\vec{v}_1\times\vec{v}_1=\langle -\partial{f}/\partial{u}, -\partial{f}/\partial{v}, 1\rangle$. - - -##### Example - -Suppose a vector field $F(x,y,z) = \langle 0, y, -z \rangle$ is given. Let $S$ be the surface of the paraboloid $y = x^2 + z^2$ between $y=0$ and $y=4$. Compute the surface integral $\int_S F\cdot \hat{N} dS$. - -This is a surface of revolution about the $y$ axis, so a parameterization is -$\Phi(y,\theta) = \langle \sqrt{y} \cos(\theta), y, \sqrt{y}\sin(\theta) \rangle$. The surface normal is given by: - -```julia; hold=true -@syms y::positive theta::positive -Phi(y,theta) = [sqrt(y)*cos(theta), y, sqrt(y)*sin(theta)] -Jac = Phi(y, theta).jacobian([y, theta]) -v1, v2 = Jac[:,1], Jac[:,2] -Normal = v1 × v2 - -# With this, the surface integral becomes: - -F(x,y,z) = [0, y, -z] -F(v) = F(v...) -integrate(F(Phi(y,theta)) ⋅ Normal, (theta, 0, 2PI), (y, 0, 4)) -``` - -##### Example - -Let $S$ be the closed surface bounded by the cylinder $x^2 + y^2 = 1$, the plane $z=0$, and the plane $z = 1+x$. Let $F(x,y,z) = \langle 1, y, -z \rangle$. Compute $\oint_S F\cdot\vec{N} dS$. - - -```julia; -𝐅(x,y,z) = [1, y, z] -𝐅(v) = 𝐅(v...) -``` - -The surface has three faces, with different outward pointing normals for each. Let $S_1$ be the unit disk in the $x-y$ plane with normal $-\hat{k}$; $S_2$ be the top part, with normal $\langle \langle-1, 0, 1\rangle$ (as the plane is $-1x + 0y + 1z = 1$); and $S_3$ be the cylindrical part with outward pointing normal $\vec{r}$. - - -Integrating over $S_1$, we have the parameterization $\Phi(r,\theta) = \langle r\cos(\theta), r\sin(\theta), 0\rangle$: - -```julia; -@syms 𝐑::positive 𝐭heta::positive -𝐏hi₁(r,theta) = [r*cos(theta), r*sin(theta), 0] -𝐉ac₁ = 𝐏hi₁(𝐑, 𝐭heta).jacobian([𝐑, 𝐭heta]) -𝐯₁, 𝐰₁ = 𝐉ac₁[:,1], 𝐉ac₁[:,2] -𝐍ormal₁ = 𝐯₁ × 𝐰₁ .|> simplify -``` - -```julia; -A₁ = integrate(𝐅(𝐏hi₁(𝐑, 𝐭heta)) ⋅ (-𝐍ormal₁), (𝐭heta, 0, 2PI), (𝐑, 0, 1)) # use -Normal for outward pointing -``` - -Integrating over $S_2$ we use the parameterization $\Phi(r, \theta) = \langle r\cos(\theta), r\sin(\theta), 1 + r\cos(\theta)\rangle$. - -```julia; -𝐏hi₂(r, theta) = [r*cos(theta), r*sin(theta), 1 + r*cos(theta)] -𝐉ac₂ = 𝐏hi₂(𝐑, 𝐭heta).jacobian([𝐑, 𝐭heta]) -𝐯₂, 𝐰₂ = 𝐉ac₂[:,1], 𝐉ac₂[:,2] -𝐍ormal₂ = 𝐯₂ × 𝐰₂ .|> simplify # has correct orientation -``` - -With this, the contribution for $S_2$ is: - -```julia; -A₂ = integrate(𝐅(𝐏hi₂(𝐑, 𝐭heta)) ⋅ (𝐍ormal₂), (𝐭heta, 0, 2PI), (𝐑, 0, 1)) -``` - -Finally for $S_3$, the parameterization used is $\Phi(z, \theta) = \langle \cos(\theta), \sin(\theta), z\rangle$, but this is over a non-rectangular region, as $z$ is between $0$ and $1 + x$. - -This parameterization gives a normal computed through: - -```julia; -@syms 𝐳::positive -𝐏hi₃(z, theta) = [cos(theta), sin(theta), 𝐳] -𝐉ac₃ = 𝐏hi₃(𝐳, 𝐭heta).jacobian([𝐳, 𝐭heta]) -𝐯₃, 𝐰₃ = 𝐉ac₃[:,1], 𝐉ac₃[:,2] -𝐍ormal₃ = 𝐯₃ × 𝐰₃ .|> simplify # wrong orientation, so we change sign below -``` - -The contribution is - -```julia; -A₃ = integrate(𝐅(𝐏hi₃(𝐑, 𝐭heta)) ⋅ (-𝐍ormal₃), (𝐳, 0, 1 + cos(𝐭heta)), (𝐭heta, 0, 2PI)) -``` - -In total, the surface integral is - -```julia; -A₁ + A₂ + A₃ -``` - -##### Example - -Two point charges with charges $q$ and $q_0$ will exert an electrostatic force of attraction or repulsion according to [Coulomb](https://en.wikipedia.org/wiki/Coulomb%27s_law)'s law. The Coulomb force is $kqq_0\vec{r}/\|\vec{r}\|^3$. -This force is proportional to the product of the charges, $qq_0$, and inversely proportional to the square of the distance between them. - -The electric field is a vector field is the field generated by the force on a test charge, and is given by $E = kq\vec{r}/\|\vec{r}\|^3$. - - -Let $S$ be the unit sphere $\|\vec{r}\|^2 = 1$. Compute the surface integral of the electric field over the closed surface, $S$. - -We have (using $\oint$ for a surface integral over a closed surface): - -```math -\oint_S S \cdot \vec{N} dS = -\oint_S \frac{kq}{\|\vec{r}\|^2} \hat{r} \cdot \hat{r} dS = -\oint_S \frac{kq}{\|\vec{r}\|^2} dS = -kqq_0 \cdot SA(S) = -4\pi k q -``` - - -Now consider the electric field generated by a point charge within the unit sphere, but not at the origin. The integral now will not fall in place by symmetry considerations, so we will approach the problem numerically. - - -```julia; -E(r) = (1/norm(r)^2) * uvec(r) # kq = 1 - -Phiₑ(theta, phi) = 1*[sin(phi)*cos(theta), sin(phi) * sin(theta), cos(phi)] -Phiₑ(r) = Phiₑ(r...) - -normal(r) = Phiₑ(r)/norm(Phiₑ(r)) - -function SE(r) - Jac = ForwardDiff.jacobian(Phiₑ, r) - v1, v2 = Jac[:,1], Jac[:,2] - v1 × v2 -end - -a = rand() * Phiₑ(2pi*rand(), pi*rand()) -A1 = hcubature(r -> E(Phiₑ(r)-a) ⋅ normal(r) * norm(SE(r)), (0.0,0.0), (2pi, 1pi)) -A1[1] -``` - -The answer is $4\pi$, regardless of the choice of `a`, as long as it is *inside* the surface. (We see above, some fussiness in the limits of integration. `HCubature` does some conversion of the limits, but does not *currently* do well with mixed types, so in the above only floating point values are used.) - -When `a` is *outside* the surface, the answer is *always* a constant: - -```julia; hold=true -a = 2 * Phiₑ(2pi*rand(), pi*rand()) # random point with radius 2 -A1 = hcubature(r -> E(Phiₑ(r)-a) ⋅ normal(r) * norm(SE(r)), (0.0,0.0), (2pi, pi/2)) -A2 = hcubature(r -> E(Phiₑ(r)-a) ⋅ normal(r) * norm(SE(r)), (0.0,pi/2), (2pi, 1pi)) -A1[1] + A2[1] -``` - -That constant being $0$. - -This is a consequence of [Gauss's law](https://en.wikipedia.org/wiki/Gauss%27s_law), which states that for an electric field $E$, the electric flux through a closed surface is proportional to the total charge contained. (Gauss's law is related to the upcoming divergence theorem.) When `a` is inside the surface, the total charge is the same regardless of exactly where, so the integral's value is always the same. When `a` is outside the surface, the total charge inside the sphere is $0$, so the flux integral is as well. - -Gauss's law is typically used to identify the electric field by choosing a judicious surface where the surface integral can be computed. For example, suppose a ball of radius $R_0$ has a *uniform* charge. What is the electric field generated? *Assuming* it is dependent only on the distance from the center of the charged ball, we can, first, take a sphere of radius $R > R_0$ and note that $E(\vec{r})\cdot\hat{N}(r) = \|E(R)\|$, the magnitude a distance $R$ away. So the surface integral is simply $\|E(R)\|4\pi R^2$ and by Gauss's law a constant depending on the total charge. So $\|E(R)\| ~ 1/R^2$. When $R < R_0$, the same applies, but the total charge within the surface will be like $(R/R_0 )^3$, so the result will be *linear* in $R$, as: - -```math -4 \pi \|E(R)\| R^2 = k 4\pi \left(\frac{R}{R_0}\right)^3. -``` - - - - -## Questions - -###### Question - -Let $\vec{r}(t) = \langle e^t\cos(t), e^{-t}\sin(t) \rangle$. - -What is $\|\vec{r}'(1/2)\|$? - -```julia; hold=true; echo=false -r(t) = [exp(t)*cos(t), exp(-t)*sin(t)] -val = norm(r'(1/2)) -numericq(val) -``` - -What is the $x$ (first) component of $\hat{N}(t) = \hat{T}'(t)/\|\hat{T}'(t)\|$ at $t=1/2$? - -```julia; hold=true; echo=false -r(t) = [exp(t)*cos(t), exp(-t)*sin(t)] -T(t) = r'(t)/norm(r'(t)) -N(t) = T'(t)/norm(T'(t)) -val = N(1/2)[1] -numericq(val) -``` - - -###### Question - -Let $\Phi(u,v) = \langle u,v,u^2+v^2\rangle$ parameterize a surface. Find the magnitude of -$\| \partial{\Phi}/\partial{u} \times \partial{\Phi}/\partial{v} \|$ at $u=1$ and $v=2$. - -```julia; hold=true; echo=false -Phi(u,v) = [u, v, u^2 + v^2] -Jac = ForwardDiff.jacobian(uv -> Phi(uv...), [1,2]) -val = norm(Jac[:,1] × Jac[:,2]) -numericq(val) -``` - - -###### Question - -For a plane $ax+by+cz=d$ find the unit normal. - -```julia; hold=true; echo=false -choices = [ -raw" ``\langle a, b, c\rangle / \| \langle a, b, c\rangle\|``", -raw" ``\langle a, b, c\rangle``", -raw" ``\langle d-a, d-b, d-c\rangle / \| \langle d-a, d-b, d-c\rangle\|``", -] -answ = 1 -radioq(choices, answ) -``` - -Does it depend on $d$? - -```julia; hold=true; echo=false -choices = [ -L"No. Moving $d$ just shifts the plane up or down the $z$ axis, but won't change the normal vector", -L"Yes. Of course. Different values for $d$ mean different values for $x$, $y$, and $z$ are needed.", -L"Yes. The gradient of $F(x,y,z) = ax + by + cz$ will be normal to the level curve $F(x,y,z)=d$, and so this will depend on $d$." -] -answ = 1 -radioq(choices, answ) -``` - - - - -###### Question - -Let $\vec{r}(t) = \langle \cos(t), \sin(t), t\rangle$ and let $F(x,y,z) = \langle -y, x, z\rangle$ - -Numerically compute $\int_0^{2\pi} F(\vec{r}(t)) \cdot \vec{r}'(t) dt$. - -```julia; hold=true; echo=false -F(x,y,z) = [-y, x, z] -r(t) = [cos(t), sin(t), t] -val = quadgk(t -> F(r(t)...) ⋅ r'(t), 0, 2pi)[1] -numericq(val) -``` - - -Compute the value symbolically: - -```julia; hold=true; echo=false -choices = [ -raw" ``2\pi + 2\pi^2``", -raw" ``2\pi^2``", -raw" ``4\pi``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - - -Let $F(x,y) = \langle 2x^3y^2, xy^4 + 1\rangle$. What is the work done in integrating $F$ along the parabola $y=x^2$ between $(-1,1)$ and $(1,1)$? Give a numeric answer: - -```julia; hold=true; echo=false -F(x,y) = [2x^3*y^2, x*y^4 + 1] -r(t) = [t, t^2] -val = quadgk(t -> F(r(t)...) ⋅ r'(t), -1, 1)[1] -numericq(val) -``` - - - -###### Question - -Let $F = \nabla{f}$ where $f:R^2 \rightarrow R$. The level curves of $f$ are curves in the $x-y$ plane where $f(x,y)=c$, for some constant $c$. Suppose $\vec{r}(t)$ describes a path on the level curve of $f$. What is the value of $\int_C F \cdot d\vec{r}$? - -```julia; hold=true; echo=false -choices =[ -L"It will be $0$, as $\nabla{f}$ is orthogonal to the level curve and $\vec{r}'$ is tangent to the level curve", -L"It will $f(b)-f(a)$ for any $b$ or $a$" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Let $F(x,y) = (x^2+y^2)^{-k/2} \langle x, y \rangle$ be a radial field. The work integral around the unit circle simplifies: - -```math -\int_C F\cdot \frac{dr}{dt} dt = \int_0^{2pi} \langle (1)^{-k/2} \cos(t), \sin(t) \rangle \cdot \langle-\sin(t), \cos(t)\rangle dt. -``` - -For any $k$, this integral will be: - -```julia; hold=true; echo=false -numericq(0) -``` - -###### Question - -Let $f(x,y) = \tan^{-1}(y/x)$. We will integrate $\nabla{f}$ over the unit circle. The integrand wil be: - -```julia; hold=true -@syms t::real x::real y::real -f(x,y) = atan(y/x) -r(t) = [cos(t), sin(t)] -∇f = subs.(∇(f(x,y)), x .=> r(t)[1], y .=> r(t)[2]) .|> simplify -drdt = diff.(r(t), t) -∇f ⋅ drdt |> simplify -``` - -So $\int_C \nabla{f}\cdot d\vec{r} = \int_0^{2\pi} \nabla{f}\cdot d\vec{r}/dt dt = 2\pi$. - -Why is this surprising? - -```julia; hold=true; echo=false -choices = [ -L"The field is a potential field, but the path integral around $0$ is not path dependent.", -L"The value of $d/dt(f\circ\vec{r})=0$, so the integral should be $0$." -] -answ =1 -radioq(choices, answ) -``` - -The function $F = \nabla{f}$ is - -```julia; hold=true; echo=false -choices = [ -"Not continuous everywhere", -"Continuous everywhere" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Let $F(x,y) = \langle F_x, F_y\rangle = \langle 2x^3y^2, xy^4 + 1\rangle$. Compute - -```math -\frac{\partial{F_y}}{\partial{x}}- \frac{\partial{F_x}}{\partial{y}}. -``` - -Is this $0$? - -```julia; hold=true; echo=false -@syms x y -F(x,y) = [2x^3*y^2, x*y^4 + 1] -val = iszero(diff(F(x,y)[2],x) - diff(F(x,y)[1],y)) -yesnoq(val) -``` - - -###### Question - -Let $F(x,y) = \langle F_x, F_y\rangle = \langle 2x^3, y^4 + 1\rangle$. Compute - -```math -\frac{\partial{F_y}}{\partial{x}} - \frac{\partial{F_x}}{\partial{y}}. -``` - -Is this $0$? - -```julia; hold=true; echo=false -@syms x y -F(x,y) = [2x^3, y^4 + 1] -val = iszero(diff(F(x,y)[2],x) - diff(F(x,y)[1],y)) -yesnoq(val) -``` - - - -###### Question - -It is not unusual to see a line integral, $\int F\cdot d\vec{r}$, where $F=\langle M, N \rangle$ expressed as $\int Mdx + Ndy$. This uses the notation for a differential form, so is familiar in some theoretical usages, but does not readily lend itself to computation. It does yield pleasing formulas, such as $\oint_C x dy$ to give the area of a two-dimensional region, $D$, in terms of a line integral around its perimeter. To see that this is so, let $\vec{r}(t) = \langle a\cos(t), b\sin(t)\rangle$, $0 \leq t \leq 2\pi$. This parameterizes an ellipse. Let $F(x,y) = \langle 0,x\rangle$. What does $\oint_C xdy$ become when translated into $\int_a^b (F\circ\vec{r})\cdot\vec{r}' dt$? - -```julia; hold=true; echo=false -choices = [ -raw" ``\int_0^{2\pi} (a\cos(t)) \cdot (b\cos(t)) dt``", -raw" ``\int_0^{2\pi} (-b\sin(t)) \cdot (b\cos(t)) dt``", -raw" ``\int_0^{2\pi} (a\cos(t)) \cdot (a\cos(t)) dt``" -] -answ=1 -radioq(choices, answ) -``` - - -###### Question - -Let a surface be parameterized by $\Phi(u,v) = \langle u\cos(v), u\sin(v), u\rangle$. - -Compute $\vec{v}_1 = \partial{\Phi}/\partial{u}$ - -```julia; hold=true; echo=false -choices = [ -raw" ``\langle \cos(v), \sin(v), 1\rangle``", -raw" ``\langle -u\sin(v), u\cos(v), 0\rangle``", -raw" ``u\langle -\cos(v), -\sin(v), 1\rangle``" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -Compute $\vec{v}_2 = \partial{\Phi}/\partial{u}$ - -```julia; hold=true; echo=false -choices = [ -raw" ``\langle \cos(v), \sin(v), 1\rangle``", -raw" ``\langle -u\sin(v), u\cos(v), 0\rangle``", -raw" ``u\langle -\cos(v), -\sin(v), 1\rangle``" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -Compute $\vec{v}_1 \times \vec{v}_2$ - - -```julia; hold=true; echo=false -choices = [ -raw" ``\langle \cos(v), \sin(v), 1\rangle``", -raw" ``\langle -u\sin(v), u\cos(v), 0\rangle``", -raw" ``u\langle -\cos(v), -\sin(v), 1\rangle``" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -For the surface parameterized by $\Phi(u,v) = \langle uv, u^2v, uv^2\rangle$ for $(u,v)$ in $[0,1]\times[0,1]$, numerically find the surface area. - -```julia; hold=true; echo=false -Phi(u,v) = [u*v, u^2*v, u*v^2] -Phi(v) = Phi(v...) -function SurfaceElement(u,v) - pt = [u,v] - Jac = ForwardDiff.jacobian(Phi, pt) - v1, v2 = Jac[:,1], Jac[:,2] - cross(v1, v2) -end -a,err = hcubature(uv -> norm(SurfaceElement(uv...)), (0,0), (1,1)) -numericq(a) -``` - -###### Question - -For the surface parameterized by $\Phi(u,v) = \langle uv, u^2v, uv^2\rangle$ for $(u,v)$ in $[0,1]\times[0,1]$ and vector field $F(x,y,z) =\langle y^2, x, z\langle$, numerically find $\iint_S (F\cdot\hat{N}) dS$. - -```julia; hold=true; echo=false -Phi(u,v) = [u*v, u^2*v, u*v^2] -Phi(v) = Phi(v...) -function SurfaceElement(u,v) - pt = [u,v] - Jac = ForwardDiff.jacobian(Phi, pt) - v1, v2 = Jac[:,1], Jac[:,2] - cross(v1, v2) -end -F(x,y,z) = [y^2,x,z] -F(v) = F(v...) -integrand(uv) = dot(F(Phi(uv)...), SurfaceElement(uv...)) -a, err = hcubature(integrand, (0,0), (1,1)) -numericq(a) -``` - -###### Question - -Let $F=\langle 0,0,1\rangle$ and $S$ be the upper-half unit sphere, parameterized by $\Phi(\theta, \phi) = \langle \sin(\phi)\cos(\theta), \sin(\phi)\sin(\theta), \cos(\phi)\rangle$. Compute $\iint_S (F\cdot\hat{N}) dS$ numerically. Choose the normal direction so that the answer is postive. - -```julia; hold=true; echo=false -F(v) = [0,0,1] -Phi(theta, phi) = [sin(phi)*cos(theta), sin(phi)*sin(theta), cos(phi)] -Phi(v) = Phi(v...) -function SurfaceElement(u,v) - pt = [u,v] - Jac = ForwardDiff.jacobian(Phi, pt) - v1, v2 = Jac[:,1], Jac[:,2] - cross(v1, v2) -end -integrand(uv) = dot(F(Phi(uv)), SurfaceElement(uv...)) -a, err = hcubature(integrand, (0, 0), (2pi, pi/2)) -numericq(abs(a)) -``` - -###### Question - -Let $\phi(x,y,z) = xy$ and $S$ be the triangle $x+y+z=1$, $x,y,z \geq 0$. The surface may be described by $z=f(x,y) = 1 - (x + y)$, $0\leq y \leq 1-x, 0 \leq x \leq 1$ is useful in describing the surface. With this, the following integral will compute $\int_S \phi dS$: - -```math -\int_0^1 \int_0^{1-x} xy \sqrt{1 + \left(\frac{\partial{f}}{\partial{x}}\right)^2 + \left(\frac{\partial{f}}{\partial{y}}\right)^2} dy dx. -``` - -Compute this. - -```julia; hold=true; echo=false -#@syms x y real=true -#phi = 1 - (x+y) -#SE = sqrt(1 + diff(phi,x)^2, diff(phi,y)^2) -#integrate(x*y*S_, (y, 0, 1-x), (x,0,1)) # \sqrt{2}/24 -choices = [ -raw" ``\sqrt{2}/24``", -raw" ``2/\sqrt{24}``", -raw" ``1/12``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Let $\Phi(u,v) = \langle u^2, uv, v^2\rangle$, $(u,v)$ in $[0,1]\times[0,1]$ and $F(x,y,z) = \langle x,y^2,z^3\rangle$. Find $\int_S (F\cdot\hat{N})dS$ - -```julia; hold=true; echo=false -#Phi(u,v) = [u^2, u*v, v^2] -#F(x,y,z) = [x,y^2,z^3] -#Phi(v) = Phi(v...); F(v) = F(v...) -#@syms u::real v::real -#function SurfaceElement(u,v) -# pt = [u,v] -# Jac = Phi(u,v).jacobian([u,v]) -# v1, v2 = Jac[:,1], Jac[:,2] -# cross(v1, v2) -#end -#integrate(F(Phi(u,v)) ⋅ SurfaceElement(u,v), (u,0,1), (v,0,1)) # 17/252 -choices = [ -raw" ``17/252``", -raw" ``0``", -raw" ``7/36``", -raw" ``1/60``" -] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/integral_vector_calculus/process.jl b/CwJ/integral_vector_calculus/process.jl deleted file mode 100644 index f61af14..0000000 --- a/CwJ/integral_vector_calculus/process.jl +++ /dev/null @@ -1,35 +0,0 @@ -using WeavePynb -using Mustache - -mmd(fname) = mmd_to_html(fname, BRAND_HREF="../toc.html", BRAND_NAME="Calculus with Julia") -## uncomment to generate just .md files -#mmd(fname) = mmd_to_md(fname, BRAND_HREF="../toc.html", BRAND_NAME="Calculus with Julia") - - - - -fnames = [ - "double_triple_integrals", - "line_integrals", - "div_grad_curl", - "stokes_theorem", - "review" -] - -function process_file(nm, twice=false) - include("$nm.jl") - mmd_to_md("$nm.mmd") - markdownToHTML("$nm.md") - twice && markdownToHTML("$nm.md") -end - -process_files(twice=false) = [process_file(nm, twice) for nm in fnames] - - - - -""" -## TODO integral_vector_calculus -* line integrals needs an image (lasso?) -* more questions -""" diff --git a/CwJ/integral_vector_calculus/review.jmd b/CwJ/integral_vector_calculus/review.jmd deleted file mode 100644 index aa9e92e..0000000 --- a/CwJ/integral_vector_calculus/review.jmd +++ /dev/null @@ -1,429 +0,0 @@ -# Quick Review of Vector Calculus - - -```julia; echo=false; results="hidden" -using CalculusWithJulia -using CalculusWithJulia.WeaveSupport -using Plots - -const frontmatter = ( - title = "Quick Review of Vector Calculus", - description = "Calculus with Julia: Quick Review of Vector Calculus", - tags = ["CalculusWithJulia", "integral_vector_calculus", "quick review of vector calculus"], -); - -nothing -``` - - -This section considers functions from ``R^n`` into ``R^m`` where one or both of ``n`` or ``m`` is greater than ``1``: - -* functions ``f:R \rightarrow R^m`` are called univariate functions. - -* functions ``f:R^n \rightarrow R`` are called scalar-valued functions. - -* function ``f:R \rightarrow R`` are univariate, scalar-valued functions. - -* functions ``\vec{r}:R\rightarrow R^m`` are parameterized curves. The trace of a parameterized curve is a path. - -* functions ``F:R^n \rightarrow R^m``, may be called vector fields in applications. They are also used to describe transformations. - -When ``m>1`` a function is called *vector valued*. - -When ``n>1`` the argument may be given in terms of components, e.g. ``f(x,y,z)``; with a point as an argument, ``F(p)``; or with a vector as an argument, ``F(\vec{a})``. The identification of a point with a vector is done frequently. - -## Limits - -Limits when ``m > 1`` depend on the limits of each component existing. - -Limits when ``n > 1`` are more complicated. One characterization is a limit at a point ``c`` exists if and only if for *every* continuous path going to ``c`` the limit along the path for every component exists in the univariate sense. - - -## Derivatives - -The derivative of a univariate function, ``f``, at a point ``c`` is defined by a limit: - -```math -f'(c) = \lim_{h\rightarrow 0} \frac{f(c+h)-f(c)}{h}, -``` - -and as a function by considering the mapping ``c`` into ``f'(c)``. A characterization is it is the value for which - -```math -|f(c+h) - f(h) - f'(c)h| = \mathcal{o}(|h|), -``` - -That is, after dividing the left-hand side by ``|h|`` the expression goes to ``0`` as ``|h|\rightarrow 0``. This characterization will generalize with the norm replacing the absolute value, as needed. - - -### Parameterized curves - -The derivative of a function ``\vec{r}: R \rightarrow R^m``, ``\vec{r}'(t)``, is found by taking the derivative of each component. (The function consisting of just one component is univariate.) - -The derivative satisfies - -```math -\| \vec{r}(t+h) - \vec{r}(t) - \vec{r}'(t) h \| = \mathcal{o}(|h|). -``` - -The derivative is *tangent* to the curve and indicates the direction of travel. - -The **tangent** vector is the unit vector in the direction of ``\vec{r}'(t)``: - -```math -\hat{T} = \frac{\vec{r}'(t)}{\|\vec{r}(t)\|}. -``` - -The path is parameterized by *arc* length if ``\|\vec{r}'(t)\| = 1`` for all ``t``. In this case an "``s``" is used for the parameter, as a notational hint: ``\hat{T} = d\vec{r}/ds``. - -The **normal** vector is the unit vector in the direction of the derivative of the tangent vector: - -```math -\hat{N} = \frac{\hat{T}'(t)}{\|\hat{T}'(t)\|}. -``` - -In dimension ``m=2``, if ``\hat{T} = \langle a, b\rangle`` then ``\hat{N} = \langle -b, a\rangle`` or ``\langle b, -a\rangle`` and ``\hat{N}'(t)`` is parallel to ``\hat{T}``. - -In dimension ``m=3``, the **binormal** vector, ``\hat{B}``, is the unit vector ``\hat{T}\times\hat{N}``. - -The [Frenet-Serret]() formulas define the **curvature**, ``\kappa``, and the **torsion**, ``\tau``, by - -```math -\begin{align} -\frac{d\hat{T}}{ds} &= & \kappa \hat{N} &\\ -\frac{d\hat{N}}{ds} &= -\kappa\hat{T} & & + \tau\hat{B}\\ -\frac{d\hat{B}}{ds} &= & -\tau\hat{N}& -\end{align} -``` - -These formulas apply in dimension ``m=2`` with ``\hat{B}=\vec{0}``. - -The curvature, ``\kappa``, can be visualized by imagining a circle of radius ``r=1/\kappa`` best approximating the path at a point. (A straight line would have a circle of infinite radius and curvature ``0``.) - - -The chain rule says ``(\vec{r}(g(t))' = \vec{r}'(g(t)) g'(t)``. - - -### Scalar functions - -A scalar function, ``f:R^n\rightarrow R``, ``n > 1`` has a **partial derivative** defined. For ``n=2``, these are: - -```math -\begin{align} -\frac{\partial{f}}{\partial{x}}(x,y) &= -\lim_{h\rightarrow 0} \frac{f(x+h,y)-f(x,y)}{h}\\ -\frac{\partial{f}}{\partial{y}}(x,y) &= -\lim_{h\rightarrow 0} \frac{f(x,y+h)-f(x,y)}{h}. -\end{align} -``` - -The generalization to ``n>2`` is clear - the partial derivative in ``x_i`` is the derivative of ``f`` when the *other* ``x_j`` are held constant. - -This may be viewed as the derivative of the univariate function ``(f\circ\vec{r})(t)`` where ``\vec{r}(t) = p + t \hat{e}_i``, ``\hat{e}_i`` being the unit vector of all ``0``s except a ``1`` in the ``i``th component. - - -The **gradient** of ``f``, when the limits exist, is the vector-valued function for ``R^n`` to ``R^n``: - -```math -\nabla{f} = \langle -\frac{\partial{f}}{\partial{x_1}}, -\frac{\partial{f}}{\partial{x_2}}, -\dots -\frac{\partial{f}}{\partial{x_n}} -\rangle. -``` - -The gradient satisfies: - -```math -\|f(\vec{x}+\Delta{\vec{x}}) - f(\vec{x}) - \nabla{f}\cdot\Delta{\vec{x}}\| = \mathcal{o}(\|\Delta{\vec{x}\|}). -``` - -The gradient is viewed as a column vector. If the dot product above is viewed as matrix multiplication, then it would be written ``\nabla{f}' \Delta{\vec{x}}``. - - -**Linearization** is the *approximation* - -```math -f(\vec{x}+\Delta{\vec{x}}) \approx f(\vec{x}) + \nabla{f}\cdot\Delta{\vec{x}}. -``` - -The **directional derivative** of ``f`` in the direction ``\vec{v}`` is ``\vec{v}\cdot\nabla{f}``, which can be seen as the derivative of the univariate function ``(f\circ\vec{r})(t)`` where ``\vec{r}(t) = p + t \vec{v}``. - - -For the function ``z=f(x,y)`` the gradient points in the direction of steepest ascent. Ascent is seen in the ``3``d surface, the gradient is ``2`` dimensional. - -For a function ``f(\vec{x})``, a **level curve** is the set of values for which ``f(\vec{x})=c``, ``c`` being some constant. Plotted, this may give a curve or surface (in ``n=2`` or ``n=3``). The gradient at a point ``\vec{x}`` with ``f(\vec{x})=c`` will be *orthogonal* to the level curve ``f=c``. - - -Partial derivatives are scalar functions, so will themselves have partial derivatives when the limits are defined. The notation ``f_{xy}`` stands for the partial derivative in ``y`` of the partial derivative of ``f`` in ``x``. [Schwarz]()'s theorem says the order of partial derivatives will not matter (e.g., ``f_{xy} = f_{yx}``) provided the higher-order derivatives are continuous. - - -The chain rule applied to ``(f\circ\vec{r})(t)`` says: - -```math -\frac{d(f\circ\vec{r})}{dt} = \nabla{f}(\vec{r}) \cdot \vec{r}'. -``` - - -### Vector-valued functions - -For a function ``F:R^n \rightarrow R^m``, the **total derivative** of ``F`` is the linear operator ``d_F`` satisfying: - -```math -\|F(\vec{x} + \vec{h})-F(\vec{x}) - d_F \vec{h}\| = \mathcal{o}(\|\vec{h}\|) -``` - - -For ``F=\langle f_1, f_2, \dots, f_m\rangle`` the total derivative is the **Jacobian**, a ``m \times n`` matrix of partial derivatives: - -```math -J_f = \left[ -\begin{align}{} -\frac{\partial f_1}{\partial x_1} &\quad \frac{\partial f_1}{\partial x_2} &\dots&\quad\frac{\partial f_1}{\partial x_n}\\ -\frac{\partial f_2}{\partial x_1} &\quad \frac{\partial f_2}{\partial x_2} &\dots&\quad\frac{\partial f_2}{\partial x_n}\\ -&&\vdots&\\ -\frac{\partial f_m}{\partial x_1} &\quad \frac{\partial f_m}{\partial x_2} &\dots&\quad\frac{\partial f_m}{\partial x_n} -\end{align} -\right]. -``` - -This can be viewed as being comprised of row vectors, each being the individual gradients; or as column vectors each being the vector of partial derivatives for a given variable. - - -The **chain rule** for ``F:R^n \rightarrow R^m`` composed with ``G:R^k \rightarrow R^n`` is: - -```math -d_{F\circ G}(a) = d_F(G(a)) d_G(a), -``` - -That is the total derivative of ``F`` at the point ``G(a)`` times (matrix multiplication) the total derivative of ``G`` at ``a``. The dimensions work out as ``d_F`` is ``m\times n`` and ``d_G`` is ``n\times k``, so ``d_(F\circ G)`` will be ``m\times k`` and ``F\circ{G}: R^k\rightarrow R^m``. - -A scalar function ``f:R^n \rightarrow R`` and a parameterized curve ``\vec{r}:R\rightarrow R^n`` composes to yield a univariate function. The total derivative of ``f\circ\vec{r}`` satisfies: - -```math -d_f(\vec{r}) d_\vec{r} = \nabla{f}(\vec{r}(t))' \vec{r}'(t) = -\nabla{f}(\vec{r}(t)) \cdot \vec{r}'(t), -``` -as above. (There is an identification of a ``1\times 1`` matrix with a scalar in re-expressing as a dot product.) - -### The divergence, curl, and their vanishing properties - -Define the **divergence** of a vector-valued function ``F:R^n \rightarrow R^n`` by: - -```math -\text{divergence}(F) = -\frac{\partial{F_{x_1}}}{\partial{x_1}} + -\frac{\partial{F_{x_2}}}{\partial{x_2}} + \cdots -\frac{\partial{F_{x_n}}}{\partial{x_n}}. -``` - -The divergence is a scalar function. For a vector field ``F``, it measures the microscopic flow out of a region. - -A vector field whose divergence is identically ``0`` is called **incompressible**. - -Define the **curl** of a *two*-dimensional vector field, ``F:R^2 \rightarrow R^2``, by: - -```math -\text{curl}(F) = \frac{\partial{F_y}}{\partial{x}} - -\frac{\partial{F_x}}{\partial{y}}. -``` - -The curl for ``n=2`` is a scalar function. - -For ``n=3`` define the **curl** of ``F:R^3 \rightarrow R^3`` to be the *vector field*: - -```math -\text{curl}(F) = -\langle \ -\frac{\partial{F_z}}{\partial{y}} - \frac{\partial{F_y}}{\partial{z}}, -\frac{\partial{F_x}}{\partial{z}} - \frac{\partial{F_z}}{\partial{x}}, -\frac{\partial{F_y}}{\partial{x}} - \frac{\partial{F_x}}{\partial{y}} -\rangle. -``` - -The curl measures the circulation in a vector field. In dimension ``n=3`` it *points* in the direction of the normal of the plane of maximum circulation with direction given by the right-hand rule. - - -A vector field whose curl is identically of magnitude ``0`` is called **irrotational**. - -The ``\nabla`` operator is the *formal* vector -```math -\nabla = \langle -\frac{\partial}{\partial{x}}, -\frac{\partial}{\partial{y}}, -\frac{\partial}{\partial{z}} -\rangle. -``` - -The gradient is then scalar "multiplication" on the left: ``\nabla{f}``. - -The divergence is the dot product on the left: ``\nabla\cdot{F}``. - -The curl is the the cross product on the left: ``\nabla\times{F}``. - - -These operations satisfy two vanishing properties: - -* The curl of a gradient is the zero vector: ``\nabla\times\nabla{f}=\vec{0}`` - -* The divergence of a curl is ``0``: ``\nabla\cdot(\nabla\times F)=0`` - - -[Helmholtz]() decomposition theorem says a vector field (``n=3``) which vanishes rapidly enough can be expressed in terms of ``F = -\nabla\phi + \nabla\times{A}``. The left term will be irrotational (no curl) and the right term will be incompressible (no divergence). - - - - -## Integrals - - -The definite integral, ``\int_a^b f(x) dx``, for a bounded univariate function is defined in terms Riemann sums, ``\lim \sum f(c_i)\Delta{x_i}`` as the maximum *partition* size goes to ``0``. Similarly the integral of a bounded scalar function ``f:R^n \rightarrow R`` over a box-like region ``[a_1,b_1]\times[a_2,b_2]\times\cdots\times[a_n,b_n]`` can be defined in terms of a limit of Riemann sums. A Riemann integrable function is one for which the upper and lower Riemann sums agree in the limit. A characterization of a Riemann integrable function is that the set of discontinuities has measure ``0``. - - -If ``f`` and the partial functions (``x \rightarrow f(x,y)`` and ``y \rightarrow f(x,y)``) are Riemann integrable, then Fubini's theorem allows the definite integral to be performed iteratively: - -```math -\iint_{R\times S}fdV = \int_R \left(\int_S f(x,y) dy\right) dx -= \int_S \left(\int_R f(x,y) dx\right) dy. -``` - -The integral satisfies linearity and monotonicity properties that follow from the definitions: - - -* For integrable ``f`` and ``g`` and constants ``a`` and ``b``: -```math -\iint_R (af(x) + bg(x))dV = a\iint_R f(x)dV + b\iint_R g(x) dV. -``` - -* If ``R`` and ``R'`` are *disjoint* rectangular regions (possibly sharing a boundary), then the integral over the union is defined by linearity: - -```math -\iint_{R \cup R'} f(x) dV = \iint_R f(x)dV + \iint_{R'} f(x) dV. -``` - - -* As ``f`` is bounded, let ``m \leq f(x) \leq M`` for all ``x`` in ``R``. Then - -```math -m V(R) \leq \iint_R f(x) dV \leq MV(R). -``` - -* If ``f`` and ``g`` are integrable *and* ``f(x) \leq g(x)``, then the integrals have the same property, namely ``\iint_R f dV \leq \iint_R gdV``. - -* If ``S \subset R``, both closed rectangles, then if ``f`` is integrable over ``R`` it will be also over ``S`` and, when ``f\geq 0``, ``\iint_S f dV \leq \iint_R fdV``. - -* If ``f`` is bounded and integrable, then ``|\iint_R fdV| \leq \iint_R |f| dV``. - - - - -In two dimensions, we have the following interpretations: - -```math -\begin{align} -\iint_R dA &= \text{area of } R\\ -\iint_R \rho dA &= \text{mass with constant density }\rho\\ -\iint_R \rho(x,y) dA &= \text{mass of region with density }\rho\\ -\frac{1}{\text{area}}\iint_R x \rho(x,y)dA &= \text{centroid of region in } x \text{ direction}\\ -\frac{1}{\text{area}}\iint_R y \rho(x,y)dA &= \text{centroid of region in } y \text{ direction} -\end{align} -``` - -In three dimensions, we have the following interpretations: - - -```math -\begin{align} -\iint_VdV &= \text{volume of } V\\ -\iint_V \rho dV &= \text{mass with constant density }\rho\\ -\iint_V \rho(x,y) dV &= \text{mass of volume with density }\rho\\ -\frac{1}{\text{volume}}\iint_V x \rho(x,y)dV &= \text{centroid of volume in } x \text{ direction}\\ -\frac{1}{\text{volume}}\iint_V y \rho(x,y)dV &= \text{centroid of volume in } y \text{ direction}\\ -\frac{1}{\text{volume}}\iint_V z \rho(x,y)dV &= \text{centroid of volume in } z \text{ direction} -\end{align} -``` - - -To compute integrals over non-box-like regions, Fubini's theorem may be utilized. Alternatively, a **transformation** of variables - - - -### Line integrals - -For a parameterized curve, ``\vec{r}(t)``, the **line integral** of a scalar function between ``a \leq t \leq b`` is defined by: ``\int_a^b f(\vec{r}(t)) \| \vec{r}'(t)\| dt``. For a path parameterized by arc-length, the integral is expressed by ``\int_C f(\vec{r}(s)) ds`` or simply ``\int_C f ds``, as the norm is ``1`` and ``C`` expresses the path. - -A Jordan curve in two dimensions is a non-intersecting continuous loop in the plane. The Jordan curve theorem states that such a curve divides the plane into a bounded and unbounded region. The curve is *positively* parameterized if the the bounded region is kept on the left. A line integral over a Jordan curve is denoted ``\oint_C f ds``. - -Some interpretations: ``\int_a^b \| \vec{r}'(t)\| dt`` computes the *arc-length*. If the path represents a wire with density ``\rho(\vec{x})`` then ``\int_a^b \rho(\vec{r}(t)) \|\vec{r}'(t)\| dt`` computes the mass of the wire. - -The line integral is also defined for a vector field ``F:R^n \rightarrow R^n`` through ``\int_a^b F(\vec{r}(t)) \cdot \vec{r}'(t) dt``. When parameterized by arc length, this becomes ``\int_C F(\vec{r}(s)) \cdot \hat{T} ds`` or more simply ``\int_C F\cdot\hat{T}ds``. In dimension ``n=2`` if ``\hat{N}`` is the normal, then this line integral (the flow) is also of interest ``\int_a^b F(\vec{r}(t)) \cdot \hat{N} dt`` (this is also expressed by ``\int_C F\cdot\hat{N} ds``). - - -When ``F`` is a *force field*, then the interpretation of ``\int_a^b F(\vec{r}(t)) \cdot \vec{r}'(t) dt`` is the amount of *work* to move an object from ``\vec{r}(a)`` to ``\vec{r}(b)``. (Work measures force applied times distance moved.) - - - -A **conservative force** is a force field within an open region ``R`` with the property that the total work done in moving a particle between two points is independent of the path taken. (Similarly, integrals over Jordan curves are zero.) - -The gradient theorem or **fundamental theorem of line integrals** states if ``\phi`` is a scalar function then the vector field ``\nabla{\phi}`` (if continuous in ``R``) is a conservative field. That is if ``q`` and ``p`` are points, ``C`` any curve in ``R``, and ``\vec{r}`` a parameterization of ``C`` over ``[a,b]`` that ``\phi(p) - \phi(q) = \int_a^b \nabla{f}(\vec{r}(t)) \cdot \vec{r}'(t) dt``. - -If ``\phi`` is a scalar function producing a field ``\nabla{\phi}`` then in dimensions ``2`` and ``3`` the curl of ``\nabla{\phi}`` is zero when the functions involved are continuous. Conversely, if the curl of a force field, ``F``, is zero *and* the derivatives are continuous in a *simply connected* domain, then there exists a scalar potential function, ``\phi,`` with ``F = -\nabla{\phi}``. - - -In dimension ``2``, if ``F`` describes a flow field, the integral ``\int_C F \cdot\hat{N}ds`` is interpreted as the flow across the curve ``C``; when ``C`` is a closed curve ``\oint_C F\cdot\hat{N}ds`` is interpreted as the flow out of the region, when ``C`` is positively parameterized. - -**Green's theorem** states if ``C`` is a positively oriented Jordan curve in the plane bounding a region ``D`` and ``F`` is a vector field ``F:R^2 \rightarrow R^2`` then ``\oint_C F\cdot\hat{T}ds = \iint_D \text{curl}(F) dA``. - -Green's theorem can be re-expressed in flow form: ``\oint_C F\cdot\hat{N}ds=\iint_D\text{divergence}(F)dA``. - -For ``F=\langle -y,x\rangle``, Green's theorem says the area of ``D`` is given by ``(1/2)\oint_C F\cdot\vec{r}' dt``. Similarly, if ``F=\langle 0,x\rangle`` or ``F=\langle -y,0\rangle`` then the area is given by ``\oint_C F\cdot\vec{r}'dt``. The above follows as ``\text{curl}(F)`` is ``2`` or ``1``. Similar formulas can be given to compute the centroids, by identifying a vector field with ``\text{curl}(F) = x`` or ``y``. - - -### Surface integrals - -A surface in ``3`` dimensions can be described by a scalar function ``z=f(x,y)``, a parameterization ``F:R^2 \rightarrow R^3`` or as a level curve of a scalar function ``f(x,y,z)``. The second case, covers the first through the parameterization ``(x,y) \rightarrow (x,y,f(x,y)``. For a parameterization of a surface, ``\Phi(u,v) = \langle \Phi_x, \Phi_y, \Phi_z\rangle``, let ``\partial{\Phi}/\partial{u}`` be the ``3``-d vector ``\langle \partial{\Phi_x}/\partial{u}, \partial{\Phi_y}/\partial{u}, \partial{\Phi_z}/\partial{u}\rangle``, similarly define ``\partial{\Phi}/\partial{v}``. As vectors, these lie in the tangent plane to the surface and this plane has normal vector ``\vec{N}=\partial{\Phi}/\partial{u}\times\partial{\Phi}/\partial{v}``. For a closed surface, the parametrization is positive if ``\vec{N}`` is an outward pointing normal. Let the *surface element* be defined by ``\|\vec{N}\|``. - -The surface integral of a scalar function ``f:R^3 \rightarrow R`` for a parameterization ``\Phi:R \rightarrow S`` is defined by -```math -\iint_R f(\Phi(u,v)) -\|\frac{\partial{\Phi}}{\partial{u}} \times \frac{\partial{\Phi}}{\partial{v}}\| -du dv -``` - -If ``F`` is a vector field, the surface integral may be defined as a flow across the boundary through - -```math -\iint_R F(\Phi(u,v)) \cdot \vec{N} du dv = -\iint_R (F \cdot \hat{N}) \|\frac{\partial{\Phi}}{\partial{u}} \times \frac{\partial{\Phi}}{\partial{v}}\| du dv = \iint_S (F\cdot\hat{N})dS -``` - - - - -### Stokes' theorem, divergence theorem - - -**Stokes' theorem** states that in dimension ``3`` if ``S`` is a smooth surface with boundary ``C`` -- *oriented* so the right-hand rule gives the choice of normal for ``S`` -- and ``F`` is a vector field with continuous partial derivatives then: - -```math -\iint_S (\nabla\times{F}) \cdot \hat{N} dS = \oint_C F ds. -``` - -Stokes' theorem has the same formulation as Green's theorem in dimension ``2``, where the surface integral is just the ``2``-dimensional integral. - -Stokes' theorem is used to show a vector field ``F`` with zero curl is conservative if ``F`` is continuous in a simply connected region. - -Stokes' theorem is used in Physics, for example, to relate the differential and integral forms of ``2`` of Maxwell's equations. - ----- - -The **divergence theorem** states if ``V`` is a compact volume in ``R^3`` with piecewise smooth boundary ``S=\partial{V}`` and ``F`` is a vector field with continuous partial derivatives then: - -```math -\iint_V (\nabla\cdot{F})dV = \oint_S (F\cdot\hat{N})dS. -``` - -The divergence theorem is available for other dimensions. In the ``n=2`` case, it is the alternate (flow) form of Green's theorem. - -The divergence theorem is used in Physics to express physical laws in either integral or differential form. diff --git a/CwJ/integral_vector_calculus/stokes_theorem.jmd b/CwJ/integral_vector_calculus/stokes_theorem.jmd deleted file mode 100644 index 4f33d8b..0000000 --- a/CwJ/integral_vector_calculus/stokes_theorem.jmd +++ /dev/null @@ -1,1349 +0,0 @@ -# Green's Theorem, Stokes' Theorem, and the Divergence Theorem - -This section uses these add-on packages: - - -```julia -using CalculusWithJulia -using Plots -using QuadGK -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -const frontmatter = ( - title = "Green's Theorem, Stokes' Theorem, and the Divergence Theorem", - description = "Calculus with Julia: Green's Theorem, Stokes' Theorem, and the Divergence Theorem", - tags = ["CalculusWithJulia", "integral_vector_calculus", "green's theorem, stokes' theorem, and the divergence theorem"], -); - -# some useful helpers for drawing -_bar(x) = sum(x)/length(x) -_shrink(x, xbar, offset) = xbar + (1-offset/100)*(x-xbar) - -function drawf!(p,f, m, dx) - a,b = m-dx, m+dx - xs = range(a,b,length=100) - plot!(p, [a,a,b,b],[f(a),0,0,f(b)], color=:black, linewidth=2) - plot!(p, xs, f.(xs), color=:blue, linewidth=3) - p -end - -function apoly!(plt::Plots.Plot, ps; offset=5, kwargs...) - xs, ys = unzip(ps) - xbar, ybar = _bar.((xs, ys)) - xs, ys = _shrink.(xs, xbar, offset), _shrink.(ys, ybar, offset) - - plot!(plt, xs, ys; kwargs...) - xn = [xs[end],ys[end]] - x0 = [xs[1], ys[1]] - dxn = 0.95*(x0 - xn) - - arrow!(plt, xn, dxn; kwargs...) - - plt -end -apoly!(ps;offset=5,kwargs...) = apoly!(Plots.current(), ps; offset=offset, kwargs...) - -# arrow square -# start 1,2,3,4: 1 upper left, 2 lower left -function cpoly!(p, c, r, st=1, orient=:ccw; linewidth=1,linealpha=1.0, color=[:red,:red,:red,:black]) - - ps = [[-1,1], [1,1],[1,-1],[-1,-1]] - if orient == :ccw - ps = [[-1,1],[-1,-1],[1,-1],[1,1]] - end - k = 1 - for i in st:(st+2) - plot!(p, unzip([c+r*ps[mod1(i,4)], c+r*ps[mod1(i+1,4)]])..., linewidth=linewidth, linealpha=linealpha, color=color[mod1(k,length(color))]) - k = k+1 - end - i = mod1(st+3,4) - j = mod1(i+1, 4) - arrow!(p, c+r*ps[i], 0.95*r*(ps[j]-ps[i]), linewidth=linewidth, linealpha= linealpha, color=color[mod1(k,length(color))]) - p -end - - -nothing -``` - - ----- - - -The fundamental theorem of calculus is a fan favorite, as it reduces a definite integral, $\int_a^b f(x) dx$, into the evaluation of a *related* function at two points: $F(b)-F(a)$, where the relation is $F$ is an *antiderivative* of $f$. It is a favorite, as it makes life much easier than the alternative of computing a limit of a Riemann sum. - -This relationship can be generalized. The key is to realize that the interval $[a,b]$ has boundary $\{a, b\}$ (a set) and then expressing the theorem as: the integral around some region of $f$ is the integral, suitably defined, around the *boundary* of the region for a function *related* to $f$. - -In an abstract setting, Stokes' theorem says exactly this with the relationship being the *exterior* derivative. Here we are not as abstract, we discuss below: - -* Green's theorem, a $2$-dimensional theorem, where the region is a planar region, $D$, and the boundary a simple curve $C$; - -* Stokes' theorem in $3$ dimensions, where the region is an open surface, $S$, in $R^3$ with boundary, $C$; - -* The Divergence theorem in $3$ dimensions, where the region is a volume in three dimensions and the boundary its $2$-dimensional closed surface. - -The related functions will involve the divergence and the curl, previously discussed. - -Many of the the examples in this section come from either [Strang](https://ocw.mit.edu/resources/res-18-001-calculus-online-textbook-spring-2005/) or [Schey](https://www.amazon.com/Div-Grad-Curl-All-That/dp/0393925161/). - - -To make the abstract concrete, consider the one dimensional case of finding the definite integral $\int_a^b F'(x) dx$. The Riemann sum picture at the *microscopic* level considers a figure like: - -```julia; hold=true; echo=false -a, b,n = 1/10,2, 10 -dx = (b-a)/n -f(x) = x^x -xs = range(a-dx/2, b+dx/2, length=251) -ms = a:dx:b - -p = plot(legend=false, xticks=nothing, yticks=nothing, border=:none, ylim=(-1/2, f(b)+1/2)) -#plot!(p, xs, f.(xs), color=:blue, linewidth=3) -for m in ms - drawf!(p, f, m, 0.9*dx/2) -end -annotate!([(ms[6]-dx/2,-0.3, L"x_{i-1}"), (ms[6]+dx/2,-0.3, L"x_{i}")]) -p -``` - -The total area under the blue curve from $a$ to $b$, is found by adding the area of each segment of the figure. - -Let's consider now what an integral over the boundary would mean. The region, or interval, $[x_{i-1}, x_i]$ has a boundary that clearly consists of the two points $x_{i-1}$ and $x_i$. If we *orient* the boundary, as we need to for higher dimensional boundaries, using the outward facing direction, then the oriented boundary at the right-hand end point, $x_i$, would point towards $+\infty$ and the left-hand end point, $x_{i-1}$, would be oriented to point to $-\infty$. An "integral" on the boundary of $F$ would naturally be $F(b) \times 1$ plus $F(a) \times -1$, or $F(b)-F(a)$. - -With this choice of integral over the boundary, we can see much cancellation arises were we to compute this integral for each piece, as we would have with $a=x_0 < x_1 < \cdots x_{n-1} < x_n=b$: - -```math -(F(x_1) - F(x_0)) + (F(x_2)-F(x_1)) + \cdots + (F(x_n) - F(x_{n-1})) = F(x_n) - F(x_0) = F(b) - F(a). -``` - -That is, with this definition for a boundary integral, the interior pieces of the microscopic approximation cancel and the total is just the integral over the oriented macroscopic boundary $\{a, b\}$. - -But each microscopic piece can be reimagined, as - -```math -F(x_{i}) - F(x_{i-1}) = \left(\frac{F(x_{i}) - F(x_{i-1})}{\Delta{x}}\right)\Delta{x} -\approx F'(x_i)\Delta{x}. -``` - -The approximation could be exact were the mean value theorem used to identify a point in the interval, but we don't pursue that, as the key point is the right hand side is a Riemann sum approximation for a *different* integral, in this case the integral $\int_a^b F'(x) dx$. Passing from the microscopic view to an infinitesimal view, the picture gives two interpretations, leading to the Fundamental Theorem of Calculus: - -```math -\int_a^b F'(x) dx = F(b) - F(a). -``` - -The three theorems of this section, Green's theorem, Stokes' theorem, and the divergence theorem, can all be seen in this manner: the sum of microscopic boundary integrals leads to a macroscopic boundary integral of the entire region; whereas, by reinterpretation, the microscopic boundary integrals are viewed as Riemann sums, which in the limit become integrals of a *related* function over the region. - -## Green's theorem - -To continue the above analysis for a higher dimension, we consider -the following figure hinting at a decomposition of a macroscopic square into subsequent microscopic sub-squares. The boundary of each square is oriented so that the right hand rule comes out of the picture. - -```julia; hold=true; echo=false -a = 1 -ps = [[0,a],[0,0],[a,0],[a,a]] -p = plot(; legend=false, aspect_ratio=:equal, xticks=nothing, yticks=nothing, border=:none) -apoly!(p, ps, linewidth=3, color=:blue) -a = 1/2 -ps = [[0,a],[0,0],[a,0],[a,a]] -del = 2/100 -apoly!(p, ([del,del],) .+ ps, linewidth=3, color=:red, offset=20) -apoly!(p, ([0+del,a-del],) .+ ps, linewidth=3, color=:red, offset=20) -apoly!(p, ([a-del,0+del],) .+ ps, linewidth=3, color=:red, offset=20) -apoly!(p, ([a-del,a-del],) .+ ps, linewidth=3, color=:red, offset=20) - -a = 1/4 -ps = [[del,del]] .+ [[0,a],[0,0],[a,0],[a,a]] -del = 4/100 -apoly!(p, ([del,del],) .+ ps, linewidth=3, color=:green, offset=40) -apoly!(p, ([0+del,a-del],) .+ ps, linewidth=3, color=:green, offset=40) -apoly!(p, ([a-del,0+del],) .+ ps, linewidth=3, color=:green, offset=40) -apoly!(p, ([a-del,a-del],) .+ ps, linewidth=3, color=:green, offset=40) - -p -``` - -Consider the boundary integral $\oint_c F\cdot\vec{T} ds$ around the smallest (green) squares. We have seen that the *curl* at a point in a direction is given in terms of the limit. Let the plane be the $x-y$ plane, and the $\hat{k}$ direction be the one coming out of the figure. In the derivation of the curl, we saw that the line integral for circulation around the square satisfies: - -```math -\lim \frac{1}{\Delta{x}\Delta{y}} \oint_C F \cdot\hat{T}ds = - \frac{\partial{F_y}}{\partial{x}} - \frac{\partial{F_x}}{\partial{y}}. -``` - -If the green squares are small enough, then the line integrals satisfy: - -```math -\oint_C F \cdot\hat{T}ds -\approx -\left( -\frac{\partial{F_y}}{\partial{x}} -- -\frac{\partial{F_x}}{\partial{y}} -\right) \Delta{x}\Delta{y} . -``` - -We interpret the right hand side as a Riemann sum approximation for the $2$ dimensional integral of the function $f(x,y) = \frac{\partial{F_x}}{\partial{y}} - \frac{\partial{F_y}}{\partial{x}}=\text{curl}(F)$, the two-dimensional curl. Were the green squares continued to fill out the large blue square, then the sum of these terms would approximate the integral - -```math -\iint_S f(x,y) dA = \iint_S -\left(\frac{\partial{F_y}}{\partial{x}} - \frac{\partial{F_x}}{\partial{y}}\right) dA -= \iint_S \text{curl}(F) dA. -``` - - -However, the microscopic boundary integrals have cancellations that lead to a macroscopic boundary integral. The sum of $\oint_C F \cdot\hat{T}ds$ over the $4$ green squares will be equal to $\oint_{C_r} F\cdot\hat{T}ds$, where $C_r$ is the red square, as the interior line integral pieces will all cancel off. The sum of $\oint_{C_r} F \cdot\hat{T}ds$ over the $4$ red squares will equal $\oint_{C_b} F \cdot\hat{T}ds$, where $C_b$ is the oriented path around the blue square, as again the interior line pieces will cancel off. Etc. - -This all suggests that the flow integral around the surface of the larger region (the blue square) is equivalent to the integral of the curl component over the region. -This is [Green](https://en.wikipedia.org/wiki/Green%27s_theorem)'s theorem, as stated by Wikipedia: - -> **Green's theorem**: Let $C$ be a positively oriented, piecewise smooth, simple closed curve in the plane, and let $D$ be the region bounded by $C$. If $F=\langle F_x, F_y\rangle$, is a vector field on an open region containing $D$ having continuous partial derivatives then: -> ```math -> \oint_C F\cdot\hat{T}ds = -> \iint_D \left( -> \frac{\partial{F_y}}{\partial{x}} - \frac{\partial{F_x}}{\partial{y}} -> \right) dA= -> \iint_D \text{curl}(F)dA. -> ``` - - -The statement of the theorem applies only to regions whose boundaries are simple closed curves. Not all simple regions have such boundaries. An annulus for example. This is a restriction that will be generalized. - -### Examples - -Some examples, following Strang, are: - -#### Computing area - -Let $F(x,y) = \langle -y, x\rangle$. Then $\frac{\partial{F_y}}{\partial{x}} - \frac{\partial{F_x}}{\partial{y}}=2$, so - -```math -\frac{1}{2}\oint_C F\cdot\hat{T}ds = \frac{1}{2}\oint_C (xdy - ydx) = -\iint_D dA = A(D). -``` - -This gives a means to compute the area of a region by integrating around its boundary. - ----- - -To compute the area of an ellipse, we have: - -```julia; hold=true - -F(x,y) = [-y,x] -F(v) = F(v...) - -r(t) = [a*cos(t),b*sin(t)] - -@syms a::positive b::positive t -(1//2) * integrate( F(r(t)) ⋅ diff.(r(t),t), (t, 0, 2PI)) -``` - -To compute the area of the triangle with vertices $(0,0)$, $(a,0)$ and $(0,b)$ we can orient the boundary counter clockwise. Let $A$ be the line segment from $(0,b)$ to $(0,0)$, $B$ be the line segment from $(0,0)$ to $(a,0)$, and $C$ be the other. Then - - -```math -\begin{align} -\frac{1}{2} \int_A F\cdot\hat{T} ds &=\frac{1}{2} \int_A -ydx = 0\\ -\frac{1}{2} \int_B F\cdot\hat{T} ds &=\frac{1}{2} \int_B xdy = 0, -\end{align} -``` - -as on $A$, $y=0$ and $dy=0$ and on $B$, $x=0$ and $dx=0$. - -On $C$ we have $\vec{r}(t) = (0, b) + t\cdot(1,-b/a) =\langle t, b-(bt)/a\rangle$ from $t=a$ to $0$ - -```math -\int_C F\cdot \frac{d\vec{r}}{dt} dt = -\int_a^0 \langle -b + (bt)/a), t\rangle\cdot\langle 1, -b/a\rangle dt -= \int_a^0 -b dt = -bt\mid_{a}^0 = ba. -``` - -Dividing by $1/2$ give the familiar answer $A=(1/2) a b$. - -#### Conservative fields - -A vector field is conservative if path integrals for work are independent of the path. We have seen that a vector field that is the gradient of a scalar field will be conservative and vice versa. This led to the vanishing identify $\nabla\times\nabla(f) = 0$ for a scalar field $f$. - -Is the converse true? Namely, *if* for some vector field $F$, $\nabla\times{F}$ is identically $0$ is the field conservative? - -The answer is yes -- if the vector field has continuous partial derivatives and the curl is $0$ in a simply connected domain. - -For the two dimensional case the curl is a scalar. *If* $F = \langle F_x, F_y\rangle = \nabla{f}$ is conservative, then $\partial{F_y}/\partial{x} - \partial{F_x}/\partial{y} = 0$. - - -Now assume $\partial{F_y}/\partial{x} - \partial{F_x}/\partial{y} = 0$. Let $P$ and $Q$ be two points in the plane. Take any path, $C_1$ from $P$ to $Q$ and any return path, $C_2$, from $Q$ to $P$ that do not cross and such that $C$, the concatenation of the two paths, satisfies Green's theorem. Then, as $F$ is continuous on an open interval containing $D$, we have: - -```math -\begin{align*} -0 &= \iint_D 0 dA \\ -&= -\iint_D \left(\partial{F_y}/\partial{x} - \partial{F_x}/\partial{y}\right)dA \\ -&= -\oint_C F \cdot \hat{T} ds \\ -&= -\int_{C_1} F \cdot \hat{T} ds + \int_{C_2}F \cdot \hat{T} ds. -\end{align*} -``` - -Reversing $C_2$ to go from $P$ to $Q$, we see the two work integrals are identical, that is the field is conservative. - -Summarizing: - -* If $F=\nabla{f}$ then $F$ is conservative. -* If $F=\langle F_x, F_y\rangle$ has *continuous* partial derivatives in a simply connected open region with $\partial{F_y}/\partial{x} - \partial{F_x}/\partial{y}=0$, then in that region $F$ is conservative and can be represented as the gradient of a scalar function. - - -For example, let $F(x,y) = \langle \sin(xy), \cos(xy) \rangle$. Is this a conservative vector field? - -We can check by taking partial derivatives. Those of interest are: - -```math -\begin{align} -\frac{\partial{F_y}}{\partial{x}} &= \frac{\partial{(\cos(xy))}}{\partial{x}} = --\sin(xy) y,\\ -\frac{\partial{F_x}}{\partial{y}} &= \frac{\partial{(\sin(xy))}}{\partial{y}} = -\cos(xy)x. -\end{align} -``` - -It is not the case that $\partial{F_y}/\partial{x} - \partial{F_x}/\partial{y}=0$, so this vector field is *not* conservative. - ----- - -The conditions of Green's theorem are important, as this next example shows. - -Let $D$ be the unit disc, $C$ the unit circle parameterized counter clockwise. - -Let $R(x,y) = \langle -y, x\rangle$ be a rotation field and $F(x,y) = R(x,y)/(R(x,y)\cdot R(x,y))$. Then: - -```julia -@syms x::real y::real z::real t::real -``` - -```julia; hold=true; -R(x,y) = [-y,x] -F(x,y) = R(x,y)/(R(x,y)⋅R(x,y)) - -Fx, Fy = F(x,y) -diff(Fy, x) - diff(Fx, y) |> simplify -``` - -As the integrand is ``00``, $\iint_D \left( \partial{F_y}/{\partial{x}}-\partial{F_xy}/{\partial{y}}\right)dA = 0$, as well. But, - -```math -F\cdot\hat{T} = \frac{R}{R\cdot{R}} \cdot \frac{R}{R\cdot{R}} = \frac{R\cdot{R}}{(R\cdot{R})^2} = \frac{1}{R\cdot{R}}, -``` - -so $\oint_C F\cdot\hat{T}ds = 2\pi$, $C$ being the unit circle so $R\cdot{R}=1$. - -That is, for this example, Green's theorem does **not** apply, as the two integrals are not the same. What isn't satisfied in the theorem? $F$ is not continuous at the origin and our curve $C$ defining $D$ encircles the origin. So, $F$ does not have continuous partial derivatives, as is required for the theorem. - -#### More complicated boundary curves - -A simple closed curve is one that does not cross itself. Green's theorem applies to regions bounded by curves which have finitely many crosses provided the orientation used is consistent throughout. - - -Consider the curve $y = f(x)$, $a \leq x \leq b$, assuming $f$ is continuous, $f(a) > 0$, and $f(b) < 0$. We can use Green's theorem to compute the signed "area" under under $f$ if we consider the curve in $R^2$ from $(b,0)$ to $(a,0)$ to $(a, f(a))$, to $(b, f(b))$ and back to $(b,0)$ in that orientation. This will cross at each zero of $f$. - -```julia; hold=true; echo=false -a, b = pi/2, 3pi/2 -f(x) = sin(x) -p = plot(f, a, b, legend=false, xticks=nothing, border=:none, color=:green) -arrow!(p, [3pi/4, f(3pi/4)], 0.01*[1,cos(3pi/4)], color = :green) -arrow!(p, [5pi/4, f(5pi/4)], 0.01*[1,cos(5pi/4)], color = :green) -arrow!(p, [a,0], [0, f(a)], color=:red) -arrow!(p, [b, f(b)], [0, -f(b)], color=:blue) -arrow!(p, [b, 0], [a-b, 0], color=:black) -del = -0.1 -annotate!(p, [(a,del, "a"), (b,-del,"b")]) -p -``` - -Let $A$ label the red line, $B$ the green curve, $C$ the blue line, and $D$ the black line. Then the area is given from Green's theorem by considering half of the the line integral of $F(x,y) = \langle -y, x\rangle$ or $\oint_C (xdy - ydx)$. To that matter we have: - -```math -\begin{align} -\int_A (xdy - ydx) &= a f(a)\\ -\int_C (xdy - ydx) &= b(-f(b))\\ -\int_D (xdy - ydx) &= 0\\ -\end{align} -``` - -Finally the integral over $B$, using integration by parts: - -```math -\begin{align} -\int_B F(\vec{r}(t))\cdot \frac{d\vec{r}(t)}{dt} dt &= -\int_b^a \langle -f(t),t)\rangle\cdot\langle 1, f'(t)\rangle dt\\ -&= \int_a^b f(t)dt - \int_a^b tf'(t)dt\\ -&= \int_a^b f(t)dt - \left(tf(t)\mid_a^b - \int_a^b f(t) dt\right). -\end{align} -``` - -Combining, we have after cancellation $\oint (xdy - ydx) = 2\int_a^b f(t) dt$, or after dividing by $2$ the signed area under the curve. - ----- - -The region may not be simply connected. A simple case might be the disc: $1 \leq x^2 + y^2 \leq 4$. In this figure we introduce a cut to make a simply connected region. - -```julia;hold=true; echo=false -a, b = 1, 2 -theta = pi/48 -alpha = asin(b/a*sin(theta)) -f1(t) = b*[cos(t), sin(t)] -f2(t) = a*[cos(t), sin(t)] -yflip(x) = [x[1],-x[2]] -p = plot(unzip(f1, theta, 2pi-theta)..., legend=false, aspect_ratio=:equal, color=:blue) -plot!(p, unzip(f2, alpha, 2pi-alpha)..., color=:red) -arrow!(p, [0,2], [-.1,0], color=:blue) -arrow!(p, [0,1], [.1,0], color=:red) -arrow!(p, yflip(f1(theta)), yflip(f2(alpha)) - yflip(f1(theta)), color=:green) -arrow!(p, f2(alpha), f1(theta) - f2(alpha), color=:black) -p -``` - -The cut leads to a counter-clockwise orientation on the outer ring and a clockwise orientation on the inner ring. If this cut becomes so thin as to vanish, then the line integrals along the lines introducing the cut will cancel off and we have a boundary consisting of two curves with opposite orientations. (If we follow either orientation the closed figure is on the left.) - - -To see that the area integral of $F(x,y) = (1/2)\langle -y, x\rangle$ produces the area for this orientation we have, using $C_1$ as the outer ring, and $C_2$ as the inner ring: - -```math -\begin{align} -\oint_{C_1} F \cdot \hat{T} ds &= -\int_0^{2\pi} (1/2)(2)\langle -\sin(t), \cos(t)\rangle \cdot (2)\langle-\sin(t), \cos(t)\rangle dt \\ -&= (1/2) (2\pi) 4 = 4\pi\\ -\oint_{C_2} F \cdot \hat{T} ds &= -\int_{0}^{2\pi} (1/2) \langle \sin(t), \cos(t)\rangle \cdot \langle-\sin(t), -\cos(t)\rangle dt\\ -&= -(1/2)(2\pi) = -\pi. -\end{align} -``` - -(Using $\vec{r}(t) = 2\langle \cos(t), \sin(t)\rangle$ for the outer ring and $\vec{r}(t) = 1\langle \cos(t), -\sin(t)\rangle$ for the inner ring.) - -Adding the two gives $4\pi - \pi = \pi \cdot(b^2 - a^2)$, with $b=2$ and $a=1$. - -#### Flow not flux - - -Green's theorem has a complement in terms of flow across $C$. As $C$ is positively oriented (so the bounded interior piece is on the left of $\hat{T}$ as the curve is traced), a normal comes by rotating $90^\circ$ counterclockwise. That is if $\hat{T} = \langle a, b\rangle$, then $\hat{N} = \langle b, -a\rangle$. - -Let $F = \langle F_x, F_y \rangle$ and $G = \langle F_y, -F_x \rangle$, then $G\cdot\hat{T} = -F\cdot\hat{N}$. The curl formula applied to $G$ becomes - -```math -\frac{\partial{G_y}}{\partial{x}} - \frac{\partial{G_x}}{\partial{y}} = -\frac{\partial{-F_x}}{\partial{x}}-\frac{\partial{(F_y)}}{\partial{y}} -= --\left(\frac{\partial{F_x}}{\partial{x}} + \frac{\partial{F_y}}{\partial{y}}\right)= --\nabla\cdot{F}. -``` - -Green's theorem applied to $G$ then gives this formula for $F$: - -```math -\oint_C F\cdot\hat{N} ds = --\oint_C G\cdot\hat{T} ds = --\iint_D (-\nabla\cdot{F})dA = -\iint_D \nabla\cdot{F}dA. -``` - -The right hand side integral is the $2$-dimensional divergence, so this has the interpretation that the flux through $C$ ($\oint_C F\cdot\hat{N} ds$) is the integral of the divergence. (The divergence is defined in terms of a limit of this picture, so this theorem extends the microscopic view to a bigger view.) - - - -Rather than leave this as an algebraic consequence, we sketch out how this could be intuitively argued from a microscopic picture, the reason being similar to that for the curl, where we considered the small green boxes. In the generalization to dimension $3$ both arguments are needed for our discussion: - - - -```julia; hold=true; echo=false; -## This isn't used -r4(t) = cos(2t) + sqrt(1.5^4 - sin(2t)^2) -ts = range(0, pi/2, length=100) -f(t) = r4(t) * [cos(t),sin(t)] -plot(unzip(f, 0, pi/2)..., xticks=nothing, yticks=nothing, border=:none, legend=false, aspect_ratio=:equal) -t0 = pi/6 -xs = f.(t0) -ys = f'.(t0) -plot!(unzip([f(t0)+1/5*ys, f(t0)-1/5*ys])..., color=:red) -arrow!(f(t0),1/5*xs, color=:red) -arrow!(f(t0), -1/10*[-ys[2],ys[1]], color=:black) -arrow!(f(t0),-1/5*xs, color=:red, linestyle=:dash) -arrow!(f(t0), 1/10*[-ys[2],ys[1]], color=:black, linestyle=:dash) -nothing -``` - - -Consider now a $2$-dimensional region split into microscopic boxes; we focus now on two adjacent boxes, $A$ and $B$: - -```julia; hold=true; echo=false -a = 1 -ps = [[0,a],[0,0],[a,0],[a,a]] -p = plot(; legend=false, aspect_ratio=:equal, xticks=nothing, border=:none, yticks=nothing) -apoly!(p, ps, linewidth=3, color=:blue) -apoly!(p, ([1,0],) .+ ps, linewidth=3, color=:red) -pt = [1, 1/4] -scatter!(unzip([pt])..., markersize=4, color=:green) -arrow!(pt, [1/2,1/4], linewidth=3, color=:green) -arrow!(pt, [1/4,0], color=:blue ) -arrow!(pt, -[1/4, 0], color=:red) -annotate!([(7/8, 1/8, "A"), (1+7/8, 1/8, "B")]) -p -``` - -The integrand $F\cdot\hat{N}$ for $A$ will differ from that for $B$ by a minus sign, as the field is the same, but the normal carries an opposite sign. Hence the contribution to the line integral around $A$ along this part of the box partition will cancel out with that around $B$. The only part of the line integral that will not cancel out for such a partition will be the boundary pieces of the overall shape. - - -This figure shows in red the parts of the line integrals that will cancel for a more refined grid. - -```julia; hold=true; echo=false -p = plot( legend=false, xticks=nothing, yticks=nothing, border=:none) -for i in 1:8 - - for j in 1:8 - color = repeat([:red],4) - st = 1 - if i == 1 - color[1] = :black - st = 2 - elseif i==8 - color[3] = :black - st = 4 - end - if j == 1 - color[2] = :black - st = 3 - elseif j == 8 - color[4] = :black - st = 1 - end - cpoly!(p, [i-1/2, j-1/2], .8*1/2,1, :ccw, linewidth=3,linealpha=0.5, color=color) - end -end -p -``` - -Again, the microscopic boundary integrals when added will give a macroscopic boundary integral due to cancellations. - -But, as seen in the derivation of the divergence, only modified for $2$ dimensions, we have -$\nabla\cdot{F} = \lim \frac{1}{\Delta S} \oint_C F\cdot\hat{N}$, so for each cell - -```math -\oint_{C_i} F\cdot\hat{N} \approx \left(\nabla\cdot{F}\right)\Delta{x}\Delta{y}, -``` -an approximating Riemann sum for $\iint_D \nabla\cdot{F} dA$. This yields: - -```math -\oint_C (F \cdot\hat{N}) dA = -\sum_i \oint_{C_i} (F \cdot\hat{N}) dA \approx -\sum \left(\nabla\cdot{F}\right)\Delta{x}\Delta{y} \approx -\iint_S \nabla\cdot{F}dA, -``` - -the approximation signs becoming equals signs in the limit. - - - - - -##### Example - - -Let $F(x,y) = \langle ax , by\rangle$, and $D$ be the square with side length $2$ centered at the origin. Verify that the flow form of Green's theorem holds. - -We have the divergence is simply $a + b$ so $\iint_D (a+b)dA = (a+b)A(D) = 4(a+b)$. - -The integral of the flow across $C$ consists of $4$ parts. By symmetry, they all should be similar. We consider the line segment connecting $(1,-1)$ to $(1,1)$ (which has the proper counterclockwise orientation): - -```math -\int_C F \cdot \hat{N} ds= -\int_{-1}^1 \langle F_x, F_y\rangle\cdot\langle 0, 1\rangle ds = -\int_{-1}^1 b dy = 2b. -``` - -Integrating across the top will give $2a$, along the bottom $2a$, and along the left side $2b$ totaling $4(a+b)$. - ----- - -Next, let $F(x,y) = \langle -y, x\rangle$. This field rotates, and we see has no divergence, as $\partial{F_x}/\partial{x} = \partial{(-y)}/\partial{x} = 0$ and -$\partial{F_y}/\partial{y} = \partial{x}/\partial{y} = 0$. As such, the area integral in Green's theorem is $0$. As well, $F$ is parallel to $\hat{T}$ so *orthogonal* to $\hat{N}$, hence $\oint F\cdot\hat{N}ds = \oint 0ds = 0$. For any region $S$ there is no net flow across the boundary and no source or sink of flow inside. - - -##### Example: stream functions - -Strang compiles the following equivalencies (one implies the others) for when the total flux is $0$ for a vector field with continuous partial derivatives: - -* $\oint F\cdot\hat{N} ds = 0$ -* for all curves connecting $P$ to $Q$, $\int_C F\cdot\hat{N}$ has the same value -* There is a *stream* function $g(x,y)$ for which $F_x = \partial{g}/\partial{y}$ and $F_y = -\partial{g}/\partial{x}$. (This says $\nabla{g}$ is *orthogonal* to $F$.) -* the components have zero divergence: $\partial{F_x}/\partial{x} + \partial{F_y}/\partial{y} = 0$. - -Strang calls these fields *source* free as the divergence is $0$. - -A [stream](https://en.wikipedia.org/wiki/Stream_function) function plays the role of a scalar potential, but note the minus sign and order of partial derivatives. These are accounted for by saying $\langle F_x, F_y, 0\rangle = \nabla\times\langle 0, 0, g\rangle$, in Cartesian coordinates. Streamlines are tangent to the flow of the velocity vector of the flow and in two dimensions are perpendicular to field lines formed by the gradient of a scalar function. - - - -[Potential](https://en.wikipedia.org/wiki/Potential_flow) flow uses a scalar potential function to describe the velocity field through $\vec{v} = \nabla{f}$. As such, potential flow is irrotational due to the curl of a conservative field being the zero vector. Restricting to two dimensions, this says the partials satisfy $\partial{v_y}/\partial{x} - \partial{v_x}/\partial{y} = 0$. For an incompressible flow (like water) the velocity will have $0$ divergence too. That is $\nabla\cdot\nabla{f} = 0$ - $f$ satisfies Laplace's equation. - -By the equivalencies above, an incompressible potential flow means in addition to a potential function, $f$, there is a stream function $g$ satisfying $v_x = \partial{g}/\partial{y}$ and $v_y=-\partial{g}/\partial{x}$. - -The gradient of $f=\langle v_x, v_y\rangle$ is orthogonal to the contour lines of $f$. The gradient of $g=\langle -v_y, v_x\rangle$ is orthogonal to the gradient of $f$, so are tangents to the contour lines of $f$. Reversing, the gradient of $f$ is tangent to the contour lines of $g$. If the flow follows the velocity field, then the contour lines of $g$ indicate the flow of the fluid. - -As an [example](https://en.wikipedia.org/wiki/Potential_flow#Examples_of_two-dimensional_flows) consider the following in polar coordinates: - -```math -f(r, \theta) = A r^n \cos(n\theta),\quad -g(r, \theta) = A r^n \sin(n\theta). -``` - -The constant $A$ just sets the scale, the parameter $n$ has a qualitative effect on the contour lines. Consider $n=2$ visualized below: - -```julia; hold=true -gr() # pyplot doesn't like the color as specified below. -n = 2 -f(r,theta) = r^n * cos(n*theta) -g(r, theta) = r^n * sin(n*theta) - -f(v) = f(v...); g(v)= g(v...) - -Φ(x,y) = [sqrt(x^2 + y^2), atan(y,x)] -Φ(v) = Φ(v...) - -xs = ys = range(-2,2, length=50) -p = contour(xs, ys, f∘Φ, color=:red, legend=false, aspect_ratio=:equal) -contour!(p, xs, ys, g∘Φ, color=:blue, linewidth=3) -#pyplot() -p -``` - -The fluid would flow along the blue (stream) lines. The red lines have equal potential along the line. - - - - - - - -## Stokes' theorem - -```julia; hold=true; echo=false -# https://en.wikipedia.org/wiki/Jiffy_Pop#/media/File:JiffyPop.jpg -imgfile ="figures/jiffy-pop.png" -caption =""" -The Jiffy Pop popcorn design has a top surface that is designed to expand to accommodate the popped popcorn. Viewed as a surface, the surface area grows, but the boundary - where the surface meets the pan - stays the same. This is an example that many different surfaces can have the same bounding curve. Stokes' theorem will relate a surface integral over the surface to a line integral about the bounding curve. -""" -ImageFile(:integral_vector_calculus, imgfile, caption) -``` - -Were the figure of Jiffy Pop popcorn animated, the surface of foil would slowly expand due to pressure of popping popcorn until the popcorn was ready. However, the boundary would remain the same. Many different surfaces can have the same boundary. Take for instance the upper half unit sphere in $R^3$ it having the curve $x^2 + y^2 = 1$ as a boundary curve. This is the same curve as the surface of the cone $z = 1 - (x^2 + y^2)$ that lies above the $x-y$ plane. This would also be the same curve as the surface formed by a Mickey Mouse glove if the collar were scaled and positioned onto the unit circle. - -Imagine if instead of the retro labeling, a rectangular grid were drawn on the surface of the Jiffy Pop popcorn before popping. By Green's theorem, the integral of the curl of a vector field $F$ over this surface reduces to just an accompanying line integral over the boundary, $C$, where the orientation of $C$ is in the $\hat{k}$ direction. The intuitive derivation being that the curl integral over the grid will have cancellations due to adjacent cells having shared paths being traversed in both directions. - -Now imagine the popcorn expanding, but rather than worry about burning, focusing instead on what happens to the integral of the curl in the direction of the normal, we have - -```math -\nabla\times{F} \cdot\hat{N} = \lim \frac{1}{\Delta{S}} \oint_C F\cdot\hat{T} ds -\approx \frac{1}{\Delta{S}} F\cdot\hat{T} \Delta{s}. -``` - -This gives the series of approximations: - -```math -\begin{align*} -\oint_C F\cdot\hat{T} ds &= -\sum \oint_{C_i} F\cdot\hat{T} ds \\ -&\approx -\sum F\cdot\hat{T} \Delta s \\ -&\approx -\sum \nabla\times{F}\cdot\hat{N} \Delta{S} \\ -&\approx -\iint_S \nabla\times{F}\cdot\hat{N} dS. -\end{align*} -``` - -In terms of our expanding popcorn, the boundary integral - after accounting for cancellations, as in Green's theorem - can be seen as a microscopic sum of boundary integrals each of which is approximated by a term -$\nabla\times{F}\cdot\hat{N} \Delta{S}$ which is viewed as a Riemann sum approximation for the the integral of the curl over the surface. The cancellation depends on a proper choice of orientation, but with that we have: - - -> **Stokes' theorem**: Let $S$ be an orientable smooth surface in $R^3$ with boundary $C$, $C$ oriented so that the chosen normal for $S$ agrees with the right-hand rule for $C$'s orientation. Then *if* $F$ has continuous partial derivatives -> ```math -> \oint_C F \cdot\hat{T} ds = \iint_S (\nabla\times{F})\cdot\hat{N} dA. -> ``` - - -Green's theorem is an immediate consequence upon viewing the region in $R^2$ as a surface in $R^3$ with normal $\hat{k}$. - - - - - -### Examples - -##### Example - -Our first example involves just an observation. For any simply connected surface $S$ without boundary (such as a sphere) the integral $\oint_S \nabla\times{F}dS=0$, as the line integral around the boundary must be $0$, as there is no boundary. - -##### Example - -Let $F(x,y,z) = \langle x^2, 0, y^2\rangle$ and $C$ be the circle $x^2 + z^2 = 1$ with $y=0$. Find $\oint_C F\cdot\hat{T}ds$. - - -We can use Stoke's theorem with the surface being just the disc, so that $\hat{N} = \hat{j}$. This makes the computation easy: - -```julia; -Fₛ(x,y,z) = [x^2, 0, y^2] -CurlFₛ = curl(Fₛ(x,y,z), [x,y,z]) -``` - -We have $\nabla\times{F}\cdot\hat{N} = 0$, so the answer is $0$. - -We could have directly computed this. Let $r(t) = \langle \cos(t), 0, \sin(t)\rangle$. Then we have: - -```julia; -rₛ(t) = [cos(t), 0, sin(t)] -rpₛ = diff.(rₛ(t), t) -integrandₛ = Fₛ(rₛ(t)...) ⋅ rpₛ -``` - -The integrand isn't obviously going to yield $0$ for the integral, but through symmetry: - -```julia; -integrate(integrandₛ, (t, 0, 2PI)) -``` - -##### Example: Ampere's circuital law - -(Schey) Suppose a current $I$ flows along a line and $C$ is a path encircling the current with orientation such that the right hand rule points in the direction of the current flow. - -Ampere's circuital law relates the line integral of the magnetic field to the induced current through: - -```math -\oint_C B\cdot\hat{T} ds = \mu_0 I. -``` - -The goal here is to re-express this integral law to produce a law at each point of the field. Let $S$ be a surface with boundary $C$, Let $J$ be the current density - $J=\rho v$, with $\rho$ the density of the current (not time-varying) and $v$ the velocity. The current can be re-expressed as $I = \iint_S J\cdot\hat{n}dA$. (If the current flows through a wire and $S$ is much bigger than the wire, this is still valid as $\rho=0$ outside of the wire.) - -We then have: - - -```math -\mu_0 \iint_S J\cdot\hat{N}dA = -\mu_0 I = -\oint_C B\cdot\hat{T} ds = -\iint_S (\nabla\times{B})\cdot\hat{N}dA. -``` - -As $S$ and $C$ are arbitrary, this implies the integrands of the surface integrals are equal, or: - -```math -\nabla\times{B} = \mu_0 J. -``` - - -##### Example: Faraday's law - -(Strang) Suppose $C$ is a wire and there is a time-varying magnetic field $B(t)$. Then Faraday's law says the *flux* passing within $C$ through a surface $S$ with boundary $C$ of the magnetic field, $\phi = \iint B\cdot\hat{N}dS$, induces an electric field $E$ that does work: - -```math -\oint_C E\cdot\hat{T}ds = -\frac{\partial{\phi}}{\partial{t}}. -``` - -Faraday's law is an empirical statement. Stokes' theorem can be used to produce one of Maxwell's equations. For any surface $S$, as above with its boundary being $C$, we have both: - -```math --\iint_S \left(\frac{\partial{B}}{\partial{t}}\cdot\hat{N}\right)dS = --\frac{\partial{\phi}}{\partial{t}} = -\oint_C E\cdot\hat{T}ds = -\iint_S (\nabla\times{E}) dS. -``` - -This is true for any capping surface for $C$. Shrinking $C$ to a point means it will hold for each point in $R^3$. That is: - -```math -\nabla\times{E} = -\frac{\partial{B}}{\partial{t}}. -``` - - -##### Example: Conservative fields - -Green's theorem gave a characterization of $2$-dimensional conservative fields, Stokes' theorem provides a characterization for $3$ dimensional conservative fields (with continuous derivatives): - -* The work $\oint_C F\cdot\hat{T} ds = 0$ for every closed path -* The work $\int_P^Q F\cdot\hat{T} ds$ is independent of the path between $P$ and $Q$ -* for a scalar potential function $\phi$, $F = \nabla{\phi}$ -* The curl satisfies: $\nabla\times{F} = \vec{0}$ (and the domain is simply connected). - -Stokes's theorem can be used to show the first and fourth are equivalent. - -First, if $0 = \oint_C F\cdot\hat{T} ds$, then by Stokes' theorem $0 = \int_S \nabla\times{F} dS$ for any orientable surface $S$ with boundary $C$. For a given point, letting $C$ shrink to that point can be used to see that the cross product must be $0$ at that point. - -Conversely, if the cross product is zero in a simply connected region, then take any simple closed curve, $C$ in the region. If the region is [simply connected](http://math.mit.edu/~jorloff/suppnotes/suppnotes02/v14.pdf) then there exists an orientable surface, $S$ in the region with boundary $C$ for which: $\oint_C F\cdot{N} ds = \iint_S (\nabla\times{F})\cdot\hat{N}dS= \iint_S \vec{0}\cdot\hat{N}dS = 0$. - - -The construction of a scalar potential function from the field can be done as illustrated in this next example. - -Take $F = \langle yz^2, xz^2, 2xyz \rangle$. Verify $F$ is conservative and find a scalar potential $\phi$. - -To verify that $F$ is conservative, we find its curl to see that it is $\vec{0}$: - -```julia; hold=true -F(x,y,z) = [y*z^2, x*z^2, 2*x*y*z] -curl(F(x,y,z), [x,y,z]) -``` - -We need $\phi$ with $\partial{\phi}/\partial{x} = F_x = yz^2$. To that end, we integrate in $x$: - -```math -\phi(x,y,z) = \int yz^2 dx = xyz^2 + g(y,z), -``` -the function $g(y,z)$ is a "constant" of integration (it doesn't depend on $x$). That $\partial{\phi}/\partial{x} = F_x$ is true is easy to verify. Now, consider the partial in $y$: - -```math -\frac{\partial{\phi}}{\partial{y}} = xz^2 + \frac{\partial{g}}{\partial{y}} = F_y = xz^2. -``` - -So we have $\frac{\partial{g}}{\partial{y}}=0$ or $g(y,z) = h(z)$, some constant in $y$. Finally, we must have $\partial{\phi}/\partial{z} = F_z$, or - -```math -\frac{\partial{\phi}}{\partial{z}} = 2xyz + h'(z) = F_z = 2xyz, -``` - -So $h'(z) = 0$. This value can be any constant, even $0$ which we take, so that $g(y,z) = 0$ and $\phi(x,y,z) = xyz^2$ is a scalar potential for $F$. - - -##### Example - -Let $F(x,y,z) = \nabla(xy^2z^3) = \langle y^2z^3, 2xyz^3, 3xy^2z^2\rangle$. Show that the line integrals around the unit circle in the $x-y$ plane and the $y-z$ planes are $0$, as $F$ is conservative. - -```julia; -Fxyz = ∇(x*y^2*z^3) -``` - -```julia; hold=true -r(t) = [cos(t), sin(t), 0] -rp = diff.(r(t), t) -Ft = subs.(Fxyz, x .=> r(t)[1], y.=> r(t)[2], z .=> r(t)[3]) -integrate(Ft ⋅ rp, (t, 0, 2PI)) -``` - -(This is trivial, as `Ft` is $0$, as each term has a $z$ factor of $0$.) - -In the $y-z$ plane we have: - -```julia;hold=true -r(t) = [0, cos(t), sin(t)] -rp = diff.(r(t), t) -Ft = subs.(Fxyz, x .=> r(t)[1], y.=> r(t)[2], z .=> r(t)[3]) -integrate(Ft ⋅ rp, (t, 0, 2PI)) -``` - -This is also easy, as `Ft` has only an `x` component and `rp` has only `y` and `z` components, so the two are orthogonal. - - - -##### Example - -In two dimensions the vector field $F(x,y) = \langle -y, x\rangle/(x^2+y^2) = S(x,y)/\|R\|^2$ is irrotational ($0$ curl) and has $0$ divergence, but is *not* conservative in $R^2$, as with $C$ being the unit disk we have -$\oint_C F\cdot\hat{T}ds = \int_0^{2\pi} \langle -\sin(\theta),\cos(\theta)\rangle \cdot \langle-\sin(\theta), \cos(\theta)\rangle/1 d\theta = 2\pi$. This is because $F$ is not continuously differentiable at the origin, so the path $C$ is not in a simply connected domain where $F$ is continuously differentiable. (Were $C$ to avoid the origin, the integral would be $0$.) - -In three dimensions, removing a single point in a domain does change simple connectedness, but removing an entire line will. So the function $F(x,y,z) =\langle -y,x,0\rangle/(x^2+y^2)\rangle$ will have $0$ curl, $0$ divergence, but won't be conservative in a domain that includes the $z$ axis. - -However, the function $F(x,y,z) = \langle x, y,z\rangle/\sqrt{x^2+y^2+z^2}$ has curl $0$, except at the origin. However, $R^3$ less the origin, as a domain, is simply connected, so $F$ will be conservative. - - - -## Divergence theorem - -The divergence theorem is a consequence of a simple observation. Consider two adjacent cubic regions that share a common face. -The boundary integral, $\oint_S F\cdot\hat{N} dA$, can be computed for each cube. The surface integral requires a choice of normal, and the convention is to use the outward pointing normal. The common face of the two cubes has *different* outward pointing normals, the difference being a minus sign. As such, the contribution of the surface integral over this face for one cube is *cancelled* out by the contribution of the surface integral over this face for the adjacent cube. As with Green's theorem, this means for a cubic partition, that only the contribution over the boundary is needed to compute the boundary integral. In formulas, if $V$ is a $3$ dimensional cubic region with boundary $S$ and it is partitioned into smaller cubic subregions, $V_i$ with surfaces $S_i$, we have: - -```math -\oint_S F\cdot{N} dA = \sum \oint_{S_i} F\cdot{N} dA. -``` - -If the partition provides a microscopic perspective, then the divergence approximation $\nabla\cdot{F} \approx (1/\Delta{V_i}) \oint_{S_i} F\cdot{N} dA$ can be used to say: - -```math -\oint_S F\cdot{N} dA = -\sum \oint_{S_i} F\cdot{N} dA \approx -\sum (\nabla\cdot{F})\Delta{V_i} \approx -\iiint_V \nabla\cdot{F} dV, -``` - -the last approximation through a Riemann sum approximation. This heuristic leads to: - - - -> **The divergence theorem**: Suppose $V$ is a $3$-dimensional volume which is bounded (compact) and has a boundary, $S$, that is piecewise smooth. If $F$ is a continuously differentiable vector field defined on an open set containing $V$, then: -> ```math -> \iiint_V (\nabla\cdot{F}) dV = \oint_S (F\cdot\hat{N})dS. -> ``` - -That is, the volume integral of the divergence can be computed from the flux integral over the boundary of $V$. - -### Examples of the divergence theorem - -##### Example - -Verify the divergence theorem for the vector field $F(x,y,z) = \langle xy, yz, zx\rangle$ for the cubic box centered at the origin with side lengths $2$. - -We need to compute two terms and show they are equal. We begin with the volume integral: - -```julia; -F₁(x,y,z) = [x*y, y*z, z*x] -DivF₁ = divergence(F₁(x,y,z), [x,y,z]) -integrate(DivF₁, (x, -1,1), (y,-1,1), (z, -1,1)) -``` - -The total integral is $0$ by symmetry, not due to the divergence being $0$, as it is $x+y+z$. - -As for the surface integral, we have $6$ sides to consider. We take the sides with $\hat{N}$ being $\pm\hat{i}$: - -```julia; hold = true; -Nhat = [1,0,0] -integrate((F₁(x,y,z) ⋅ Nhat), (y, -1, 1), (z, -1,1)) # at x=1 -``` - -In fact, all $6$ sides will be $0$, as in this case $F \cdot \hat{i} = xy$ and at $x=1$ the surface integral is just $\int_{-1}^1\int_{-1}^1 y dy dz = 0$, as $y$ is an odd function. - -As such, the two sides of the Divergence theorem are both $0$, so the theorem is verified. - -###### Example - -(From Strang) If the temperature inside the sun is $T = \log(1/\rho)$ find the *heat* flow $F=-\nabla{T}$; the source, $\nabla\cdot{F}$; and the flux, $\iint F\cdot\hat{N}dS$. Model the sun as a ball of radius $\rho_0$. - -We have the heat flow is simply: - -```julia; -Rₗ(x,y,z) = norm([x,y,z]) -Tₗ(x,y,z) = log(1/Rₗ(x,y,z)) -HeatFlow = -diff.(Tₗ(x,y,z), [x,y,z]) -``` - -We may recognize this as $\rho/\|\rho\|^2 = \hat{\rho}/\|\rho\|$. - -The source is - -```julia; -Divₗ = divergence(HeatFlow, [x,y,z]) |> simplify -``` - -Which would simplify to $1/\rho^2$. - -Finally, the surface integral over the surface of the sun is an integral over a sphere of radius $\rho_0$. We could use spherical coordinates to compute this, but note instead that the normal is $\hat{\rho}$ so, $F \cdot \hat{N} = 1/\rho = 1/\rho_0$ over this surface. So the surface integral is simple the surface area times $1/\rho_0$: $4\pi\rho_0^2/\rho_0 = 4\pi\rho_0$. - -Finally, though $F$ is not continuous at the origin, the divergence theorem's result holds. Using spherical coordinates we have: - -```julia; hold=true -@syms rho::real rho_0::real phi::real theta::real -Jac = rho^2 * sin(phi) -integrate(1/rho^2 * Jac, (rho, 0, rho_0), (theta, 0, 2PI), (phi, 0, PI)) -``` - - - - - - -##### Example: Continuity equation (Schey) - -Imagine a venue with a strict cap on the number of persons at one time. Two ways to monitor this are: at given times, a count, or census, of all the people in the venue can be made. Or, when possible, a count of people coming in can be compared to a count of people coming out and the difference should yield the number within. Either works well when access is limited and the venue small, but the latter can also work well on a larger scale. For example, for the subway system of New York it would be impractical to attempt to count all the people at a given time using a census, but from turnstile data an accurate count can be had, as turnstiles can be used to track people coming in and going out. But turnstiles can be restricting and cause long(ish) lines. At some stores, new technology is allowing checkout-free shopping. Imagine if each customer had an app on their phone that can be used to track location. As they enter a store, they can be recorded, as they exit they can be recorded and if RFID tags are on each item in the store, their "purchases" can be tallied up and billed through the app. (As an added bonus to paying fewer cashiers, stores can also track on a step-by-step basis how a customer interacts with the store.) In any of these three scenarios, a simple thing applies: the total number of people in a confined region can be counted by counting how many crossed the boundary (and in which direction) and the change in time of the count can be related to the change in time of the people crossing. - - -For a more real world example, the [New York Times]( -https://www.nytimes.com/interactive/2019/07/03/world/asia/hong-kong-protest-crowd-ai.html) ran an article about estimating the size of a large protest in Hong Kong: - -> Crowd estimates for Hong Kong’s large pro-democracy protests have been a point of contention for years. The organizers and the police often release vastly divergent estimates. This year’s annual pro-democracy protest on Monday, July 1, was no different. Organizers announced 550,000 people attended; the police said 190,000 people were there at the peak. - -> But for the first time in the march’s history, a group of researchers combined artificial intelligence and manual counting techniques to estimate the size of the crowd, concluding that 265,000 people marched. - -> On Monday, the A.I. team attached seven iPads to two major footbridges along the march route. Volunteers doing manual counts were also stationed next to the cameras, to help verify the computer count. - - -The article describes some issues in counting such a large group: - -> The high density of the crowd and the moving nature of these protests make estimating the turnout very challenging. For more than a decade, groups have stationed teams along the route and manually counted the rate of people passing through to derive the total number of participants. - -As there are no turnstiles to do an accurate count and too many points to come and go, this technique can be too approximate. The article describes how artificial intelligence was used to count the participants. The Times tried their own hand: - -> Analyzing a short video clip recorded on Monday, The Times’s model tried to detect people based on color and shape, and then tracked the figures as they moved across the screen. This method helps avoid double counting because the crowd generally flowed in one direction. - - -The divergence theorem provides two means to compute a value, the point here is to illustrate that there are (at least) two possible ways to compute crowd size. Which is better depends on the situation. - ----- - -Following Schey, we now consider a continuous analog to the crowd counting problem through a flow with a non-uniform density that may vary in time. Let $\rho(x,y,z;t)$ be the time-varying density and $v(x,y,z;t)$ be a vector field indicating the direction of flow. Consider some three-dimensional volume, $V$, with boundary $S$ (though two-dimensional would also be applicable). Then these integrals have interpretations: - -```math -\begin{align} -\iiint_V \rho dV &&\quad\text{Amount contained within }V\\ -\frac{\partial}{\partial{t}} \iiint_V \rho dV &= -\iiint_V \frac{\partial{\rho}}{\partial{t}} dV &\quad\text{Change in time of amount contained within }V -\end{align} -``` - -Moving the derivative inside the integral requires an assumption of continuity. -Assume the material is *conserved*, meaning that if the amount in the volume $V$ changes it must flow in and out through the boundary. The flow out through $S$, the boundary of $V$, is - -```math -\oint_S (\rho v)\cdot\hat{N} dS, -``` - -using the customary outward pointing normal for the orientation of $S$. - -So we have: - -```math -\iiint_V \frac{\partial{\rho}}{\partial{t}} dV = --\oint_S (\rho v)\cdot\hat{N} dS = - \iiint_V \nabla\cdot\left(\rho v\right)dV. -``` - -The last equality by the divergence theorem, the minus sign as a positive change in amount within $V$ means flow *opposite* the outward pointing normal for $S$. - -The volume $V$ was arbitrary. While it isn't the case that two integrals being equal implies the integrands are equal, it is the case that if the two integrals are equal for all volumes and the two integrands are continuous, then they are equal. - -That is, under the *assumptions* that material is conserved and density is continuous a continuity equation can be derived from the divergence theorem: - -```math -\nabla\cdot(\rho v) = - \frac{\partial{\rho}}{dt}. -``` - - - -##### Example: The divergence theorem can fail to apply - -The assumption of the divergence theorem that the vector field be *continuously* differentiable is important, as otherwise it may not hold. With $R(x,y,z) = \langle x,y,z\rangle$ take for example $F = (R/\|R\|) / \|R\|^2)$. This has divergence - -```julia; hold=true -R(x,y,z) = [x,y,z] -F(x,y,z) = R(x,y,z) / norm(R(x,y,z))^3 - - -divergence(F(x,y,z), [x,y,z]) |> simplify -``` - -The simplification done by SymPy masks the presence of $R^{-5/2}$ when taking the partial derivatives, which means the field is *not* continuously differentiable at the origin. - - -*Were* the divergence theorem applicable, then the integral of $F$ over the unit sphere would mean: - -```math -0 = \iiint_V \nabla\cdot{F} dV = -\oint_S F\cdot{N}dS = \oint_S \frac{R}{\|R\|^3} \cdot{R} dS = -\oint_S 1 dS = 4\pi. -``` - -Clearly, as $0$ is not equal to $4\pi$, the divergence theorem can not apply. - - -However, it *does* apply to any volume not enclosing the origin. So without any calculation, if $V$ were shifted over by $2$ units the volume integral over $V$ would be $0$ and the surface integral over $S$ would be also. - -As already seen, the inverse square law here arises in the electrostatic force formula, and this same observation was made in the context of Gauss's law. - - -## Questions - - -###### Question - -(Schey) What conditions on $F: R^2 \rightarrow R^2$ imply $\oint_C F\cdot d\vec{r} = A$? ($A$ is the area bounded by the simple, closed curve $C$) - -```julia; hold=true; echo=false -choices = [ -L"We must have $\text{curl}(F) = 1$", -L"We must have $\text{curl}(F) = 0$", -L"We must have $\text{curl}(F) = x$" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -For $C$, a simple, closed curve parameterized by $\vec{r}(t) = \langle x(t), y(t) \rangle$, $a \leq t \leq b$. The area contained can be computed by $\int_a^b x(t) y'(t) dt$. Let $\vec{r}(t) = \sin(t) \cdot \langle \cos(t), \sin(t)\rangle$. - -Find the area inside $C$ - -```julia; hold=true; echo=false -val, err = quadgk(t -> (sin(t)*cos(t)* ForwardDiff.derivative(u->sin(u)^2, t)), 0, 2pi) -numericq(val) -``` - -###### Question - -Let $\hat{N} = \langle \cos(t), \sin(t) \rangle$ and $\hat{T} = \langle -\sin(t), \cos(t)\rangle$. Then polar coordinates can be viewed as the parametric curve $\vec{r}(t) = r(t) \hat{N}$. - -Applying Green's theorem to the vector field $F = \langle -y, x\rangle$ which along the curve is $r(t) \hat{T}$ we know the area formula $(1/2) (\int xdy - \int y dx)$. What is this in polar coordinates (using $\theta=t$?) (Using $(r\hat{N}' = r'\hat{N} + r \hat{N}' = r'\hat{N} +r\hat{T}$ is useful.) - -```julia; hold=true; echo=false -choices = [ -raw" ``\int rd\theta``", -raw" ``(1/2) \int r d\theta``", -raw" ``\int r^2 d\theta``", -raw" ``(1/2) \int r^2d\theta``" -] -answ=4 -radioq(choices, answ) -``` - - -###### Question - -Let $\vec{r}(t) = \langle \cos^3(t), \sin^3(t)\rangle$, $0\leq t \leq 2\pi$. (This describes a hypocycloid.) Compute the area enclosed by the curve $C$ using Green's theorem. - -```julia; hold=true; echo=false -r(t) = [cos(t)^3, sin(t)^3] -F(x,y) = [-y,x]/2 -F_t = subs.(F(x,y), x.=>r(t)[1], y.=>r(t)[2]) -Tangent = diff.(r(t),t) -integrate(F_t ⋅ Tangent, (t, 0, 2PI)) -choices = [ -raw" ``3\pi/8``", -raw" ``\pi/4``", -raw" ``\pi/2``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Let $F(x,y) = \langle y, x\rangle$. We verify Green's theorem holds when $S$ is the unit square, $[0,1]\times[0,1]$. - -The curl of $F$ is - -```julia; hold=true; echo=false -choices = [ -raw" ``0``", -raw" ``1``", -raw" ``2``" -] -answ =1 -radioq(choices, answ, keep_order=true) -``` - -As the curl is a constant, say $c$, we have $\iint_S (\nabla\times{F}) dS = c \cdot 1$. This is? - -```julia; hold=true; echo=false -choices = [ -raw" ``0``", -raw" ``1``", -raw" ``2``" -] -answ =1 -radioq(choices, answ, keep_order=true) -``` - -To integrate around the boundary we have ``4`` terms: the path $A$ connecting $(0,0)$ to $(1,0)$ (on the $x$ axis), the path $B$ connecting $(1,0)$ to $(1,1)$, the path $C$ connecting $(1,1)$ to $(0,1)$, and the path $D$ connecting $(0,1)$ to $(0,0)$ (along the $y$ axis). - -Which path has tangent $\hat{j}$? - -```julia; hold=true; echo=false -choices = ["`` A``","`` B``"," ``C``"," ``D``"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -Along path $C$, $F(x,y) = [1,x]$ and $\hat{T}=-\hat{i}$ so $F\cdot\hat{T} = -1$. The path integral $\int_C (F\cdot\hat{T})ds = -1$. What is the value of the path integral over $A$? - -```julia; hold=true; echo=false -choices = [ -raw" ``-1``", -raw" ``0``", -raw" ``1``" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -What is the integral over the oriented boundary of $S$? - -```julia; hold=true; echo=false -choices = [ -raw" ``0``", -raw" ``1``", -raw" ``2``" -] -answ =1 -radioq(choices, answ, keep_order=true) -``` - - - -###### Question - -Suppose $F: R^2 \rightarrow R^2$ is a vector field such that $\nabla\cdot{F}=0$ *except* at the origin. Let $C_1$ and $C_2$ be the unit circle and circle with radius $2$ centered at the origin, both parameterized counterclockwise. What is the relationship between $\oint_{C_2} F\cdot\hat{N}ds$ and $\oint_{C_1} F\cdot\hat{N}ds$? - -```julia; hold=true; echo=false -choices = [ -L"They are the same, as Green's theorem applies to the area, $S$, between $C_1$ and $C_2$ so $\iint_S \nabla\cdot{F}dA = 0$." -L"They differ by a minus sign, as Green's theorem applies to the area, $S$, between $C_1$ and $C_2$ so $\iint_S \nabla\cdot{F}dA = 0$." -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - - -Let $F(x,y) = \langle x, y\rangle/(x^2+y^2)$. Though this has divergence $0$ away from the origin, the flow integral around the unit circle, $\oint_C (F\cdot\hat{N})ds$, is $2\pi$, as Green's theorem in divergence form does not apply. Consider the integral around the square centered at the origin, with side lengths $2$. What is the flow integral around this closed curve? - -```julia; hold=true; echo=false -choices = [ -L"Also $2\pi$, as Green's theorem applies to the region formed by the square minus the circle and so the overall flow integral around the boundary is $0$, so the two will be the same.", -L"It is $-2\pi$, as Green's theorem applies to the region formed by the square minus the circle and so the overall flow integral around the boundary is $0$, so the two will have opposite signs, but the same magnitude." -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Using the divergence theorem, compute $\iint F\cdot\hat{N} dS$ where $F(x,y,z) = \langle x, x, y \rangle$ and $V$ is the unit sphere. - -```julia; hold=true; echo=false -choices = [ -raw" ``4/3 \pi``", -raw" ``4\pi``", -raw" ``\pi``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Using the divergence theorem, compute $\iint F\cdot\hat{N} dS$ where $F(x,y,z) = \langle y, y,x \rangle$ and $V$ is the unit cube $[0,1]\times[0,1]\times[0,1]$. - -```julia; hold=true; echo=false -choices = [ -raw" ``1``", -raw" ``2``", -raw" ``3``" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -Let $R(x,y,z) = \langle x, y, z\rangle$ and $\rho = \|R\|^2$. If $F = 2R/\rho^2$ then $F$ is the gradient of a potential. Which one? - -```julia; hold=true; echo=false -choices = [ -raw" ``\log(\rho)``", -raw" ``1/\rho``", -raw" ``\rho``" -] -answ = 1 -radioq(choices, answ) -``` - -Based on this information, for $S$ a surface not including the origin with boundary $C$, a simple closed curve, what is $\oint_C F\cdot\hat{T}ds$? - -```julia; hold=true; echo=false -choices = [ -L"It is $0$, as, by Stoke's theorem, it is equivalent to $\iint_S (\nabla\times\nabla{\phi})dS = \iint_S 0 dS = 0$.", -L"It is $2\pi$, as this is the circumference of the unit circle" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Consider the circle, $C$ in $R^3$ parameterized by $\langle \cos(t), \sin(t), 0\rangle$. The upper half sphere and the unit disc in the $x-y$ plane are both surfaces with this boundary. Let $F(x,y,z) = \langle -y, x, z\rangle$. Compute $\oint_C F\cdot\hat{T}ds$ using Stokes' theorem. The value is: - -```julia; hold=true; echo=false -choices = [ -raw" ``2\pi``", -raw" ``2``", -raw" ``0``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -From [Illinois](https://faculty.math.illinois.edu/~franklan/Math241_165_ConservativeR3.pdf) comes this advice to check if a vector field $F:R^3 \rightarrow R^3$ is conservative: - -* If $\nabla\times{F}$ is non -zero the field is not conservative -* If $\nabla\times{F}$ is zero *and* the domain of $F$ is simply connected (e.g., all of $R^3$, then $F$ is conservative -* If $\nabla\times{F}$ is zero *but* the domain of $F$ is *not* simply connected then ... - -What should finish the last sentence? - -```julia; hold=true; echo=false -choices = [ -"the field could be conservative or not. One must work harder to answer the question.", -"the field is *not* conservative.", -"the field *is* conservative" -] -answ=1 -radioq(choices, answ) -``` - -###### Question - -[Knill]() provides the following chart showing what happens under the three main operations on vector-valued functions: - -```verbatim - 1 - 1 -> grad -> 1 - 1 -> grad -> 2 -> curl -> 1 -1 -> grad -> 3 -> curl -> 3 -> div -> 1 -``` - -In the first row, the gradient is just the regular derivative and takes a function $f:R^1 \rightarrow R^1$ into another such function, $f':R \rightarrow R^1$. - -In the second row, the gradient is an operation that takes a function $f:R^2 \rightarrow R$ into one $\nabla{f}:R^2 \rightarrow R^2$, whereas the curl takes $F:R^2\rightarrow R^2$ into $\nabla\times{F}:R^2 \rightarrow R^1$. - -In the third row, the gradient is an operation that takes a function $f:R^3 \rightarrow R$ into one $\nabla{f}:R^3 \rightarrow R^3$, whereas the curl takes $F:R^3\rightarrow R^3$ into $\nabla\times{F}:R^3 \rightarrow R^3$, and the divergence takes $F:R^3 \rightarrow R^3$ into $\nabla\cdot{F}:R^3 \rightarrow R$. - -The diagram emphasizes a few different things: - -* The number of integral theorems is implied here. The ones for the gradient are the fundamental theorem of line integrals, namely $\int_C \nabla{f}\cdot d\vec{r}=\int_{\partial{C}} f$, a short hand notation for $f$ evaluated at the end points. - -The one for the curl in $n=2$ is Green's theorem: $\iint_S \nabla\times{F}dA = \oint_{\partial{S}} F\cdot d\vec{r}$. - -The one for the curl in $n=3$ is Stoke's theorem: $\iint S \nabla\times{F}dA = \oint_{\partial{S}} F\cdot d\vec{r}$. Finally, the divergence for $n=3$ is the divergence theorem $\iint_V \nabla\cdot{F} dV = \iint_{\partial{V}} F dS$. - -* Working left to right along a row of the diagram, applying two steps of these operations yields: - -```julia; hold=true; echo=false -choices = [ -"Zero, by the vanishing properties of these operations", -"The maximum number in a row", -"The row number plus 1" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -[Katz](https://www.jstor.org/stable/2690275) provides details on the history of Green, Gauss (divergence), and Stokes. The first paragraph says that each theorem was not original to the attributed name. Part of the reason being the origins dating back to the 17th century, their usage by Lagrange in Laplace in the 18th century, and their formalization in the 19th century. Other reasons are the applications were different "Gauss was interested in the theory of magnetic attraction, Ostrogradsky in the theory of heat, Green in electricity and magnetism, Poisson in elastic bodies, and Sarrus in floating bodies." Finally, in nearly all the cases the theorems were thought of as tools toward some physical end. - -In 1846, Cauchy proved - -```math -\int\left(p\frac{dx}{ds} + q \frac{dy}{ds}\right)ds = -\pm\iint\left(\frac{\partial{p}}{\partial{y}} - \frac{\partial{q}}{\partial{x}}\right)dx dy. -``` - -This is a form of: - -```julia; hold=true; echo=false -choices = [ -"Green's theorem", -"The divergence (Gauss') theorem", -"Stokes' theorem" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` diff --git a/CwJ/integrals/Project.toml b/CwJ/integrals/Project.toml deleted file mode 100644 index a9f6bb7..0000000 --- a/CwJ/integrals/Project.toml +++ /dev/null @@ -1,8 +0,0 @@ -[deps] -ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" -Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" -QuadGK = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" -Roots = "f2b01f46-fcfa-551c-844a-d8ac1e96c665" -SymPy = "24249f21-da20-56a4-8eb1-6a02cf4ae2e6" -Unitful = "1986cc42-f94f-5a68-af5c-568840ba703d" -UnitfulUS = "7dc9378f-8956-57ef-a780-aa31cc70ff3d" diff --git a/CwJ/integrals/arc_length.jmd b/CwJ/integrals/arc_length.jmd deleted file mode 100644 index 83a0b45..0000000 --- a/CwJ/integrals/arc_length.jmd +++ /dev/null @@ -1,967 +0,0 @@ -# Arc length - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using QuadGK -using Roots -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -frontmatter = ( - title = "Arc length", - description = "Calculus with Julia: Arc length", - tags = ["CalculusWithJulia", "integrals", "arc length"], -); -fig_size=(800, 600) - -nothing -``` - ----- - -```julia;hold=true; echo=false -imgfile = "figures/jump-rope.png" -caption = """ - -A kids' jump rope by Lifeline is comprised of little plastic segments of uniform length around a cord. The length of the rope can be computed by adding up the lengths of each segment, regardless of how the rope is arranged. - -""" -ImageFile(:integrals, imgfile, caption) -``` - -The length of the jump rope in the picture can be computed by either looking at the packaging it came in, or measuring the length of each plastic segment and multiplying by the number of segments. The former is easier, the latter provides the intuition as to how we can find the length of curves in the $x-y$ plane. The idea is old, [Archimedes](http://www.maa.org/external_archive/joma/Volume7/Aktumen/Polygon.html) used fixed length segments of polygons to approximate $\pi$ using the circumference of circle producing the bounds $3~\frac{1}{7} > \pi > 3~\frac{10}{71}$. - -A more modern application is the algorithm used by GPS devices to record a path taken. However, rather than record times for a fixed distance traveled, the GPS device records position ($(x,y)$) or longitude and latitude at fixed units of time - similar to how parametric functions are used. The device can then compute distance traveled and speed using some familiar formulas. - -## Arc length formula - -Recall the distance formula gives the distance between two points: $\sqrt{(x_1 - x_0)^2 + (y_1 - y_0)^2}$. - -Consider now two functions $g(t)$ and $f(t)$ and the parameterized -graph between $a$ and $b$ given by the points $(g(t), f(t))$ for $a -\leq t \leq b$. Assume that both $g$ and $f$ are differentiable on -$(a,b)$ and continuous on $[a,b]$ and furthermore that $\sqrt{g'(t)^2 + -f'(t)^2}$ is Riemann integrable. - - -> **The arc length of a curve**. For $f$ and $g$ as described, the arc length of the parameterized curve is given by -> -> ``L = \int_a^b \sqrt{g'(t)^2 + f'(t)^2} dt.`` -> -> For the special case of the graph of a function $f(x)$ between $a$ and $b$ the formula becomes $L = \int_a^b \sqrt{ 1 + f'(x)^2} dx$ (taking $g(t) = t$). - - -!!! note - The form of the integral may seem daunting with the square root and - the derivatives. A more general writing would create a vector out of - the two functions: $\phi(t) = \langle g(t), f(t) \rangle$. It is - natural to then let $\phi'(t) = \langle g'(t), f'(t) \rangle$. With - this, the integrand is just the norm - or length - of the - derivative, or $L=\int \| \phi'(t) \| dt$. This is similar to the - distance traveled being the integral of the speed, or the absolute - value of the derivative of position. - - -To see why, any partition of the interval $[a,b]$ by $a = t_0 < t_1 < \cdots < t_n =b$ gives rise to $n+1$ points in the plane given by $(g(t_i), f(t_i))$. - -```julia; hold=false; echo=false; cache=true -## {{{arclength_graph}}} -function make_arclength_graph(n) - - ns = [10,15,20, 30, 50] - - g(t) = cos(t)/t - f(t) = sin(t)/t - - ts = range(1, stop=4pi, length=200) - tis = range(1, stop=4pi, length=ns[n]) - - p = plot(g, f, 1, 4pi, legend=false, size=fig_size, - title="Approximate arc length with $(ns[n]) points") - plot!(p, map(g, tis), map(f, tis), color=:orange) - - p - -end - -n = 5 -anim = @animate for i=1:n - make_arclength_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) -caption = L""" - -The arc length of the parametric curve can be approximated using straight line segments connecting points. This gives rise to an integral expression defining the length in terms of the functions $f$ and $g$. - -""" - -ImageFile(imgfile, caption) -``` - - - - -The distance between points $(g(t_i), f(t_i))$ and $(g(t_{i-1}), f(t_{i-1}))$ is just - -```math -d_i = \sqrt{(g(t_i)-g(t_{i-1}))^2 + (f(t_i)-f(t_{i-1}))^2} -``` - -The total approximate distance of the curve would be $L_n = d_1 + -d_2 + \cdots + d_n$. This is exactly how we would compute the length -of the jump rope or the distance traveled from GPS recordings. - -However, differences, such as $f(t_i)-f(t_{i-1})$, are the building blocks of approximate derivatives. With an eye towards this, we multiply both top and bottom by $t_i - t_{i-1}$ to get: - -```math -L_n = d_1 \cdot \frac{t_1 - t_0}{t_1 - t_0} + d_2 \cdot \frac{t_2 - t_1}{t_2 - t_1} + \cdots + d_n \cdot \frac{t_n - t_{n-1}}{t_n - t_{n-1}}. -``` - -But looking at each term, we can push the denominator into the square root as: - -```math -\begin{align*} -d_i &= d_i \cdot \frac{t_i - t_{i-1}}{t_i - t_{i-1}} -\\ -&= \sqrt{ \left(\frac{g(t_i)-g(t_{i-1})}{t_i-t_{i-1}}\right)^2 + -\left(\frac{f(t_i)-f(t_{i-1})}{t_i-t_{i-1}}\right)^2} \cdot (t_i - t_{i-1}) \\ -&= \sqrt{ g'(\xi_i)^2 + f'(\psi_i)^2} \cdot (t_i - t_{i-1}). -\end{align*} -``` - -The values $\xi_i$ and $\psi_i$ are guaranteed by the mean value theorem and must be in $[t_{i-1}, t_i]$. - - -With this, if $\sqrt{f'(t)^2 + g'(t)^2}$ is integrable, as assumed, then as the size of the partition goes to zero, the sum of the $d_i$, $L_n$, must converge to the integral: - -```math -L = \int_a^b \sqrt{f'(t)^2 + g'(t)^2} dt. -``` - -(This needs a technical adjustment to the Riemann theorem, as we are evaluating our function at two points in the interval. A general proof is [here](https://randomproofs.files.wordpress.com/2010/11/arc_length.pdf).) - - -!!! note - [Bressoud](http://www.math.harvard.edu/~knill/teaching/math1a_2011/exhibits/bressoud/) - notes that Gregory (1668) proved this formula for arc length of the - graph of a function by showing that the length of the curve $f(x)$ is defined - by the area under $\sqrt{1 + f'(x)^2}$. (It is commented that this was - also known a bit earlier by von Heurat.) Gregory went further though, - as part of the fundamental theorem of calculus was contained in his - work. Gregory then posed this inverse question: given a curve - $y=g(x)$ find a function $u(x)$ so that the area under $g$ is equal to - the length of the second curve. The answer given was $u(x) = - (1/c)\int_a^x \sqrt{g^2(t) - c^2}$, which if $g(t) = \sqrt{1 + f'(t)^2}$ - and $c=1$ says $u(x) = \int_a^x f(t)dt$. - - An analogy might be a sausage maker. These take a mass of ground-up sausage material and return a long length of sausage. The material going in would depend on time via an equation like $\int_0^t g(u) du$ and the length coming out would be a constant (accounting for the cross section) times $u(t) = \int_0^t \sqrt{1 + g'(s)} ds$. - -#### Examples - -Let $f(x) = x^2$. The arc length of the graph of $f(x)$ over $[0,1]$ is then $L=\int_0^1 \sqrt{1 + (2x)^2} dx$. A trigonometric substitution of $2x = \sin(\theta)$ leads to the antiderivative: - -```julia; -@syms x -F = integrate(sqrt(1 + (2x)^2), x) -``` - -```julia; -F(1) - F(0) -``` - -That number has some context, as can be seen from the graph, which gives simple lower and upper bounds of $\sqrt{1^2 + 1^2} = 1.414...$ and $1 + 1 = 2$. - -```julia; -f(x) = x^2 -plot(f, 0, 1) -``` - -!!! note - The integrand $\sqrt{1 + f'(x)^2}$ may seem odd at first, but it can be interpreted as the length of the hypotenuse of a right triangle with "run" of $1$ and "rise" of $f'(x)$. This triangle is easily formed using the tangent line to the graph of $f(x)$. By multiplying by $dx$, the integral is "summing" up the lengths of infinitesimal pieces of the tangent line approximation. - - -##### Example - -Let $f(t) = R\cos(t)$ and $g(t) = R\sin(t)$. Then the parametric curve over $[0, 2\pi]$ is a circle. As the curve does not wrap around, the arc-length of the curve is just the circumference of the circle. To see that the arc length formula gives us familiar answers, we have: - -```math -L = \int_0^{2\pi} \sqrt{(R\cos(t))^2 + (-R\sin(t))^2} dt = R\int_0^{2\pi} \sqrt{\cos(t)^2 + \sin(t)^2} dt = -R\int_0^{2\pi} dt = 2\pi R. -``` - -##### Example - -Let $f(x) = \log(x)$. Find the length of the graph of $f$ over $[1/e, e]$. - -The answer is - -```math -L = \int_{1/e}^e \sqrt{1 + \left(\frac{1}{x}\right)^2} dx. -``` - -This has a *messy* antiderivative, so we let `SymPy` compute for us: - -```julia; -ex = integrate(sqrt(1 + (1/x)^2), (x, 1/sympy.E, sympy.E)) # sympy.E is symbolic -``` - -Which isn't so satisfying. From a quick graph, we see the answer should be no more than 4, and we see in fact it is - -```julia; -N(ex) -``` - -##### Example - - -A [catenary shape](http://en.wikipedia.org/wiki/Catenary) is the shape a hanging chain will take as it is suspended between two posts. It appears elsewhere, for example, power wires will also have this shape as they are suspended between towers. A formula for a catenary can be written in terms of the hyperbolic cosine, `cosh` in `julia` or exponentials. - -```math -y = a \cosh(x/a) = a \cdot \frac{e^{x/a} + e^{-x/a}}{2}. -``` - -Suppose we have the following chain hung between $x=-1$ and $x=1$ with $a = 2$: - -```julia; -chain(x; a=2) = a * cosh(x/a) -plot(chain, -1, 1) -``` - -How long is the chain? Looking at the graph we can guess an answer is -between $2$ and $2.5$, say, but it isn't much work to get -an approximate numeric answer. Recall, the accompanying `CalculusWithJulia` package defines `f'` to find the derivative using the `ForwardDiff` package. - - -```julia; -quadgk(x -> sqrt(1 + chain'(x)^2), -1, 1)[1] -``` - -We used a numeric approach, but this can be solved by hand and the answer is surprising. - -##### Example - -This picture of Jasper John's [Near the Lagoon](http://www.artic.edu/aic/collections/artwork/184095) was taken at The Art Institute Chicago. - - -```julia; hold=true; echo=false -imgfile = "figures/johns-catenary.jpg" -caption = "One of Jasper Johns' Catenary series. Art Institute of Chicago." -ImageFile(:integrals, imgfile, caption) -``` - - - -The museum notes have - -> For his Catenary series (1997–2003), of which Near the Lagoon is -> the largest and last work, Johns formed catenaries—a term used to -> describe the curve assumed by a cord suspended freely from two -> points—by tacking ordinary household string to the canvas or its -> supports. - -This particular catenary has a certain length. The basic dimensions -are ``78``in wide and ``118``in drop. We shift the basic function for catenaries to have $f(78/2) = f(-78/2) = 0$ and -$f(0) = -118$ (the top curve segment is on the $x$ axis and centered). We let our shifted function be parameterized by - -```math -f(x; a, b) = a \cosh(x/a) - b. -``` - - -Evaluating at $0$ gives: - -```math --118 = a - b \text{ or } b = a + 118. -``` - -Evaluating at $78/2$ gives: $a \cdot \cosh(78/(2a)) - (a + 118) = 0$. This can be solved numerically for a: - -```julia; -cat(x; a=1, b=0) = a*cosh(x/a) - b -find_zero(a -> cat(78/2, a=a, b=118 + a), 10) -``` - -Rounding, we take $a=13$. With these parameters ($a=13$, $b = 131$), we -compute the length of Johns' catenary in inches: - -```julia; hold=true -a = 13 -b = 118 + a -f(x) = cat(x; a=13, b=118+13) -quadgk(x -> sqrt(1 + f'(x)^2), -78/2, 78/2)[1] -``` - -##### Example - - -Suspension bridges, like the Verrazzano-Narrows Bridge, have different loading -than a cable and hence a different shape. A parabola is the shape the -cable takes under uniform loading (cf. [page 19](http://calteches.library.caltech.edu/4007/1/Calculus.pdf) for a -picture). - - - - -The Verrazzano-Narrows -[Bridge](https://www.brownstoner.com/brooklyn-life/verrazano-narrows-bridge-anniversary-historic-photos/) has a span -of $1298$m. - - -Suppose the drop of the -main cables is $147$ meters over this span. Then the cable itself can -be modeled as a parabola with - -* The $x$-intercepts $a = 1298/2$ and $-a$ and -* vertex $(0,b)$ with $b=-147$. - -The parabola that fits these three points is - -```math -y = \frac{-b}{a^2}(x^2 - a^2) -``` - -Find the length of the cable in meters. - -```julia;hold=true -a = 1298/2; -b = -147; -f(x) = (-b/a^2)*(x^2 - a^2); -val, _ = quadgk(x -> sqrt(1 + f'(x)^2), -a, a) -val -``` - -```julia; hold=true; echo=false -imgfile="figures/verrazzano-unloaded.jpg" -caption = """ -The Verrazzano-Narrows Bridge during construction. The unloaded suspension cables form a catenary. -""" -ImageFile(:integrals, imgfile, caption) -``` - -```julia; hold=true; echo=false -imgfile="figures/verrazzano-loaded.jpg" -caption = """ -A rendering of the Verrazzano-Narrows Bridge after construction (cf. [nycgovparks.org](https://www.nycgovparks.org/highlights/verrazano-bridge)). The uniformly loaded suspension cables would form a parabola, presumably a fact the artist of this rendering knew. (The spelling in the link is not the official spelling, which carries two zs.) -""" -ImageFile(:integrals, imgfile, caption) -``` - - - -##### Example - -The -[nephroid](http://www-history.mcs.st-and.ac.uk/Curves/Nephroid.html) -is a curve that can be described parametrically by - -```math -\begin{align*} -g(t) &= a(3\cos(t) - \cos(3t)), \\ -f(t) &= a(3\sin(t) - \sin(3t)). -\end{align*} -``` - -Taking $a=1$ we have this graph: - -```julia; -a = 1 -𝒈(t) = a*(3cos(t) - cos(3t)) -𝒇(t) = a*(3sin(t) - sin(3t)) -plot(𝒈, 𝒇, 0, 2pi) -``` - -Find the length of the perimeter of the closed figure formed by the graph. - -We have $\sqrt{g'(t)^2 + f'(t)^2} = \sqrt{18 - 18\cos(2t)}$. -An antiderivative isn't forthcoming through `SymPy`, so we take a numeric approach to find the length: - - -```julia; -quadgk(t -> sqrt(𝒈'(t)^2 + 𝒇'(t)^2), 0, 2pi)[1] -``` - -The answer seems like a floating point approximation of $24$, which suggests that this integral is tractable. Pursuing this, the integrand simplifies: - -```math -\begin{align*} -\sqrt{g'(t)^2 + f'(t)^2} -&= \sqrt{(-3\sin(t) + 3\sin(3t))^2 + (3\cos(t) - 3\cos(3t))^2} \\ -&= 3\sqrt{(\sin(t)^2 - 2\sin(t)\sin(3t) + \sin(3t)^2) + (\cos(t)^2 -2\cos(t)\cos(3t) + \cos(3t)^2)} \\ -&= 3\sqrt{(\sin(t)^2+\cos(t)^2) + (\sin(3t)^2 + \cos(3t)^2) - 2(\sin(t)\sin(3t) + \cos(t)\cos(3t))}\\ -&= 3\sqrt{2(1 - (\sin(t)\sin(3t) + \cos(t)\cos(3t)))}\\ -&= 3\sqrt{2}\sqrt{1 - \cos(2t)}\\ -&= 3\sqrt{2}\sqrt{2\sin(t)^2}. -\end{align*} -``` - -The second to last line comes from a double angle formula expansion of $\cos(3t - t)$ and the last line from the half angle formula for $\cos$. - -By graphing, we see that integrating over $[0,2\pi]$ gives twice the answer to integrating over $[0, \pi]$, which allows the simplification to: - -```math -L = \int_0^{2\pi} \sqrt{g'(t)^2 + f'(t)^2}dt = \int_0^{2\pi} 3\sqrt{2}\sqrt{2\sin(t)^2} = -3 \cdot 2 \cdot 2 \int_0^\pi \sin(t) dt = 3 \cdot 2 \cdot 2 \cdot 2 = 24. -``` - -##### Example - -A teacher of small children assigns his students the task of computing the length of a jump rope by counting the number of $1$-inch segments it is made of. He knows that if a student is accurate, no matter how fast or slow they count the answer will be the same. (That is, unless the student starts counting in the wrong direction by mistake). The teacher knows this, as he is certain that the length of curve is independent of its parameterization, as it is a property intrinsic to the curve. - -Mathematically, suppose a curve is described parametrically by $(g(t), f(t))$ for $a \leq t \leq b$. A new parameterization is provided by $\gamma(t)$. Suppose $\gamma$ is strictly increasing, so that an inverse function exists. (This assumption is implicitly made by the teacher, as it implies the student won't start counting in the wrong direction.) Then the same curve is described by composition through $(g(\gamma(u)), f(\gamma(u)))$, $\gamma^{-1}(a) \leq u \leq \gamma^{-1}(b)$. That the arc length is the same follows from substitution: - -```math -\begin{align*} -\int_{\gamma^{-1}(a)}^{\gamma^{-1}(b)} \sqrt{([g(\gamma(t))]')^2 + ([f(\gamma(t))]')^2} dt -&=\int_{\gamma^{-1}(a)}^{\gamma^{-1}(b)} \sqrt{(g'(\gamma(t) )\gamma'(t))^2 + (f'(\gamma(t) )\gamma'(t))^2 } dt \\ -&=\int_{\gamma^{-1}(a)}^{\gamma^{-1}(b)} \sqrt{g'(\gamma(t))^2 + f'(\gamma(t))^2} \gamma'(t) dt\\ -&=\int_a^b \sqrt{g'(u)^2 + f'(u)^2} du = L -\end{align*} -``` - -(Using $u=\gamma(t)$ for the substitution.) - -In traveling there are two natural parameterizations: one by time, as in "how long have we been driving?"; and the other by distance, as in "how far have we been driving?" Parameterizing by distance, or more technically arc length, has other mathematical advantages. - -To parameterize by arc length, we just need to consider a special $\gamma$ defined by: - -```math -\gamma(u) = \int_0^u \sqrt{g'(t)^2 + f'(t)^2} dt. -``` - -Supposing $\sqrt{g'(t)^2 + f'(t)^2}$ is continuous and positive, This -transformation is increasing, as its derivative by the Fundamental -Theorem of Calculus is $\gamma'(u) = \sqrt{g'(u)^2 + f'(u)^2}$, which by assumption -is positive. (It is certainly non-negative.) So there exists an inverse -function. That it exists is one thing, computing all of this is a -different matter, of course. - -For a simple example, we have $g(t) = R\cos(t)$ and $f(t)=R\sin(t)$ -parameterizing the circle of radius $R$. The arc length between $0$ -and $t$ is simply $\gamma(t) = Rt$, which we can easily see from the -formula. The inverse of this function is $\gamma^{-1}(u) = u/R$, so -we get the parameterization $(g(Rt), f(Rt))$ for $0/R \leq t \leq -2\pi/R$. - -What looks at first glance to be just a slightly more complicated equation is that of an ellipse, with $g(t) = a\cos(t)$ and $f(t) = b\sin(t)$. Taking $a=1$ and $b = a + c$, for $c > 0$ we get the equation for the arc length as a function of $t$ is just - -```math -\begin{align*} -s(u) &= \int_0^u \sqrt{(-\sin(t))^2 + b\cos(t)^2} dt\\ - &= \int_0^u \sqrt{\sin(t))^2 + \cos(t)^2 + c\cos(t)^2} dt \\ - &=\int_0^u \sqrt{1 + c\cos(t)^2} dt. -\end{align*} -``` - -But, despite it not looking too daunting, this integral is not tractable through our techniques and has an answer involving elliptic integrals. We can work numerically though. Letting $a=1$ and $b=2$, we have the arc length is given by: - -```julia; -𝒂, 𝒃 = 1, 2 -𝒔(u) = quadgk(t -> sqrt(𝒂^2 * sin(t)^2 + 𝒃^2 * cos(t)^2), 0, u)[1] -``` - -This has a graph, which does not look familiar, but we can see is monotonically increasing, so will have an inverse function: - -```julia; -plot(𝒔, 0, 2pi) -``` - -The range is $[0, s(2\pi)]$. - -The inverse function can be found by solving, we use the bracketing version of `find_zero` for this: - -```julia; -sinv(u) = find_zero(x -> 𝒔(x) - u, (0, 𝒔(2pi))) -``` - -Here we see visually that the new parameterization yields the same curve: - -```julia; hold=true -g(t) = 𝒂 * cos(t) -f(t) = 𝒃 * sin(t) - -plot(t -> g(𝒔(t)), t -> f(𝒔(t)), 0, 𝒔(2*pi)) -``` - - -#### Example: An implication of concavity - -Following (faithfully) [Kantorwitz and Neumann](https://www.researchgate.net/publication/341676916_The_English_Galileo_and_His_Vision_of_Projectile_Motion_under_Air_Resistance), we consider a function ``f(x)`` with the property that **both** ``f`` and ``f'`` are strictly concave down on ``[a,b]`` and suppose ``f(a) = f(b)``. Further, assume ``f'`` is continuous. We will see this implies facts about arc-length and other integrals related to ``f``. - -The following figure is clearly of a concave down function. The asymmetry about the critical point will be seen to be a result of the derivative also being concave down. This asymmetry will be characterized in several different ways in the following including showing that the arc length from ``(a,0)`` to ``(c,f(c))`` is longer than from ``(c,f(c))`` to ``(b,0)``. - -```julia; hold=true; echo=false -function trajectory(x; g = 9.8, v0 = 50, theta = 45*pi/180, k = 1/8) - a = v0 * cos(theta) - (g/(k*a) + tan(theta))* x + (g/k^2) * log(1 - k/a*x) -end -v0 = 50; theta = 45*pi/180; k = 1/5 -𝒂 = v0 * cos(theta) -Δ = 𝒂/k -a = find_zero(trajectory, (50, Δ-1/10)) -plot(trajectory,0, a, legend=false, linewidth=5) - -u=25 -fu = trajectory(u) -v = find_zero(x -> trajectory(x) - fu, (50, Δ)) -plot!([u,v], [fu,fu])#, linestyle = :dash) - -c = find_zero(trajectory', (50, 100)) -plot!([c,c],[0, trajectory(c)]) - - -h(y)= tangent(trajectory, u)(y) - tangent(trajectory, v)(y) -d = find_zero(h, (u,v)) -plot!(tangent(trajectory, u), 0, 110) -plot!(tangent(trajectory, v), 75, 150) - -plot!(zero) -𝒚 = 4 -annotate!([(0, 𝒚, "a"), (152, 𝒚, "b"), (u, 𝒚, "u"), (v, 𝒚, "v"), (c, 𝒚, "c")]) -``` - - -Take ``a < u < c < v < b`` with ``f(u) = f(v)`` and ``c`` a critical point, as in the picture. There must be a critical point by Rolle's theorem, and it must be unique, as the derivative, which exists by the assumptions, must be strictly decreasing due to concavity of ``f`` and hence there can be at most ``1`` critical point. - -Some facts about this picture can be proven from the definition of concavity: - -> The slope of the tangent line at ``u`` goes up slower than the slope of the tangent line at ``v`` declines: ``f'(u) < -f'(v)``. - -Since ``f'`` is *strictly* concave, we have for any ``a The critical point is greater than the midpoint between ``u`` and ``v``: ``(u+v)/2 < c``. - -The function ``f`` restricted to ``[a,c]`` and ``[c,b]`` is strictly monotone, as ``f'`` only changes sign at ``c``. Hence, there are inverse functions, say ``f_1^{-1}`` and ``f_2^{-1}`` taking ``[0,m]`` to ``[a,c]`` and ``[c,b]`` respectively. The inverses are differentiable, as ``f'`` exists, and must satisfy: -``[f_1^{-1}]'(y) > 0`` (as ``f'`` is positive on ``[a,c]``) and, similarly, -``[f_2^{-1}]'(y) < 0``. By the previous result, the inverses also satisfy: - -```math -[f_1^{-1}]'(y) > -[f_2^{-1}]'(y) -``` - -(The inequality reversed due to the derivative of the inverse function being related to the reciprocal of the derivative of the function.) - -For any ``0 \leq \alpha < \beta \leq m`` we have: - -```math -\int_{\alpha}^{\beta} ([f_1^{-1}]'(y) +[f_2^{-1}]'(y)) dy > 0 -``` - -By the fundamental theorem of calculus: - -```math -(f_1^{-1}(y) + f_2^{-1}(y))\big|_\alpha^\beta > 0 -``` - -On rearranging: - -```math -f_1^{-1}(\alpha) + f_2^{-1}(\alpha) < f_1^{-1}(\beta) + f_2^{-1}(\beta) -``` - -That is ``f_1^{-1} + f_2^{-1}`` is strictly increasing. - - -Taking ``\beta=m`` gives a bound in terms of ``c`` for any ``0 \leq \alpha < m``: - -```math -f_1^{-1}(\alpha) + f_2^{-1}(\alpha) < 2c. -``` - -The result comes from setting ``\alpha=f(u)``; setting ``\alpha=0`` shows the result for ``[a,b]``. - -> The intersection point of the two tangent lines, ``d``, satisfies ``(u+v)/2 < d``. - -If ``f(u) = f(v)``, the previously established relationship between the slopes of the tangent lines suggests the answer. However, this statement is actually true more generally, with just the assumption that ``u < v`` and not necessarily that ``f(u)=f(v)``. - - -Solving for ``d`` from equations of the tangent lines gives - -```math -d = \frac{f(v)-f(u) + uf'(u) - vf'(v)}{f'(u) - f'(v)} -``` - -So ``(u+v)/2 < d`` can be re-expressed as - -```math -\frac{f'(u) + f'(v)}{2} < \frac{f(v) - f(u)}{v-u} -``` - -which holds by the strict concavity of ``f'``, as found previously. - -> Let ``h=f(u)``. The areas under ``f`` are such that there is more area in ``[a,u]`` than ``[v,b]`` and more area under ``f(x)-h`` in ``[u,c]`` than ``[c,v]``. In particular more area under ``f`` over ``[a,c]`` than ``[c,b]``. - -Using the substitution ``x = f_i^{-1}(u)`` as needed to see: - -```math -\begin{align*} -\int_a^u f(x) dx &= \int_0^{f(u)} u [f_1^{-1}]'(u) du \\ -&> -\int_0^h u [f_2^{-1}]'(u) du \\ -&= \int_h^0 u [f_2^{-1}]'(u) du \\ -&= \int_v^b f(x) dx. -\end{align*} -``` - -For the latter claim, integrating in the ``y`` variable gives - -```math -\begin{align*} -\int_u^c (f(x)-h) dx &= \int_h^m (c - f_1^{-1}(y)) dy\\ -&> \int_h^m (c - f_2^{-1}(y)) dy\\ -&= \int_c^v (f(x)-h) dx -\end{align*} -``` - -Now, the area under ``h`` over ``[u,c]`` is greater than that over ``[c,v]`` as -``(u+v)/2 < c`` or ``v-c < c-u``. That means the area under ``f`` over ``[u,c]`` is greater than that over ``[c,v]``. - -> There is more arc length for ``f``over ``[a,u]`` than ``[v,b]``; more arc length for ``f`` over ``[u,c]`` than ``[c,v]``. In particular more arc length over ``[a,c]`` than ``[c,b]``. - -let ``\phi(z) = f_2^{-1}(f_1(z))`` be the function taking ``u`` to ``v``, and ``a`` to ``b`` and moreover the interval ``[a,u]`` to ``[v,b]``. Further, ``f(z) = f(\phi(z))``. The function is differentiable, as it is a composition of differentiable functions and for any ``a \leq z \leq u`` we have - -```math -f'(\phi(z)) \cdot \phi'(z) = f'(z) < 0 -``` - -or ``\phi'(z) < 0``. Moreover, we have by the first assertion that ``f'(z) < -f'(\phi(z))`` so ``|\phi'(z)| = |f(z)/f'(\phi(z))| < 1``. - -Using the substitution ``x = \phi(z)`` gives: - -```math -\begin{align*} -\int_v^b \sqrt{1 + f'(x)^2} dx &= -\int_u^a \sqrt{1 + f'(\phi(z))^2} \phi'(z) dz\\ -&= \int_a^u \sqrt{1 + f'(\phi(z))^2} |\phi'(z)| dz\\ -&= \int_a^u \sqrt{\phi'(z)^2 + (f'(\phi(z))\phi'(z))^2} dz\\ -&= \int_a^u \sqrt{\phi'(z)^2 + f'(z)^2} dz\\ -&< \int_a^u \sqrt{1 + f'(z)^2} dz -\end{align*} -``` - -Letting ``h=f(u) \rightarrow c`` we get the *inequality* - -```math -\int_c^b \sqrt{1 + f'(x)^2} dx \leq \int_a^c \sqrt{1 + f'(x)^2} dx, -``` - -which must also hold for any paired ``u,v=\phi(u)``. This allows the use of the strict inequality over ``[v,b]`` and ``[a,u]`` to give: - -```math -\int_c^b \sqrt{1 + f'(x)^2} dx < \int_a^c \sqrt{1 + f'(x)^2} dx, -``` - -which would also hold for any paired ``u, v``. - - -Now, why is this of interest. Previously, we have considered the example of the trajectory of an arrow on a windy day given in function form by: - -```math -f(x) = \left(\frac{g}{k v_0\cos(\theta)} + \tan(\theta) \right) x + \frac{g}{k^2}\log\left(1 - \frac{k}{v_0\cos(\theta)} x \right) -``` - -This comes from solving the projectile motion equations with a drag force *proportional* to the velocity. This function satisfies: - -```julia; hold=true -@syms gₑ::postive, k::postive, v₀::positive, θ::postive, x::postive -ex = (gₑ/(k*v₀*cos(θ)) + tan(θ))*x + gₑ/k^2 * log(1 - k/(v₀*cos(θ))*x) -diff(ex, x, x), diff(ex, x, x, x,) -``` - -Both the second and third derivatives are negative (as ``0 \leq x < (v_0\cos(\theta))/k`` due to the logarithm term), so, both ``f`` and ``f'`` are strictly concave down. Hence the results above apply. That is the arrow will fly further as it goes up, than as it comes down and will carve out more area on its way up, than its way down. The trajectory could also show time versus height, and the same would hold, e.g, the arrow would take longer to go up than come down. - -In general, the drag force need not be proportional to the velocity, but merely in opposite direction to the velocity vector ``\langle x'(t), y'(t) \rangle``: - -```math --m W(t, x(t), x'(t), y(t), y'(t)) \cdot \langle x'(t), y'(t)\rangle, -``` - -with the case above corresponding to ``W = -m(k/m)``. The set of equations then satisfy: - -```math -\begin{align*} -x''(t) &= - W(t,x(t), x'(t), y(t), y'(t)) \cdot x'(t)\\ -y''(t) &= -g - W(t,x(t), x'(t), y(t), y'(t)) \cdot y'(t)\\ -\end{align*} -``` - -with initial conditions: ``x(0) = y(0) = 0`` and ``x'(0) = v_0 \cos(\theta), y'(0) = v_0 \sin(\theta)``. - -Only with certain drag forces, can this set of equations be be solved exactly, though it can be approximated numerically for admissible ``W``, but if ``W`` is strictly positive then it can be shown ``x(t)`` is increasing on ``[0, x_\infty)`` and so invertible, and ``f(u) = y(x^{-1}(u))`` is three times differentiable with both ``f`` and ``f'`` being strictly concave, as it can be shown that (say ``x(v) = u`` so ``dv/du = 1/x'(v) > 0``): - -```math -\begin{align*} -f''(u) &= -\frac{g}{x'(v)^2} < 0\\ -f'''(u) &= \frac{2gx''(v)}{x'(v)^3} \\ -&= -\frac{2gW}{x'(v)^2} \cdot \frac{dv}{du} < 0 -\end{align*} -``` - -The latter by differentiating, the former a consequence of the following formulas for derivatives of inverse functions - -```math -\begin{align*} -[x^{-1}]'(u) &= 1 / x'(v) \\ -[x^{-1}]''(u) &= -x''(v)/(x'(v))^3 -\end{align*} -``` - -For then - -```math -\begin{align*} -f(u) &= y(x^{-1}(u)) \\ -f'(u) &= y'(x^{-1}(u)) \cdot {x^{-1}}'(u) \\ -f''(u) &= y''(x^{-1}(u))\cdot[x^{-1}]'(u)^2 + y'(x^{-1}(u)) \cdot [x^{-1}]''(u) \\ - &= y''(v) / (x'(v))^2 - y'(v) \cdot x''(v) / x'(v)^3\\ - &= -g/(x'(v))^2 - W y'/(x'(v))^2 - y'(v) \cdot (- W \cdot x'(v)) / x'(v)^3\\ - &= -g/x'(v)^2. -\end{align*} -``` - - - -## Questions - -###### Question - -The length of the curve given by $f(x) = e^x$ between $0$ and $1$ is certainly longer than the length of the line connecting $(0, f(0))$ and $(1, f(1))$. What is that length? - -```julia; hold=true; echo=false -f(x) = exp(x) -val = sqrt( (f(1) - f(0))^2 - (1 - 0)^2) -numericq(val) -``` - -The length of the curve is certainly less than the length of going from $(0,f(0))$ to $(1, f(0))$ and then up to $(1, f(1))$. What is the length of this upper bound? - -```julia; hold=true; echo=false -val = (1 - 0) + (f(1) - f(0)) -numericq(val) -``` - -Now find the actual length of the curve numerically: - -```julia; hold=true; echo=false -a,b = 0, 1 -val, _ = quadgk(x -> sqrt(1 + exp(x)^2), a, b) -numericq(val) -``` - -###### Question - -Find the length of the graph of $f(x) = x^{3/2}$ between $0$ and $4$. - -```julia; hold=true; echo=false -f(x) = x^(3/2) -a, b = 0, 4 -val, _ = quadgk( x -> sqrt(1 + f'(x)^2), a, b) -numericq(val) -``` - - -###### Question - -A [pursuit](http://www-history.mcs.st-and.ac.uk/Curves/Pursuit.html) curve is a track an optimal pursuer will take when chasing prey. The function $f(x) = x^2 - \log(x)$ is an example. Find the length of the curve between $1/10$ and $2$. - -```julia; hold=true; echo=false -f(x) = x^2 - log(x) -a, b= 1/10, 2 -val, _ = quadgk( x -> sqrt(1 + (f)(x)^2), a, b) -numericq(val) -``` - - - -###### Question - -Find the length of the graph of $f(x) = \tan(x)$ between $-\pi/4$ and $\pi/4$. - -```julia; hold=true; echo=false -f(x) = tan(x) -a, b= -pi/4, pi/4 -val, _ = quadgk( x -> sqrt(1 + f'(x)^2), a, b) -numericq(val) -``` - -Note, the straight line segment should be a close approximation and has length: - -```julia; hold=true; -sqrt((tan(pi/4) - tan(-pi/4))^2 + (pi/4 - -pi/4)^2) -``` - -###### Question - -Find the length of the graph of the function $g(x) =\int_0^x \tan(x)dx$ between $0$ and $\pi/4$ by hand or numerically: - -```julia; hold=true; echo=false -fp(x) = tan(x) -a, b = 0, pi/4 -val, _ = quadgk(x -> sqrt(1 + fp(x)^2), a, b) -numericq(val) -``` - - -###### Question - - -A boat sits at the point $(a, 0)$ and a man holds a rope taut attached to the boat at the origin $(0,0)$. The man walks on the $y$ axis. The position $y$ depends then on the position $x$ of the boat, and if the rope is taut, the position satisfies: - - -```math -y = a \ln\frac{a + \sqrt{a^2 - x^2}}{x} - \sqrt{a^2 - x^2} -``` - -This can be entered into `julia` as: - -```julia; -h(x, a) = a * log((a + sqrt(a^2 - x^2))/x) - sqrt(a^2 - x^2) -``` - - -Let $a=12$, $f(x) = h(x, a)$. Compute the length the bow of the -boat has traveled between $x=1$ and $x=a$ using `quadgk`. - -```julia; hold=true; echo=false -a = 12 -f(x) = h(x, a); -val = quadgk(x -> sqrt(1 + D(f)(x)^2), 1, a)[1]; -numericq(val, 1e-3) -``` - -(The most elementary description of this curve is in terms -of the relationship $dy/dx = -\sqrt{a^2-x^2}/x$ which could be used in place of `D(f)` in your work.) - -!!! note - To see an example of how the tractrix can be found in an everyday observation, follow this link on a description of [bicycle](https://simonsfoundation.org/multimedia/mathematical-impressions-bicycle-tracks) tracks. - -###### Question - -`SymPy` fails with the brute force approach to finding the length of a catenary, but can with a little help: - -```julia; hold=true -@syms x::real a::real -f(x,a) = a * cosh(x/a) -inside = 1 + diff(f(x,a), x)^2 -``` - -Just trying `integrate(sqrt(inside), x)` will fail, but if we try `integrate(sqrt(simplify(inside), x))` an antiderivative can be found. What is it? - -```julia; echo=false -choices = ["``a \\sinh{\\left(\\frac{x}{a} \\right)}``", - "``\\frac{a \\sinh{\\left(\\frac{x}{a} \\right)} \\cosh{\\left(\\frac{x}{a} \\right)}}{2} - \\frac{x \\sinh^{2}{\\left(\\frac{x}{a} \\right)}}{2} + \\frac{x \\cosh^{2}{\\left(\\frac{x}{a} \\right)}}{2}``" - ] -radioq(choices, 1) -``` - -###### Question - -A curve is parameterized by $g(t) = t + \sin(t)$ and $f(t) = \cos(t)$. Find the arc length of the curve between $t=0$ and $\pi$. - -```julia; echo=false -let - g(t) = t + sin(t) - f(t) = cos(t) - a, b = 0, pi - val, _ = quadgk( x -> sqrt(D(g)(x)^2 + D(f)(x)^2), a, b) - numericq(val) -end -``` - -###### Question - -The [astroid](http://www-history.mcs.st-and.ac.uk/Curves/Astroid.html) is -a curve parameterized by $g(t) = \cos(t)^3$ and $f(t) = \sin(t)^3$. Find the arc length of the curve between $t=0$ and $2\pi$. (This can be computed by hand or numerically.) - -```julia; echo=false -let - g(t) = cos(t)^3 - f(t) = sin(t)^3 - a, b = 0, 2pi - val, _ = quadgk( x -> sqrt(D(g)(x)^2 + D(f)(x)^2), a, b) - numericq(val) -end -``` - -###### Question - -A curve is parameterized by $g(t) = (2t + 3)^{2/3}/3$ and $f(t) = t + t^2/2$, for $0\leq t \leq 3$. Compute the arc-length numerically or by hand: - -```julia; echo=false -let - g(t) = (2t+3)^(2/3)/3 - f(t) = t + t^2/2 - a, b = 0, 3 - val, _ = quadgk( x -> sqrt(D(g)(x)^2 + D(f)(x)^2), a, b) - numericq(val) -end -``` - - -###### Question - -The cycloid is parameterized by $g(t) = a(t - \sin(t))$ and $f(t) = a(1 - \cos(t))$ for $a > 0$. Taking $a=3$, and $t$ in $[0, 2\pi]$, find the length of the curve traced out. (This was solved by the architect and polymath [Wren](https://www.maa.org/sites/default/files/pdf/cmj_ftp/CMJ/January%202010/3%20Articles/3%20Martin/08-170.pdf) in 1650.) - - -```julia; echo=false -let - a = 3 - g(t) = a*(t - sin(t)) - f(t) = a*(1 - cos(t)) - val, _ = quadgk( x -> sqrt(D(g)(x)^2 + D(f)(x)^2), 0, 2pi) - numericq(val) -end -``` - -A cycloid parameterized this way can be generated by a circle of radius ``a``. Based on this example, what do you think Wren wrote to Pascal about this length: - -```julia; hold=true; echo=false -choices = ["The length of the cycloidal arch is exactly **two** times the radius of the generating -circle.", - "The length of the cycloidal arch is exactly **four** times the radius of the generating -circle.", - "The length of the cycloidal arch is exactly **eight** times the radius of the generating -circle."] -radioq(choices, 3, keep_order=true) -``` - -!!! note - In [Martin](https://www.maa.org/sites/default/files/pdf/cmj_ftp/CMJ/January%202010/3%20Articles/3%20Martin/08-170.pdf) we read why Wren was mailing Pascal: - - After demonstrating mathematical talent at an early age, Blaise Pascal - turned his attention to theology, denouncing the study of mathematics - as a vainglorious pursuit. Then one night, unable to sleep as the - result of a toothache, he began thinking about the cycloid and to his - surprise, his tooth stopped aching. Taking this as a sign that he had - God’s approval to continue, Pascal spent the next eight days studying - the curve. During this time he discovered nearly all of the geometric - properties of the cycloid. He issued some of his results in ``1658`` in - the form of a contest, offering a prize of forty Spanish gold pieces - and a second prize of twenty pieces. diff --git a/CwJ/integrals/area.jmd b/CwJ/integrals/area.jmd deleted file mode 100644 index 6bf78f3..0000000 --- a/CwJ/integrals/area.jmd +++ /dev/null @@ -1,1548 +0,0 @@ -# Area under a curve - - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using QuadGK -using Roots -``` - -```julia; echo=false; results="hidden" - -using CalculusWithJulia.WeaveSupport - -fig_size = (800, 600) -using Markdown, Mustache - -const frontmatter = ( - title = "Area under a curve", - description = "Calculus with Julia: Area under a curve", - tags = ["CalculusWithJulia", "integrals", "area under a curve"], -); - -nothing -``` - ----- - -The question of area has long fascinated human culture. As children, -we learn early on the formulas for the areas of some geometric -figures: a square is $b^2$, a rectangle $b\cdot h$ a triangle $1/2 -\cdot b \cdot h$ and for a circle, $\pi r^2$. The area of a rectangle -is often the intuitive basis for illustrating multiplication. The area of a triangle -has been known for ages. Even complicated expressions, such as -[Heron's](http://tinyurl.com/mqm9z) formula which relates the area of -a triangle with measurements from its perimeter have been around for -2000 years. The formula for the area of a circle is also quite -old. Wikipedia dates it as far back as the -[Rhind](http://en.wikipedia.org/wiki/Rhind_Mathematical_Papyrus) -papyrus for 1700 BC, with the approximation of $256/81$ for $\pi$. - - -The modern approach to area begins with a non-negative function $f(x)$ -over an interval $[a,b]$. The goal is to compute the area under the -graph. That is, the area between $f(x)$ and the $x$-axis between $a -\leq x \leq b$. - - -For some functions, this area can be computed by geometry, for -example, here we see the area under $f(x)$ is just $1$, as it is a triangle with base $2$ and height $1$: - -```julia; hold=true; -f(x) = 1 - abs(x) -plot(f, -1, 1) -plot!(zero) -``` - -Similarly, we know this area is also $1$, it being a square: - -```julia; hold=true; -f(x) = 1 -plot(f, 0, 1) -plot!(zero) -``` - -This one, is simply $\pi/2$, it being half a circle of radius $1$: - -```julia; hold=true; -f(x) = sqrt(1 - x^2) -plot(f, -1, 1) -plot!(zero) -``` - -And this area can be broken into a sum of the area of square and the area of a rectangle, or $1 + 1/2$: - -```julia; hold=true; -f(x) = x > 1 ? 2 - x : 1.0 -plot(f, 0, 2) -plot!(zero) -``` - -But what of more complicated areas? Can these have their area computed? - -## Approximating areas - -In a previous section, we saw this animation: - -```julia; hold=true; echo=false; cache=true -## {{{archimedes_parabola}}} - - -f(x) = x^2 -colors = [:black, :blue, :orange, :red, :green, :orange, :purple] - -## Area of parabola - -## Area of parabola -function make_triangle_graph(n) - title = "Area of parabolic cup ..." - n==1 && (title = "Area = 1/2") - n==2 && (title = "Area = previous + 1/8") - n==3 && (title = "Area = previous + 2*(1/8)^2") - n==4 && (title = "Area = previous + 4*(1/8)^3") - n==5 && (title = "Area = previous + 8*(1/8)^4") - n==6 && (title = "Area = previous + 16*(1/8)^5") - n==7 && (title = "Area = previous + 32*(1/8)^6") - - - - plt = plot(f, 0, 1, legend=false, size = fig_size, linewidth=2) - annotate!(plt, [(0.05, 0.9, text(title,:left))]) # if in title, it grows funny with gr - n >= 1 && plot!(plt, [1,0,0,1, 0], [1,1,0,1,1], color=colors[1], linetype=:polygon, fill=colors[1], alpha=.2) - n == 1 && plot!(plt, [1,0,0,1, 0], [1,1,0,1,1], color=colors[1], linewidth=2) - for k in 2:n - xs = range(0, stop=1, length=1+2^(k-1)) - ys = map(f, xs) - k < n && plot!(plt, xs, ys, linetype=:polygon, fill=:black, alpha=.2) - if k == n - plot!(plt, xs, ys, color=colors[k], linetype=:polygon, fill=:black, alpha=.2) - plot!(plt, xs, ys, color=:black, linewidth=2) - end - end - plt -end - - - -n = 7 -anim = @animate for i=1:n - make_triangle_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -caption = L""" -The first triangle has area $1/2$, the second has area $1/8$, then $2$ have area $(1/8)^2$, $4$ have area $(1/8)^3$, ... -With some algebra, the total area then should be $1/2 \cdot (1 + (1/4) + (1/4)^2 + \cdots) = 2/3$. -""" - -ImageFile(imgfile, caption) -``` - -This illustrates a method of -[Archimedes](http://en.wikipedia.org/wiki/The_Quadrature_of_the_Parabola) -to compute the area contained in a parabola using the method of -exhaustion. Archimedes leveraged a fact he discovered relating the -areas of triangle inscribed with parabolic segments to create a sum -that could be computed. - -The pursuit of computing areas persisted. The method of computing area -by finding a square with an equivalent area was known as -*quadrature*. Over the years, many figures had their area computed, -for example, the area under the graph of the -[cycloid](http://en.wikipedia.org/wiki/Cycloid) (...Galileo tried empirically to find this using a tracing on sheet metal and a scale). - -However, as areas of geometric objects were replaced by the more -general question of area related to graphs of functions, a more -general study was called for. - -One such approach is illustrated in this figure due to Beeckman from 1618 -(from -[Bressoud](http://www.math.harvard.edu/~knill/teaching/math1a_2011/exhibits/bressoud/)) - -```julia; echo=false -imgfile = "figures/beeckman-1618.png" -caption = L""" - -Figure of Beeckman (1618) showing a means to compute the area under a -curve, in this example the line connecting points $A$ and $B$. Using -approximations by geometric figures with known area is the basis of -Riemann sums. - -""" -ImageFile(:integrals, imgfile, caption) -nothing -``` - -Beeckman actually did more than find the area. He generalized the -relationship of rate $\times$ time $=$ distance. The line was interpreting a -velocity, the "squares", then, provided an approximate distance traveled -when the velocity is taken as a constant on the small time -interval. Then the distance traveled can be approximated by a smaller -quantity - just add the area of the rectangles squarely within the -desired area ($6+16+6$) - and a larger quantity - by including all -rectangles that have a portion of their area within the desired area -($10 + 16 + 10$). Beeckman argued that the error vanishes as the -rectangles get smaller. - - -Adding up the smaller "squares" can be a bit more efficient if we were -to add all those in a row, or column at once. We would then add the -areas of a smaller number of rectangles. For this curve, the two -approaches are basically identical. For other curves, identifying -which squares in a row would be added is much more complicated (though -useful), but for a curve generated by a function, identifying which -"squares" go in a rectangle is quite easy, in fact we can see the -rectangle's area will be a base given by that of the squares, and -height depending on the function. - -### Adding rectangles - -The idea of the Riemann sum then is to approximate the area under the -curve by the area of well-chosen rectangles in such a way that as the -bases of the rectangles get smaller (hence adding more rectangles) the -error in approximation vanishes. - -Define a partition of $[a,b]$ to be a selection of points $a = x_0 -< x_1 < \cdots < x_{n-1} < x_n = b$. The norm of the partition is the -largest of all the differences $\lvert x_i - x_{i-1} \rvert$. For a partition, consider an -arbitrary selection of points $c_i$ satisfying $x_{i-1} \leq c_i \leq -x_{i}$, $1 \leq i \leq n$. Then the following is a **Riemann sum**: - -```math -S_n = f(c_1) \cdot (x_1 - x_0) + f(c_2) \cdot (x_2 - x_1) + \cdots + f(c_n) \cdot (x_n - x_{n-1}). -``` - -Clearly for a given partition and choice of $c_i$, the above can be -computed. Each term $f(c_i)\cdot(x_i-x_{i-1})$ can be visualized as -the area of a rectangle with base spanning from $x_{i-1}$ to $x_i$ and -height given by the function value at $c_i$. The following -visualizes left Riemann sums for different values of $n$ in a way that -makes Beekman's intuition plausible -- that as the number of rectangles gets larger, the -approximate sum will get closer to the actual area. - -```julia; hold=true; echo=false -rectangle(x, y, w, h) = Shape(x .+ [0,w,w,0], y .+ [0,0,h,h]) -function ₙ(j) - a = ("₋","","","₀","₁","₂","₃","₄","₅","₆","₇","₈","₉") - join([a[Int(i)-44] for i in string(j)]) -end - -function left_riemann(n) - f = x -> -(x+1/2)*(x-1)*(x-3) + 1 - p = plot(f, 1, 3, legend=false, linewidth=5) - a, b= 1, 3 - Δ = (b-a)/n - for i ∈ 0:n-1 - xᵢ = a + i*Δ - plot!(rectangle(xᵢ, 0, Δ, f(xᵢ)), opacity=0.5, color=:red) - end - a = round(sum(f(a + i*Δ)*Δ for i ∈ 0:n-1), digits=3) - - title!("L$(ₙ(n)) = $a") - p -end - -anim = @animate for i ∈ (2,4,8,16,32,64) - left_riemann(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -caption = "Illustration of left Riemann sum for increasing ``n`` values" - -ImageFile(imgfile, caption) -``` - - -To successfully compute a good approximation for the area, we -would need to choose $c_i$ and the partition so that a formula can be -found to express the dependence on the size of the partition. - -For Archimedes' problem - finding the area under $f(x)=x^2$ between -$0$ and $1$ - if we take as a partition $x_i = i/n$ and $c_i = x_i$, -then the above sum becomes: - -```math -\begin{align*} -S_n &= f(c_1) \cdot (x_1 - x_0) + f(c_2) \cdot (x_2 - x_1) + \cdots + f(c_n) \cdot (x_n - x_{n-1})\\ -&= (x_1)^2 \cdot \frac{1}{n} + (x_2)^2 \cdot \frac{1}{n} + \cdot + (x_n)^2 \cdot \frac{1}{n}\\ -&= 1^2 \cdot \frac{1}{n^3} + 2^2 \cdot \frac{1}{n^3} + \cdots + n^2 \cdot \frac{1}{n^3}\\ -&= \frac{1}{n^3} \cdot (1^2 + 2^2 + \cdots + n^2) \\ -&= \frac{1}{n^3} \cdot \frac{n\cdot(n-1)\cdot(2n+1)}{6}. -\end{align*} -``` - -The latter uses a well-known formula for the sum of squares of the -first $n$ natural numbers. - -With this expression, it is readily seen that as $n$ gets large this value gets close to $2/6 = 1/3$. - -!!! note - The above approach, like Archimedes', ends with a limit being - taken. The answer comes from using a limit to add a big number of - small values. As with all limit questions, worrying about whether a - limit exists is fundamental. For this problem, we will see that for - the general statement there is a stretching of the formal concept of a limit. - - ----- - -There is a more compact notation to $x_1 + x_2 + \cdots + x_n$, this using the *summation notation* or capital sigma. We have: - -```math -\Sigma_{i = 1}^n x_i = x_1 + x_2 + \cdots + x_n -``` - -The notation includes three pieces of information: - -- The $\Sigma$ is an indication of a sum - -- The ${i=1}$ and $n$ sub- and superscripts indicate the range to sum over. - -- The term $x_i$ is a general term describing the $i$th entry, where it is understood that $i$ is just some arbitrary indexing value. - -With this notation, a Riemann sum can be written as $\Sigma_{i=1}^n f(c_i)(x_i-x_{i-1})$. - - -### Other sums - -The choice of the $c_i$ will give different answers for the -approximation, though for an integrable function these differences -will vanish in the limit. Some common choices are: - -* Using the right hand endpoint of the interval $[x_{i-1}, x_i]$ giving the right-Riemann sum, ``R_n``. - -* The choice $c_i = x_{i-1}$ gives the left-Riemann sum, ``L_n``. - -* The choice $c_i = (x_i + x_{i-1})/2$ is the midpoint rule, ``M_n``. - -* If the function is continuous on the closed subinterval $[x_{i-1}, x_i]$, then it will take on its minimum and maximum values. By the extreme value theorem, we could take $c_i$ to correspond to either the maximum or the minimum. These choices give the "upper Riemann-sums" and "lower Riemann-sums". - - -The choice of partition can also give different answers. A common -choice is to break the interval into $n+1$ equal-sized pieces. With -$\Delta = (b-a)/n$, these pieces become the arithmetic sequence $a = -a + 0 \cdot \Delta < a + 1 \cdot \Delta < a + 2 \cdot \Delta < \cdots -a + n \cdots < \Delta = b$ with $x_i = a + i (b-a)/n$. (The -`range(a, b, length=n+1)` command will compute these.) An alternate choice -made below for one problem is to use a geometric progression: - -```math -a = a(1+\alpha)^0 < a(1+\alpha)^1 < a (1+\alpha)^2 < \cdots < a (1+\alpha)^n = b. -``` - -The general statement allows for any partition such that the largest gap -goes to ``0``. - ----- - -Riemann sums weren't named after Riemann because he was the first to -approximate areas using rectangles. Indeed, others had been using even -more efficient ways to compute areas for centuries prior to -Riemann's work. Rather, Riemann put the definition of the area under -the curve on a firm theoretical footing with the following -theorem which gives a concrete notion of what functions -are integrable: - - -> **Riemann Integral**: A function $f$ is Riemann integrable over the -> interval $[a,b]$ and its integral will have value $V$ provided for every -> $\epsilon > 0$ there exists a $\delta > 0$ such that for any -> partition $a =x_0 < x_1 < \cdots < x_n=b$ with $\lvert x_i - x_{i-1} -> \rvert < \delta$ and for any choice of points $x_{i-1} \leq c_i \leq -> x_{i}$ this is satisfied: -> -> ```math -> \lvert \sum_{i=1}^n f(c_i)(x_{i} - x_{i-1}) - V \rvert < \epsilon. -> ``` -> -> When the integral exists, it is written $V = \int_a^b f(x) dx$. - - - -!!! info "History note" - - The expression $V = \int_a^b f(x) dx$ is known as the *definite integral* of $f$ over $[a,b]$. Much earlier than Riemann, Cauchy had defined the definite integral in terms of a sum of rectangular products beginning with $S=(x_1 - x_0) f(x_0) + (x_2 - x_1) f(x_1) + \cdots + (x_n - x_{n-1}) f(x_{n-1})$ (the left Riemann sum). He showed the limit was well defined for any continuous function. Riemann's formulation relaxes the choice of partition and the choice of the $c_i$ so that integrability can be better understood. - - - - -### Some immediate consequences - -The following formulas are consequences when $f(x)$ is -integrable. These mostly follow through a judicious rearranging of the -approximating sums. - - -The area is $0$ when there is no width to the interval to integrate over: - -> $\int_a^a f(x) dx = 0.$ - - -Even our definition of a partition doesn't really apply, as we assume -$a < b$, but clearly if $a=x_0=x_n=b$ then our only"approximating" sum -could be $f(a)(b-a) = 0$. - - -The area under a constant function is found from the area of -rectangle, a special case being $c=0$ yielding $0$ area: - -> $\int_a^b c dx = c \cdot (b-a).$ - - -For any partition of $a < b$, we have -$S_n = c(x_1 - x_0) + c(x_2 -x_1) + \cdots + c(x_n - x_{n-1})$. -By factoring out the $c$, we have a -*telescoping sum* which means the sum simplifies to $S_n = c(x_n-x_0) -= c(b-a)$. Hence any limit must be this constant value. - - -Scaling the $y$ axis by a constant can be done before or after -computing the area: - -> $\int_a^b cf(x) dx = c \int_a^b f(x) dx.$ - - -Let $a=x_0 < x_1 < \cdots < x_n=b$ be any partition. Then we have -$S_n= cf(c_1)(x_1-x_0) + \cdots + cf(c_n)(x_n-x_0)$ $=$ -$c\cdot\left[ f(c_1)(x_1 - x_0) + \cdots + f(c_n)(x_n - x_0)\right]$. The -"limit" of the left side is $\int_a^b c f(x) dx$. The "limit" of the -right side is $c \cdot \int_a^b f(x)$. We call this a "sketch" as a -formal proof would show that for any $\epsilon$ we could choose a -$\delta$ so that any partition with norm $\delta$ will yield a sum -less than $\epsilon$. Here, then our "any" partition would be one for -which the $\delta$ on the left hand side applies. The computation -shows that the same $\delta$ would apply for the right hand side when -$\epsilon$ is the same. - - -The area is invariant under shifts left or right. - -> $\int_a^b f(x - c) dx = \int_{a-c}^{b-c} f(x) dx.$ - -Any partition $a =x_0 < x_1 < \cdots < x_n=b$ is related to a -partition of $[a-c, b-c]$ through $a-c < x_0-c < x_1-c < \cdots < -x_n - c = b-c$. Let $d_i=c_i-c$ denote this partition, then we have: - -```math -f(c_1 -c) \cdot (x_1 - x_0) + f(c_2 -c) \cdot (x_2 - x_1) + \cdots + f(c_n -c) \cdot (x_n - x_{n-1}) = -f(d_1) \cdot(x_1-c - (x_0-c)) + f(d_2) \cdot(x_2-c - (x_1-c)) + \cdots + f(d_n) \cdot(x_n-c - (x_{n-1}-c)). -``` - -The left side will have a limit of $\int_a^b f(x-c) dx$ the right -would have a "limit" of $\int_{a-c}^{b-c}f(x)dx$. - -Similarly, reflections don't effect the area under the curve, they just require a new parameterization: - -> ```math -> \int_a^b f(x) dx = \int_{-b}^{-a} f(-x) dx -> ``` - -The scaling operation ``g(x) = f(cx)`` has the following: - -> $\int_a^b f(c\cdot x) dx = \frac{1}{c} \int_{ca}^{cb}f(x) dx$ - -The scaling operation shifts ``a`` to ``ca`` and ``b`` to ``cb`` so the limits of integration make sense. However, the area stretches by ``c`` in the ``x`` direction, so must contract by ``c`` in the ``y`` direction to stay in balance. Hence the factor of ``1/c``. - -Combining two operations above, the operation ``g(x) = \frac{1}{h}f(\frac{x-c}{h})`` will leave the area between ``a`` and ``b`` under ``g`` the same as the area under ``g`` between ``(a-c)/h`` and ``(b-c)/h``. - - ----- - - -The area between $a$ and $b$ can be broken up into the sum of the area -between $a$ and $c$ and that between $c$ and $b$. - -> $\int_a^b f(x) dx = \int_a^c f(x) dx + \int_c^b f(x) dx.$ - - -For this, suppose we have a partition for both the integrals on the -right hand side for a given $\epsilon/2$ and $\delta$. Combining these -into a partition of $[a,b]$ will mean $\delta$ is still the norm. The -approximating sum will combine to be no more than $\epsilon/2 + -\epsilon/2$, so for a given $\epsilon$, this $\delta$ applies. - - - -This is due to the area on the left and right of -$0$ being equivalent. - -The "reversed" area is the same, only accounted for with a minus sign. - -> $\int_a^b f(x) dx = -\int_b^a f(x) dx.$ - -A consequence of the last few statements is: - -> If $f(x)$ is an even function, then $\int_{-a}^a f(x) dx = 2 -> \int_0^a f(x) dx$. If $f(x)$ is an odd function, then $\int_{-a}^a f(x) dx = 0$. - - -If $g$ bounds $f$ then the area under $g$ will bound the area under - $f$, in particular if $f(x)$ is non negative, so will the area under - $f$ be non negative for any $a < b$. (This assumes that $g$ and $f$ - are integrable.) - -> If $0 \leq f(x) \leq g(x)$ then $\int_a^b f(x) dx \leq \int_a^b g(x) -> dx.$ - -For any partition of $[a,b]$ and choice of $c_i$, we have the -term-by-term bound $f(c_i)(x_i-x_{i-1}) \leq g(c_i)(x_i-x_{i-1})$ So -any sequence of partitions that converges to the limits will have this -inequality maintained for the sum. - - - -### Some known integrals - -Using the definition, we can compute a few definite integrals: - -> $\int_a^b c dx = c \cdot (b-a).$ - -> $\int_a^b x dx = \frac{b^2}{2} - \frac{a^2}{2}.$ - - -This is just the area of a trapezoid with heights $a$ and $b$ and side - length $b-a$, or $1/2 \cdot (b + a) \cdot (b - a)$. The right sum - would be: - - -```math -\begin{align*} -S &= x_1 \cdot (x_1 - x_0) + x_2 \cdot (x_2 - x_1) + \cdots x_n \cdot (x_n - x_{n-1}) \\ -&= (a + 1\frac{b-a}{n}) \cdot \frac{b-a}{n} + (a + 2\frac{b-a}{n}) \cdot \frac{b-a}{n} + \cdots (a + n\frac{b-a}{n}) \cdot \frac{b-a}{n}\\ -&= n \cdot a \cdot (\frac{b-a}{n}) + (1 + 2 + \cdots n) \cdot (\frac{b-a}{n})^2 \\ -&= n \cdot a \cdot (\frac{b-a}{n}) + \frac{n(n+1)}{2} \cdot (\frac{b-a}{n})^2 \\ -& \rightarrow a \cdot(b-a) + \frac{(b-a)^2}{2} \\ -&= \frac{b^2}{2} - \frac{a^2}{2}. -\end{align*} -``` - - -> $\int_a^b x^2 dx = \frac{b^3}{3} - \frac{a^3}{3}.$ - -This is similar to the Archimedes case with $a=0$ and $b=1$ shown above. - -> $\int_a^b x^k dx = \frac{b^{k+1}}{k+1} - \frac{a^{k+1}}{k+1},\quad k \neq -1$. - -Cauchy showed this using a *geometric series* for the partition, not the arithmetic series $x_i = a + i (b-a)/n$. The series defined by $1 + \alpha = (b/a)^{1/n}$, then $x_i = a \cdot (1 + \alpha)^i$. Here the bases $x_{i+1} - x_i$ simplify to $x_i \cdot \alpha$ and $f(x_i) = (a\cdot(1+\alpha)^i)^k = a^k (1+\alpha)^{ik}$, or $f(x_i)(x_{i+1}-x_i) = a^{k+1}\alpha[(1+\alpha)^{k+1}]^i$, -so, using $u=(1+\alpha)^{k+1}=(b/a)^{(k+1)/n}$, $f(x_i) \cdot(x_{i+1} - x_i) = a^{k+1}\alpha u^i$. This gives - -```math -\begin{align*} -S &= a^{k+1}\alpha u^0 + a^{k+1}\alpha u^1 + \cdots + a^{k+1}\alpha u^{n-1} -&= a^{k+1} \cdot \alpha \cdot (u^0 + u^1 + \cdot u^{n-1}) \\ -&= a^{k+1} \cdot \alpha \cdot \frac{u^n - 1}{u - 1}\\ -&= (b^{k+1} - a^{k+1}) \cdot \frac{\alpha}{(1+\alpha)^{k+1} - 1} \\ -&\rightarrow \frac{b^{k+1} - a^{k+1}}{k+1}. -\end{align*} -``` - - -> $\int_a^b x^{-1} dx = \log(b) - \log(a), \quad (0 < a < b).$ - - -Again, Cauchy showed this using a geometric series. The expression $f(x_i) \cdot(x_{i+1} - x_i)$ becomes just $\alpha$. So the approximating sum becomes: - -```math -S = f(x_0)(x_1 - x_0) + f(x_1)(x_2 - x_1) + \cdots + f(x_{n-1}) (x_n - x_{n-1}) = \alpha + \alpha + \cdots \alpha = n\alpha. -``` - -But, letting $x = 1/n$, the limit above is just the limit of - -```math -\lim_{x \rightarrow 0+} \frac{(b/a)^x - 1}{x} = \log(b/a) = \log(b) - \log(a). -``` - -(Using L'Hopital's rule to compute the limit.) - -Certainly other integrals could be computed with various tricks, but -we won't pursue this. There is another way to evaluate integrals using -the forthcoming Fundamental Theorem of Calculus. - -### Some other consequences - -* The definition is defined in terms of any partition with its norm - bounded by $\delta$. If you know a function $f$ is Riemann - integrable, then it is enough to consider just a regular partition - $x_i = a + i \cdot (b-a)/n$ when forming the sums, as was done above. It is just that - showing a limit for just this particular type of partition would not be - sufficient to prove Riemann integrability. - -* The choice of $c_i$ is arbitrary to allow for maximum - flexibility. The Darboux integrals use the maximum and minimum over - the subinterval. It is sufficient to prove integrability to show - that the limit exists with just these choices. - -* Most importantly, - -> A continuous function on $[a,b]$ is Riemann integrable on $[a,b]$. - -The main idea behind this is that the difference between the maximum -and minimum values over a partition gets small. That is if -$[x_{i-1}, x_i]$ is like $1/n$ is length, then the difference between -the maximum of $f$ over this interval, $M$, and the minimum, $m$ over -this interval will go to zero as $n$ gets big. That $m$ and $M$ exists -is due to the extreme value theorem, that this difference goes to $0$ -is a consequence of continuity. What is needed is that this value goes -to $0$ at the same rate -- no matter what interval is being discussed -- -is a consequence of a notion of uniform continuity, a concept -discussed in advanced calculus, but which holds for continuous -functions on closed intervals. Armed with this, the Riemann sum for a -general partition can be bounded by this difference times $b-a$, which -will go to zero. So the upper and lower Riemann sums will converge to -the same value. - -* A "jump", or discontinuity of the first kind, is a value $c$ in $[a,b]$ where $\lim_{x \rightarrow c+} f(x)$ and $\lim_{x \rightarrow c-}f(x)$ both exist, but are not equal. It is true that a function that is not continuous on $I=[a,b]$, but only has discontinuities of the first kind on $I$ will be Riemann integrable on $I$. - -For example, the function $f(x) = 1$ for $x$ in $[0,1]$ and $0$ -otherwise will be integrable, as it is continuous at all but two -points, $0$ and $1$, where it jumps. - - -* Some functions can have infinitely many points of discontinuity and still be integrable. The example of $f(x) = 1/q$ when $x=p/q$ is rational, and $0$ otherwise is often used as an example. - -## Numeric integration - -The Riemann sum approach gives a method to approximate the value of a -definite integral. We just compute an approximating sum for a large -value of $n$, so large that the limiting value and the approximating -sum are close. - - -To see the mechanics, let's again return to Archimedes' problem and compute $\int_0^1 x^2 dx$. - - -Let us fix some values: - -```julia; -a, b = 0, 1 -f(x) = x^2 -``` - -Then for a given $n$ we have some steps to do: create the partition, -find the $c_i$, multiply the pieces and add up. Here is one way to do all this: - -```julia; -n = 5 -xs = a:(b-a)/n:b # also range(a, b, length=n) -deltas = diff(xs) # forms x2-x1, x3-x2, ..., xn-xn-1 -cs = xs[1:end-1] # finds left-hand end points. xs[2:end] would be right-hand ones. -``` - -Now to multiply the values. We want to sum the product `f(cs[i]) * deltas[i]`, here is one way to do so: - -```julia; -sum(f(cs[i]) * deltas[i] for i in 1:length(deltas)) -``` - -Our answer is not so close to the value of $1/3$, but what did we -expect - we only used $n=5$ intervals. Trying again with $50,000$ gives -us: - -```julia; hold=true -n = 50_000 -xs = a:(b-a)/n:b -deltas = diff(xs) -cs = xs[1:end-1] -sum(f(cs[i]) * deltas[i] for i in 1:length(deltas)) -``` - -This value is about $10^{-5}$ off from the actual answer of $1/3$. - -We should expect that larger values of $n$ will produce better -approximate values, as long as numeric issues don't get involved. - - -Before continuing, we define a function to compute the Riemann sum for -us with an extra argument to specifying one of four methods for -computing $c_i$: - -```julia; eval=false -function riemann(f::Function, a::Real, b::Real, n::Int; method="right") - if method == "right" - meth = f -> (lr -> begin l,r = lr; f(r) * (r-l) end) - elseif method == "left" - meth = f -> (lr -> begin l,r = lr; f(l) * (r-l) end) - elseif method == "trapezoid" - meth = f -> (lr -> begin l,r = lr; (1/2) * (f(l) + f(r)) * (r-l) end) - elseif method == "simpsons" - meth = f -> (lr -> begin l,r=lr; (1/6) * (f(l) + 4*(f((l+r)/2)) + f(r)) * (r-l) end) - end - - xs = range(a, b, n+1) - pairs = zip(xs[begin:end-1], xs[begin+1:end]) # (x₀,x₁), …, (xₙ₋₁,xₙ) - sum(meth(f), pairs) - -end -``` - -(This function is defined in `CalculusWithJulia` and need not be copied over if that package is loaded.) - -With this, we can easily find an approximate answer. We wrote the -function to use the familiar template `action(function, -arguments...)`, so we pass in a function and arguments to describe the -problem (`a`, `b`, and `n` and, optionally, the `method`): - -```julia; -𝒇(x) = exp(x) -riemann(𝒇, 0, 5, 10) # S_10 -``` - -Or with more intervals in the partition - -```julia; -riemann(𝒇, 0, 5, 50_000) -``` - -(The answer is $e^5 - e^0 = 147.4131591025766\dots$, which shows that even $50,000$ partitions is not enough to guarantee many digits of accuracy.) - - -## "Negative" area - - -So far, we have had the assumption that $f(x) \geq 0$, as that allows -us to define the concept of area. We can define the signed area -between $f(x)$ and the $x$ axis through the definite integral: - -```math -A = \int_a^b f(x) dx. -``` - -The right hand side is defined whenever the Riemann limit exists and -in that case we call $f(x)$ Riemann integrable. (The definition does -not suppose $f$ is non-negative.) - - -Suppose $f(a) = f(b) = 0$ for $a < b$ and for all $a < x < b$ we have $f(x) < 0$. Then we can see easily from the geometry (or from the Riemann sum approximation) that - -```math -\int_a^b f(x) dx = - \int_a^b \lvert f(x) \rvert dx. -``` - -If we think of the area below the $x$ axis as "signed" area carrying a minus sign, then the total area can be seen again as a sum, only this time some of the summands may be negative. - -##### Example - -Consider a function $g(x)$ defined through its piecewise linear graph: - -```julia; echo=false -g(x) = abs(x) > 2 ? 1.0 : abs(x) - 1.0 -plot(g, -3,3) -plot!(zero) -``` - -* Compute $\int_{-3}^{-1} g(x) dx$. The area comprised of a square of area $1$ and a triangle with area $1/2$, so should be $3/2$. - -* Compute $\int_{-3}^{0} g(x) dx$. In addition to the above, there is a triangle with area $1/2$, but since the function is negative, this area is added in as $-1/2$. In total then we have $1 + 1/2 - 1/2 = 1$ for the answer. - -* Compute $\int_{-3}^{1} g(x) dx$: - -We could add the signed area over $[0,1]$ to the above, but instead see a square of area $1$, a triangle with area $1/2$ and a triangle with signed area $-1$. The total is then $1/2$. - -* Compute $\int_{-3}^{3} g(x) dx$: - -We could add the area, but let's use a symmetry trick. This is clearly -twice our second answer, or $2$. (This is because $g(x)$ is an even -function, as we can tell from the graph.) - -##### Example - -Suppose $f(x)$ is an odd function, then $f(x) = - f(-x)$ for any -$x$. So the signed area between $[-a,0]$ is related to the signed area -between $[0,a]$ but of different sign. This gives $\int_{-a}^a f(x) dx -= 0$ for odd functions. - -An immediate consequence would be $\int_{-\pi}^\pi \sin(x) = 0$, as would $\int_{-a}^a x^k dx$ for any *odd* integer $k > 0$. - -##### Example - -Numerically estimate the definite integral $\int_0^e x\log(x) dx$. (We -redefine the function to be $0$ at $0$, so it is continuous.) - -We have to be a bit careful with the Riemann sum, as the left Riemann -sum will have an issue at $0=x_0$ (`0*log(0)` returns `NaN` which will -poison any subsequent arithmetic operations, so the value returned will -be `NaN` and not an approximate answer). We could define our function -with a check: - -```julia; -𝒉(x) = x > 0 ? x * log(x) : 0.0 -``` - -This is actually inefficient, as the check for the size of `x` will -slow things down a bit. Since we will call this function 50,000 times, we -would like to avoid this, if we can. In this case just using the right -sum will work: - -```julia; -h(x) = x * log(x) -riemann(h, 0, 2, 50_000, method="right") -``` - -(The default is `"right"`, so no method specified would also work.) - -##### Example - -Let $j(x) = \sqrt{1 - x^2}$. The area under the curve between $-1$ and -$1$ is $\pi/2$. Using a Riemann sum with 4 equal subintervals and the -midpoint, estimate $\pi$. How close are you? - - -The partition is $-1 < -1/2 < 0 < 1/2 < 1$. The midpoints are $-3/4, --1/4, 1/4, 3/4$. We thus have that $\pi/2$ is approximately: - -```julia; hold=true -xs = range(-1, 1, length=5) -deltas = diff(xs) -cs = [-3/4, -1/4, 1/4, 3/4] -j(x) = sqrt(1 - x^2) -a = sum(j(c)*delta for (c,delta) in zip(cs, deltas)) -a, pi/2 # π ≈ 2a -``` - -(For variety, we used an alternate way to sum over two vectors.) - -So $\pi$ is about `2a`. - -##### Example - -We have the well-known triangle -[inequality](http://en.wikipedia.org/wiki/Triangle_inequality) which -says for an individual sum: -$\lvert a + b \rvert \leq \lvert a \rvert +\lvert b \rvert$. -Applying this recursively to a partition with -$a < b$ gives: - - -```math -\begin{align*} -\lvert f(c_1)(x_1-x_0) + f(c_2)(x_2-x_1) + \cdots + f(c_n) (x_n-x_1) \rvert -& \leq -\lvert f(c_1)(x_1-x_0) \rvert + \lvert f(c_2)(x_2-x_1)\rvert + \cdots +\lvert f(c_n) (x_n-x_1) \rvert \\ -&= \lvert f(c_1)\rvert (x_1-x_0) + \lvert f(c_2)\rvert (x_2-x_1)+ \cdots +\lvert f(c_n) \rvert(x_n-x_1). -\end{align*} -``` - -This suggests that the following inequality holds for integrals: - -> ``\lvert \int_a^b f(x) dx \rvert \leq \int_a^b \lvert f(x) \rvert dx``. - -This can be used to give bounds on the size of an integral. For example, suppose you know that $f(x)$ is continuous on $[a,b]$ and takes its maximum value of $M$ and minimum value of $m$. Letting $K$ be the larger of $\lvert M\rvert$ and $\lvert m \rvert$, gives this bound when $a < b$: - -```math -\lvert\int_a^b f(x) dx \rvert \leq \int_a^b \lvert f(x) \rvert dx \leq \int_a^b K dx = K(b-a). -``` - -While such bounds are disappointing, often, when looking for specific -values, they are very useful when establishing general truths, such as -is done with proofs. - -## Error estimate - -The Riemann sum above is actually extremely inefficient. To see how much, we -can derive an estimate for the error in approximating the value using -an arithmetic progression as the partition. Let's assume that our -function $f(x)$ is increasing, so that the right sum gives an upper -estimate and the left sum a lower estimate, so the error in the -estimate will be between these two values: - -```math -\begin{align*} -\text{error} &\leq -\left[ -f(x_1) \cdot (x_{1} - x_0) + f(x_2) \cdot (x_{2} - x_1) + \cdots + f(x_{n-1})(x_{n-1} - x_n) + f(x_n) \cdot (x_n - x_{n-1})\right]\\ -&- -\left[f(x_0) \cdot (x_{1} - x_0) + f(x_1) \cdot (x_{2} - x_1) + \cdots + f(x_{n-1})(x_{n-1} - x_n)\right] \\ -&= \frac{b-a}{n} \cdot (\left[f(x_1) + f(x_2) + \cdots f(x_n)\right] - \left[f(x_0) + \cdots f(x_{n-1})\right]) \\ -&= \frac{b-a}{n} \cdot (f(b) - f(a)). -\end{align*} -``` - -We see the error goes to $0$ at a rate of $1/n$ with the constant -depending on $b-a$ and the function $f$. In general, a similar bound -holds when $f$ is not monotonic. - -There are other ways to approximate the integral that use fewer points -in the partition. [Simpson's](http://tinyurl.com/7b9pmu) rule is one, -where instead of approximating the area with rectangles that go -through some $c_i$ in $[x_{i-1}, x_i]$ instead the function is -approximated by the quadratic polynomial going through $x_{i-1}$, -$(x_i + x_{i-1})/2$, and $x_i$ and the exact area under that -polynomial is used in the approximation. The explicit formula is: - -```math -A \approx \frac{b-a}{3n} (f(x_0) + 4 f(x_1) + 2f(x_2) + 4f(x_3) + \cdots + 2f(x_{n-2}) + 4f(x_{n-1}) + f(x_n)). -``` - -The error in this approximation can be shown to be - -```math -\text{error} \leq \frac{(b-a)^5}{180n^4} \text{max}_{\xi \text{ in } [a,b]} \lvert f^{(4)}(\xi) \rvert. -``` - -That is, the error is like $1/n^4$ with constants depending on the -length of the interval, $(b-a)^5$, and the maximum value of the fourth -derivative over $[a,b]$. This is significant, the error in $10$ steps -of Simpson's rule is on the scale of the error of $10,000$ steps of -the Riemann sum for well-behaved functions. - -!!! note - The Wikipedia article mentions that Kepler used a similar formula $100$ - years prior to Simpson, or about $200$ years before Riemann published - his work. Again, the value in Riemann's work is not the computation of - the answer, but the framework it provides in determining if a function - is Riemann integrable or not. - -## Gauss quadrature - -The formula for Simpson's rule was the *composite* formula. If just a single rectangle is approximated over $[a,b]$ by a parabola interpolating the points $x_1=a$, $x_2=(a+b)/2$, and $x_3=b$, the formula is: - -```math -\frac{b-a}{6}(f(x_1) + 4f(x_2) + f(x_3)). -``` - -This formula will actually be exact for any 3rd degree polynomial. In -fact an entire family of similar approximations using $n$ points can -be made exact for any polynomial of degree $n-1$ or lower. But with -non-evenly spaced points, even better results can be found. - - - -The formulas for an approximation to the integral $\int_{-1}^1 f(x) -dx$ discussed so far can be written as: - -```math -\begin{align*} -S &= f(x_1) \Delta_1 + f(x_2) \Delta_2 + \cdots + f(x_n) \Delta_n\\ - &= w_1 f(x_1) + w_2 f(x_2) + \cdots + w_n f(x_n). -\end{align*} -``` - -The $w$s are "weights" and the $x$s are nodes. A -[Gaussian](http://en.wikipedia.org/wiki/Gaussian_quadrature) -*quadrature rule* is a set of weights and nodes for $i=1, \dots n$ for -which the sum is *exact* for any $f$ which is a polynomial of degree -$2n-1$ or less. Such choices then also approximate well the integrals of -functions which are not polynomials of degree $2n-1$, provided $f$ can -be well approximated by a polynomial over $[-1,1]$. (Which is the case -for the "nice" functions we encounter.) Some examples are given in the questions. - -### The quadgk function - -In `Julia` a modification of the Gauss quadrature rule is implemented -in the `quadgk` function (from the `QuadGK` package) to give numeric approximations to -integrals. -The `quadgk` function also has the familiar interface -`action(function, arguments...)`. Unlike our `riemann` function, there -is no `n` specified, as the number of steps is *adaptively* -determined. (There is more partitioning occurring where the function -is changing rapidly.) Instead, the algorithm outputs an estimate on -the possible error along with the answer. Instead of $n$, some -trickier problems require a specification of an error threshold. - - -To use the function, we have: - -```julia; hold=true -f(x) = x * log(x) -quadgk(f, 0, 2) -``` - - -As mentioned, there are two values returned: an approximate answer, -and an error estimate. In this example we see that the value of -$0.3862943610307017$ is accurate to within $10^{-9}$. (The actual -answer is $-1 + 2\cdot \log(2)$ and the error is only $10^{-11}$. The -reported error is an upper bound, and may be conservative, as with -this problem.) Our previous answer using $50,000$ right-Riemann sums -was $0.38632208884775737$ and is only accurate to $10^{-5}$. By -contrast, this method uses just $256$ function evaluations in the above -problem. - - - -The method should be exact for polynomial functions: - -```julia; hold=true -f(x) = x^5 - x + 1 -quadgk(f, -2, 2) -``` - -The error term is $0$, answer is $4$ up to the last unit of precision -(1 ulp), so any error is only in floating point approximations. - - - -For the numeric computation of definite integrals, the `quadgk` -function should be used over the Riemann sums or even Simpson's rule. - -Here are some sample integrals computed with `quadgk`: - -```math -\int_0^\pi \sin(x) dx -``` - -```julia; -quadgk(sin, 0, pi) -``` - -(Again, the actual answer is off only in the last digit, the error estimate is an upper bound.) - -```math -\int_0^2 x^x dx -``` - -```julia; -quadgk(x -> x^x, 0, 2) -``` - -```math -\int_0^5 e^x dx -``` - -```julia; -quadgk(exp, 0, 5) -``` - - -When composing the answer with other functions it may be desirable to -drop the error in the answer. Two styles can be used for this. The -first is to just name the two returned values: - -```julia; hold=true -A, err = quadgk(cos, 0, pi/4) -A -``` - -The second is to ask for just the first component of the returned value: - -```julia; hold=true -A = quadgk(tan, 0, pi/4)[1] # or first(quadgk(tan, 0, pi/4)) -``` - ----- - -To visualize the choice of nodes by the algorithm, we have for ``f(x)=\sin(x)`` over ``[0,\pi]`` relatively few nodes used to get a high-precision estimate: - -```julia; echo=false -function FnWrapper(f) - xs=Any[] - ys=Any[] - x -> begin - fx = f(x) - push!(xs, x) - push!(ys, fx) - fx - end -end -nothing -``` - -```julia; hold=true; echo=false -let - a, b= 0, pi - f(x) = sin(x) - F = FnWrapper(f) - ans,err = quadgk(F, a, b) - plot(f, a, b, legend=false, title="Error ≈ $(round(err,sigdigits=2))") - scatter!(F.xs, F.ys) -end -``` - -For a more oscillatory function, more nodes are chosen: - -```julia; hold=true; echo=false -let - a, b= 0, pi - f(x) = exp(-x)*sinpi(x) - F = FnWrapper(f) - ans,err = quadgk(F, a, b) - plot(f, a, b, legend=false, title="Error ≈ $(round(err,sigdigits=2))") - scatter!(F.xs, F.ys) -end -``` - - -##### Example - -In probability theory, a *univariate density* is a function, $f(x)$ -such that $f(x) \geq 0$ and $\int_a^b f(x) dx = 1$, where $a$ and $b$ -are the range of the distribution. The -[Von Mises](http://en.wikipedia.org/wiki/Von_Mises_distribution) -distribution, takes the form - -```math -k(x) = C \cdot \exp(\cos(x)), \quad -\pi \leq x \leq \pi -``` - -Compute $C$ (numerically). - -The fact that $1 = \int_{-\pi}^\pi C \cdot \exp(\cos(x)) dx = C \int_{-\pi}^\pi \exp(\cos(x)) dx$ implies that $C$ is the reciprocal of - -```julia; -k(x) = exp(cos(x)) -A,err = quadgk(k, -pi, pi) -``` - -So - -```julia; -C = 1/A -k₁(x) = C * exp(cos(x)) -``` - -The *cumulative distribution function* for $k(x)$ is $K(x) = -\int_{-\pi}^x k(u) du$, $-\pi \leq x \leq \pi$. We just showed that $K(\pi) = 1$ and it is -trivial that $K(-\pi) = 0$. The quantiles of the distribution are the -values $q_1$, $q_2$, and $q_3$ for which $K(q_i) = i/4$. Can we find -these? - -First we define a function, that computes $K(x)$: - -```julia; -K(x) = quadgk(k₁, -pi, x)[1] -``` - -(The trailing `[1]` is so only the answer - and not the error - is returned.) - -The question asks us to solve $K(x) = 0.25$, $K(x) = 0.5$ and $K(x) -= 0.75$. The `Roots` package can be used for such work, in particular -`find_zero`. We will use a bracketing method, as clearly $K(x)$ is -increasing, as $k(u)$ is positive, so we can just bracket our answer -with $-\pi$ and $\pi$. (We solve $K(x) - p = 0$, so $K(\pi) - p > 0$ and $K(-\pi)-p < 0$.). We could do this with `[find_zero(x -> K(x) - p, (-pi, pi)) for p in [0.25, 0.5, 0.75]]`, but that is a bit less performant than using the `solve` interface for this task: - - -```julia; hold=true -Z = ZeroProblem((x,p) -> K(x) - p, (-pi, pi)) -solve.(Z, (1/4, 1/2, 3/4)) -``` - -The middle one is clearly $0$. This distribution is symmetric about -$0$, so half the area is to the right of $0$ and half to the left, so -clearly when $p=0.5$, $x$ is $0$. The other two show that the area to -the left of $-0.809767$ is equal to the area to the right of -$0.809767$ and equal to $0.25$. - -## Questions - -###### Question - -Using geometry, compute the definite integral: - -```math -\int_{-5}^5 \sqrt{5^2 - x^2} dx. -``` - -```julia; hold=true; echo=false -f(x) = sqrt(5^2 - x^2) -val, _ = quadgk(f, -5,5) -numericq(val) -``` - -###### Question - - -Using geometry, compute the definite integral: - -```math -\int_{-2}^2 (2 - \lvert x\rvert) dx -``` - -```julia; hold=true; echo=false -f(x) = 2- abs(x) -a,b = -2, 2 -val, _ = quadgk(f, a,b) -numericq(val) -``` - -###### Question - - -Using geometry, compute the definite integral: - -```math -\int_0^3 3 dx + \int_3^9 (3 + 3(x-3)) dx -``` - -```julia; hold=true; echo=false -f(x) = x <= 3 ? 3.0 : 3 + 3*(x-3) -a,b = 0, 9 -val, _ = quadgk(f, a, b) -numericq(val) -``` - -###### Question - - -Using geometry, compute the definite integral: - -```math -\int_0^5 \lfloor x \rfloor dx -``` - -(The notation $\lfloor x \rfloor$ is the integer such that $\lfloor x \rfloor \leq x < \lfloor x \rfloor + 1$.) - -```julia; hold=true; echo=false -f(x) = floor(x) -a, b = 0, 5 -val, _ = quadgk(f, a, b) -numericq(val) -``` - - -###### Question - - -Using geometry, compute the definite integral between $-3$ and $3$ of this graph comprised of lines and circular arcs: - -```julia; hold=true; echo=false -function f(x) - if x < -1 - abs(x+1) - elseif -1 <= x <= 1 - sqrt(1 - x^2) - else - abs(x-1) - end -end -plot(f, -3, 3, aspect_ratio=:equal) -``` - -The value is: - -```julia; hold=true; echo=false -val = (1/2 * 2 * 2) * 2 + pi*1^2/2 -numericq(val) -``` - - - -###### Question - -For the function $f(x) = \sin(\pi x)$, estimate the integral for $-1$ -to $1$ using a left-Riemann sum with the partition $-1 < -1/2 < 0 < 1/2 -< 1$. - -```julia; hold=true; echo=false -f(x) = sin(x) -xs = -1:1/2:1 -deltas = diff(xs) -val = sum(map(f, xs[1:end-1]) .* deltas) -numericq(val) -``` - - -###### Question - -Without doing any *real* work, find this integral: - -```math -\int_{-\pi/4}^{\pi/4} \tan(x) dx. -``` - - -```julia; hold=true; echo=false -val = 0 -numericq(val) -``` - -###### Question - -Without doing any *real* work, find this integral: - -```math -\int_3^5 (1 - \lvert x-4 \rvert) dx -``` - - -```julia; hold=true; echo=false -val = 1 -numericq(val) -``` - - -###### Question - -Suppose you know that for the integrable function -$\int_a^b f(u)du =1$ and $\int_a^c f(u)du = p$. If $a < c < b$ what is -$\int_c^b f(u)du$? - -```julia; hold=true; echo=false -choices = [ -"``1``", -"``p``", -"``1-p``", -"``p^2``"] - answ = 3 -radioq(choices, answ) -``` - -###### Question - -What is $\int_0^2 x^4 dx$? Use the rule for integrating $x^n$. - -```julia; hold=true; echo=false -choices = [ -"``2^5 - 0^5``", -"``2^5/5 - 0^5/5``", -"``2^4/4 - 0^4/4``", -"``3\\cdot 2^3 - 3 \\cdot 0^3``"] -answ = 2 -radioq(choices, answ) -``` - - -###### Question - -Solve for a value of $x$ for which: - -```math -\int_1^x \frac{1}{u}du = 1. -``` - -```julia; hold=true;echo=false -val = exp(1) -numericq(val) -``` - -###### Question - -Solve for a value of $n$ for which - -```math -\int_0^1 x^n dx = \frac{1}{12}. -``` - -```julia; hold=true; echo=false -val = 11 -numericq(val) -``` - -###### Question - -Suppose $f(x) > 0$ and $a < c < b$. Define $F(x) = \int_a^x f(u) du$. What can be said about $F(b)$ and $F(c)$? - -```julia; hold=true; echo=false -choices = [ -L"The area between $c$ and $b$ must be positive, so $F(c) < F(b)$.", -"``F(b) - F(c) = F(a).``", -L" $F(x)$ is continuous, so between $a$ and $b$ has an extreme value, which must be at $c$. So $F(c) \geq F(b)$." -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -For the right Riemann sum approximating $\int_0^{10} e^x dx$ with $n=100$ subintervals, what would be a good estimate for the error? - -```julia; hold=true; echo=false -choices = [ -"``(10 - 0)/100 \\cdot (e^{10} - e^{0})``", -"``10/100``", -"``(10 - 0) \\cdot e^{10} / 100^4``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Use `quadgk` to find the following definite integral: - -```math -\int_1^4 x^x dx . -``` - -```julia; hold=true; echo=false -f(x) = x^x -a, b = 1, 4 -val, _ = quadgk(f, a, b) -numericq(val) -``` - - -###### Question - -Use `quadgk` to find the following definite integral: - -```math -\int_0^3 e^{-x^2} dx . -``` - -```julia; hold=true; echo=false -f(x) = exp(-x^2) -a, b = 0, 3 -val, _ = quadgk(f, a, b) -numericq(val) -``` - - -###### Question - -Use `quadgk` to find the following definite integral: - -```math -\int_0^{9/10} \tan(u \frac{\pi}{2}) du. . -``` - -```julia; hold=true; echo=false -f(x) = tan(x*pi/2) -a, b = 0, 9/10 -val, _ = quadgk(f, a, b) -numericq(val) -``` - - -###### Question - -Use `quadgk` to find the following definite integral: - -```math -\int_{-1/2}^{1/2} \frac{1}{\sqrt{1 - x^2}} dx -``` - -```julia; hold=true; echo=false -f(x) = 1/sqrt(1 - x^2) -a, b =-1/2, 1/2 -val, _ = quadgk(f, a, b) -numericq(val) -``` - - -###### Question - - -```julia; hold=true; echo=false -caption = """ -The area under a curve approximated by a Riemann sum. -""" -#CalculusWithJulia.WeaveSupport.JSXGraph(joinpath(@__DIR__, "riemann.js"), caption) -# url = "riemann.js" -#CalculusWithJulia.WeaveSupport.JSXGraph(:integrals, url, caption) -# This is just wrong... -#CalculusWithJulia.WeaveSupport.JSXGraph(url, caption) -nothing -``` - -```=html -
-``` - -```ojs -//| echo: false -//| output: false -JXG = require("jsxgraph"); - -b = JXG.JSXGraph.initBoard('jsxgraph', { - boundingbox: [-0.5,0.3,1.5,-1/4], axis:true -}); - -g = function(x) { return x*x*x*x + 10*x*x - 60* x + 100} -f = function(x) {return 1/Math.sqrt(g(x))}; - -type = "right"; -l = 0; -r = 1; -rsum = function() { - return JXG.Math.Numerics.riemannsum(f,n.Value(), type, l, r); -}; -n = b.create('slider', [[0.1, -0.05],[0.75,-0.05], [2,1,50]],{name:'n',snapWidth:1}); - -graph = b.create('functiongraph', [f, l, r]); -os = b.create('riemannsum', - [f, - function(){ return n.Value();}, - type, l, r - ], - {fillColor:'#ffff00', fillOpacity:0.3}); - -b.create('text', [0.1,0.25, function(){ - return 'Riemann sum='+(rsum().toFixed(4)); -}]); -``` - - - - -The interactive graphic shows the area of a right-Riemann sum for different partitions. The function is - -```math -f(x) = \frac{1}{\sqrt{ x^4 + 10x^2 - 60x + 100}} -``` - -When ``n=5`` what is the area of the Riemann sum? - -```julia; hold=true; echo=false -numericq(0.1224) -``` - -When ``n=50`` what is the area of the Riemann sum? - -```julia; hold=true; echo=false -numericq(0.1187) -``` - -Using `quadgk` what is the area under the curve? - -```julia; hold=true; echo=false -g(x) = 1/sqrt(x^4 + 10x^2 - 60x + 100) -val, tmp = quadgk(g, 0, 1) -numericq(val) -``` - - -###### Question - - -Gauss nodes for approximating the integral $\int_{-1}^1 f(x) dx$ for $n=4$ are: - -```julia; -ns = [-0.861136, -0.339981, 0.339981, 0.861136] -``` - -The corresponding weights are - -```julia; -wts = [0.347855, 0.652145, 0.652145, 0.347855] -``` - -Use these to estimate the integral $\int_{-1}^1 \cos(\pi/2 \cdot x)dx$ with $w_1f(x_1) + w_2 f(x_2) + w_3 f(x_3) + w_4 f(x_4)$. - -```julia; hold=true; echo=false -f(x) = cos(pi/2*x) -val = sum([f(wi)*ni for (wi, ni) in zip(wts, ns)]) -numericq(val) -``` - -The actual answer is $4/\pi$. How far off is the approximation based on 4 points? - -```julia; hold=true; echo=false -choices = [ -L"around $10^{-1}$", -L"around $10^{-2}$", -L"around $10^{-4}$", -L"around $10^{-6}$", -L"around $10^{-8}$"] -answ = 4 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Using the Gauss nodes and weights from the previous question, estimate the integral of $f(x) = e^x$ over $[-1, 1]$. The value is: - -```julia; hold=true; echo=false -f(x) = exp(x) -val = sum([f(wi)*ni for (wi, ni) in zip(wts, ns)]) -numericq(val) -``` diff --git a/CwJ/integrals/area_between_curves.jmd b/CwJ/integrals/area_between_curves.jmd deleted file mode 100644 index 71f7c7f..0000000 --- a/CwJ/integrals/area_between_curves.jmd +++ /dev/null @@ -1,669 +0,0 @@ -# Area between two curves - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using Roots -using QuadGK -using SymPy -``` - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Area between two curves", - description = "Calculus with Julia: Area between two curves", - tags = ["CalculusWithJulia", "integrals", "area between two curves"], -); -nothing -``` - ----- - -The definite integral gives the "signed" area between the function $f(x)$ and the $x$-axis over $[a,b]$. Conceptually, this is the area between two curves, $f(x)$ and $g(x)=0$. More generally, this integral: - -```math -\int_a^b (f(x) - g(x)) dx -``` - -can be interpreted as the "signed" area between $f(x)$ and $g(x)$ over -$[a,b]$. If on this interval $[a,b]$ it is true that $f(x) \geq g(x)$, -then this would just be the area, as seen in this figure. The -rectangle in the figure has area: $(f(a)-g(a)) \cdot (b-a)$ which -could be a term in a left Riemann sum of the integral of $f(x) - g(x)$: - - -```julia; hold=true; echo=false -f1(x) = x^2 -g1(x) = sqrt(x) -a,b = 1/4, 3/4 - -xs = range(a, stop=b, length=250) -ss = vcat(xs, reverse(xs)) -ts = vcat(f1.(xs), g1.(reverse(xs))) - -plot(f1, 0, 1, legend=false) -plot!(g1) -plot!(ss, ts, fill=(0, :red)) -plot!(xs, f1.(xs), linewidth=5, color=:green) -plot!(xs, g1.(xs), linewidth=5, color=:green) - - -plot!(xs, f1.(xs), legend=false, linewidth=5, color=:blue) -plot!(xs, g1.(xs), linewidth=5, color=:blue) -u,v = .4, .5 -plot!([u,v,v,u,u], [f1(u), f1(u), g1(u), g1(u), f1(u)], color=:black, linewidth=3) -``` - -For the figure, we have $f(x) = \sqrt{x}$, $g(x)= x^2$ and $[a,b] = [1/4, 3/4]$. The shaded area is then found by: - -```math -\int_{1/4}^{3/4} (x^{1/2} - x^2) dx = (\frac{x^{3/2}}{3/2} - \frac{x^3}{3})\big|_{1/4}^{3/4} = \frac{\sqrt{3}}{4} -\frac{7}{32}. -``` - - - -#### Examples - -Find the area bounded by the line $y=2x$ and the curve $y=2 - x^2$. - -We can plot to see the area in question: - -```julia; -f(x) = 2 - x^2 -g(x) = 2x -plot(f, -3,3) -plot!(g) -``` - -For this problem we need to identify $a$ and $b$. These are found numerically through: - -```julia; -a,b = find_zeros(x -> f(x) - g(x), -3, 3) -``` - -The answer then can be found numerically: - -```julia; -quadgk(x -> f(x) - g(x), a, b)[1] -``` - -##### Example - -Find the integral between $f(x) = \sin(x)$ and $g(x)=\cos(x)$ over $[0,2\pi]$ where $f(x) \geq g(x)$. - -A plot shows the areas: - -```julia; -𝒇(x) = sin(x) -𝒈(x) = cos(x) -plot(𝒇, 0, 2pi) -plot!(𝒈) -``` - -There is a single interval when $f \geq g$ and this can be found -algebraically using basic trigonometry, or numerically: - -```julia; -𝒂,𝒃 = find_zeros(x -> 𝒇(x) - 𝒈(x), 0, 2pi) # pi/4, 5pi/4 -quadgk(x -> 𝒇(x) - 𝒈(x), 𝒂, 𝒃)[1] -``` - -##### Example - -Find the area between $x^n$ and $x^{n+1}$ over $[0,1]$ for $n=1,2,\dots$. - -We have on this interval $x^n \geq x^{n+1}$, so the integral can be found symbolically through: - -```julia; -@syms x::positive n::positive -ex = integrate(x^n - x^(n+1), (x, 0, 1)) -together(ex) -``` - -Based on this answer, what is the value of this - -```math -\frac{1}{2\cdot 3} + \frac{1}{3\cdot 4} + \frac{1}{4\cdot 5} + \cdots? -``` - -This should should be no surprise, given how the areas computed carve up the area under the line $y=x^1$ over $[0,1]$, so the answer should be ``1/2``. - - -```julia -p = plot(x, 0, 1, legend=false) -[plot!(p, x^n, 0, 1) for n in 2:20] -p -``` - - - -We can check using the `summation` function of `SymPy` which is similar in usage to `integrate`: - -```julia; -summation(1/(n+1)/(n+2), (n, 1, oo)) -``` - -##### Example - -Verify [Archimedes'](http://en.wikipedia.org/wiki/The_Quadrature_of_the_Parabola) finding that the area of the parabolic segment is $4/3$rds that of the triangle joining $a$, $(a+b)/2$ and $b$. - - -```julia; hold=true; echo=false -f(x) = 2 - x^2 -a,b = -1, 1/2 -c = (a + b)/2 -xs = range(-sqrt(2), stop=sqrt(2), length=50) -rxs = range(a, stop=b, length=50) -rys = map(f, rxs) - - -plot(f, a, b, legend=false, linewidth=3) -xs = [a,c,b,a] -plot!(xs, f.(xs), linewidth=3) -``` - -For concreteness, let $f(x) = 2-x^2$ and $[a,b] = [-1, 1/2]$, as in -the figure. Then the area of the triangle can be computed through: - -```julia -𝐟(x) = 2 - x^2 -𝐚, 𝐛 = -1, 1/2 -𝐜 = (𝐚 + 𝐛)/2 - -sac, sab, scb = secant(𝐟, 𝐚, 𝐜), secant(𝐟, 𝐚, 𝐛), secant(𝐟, 𝐜, 𝐛) -f1(x) = min(sac(x), scb(x)) -f2(x) = sab(x) - -A1 = quadgk(x -> f1(x) - f2(x), 𝐚, 𝐛)[1] -``` - -As we needed three secant lines, we used the `secant` function from `CalculusWithJulia` to create functions representing each. Once that was done, we used the `max` function to facilitate -integrating over the top bounding curve, alternatively, we could break -the integral over $[a,c]$ and $[c,b]$. - -The area of the parabolic segment is more straightforward. - -```julia; -A2 = quadgk(x -> 𝐟(x) - f2(x), 𝐚, 𝐛)[1] -``` - -Finally, if Archimedes was right, this relationship should bring about $0$ (or something within round-off error): - -```julia; -A1 * 4/3 - A2 -``` - -##### Example - -Find the area bounded by $y=x^4$ and $y=e^x$ when $x^4 \geq e^x$ and -$x > 0$. - -A graph over $[0,10]$ shows clearly the largest zero, for afterwards -the exponential dominates the power. - -```julia; -h1(x) = x^4 -h2(x) = exp(x) -plot(h1, 0, 10) -plot!(h2) -``` - -There must be another zero, though it is hard to see from the graph over $[0,10]$, -as $0^4=0$ and $e^0=1$, so the polynomial must cross below the -exponential to the left of $5$. (Otherwise, plotting over $[0,2]$ will -clearly reveal the other zero.) We now find these intersection points -numerically and then integrate: - -```julia; hold=true -a,b = find_zeros(x -> h1(x) - h2(x), 0, 10) -quadgk(x -> h1(x) - h2(x), a, b)[1] -``` - -##### Examples - -The area between $y=\sin(x)$ and $y=m\cdot x$ between $0$ and the -first positive intersection depends on $m$ (where $0 \leq m \leq -1$. The extremes are when $m=0$, the area is $2$ and when $m=1$ (the -line is tangent at $x=0$), the area is $0$. What is it for other -values of $m$? The picture for $m=1/2$ is: - -```julia; -m = 1/2 -plot(sin, 0, pi) -plot!(x -> m*x) -``` - -For a given $m$, the area is found after computing $b$, the -intersection point. We express this as a function of $m$ for later reuse: - -```julia; -intersection_point(m) = maximum(find_zeros(x -> sin(x) - m*x, 0, pi)) -a1 = 0 -b1 = intersection_point(m) -quadgk(x -> sin(x) - m*x, a1, b1)[1] -``` - - -In general, the area then as a function of `m` is found by substituting `intersection_point(m)` for `b`: - -```julia; -area(m) = quadgk(x -> sin(x) - m*x, 0, intersection_point(m))[1] -``` - -A plot shows the relationship: - -```julia; -plot(area, 0, 1) -``` - -While here, let's also answer the question of which $m$ gives an area -of $1$, or one-half the total? This can be done as follows: - -```julia; -find_zero(m -> area(m) - 1, (0, 1)) -``` - -(Which is a nice combination of using `find_zeros`, `quadgk` and `find_zero` to answer a problem.) - -##### Example - -Find the area bounded by the $x$ axis, the line $x-1$ and the function $\log(x+1)$. - -A plot shows us the basic area: - -```julia; -j1(x) = log(x+1) -j2(x) = x - 1 -plot(j1, 0, 3) -plot!(j2) -plot!(zero) -``` - -The value for "$b$" is found from the intersection point of $\log(x+1)$ and $x-1$, which is near $2$: - -```julia; -ja = 0 -jb = find_zero(x -> j1(x) - j2(x), 2) -``` - -We see that the lower part of the area has a condition: if $x < 1$ then use $0$, otherwise use $g(x)$. We can handle this many different ways: - -* break the integral into two pieces and add: - -```julia; -quadgk(x -> j1(x) - zero(x), ja, 1)[1] + quadgk(x -> j1(x) - j2(x), 1, jb)[1] -``` - -* make a new function for the bottom bound: - -```julia; -j3(x) = x < 1 ? 0.0 : j2(x) -quadgk(x -> j1(x) - j3(x), ja, jb)[1] -``` - -* Turn the picture on its side and integrate in the $y$ variable. To - do this, we need to solve for inverse functions: - -```julia; hold=true -a1=j1(ja) -b1=j1(jb) -f1(y)=y+1 # y=x-1, so x=y+1 -g1(y)=exp(y)-1 # y=log(x+1) so e^y = x + 1, x = e^y - 1 -quadgk(y -> f1(y) - g1(y), a1, b1)[1] -``` - -!!! note - When doing problems by hand this latter style can often reduce the complications, but when approaching the task numerically, the first two styles are generally easier, though computationally more expensive. - - -#### Integrating in different directions - -The last example suggested integrating in the $y$ variable. This could have more explanation. - -It has been noted that different symmetries can aid in computing -integrals through their interpretation as areas. For example, if -$f(x)$ is odd, then $\int_{-b}^b f(x)dx=0$ and if $f(x)$ is even, -$\int_{-b}^b f(x) dx = 2\int_0^b f(x) dx$. - -Another symmetry of the $x-y$ plane is the reflection through the line -$y=x$. This has the effect of taking the graph of $f(x)$ to the graph -of $f^{-1}(x)$ and vice versa. Here is an example with $f(x) = x^3$ -over $[-1,1]$. - -```julia; hold=true -f(x) = x^3 -xs = range(-1, stop=1, length=50) -ys = f.(xs) -plot(ys, xs) -``` - -By switching the order of the `xs` and `ys` we "flip" the graph -through the line $x=y$. - -We can use this symmetry to our advantage. Suppose instead of being given an equation $y=f(x)$, we are given it in "inverse" style: $x = f(y)$, for example suppose we have $x = y^3$. We can plot this as above via: - -```julia; hold=true -ys = range(-1, stop=1, length=50) -xs = [y^3 for y in ys] -plot(xs, ys) -``` - -Suppose we wanted the area in the first quadrant between this graph, -the $y$ axis and the line $y=1$. What to do? With the problem -"flipped" through the $y=x$ line, this would just be $\int_0^1 x^3dx$. -Rather than mentally flipping the picture to integrate, instead we can just integrate in -the $y$ variable. That is, the area is $\int_0^1 y^3 dy$. The -mental picture for Riemann sums would be have the approximating rectangles laying flat and as -a function of $y$, are given a length of $y^3$ and height of "$dy$". - ----- - -```julia; hold=true; echo=false -f(x) = x^(1/3) -f⁻¹(x) = x^3 -plot(f, 0, 1, label="f", linewidth=5, color=:blue, aspect_ratio=:equal) -plot!([0,1,1],[0,0,1], linewidth=1, linestyle=:dash, label="") -x₀ = 2/3 -Δ = 1/16 -col = RGBA(0,0,1,0.25) -function box(x,y,Δₓ, Δ, color=col) - plot!([x,x+Δₓ, x+Δₓ, x, x], [y,y,y+Δ,y+Δ,y], color=:black, label="") - plot!(x:Δₓ:(x+Δₓ), u->y, fillto = u->y+Δ, color=col, label="") -end -box(x₀, 0, Δ, f(x₀), col) -box(x₀+Δ, 0, Δ, f(x₀+Δ), col) -box(x₀+2Δ, 0, Δ, f(x₀+2Δ), col) -colᵣ = RGBA(1,0,0,0.25) -box(f⁻¹(x₀-0Δ), x₀-1Δ, 1 - f⁻¹(x₀-0Δ), Δ, colᵣ) -box(f⁻¹(x₀-1Δ), x₀-2Δ, 1 - f⁻¹(x₀-1Δ), Δ, colᵣ) -box(f⁻¹(x₀-2Δ), x₀-3Δ, 1 - f⁻¹(x₀-2Δ), Δ, colᵣ) -``` - -The figure above suggests that the area under ``f(x)`` over ``[a,b]`` could be represented as the area between the curves ``f^{-1}(y)`` and ``y=b`` from ``[f(a), f(b)]``. - ----- - - -For a less trivial problem, consider the area between $x = y^2$ and $x -= 2-y$ in the first quadrant. - -```julia; hold=true -ys = range(0, stop=2, length=50) -xs = [y^2 for y in ys] -plot(xs, ys) -xs = [2-y for y in ys] -plot!(xs, ys) -plot!(zero) -``` - -We see the bounded area could be described in the "$x$" variable in -terms of two integrals, but in the $y$ variable in terms of the -difference of two functions with the limits of integration running -from $y=0$ to $y=1$. So, this area may be found as follows: - -```julia; hold=true -f(y) = 2-y -g(y) = y^2 -a, b = 0, 1 -quadgk(y -> f(y) - g(y), a, b)[1] -``` - - -## Questions - -###### Question - -Find the area enclosed by the curves $y=2-x^2$ and $y=x^2 - 3$. - -```julia; hold=true; echo=false -f(x) = 2 - x^2 -g(x) = x^2 - 3 -a,b = find_zeros(x -> f(x) - g(x), -10, 10) -val, _ = quadgk(x -> f(x) - g(x), a, b) -numericq(val) -``` - -###### Question - -Find the area between $f(x) = \cos(x)$, $g(x) = x$ and the $y$ axis. - -```julia; hold=true; echo=false -f(x) = cos(x) -g(x) = x -a = 0 -b = find_zero(x -> f(x) - g(x), 1) -val, _ = quadgk(x -> f(x) - g(x), a, b) -numericq(val) -``` - -###### Question - -Find the area between the line $y=1/2(x+1)$ and half circle $y=\sqrt{1 - x^2}$. - -```julia; hold=true; echo=false -f(x) = sqrt(1 - x^2) -g(x) = 1/2 * (x + 1) -a,b = find_zeros(x -> f(x) - g(x), -1, 1) -val, _ = quadgk(x -> f(x) - g(x), a, b) -numericq(val) -``` - -###### Question - -Find the area in the first quadrant between the lines $y=x$, $y=1$, and the curve $y=x^2 + 4$. - -```julia; hold=true; echo=false -f(x) = x -g(x) = 1.0 -h(x) = min(f(x), g(x)) -j(x) = x^2 / 4 -a,b = find_zeros(x -> h(x) - j(x), 0, 3) -val, _ = quadgk(x -> h(x) - j(x), a, b) -numericq(val) -``` - -###### Question - -Find the area between $y=x^2$ and $y=-x^4$ for $\lvert x \rvert \leq 1$. - -```julia; hold=true; echo=false -f(x) = x^2 -g(x) = -x^4 -a,b = -1, 1 -val, _ = quadgk(x -> f(x) - g(x), a, b) -numericq(val) -``` - -###### Question - -Let `f(x) = 1/(sqrt(pi)*gamma(1/2)) * (1 + t^2)^(-1)` and `g(x) = 1/sqrt(2*pi) * exp(-x^2/2)`. These graphs intersect in two points. Find the area bounded by them. - -```julia; hold=true; echo=false -import SpecialFunctions: gamma -f(x) = 1/(sqrt(pi)*gamma(1/2)) * (1 + x^2)^(-1) -g(x) = 1/sqrt(2*pi) * exp(-x^2/2) -a,b = find_zeros(x -> f(x) - g(x), -3, 3) -val, _ = quadgk(x -> f(x) - g(x), a, b) -numericq(val) -``` - -(Where `gamma(1/2)` is a call to the [gamma](http://en.wikipedia.org/wiki/Gamma_function) function.) - -###### Question - -Find the area in the first quadrant bounded by the graph of $x = (y-1)^2$, $x=3-y$ and $x=2\sqrt{y}$. (Hint: integrate in the $y$ variable.) - -```julia; hold=true; echo=false -f(y) = (y-1)^2 -g(y) = 3 - y -h(y) = 2sqrt(y) -a = 0 -b = find_zero(y -> f(y) - g(y), 2) -f1(y) = max(f(y), zero(y)) -g1(y) = min(g(y), h(y)) -val, _ = quadgk(y -> g1(y) - f1(y), a, b) -numericq(val) -``` - -###### Question - -Find the total area bounded by the lines $x=0$, $x=2$ and the curves $y=x^2$ and $y=x$. This would be $\int_a^b \lvert f(x) - g(x) \rvert dx$. - -```julia; hold=true; echo=false -f(x) = x^2 -g(x) = x -a, b = 0, 2 -val, _ = quadgk(x -> abs(f(x) - g(x)), a, b) -numericq(val) -``` - -###### Question - - -Look at the sculpture -[Le Tamanoir](https://www.google.com/search?q=Le+Tamanoir+by+Calder.&num=50&tbm=isch&tbo=u&source=univ&sa=X&ved=0ahUKEwiy8eO2tqzVAhVMPz4KHXmgBpgQsAQILQ&biw=1556&bih=878) -by Calder. A large scale work. How much does it weigh? Approximately? - - -Let's try to answer that with an educated guess. The right most figure -looks to be about 1/5th the total amount. So if we estimate that piece -and multiply by 5 we get a good guess. That part looks like an area of -metal bounded by two quadratic polynomials. If we compute that area in -square inches, then multiply by an assumed thickness of one inch, we -have the cubic volume. The density of galvanized steel is 7850 -kg/$m^3$ which we convert into pounds/in$^3$ via: - -```julia; -7850 * 2.2 * (1/39.3)^3 -``` - -The two parabolas, after rotating, might look like the following (with $x$ in inches): - -```math -f(x) = x^2/70, \quad g(x) = 35 + x^2/140 -``` - - - -Put this altogether to give an estimated weight in pounds. - -```julia; hold=true; echo=false -f(x) = x^2/70 -g(x) = 35 + x^2/140 -a,b = find_zeros(x -> f(x) - g(x), -100, 100) -ar, _ = quadgk(x -> abs(f(x) - g(x)), a, b) -val = 5 * ar * 7850 * 2.2 * (1/39.3)^3 -numericq(val) -``` - -Is the guess that the entire sculpture is more than two tons? - -```julia; hold=true; echo=false -choices=["Less than two tons", "More than two tons"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -!!! note - We used area to estimate weight in this example, but Galileo used weight to estimate area. It is [mentioned](https://www.maa.org/sites/default/files/pdf/cmj_ftp/CMJ/January%202010/3%20Articles/3%20Martin/08-170.pdf) by Martin that in order to estimate the area enclosed by one arch of a cycloid, Galileo cut the arch from from some material and compared the weight to the weight of the generating circle. He concluded the area is close to ``3`` times that of the circle, a conjecture proved by Roberval in 1634. - - -###### Question - -Formulas from the business world say that revenue is the integral of *marginal revenue* or the additional money from selling 1 more unit. (This is basically the derivative of profit). Cost is the integral of *marginal cost*, or the cost to produce 1 more. Suppose we have - -```math -\text{mr}(x) = 2 - \frac{e^{-x/10}}{1 + e^{-x/10}}, \quad -\text{mc}(x) = 1 - \frac{1}{2} \cdot \frac{e^{-x/5}}{1 + e^{-x/5}}. -``` - -Find the profit to produce 100 units: $P = \int_0^{100} (\text{mr}(x) - \text{mc}(x)) dx$. - -```julia; hold=true; echo=false -mr(x) = 2 + exp((-x/10)) / (1 + exp(-x/10)) -mc(x) = 1 + (1/2) * exp(-x/5) / (1 + exp(-x/5)) -a, b = 0, 100 -val, _ = quadgk(x -> mr(x) - mc(x), 0, 100) -numericq(val) -``` - -###### Question - -Can `SymPy` do what Archimedes did? - -Consider the following code which sets up the area of an inscribed triangle, `A1`, and the area of a parabolic segment, `A2` for a general parabola: - - - -```julia; hold=true -@syms x::real A::real B::real C::real a::real b::real -c = (a + b) / 2 -f(x) = A*x^2 + B*x + C -Secant(f, a, b) = f(a) + (f(b)-f(a))/(b-a) * (x - a) -A1 = integrate(Secant(f, a, c) - Secant(f,a,b), (x,a,c)) + integrate(Secant(f,c,b)-Secant(f,a,b), (x, c, b)) -A2 = integrate(f(x) - Secant(f,a,b), (x, a, b)) -out = 4//3 * A1 - A2 -``` - -Does `SymPy` get the correct output, ``0``, after calling `simplify`? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -###### Question - -In [Martin](https://www.maa.org/sites/default/files/pdf/cmj_ftp/CMJ/January%202010/3%20Articles/3%20Martin/08-170.pdf) a fascinating history of the cycloid can be read. - -```julia; hold=true;echo=false -imgfile="figures/cycloid-companion-curve.png" -caption = """ -Figure from Martin showing the companion curve to the cycloid. As the generating circle rolls, from ``A`` to ``C``, the original point of contact, ``D``, traces out an arch of the cycloid. The companion curve is that found by congruent line segments. In the figure, when ``D`` was at point ``P`` the line segment ``PQ`` is congruent to ``EF`` (on the original position of the generating circle). -""" -ImageFile(:integrals, imgfile, caption) -``` - -In particular, it can be read that Roberval proved that the area between the cycloid and its companion curve is half the are of the generating circle. Roberval didn't know integration, so finding the area between two curves required other tricks. One is called "Cavalieri's principle." From the figure above, which of the following would you guess this principle to be: - -```julia; hold=true; echo=false -choices = [""" -If two regions bounded by parallel lines are such that any parallel between them cuts each region in segments of equal length, then the regions have equal area. -""", - """ -The area of the cycloid is nearly the area of a semi-ellipse with known values, so one can approximate the area of the cycloid with formula for the area of an ellipse -"""] -radioq(choices, 1) -``` - -Suppose the generating circle has radius ``1``, so the area shown is ``\pi/2``. The companion curve is then ``1-\cos(\theta)`` (a fact not used by Roberval). The area *under* this curve is then - -```julia; hold=true -@syms theta -integrate(1 - cos(theta), (theta, 0, SymPy.PI)) -``` - -That means the area under **one-half** arch of the cycloid is - -```julia; hold=true; echo=false -choices = ["``\\pi``", - "``(3/2)\\cdot \\pi``", - "``2\\pi``" - ] -radioq(choices, 2, keep_order=true) -``` - - -Doubling the answer above gives a value that Galileo had struggled with for many years. - -```julia; hold=true; echo=false -imgfile="figures/companion-curve-bisects-rectangle.png" -caption = """ -Roberval, avoiding a trignometric integral, instead used symmetry to show that the area under the companion curve was half the area of the rectangle, which in this figure is ``2\\pi``. -""" -ImageFile(:integrals, imgfile, caption) -``` diff --git a/CwJ/integrals/cache/arc_length.cache b/CwJ/integrals/cache/arc_length.cache deleted file mode 100644 index 0366c93..0000000 Binary files a/CwJ/integrals/cache/arc_length.cache and /dev/null differ diff --git a/CwJ/integrals/cache/area.cache b/CwJ/integrals/cache/area.cache deleted file mode 100644 index f0c1c2c..0000000 Binary files a/CwJ/integrals/cache/area.cache and /dev/null differ diff --git a/CwJ/integrals/cache/area_between_curves.cache b/CwJ/integrals/cache/area_between_curves.cache deleted file mode 100644 index 86bc8ef..0000000 Binary files a/CwJ/integrals/cache/area_between_curves.cache and /dev/null differ diff --git a/CwJ/integrals/cache/center_of_mass.cache b/CwJ/integrals/cache/center_of_mass.cache deleted file mode 100644 index 7618679..0000000 Binary files a/CwJ/integrals/cache/center_of_mass.cache and /dev/null differ diff --git a/CwJ/integrals/cache/ftc.cache b/CwJ/integrals/cache/ftc.cache deleted file mode 100644 index 701028d..0000000 Binary files a/CwJ/integrals/cache/ftc.cache and /dev/null differ diff --git a/CwJ/integrals/cache/improper_integrals.cache b/CwJ/integrals/cache/improper_integrals.cache deleted file mode 100644 index 13ab621..0000000 Binary files a/CwJ/integrals/cache/improper_integrals.cache and /dev/null differ diff --git a/CwJ/integrals/cache/integration_by_parts.cache b/CwJ/integrals/cache/integration_by_parts.cache deleted file mode 100644 index 8372a85..0000000 Binary files a/CwJ/integrals/cache/integration_by_parts.cache and /dev/null differ diff --git a/CwJ/integrals/cache/mean_value_theorem.cache b/CwJ/integrals/cache/mean_value_theorem.cache deleted file mode 100644 index b896bf2..0000000 Binary files a/CwJ/integrals/cache/mean_value_theorem.cache and /dev/null differ diff --git a/CwJ/integrals/cache/partial_fractions.cache b/CwJ/integrals/cache/partial_fractions.cache deleted file mode 100644 index bbaceff..0000000 Binary files a/CwJ/integrals/cache/partial_fractions.cache and /dev/null differ diff --git a/CwJ/integrals/cache/substitution.cache b/CwJ/integrals/cache/substitution.cache deleted file mode 100644 index a1d30c2..0000000 Binary files a/CwJ/integrals/cache/substitution.cache and /dev/null differ diff --git a/CwJ/integrals/cache/surface_area.cache b/CwJ/integrals/cache/surface_area.cache deleted file mode 100644 index 0779e05..0000000 Binary files a/CwJ/integrals/cache/surface_area.cache and /dev/null differ diff --git a/CwJ/integrals/cache/volumes_slice.cache b/CwJ/integrals/cache/volumes_slice.cache deleted file mode 100644 index 774f09e..0000000 Binary files a/CwJ/integrals/cache/volumes_slice.cache and /dev/null differ diff --git a/CwJ/integrals/center_of_mass.jmd b/CwJ/integrals/center_of_mass.jmd deleted file mode 100644 index d5fa6ea..0000000 --- a/CwJ/integrals/center_of_mass.jmd +++ /dev/null @@ -1,520 +0,0 @@ -# Center of Mass - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using Roots -using QuadGK -using SymPy - -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Center of Mass", - description = "Calculus with Julia: Center of Mass", - tags = ["CalculusWithJulia", "integrals", "center of mass"], -); -nothing -``` - ----- - -```julia; hold=true; echo=false -imgfile = "figures/seesaw.png" -caption = L""" - -A silhouette of two children on a seesaw. The seesaw can be balanced -only if the distance from the central point for each child reflects -their relative weights, or masses, through the formula $d_1m_1 = d_2 -m_2$. This means if the two children weigh the same the balance will -tip in favor of the child farther away, and if both are the same -distance, the balance will tip in favor of the heavier. -""" -ImageFile(:integrals, imgfile, caption) -``` - -The game of seesaw is one where children earn an early appreciation for the effects of distance and relative weight. For children with equal weights, the seesaw will balance if they sit an equal distance from the center (on opposite sides, of course). However, with unequal weights that isn't the case. If one child weighs twice as much, the other must sit twice as far. - -The key relationship is that $d_1 m_1 = d_2 m_2$. This come from physics, where the moment about a point is defined by the mass times the distance. This balance relationship says the overall moment balances out. When this is the case, then the *center of mass* is at the fulcrum point, so there is no impetus to move. - -The [center](http://en.wikipedia.org/wiki/Center_of_mass) of mass is an old concept that often allows a possibly complicated relationship involving weights to be reduced to a single point. The seesaw is an example: if the center of mass is at the fulcrum the seesaw can balance. - -In general, we use position of the mass, rather than use distance from some fixed fulcrum. With this, the center of mass for a finite set of point masses distributed on the real line, is defined by: - - -```math -\bar{\text{cm}} = \frac{m_1 x_1 + m_2 x_2 + \cdots + m_n x_n}{m_1 + m_2 + \cdots + m_n}. -``` - -Writing $w_i = m_i / (m_1 + m_2 + \cdots + m_n)$, we get the center of mass is just a weighted sum: $w_1 x_1 + \cdots + w_n x_n$, where the $w_i$ are the relative weights. - -With some rearrangment, we can see that the center of mass satisfies the equation: - -```math -w_1 \cdot (x_1 - \bar{\text{cm}}) + w_2 \cdot (x_2 - \bar{\text{cm}}) + \cdots + w_n \cdot (x_n - \bar{\text{cm}}) = 0. -``` - -The center of mass is a balance of the weighted signed distances. This property of the center of mass being a balancing point makes it of intrinsic interest and can be - in the case of sufficient symmetry - easy to find. - -##### Example - -A set of weights sits on a dumbbell rack. They are spaced 1 foot apart starting with the 5, then the 10-, 15-, 25-, and 35-pound weights. Where is the center of mass? - -We begin by letting $m_1=5$, $m_2=10$, $m_3=15$, $m_4=25$ and $m_5=35$. Our positions will be labeled $x_i = i-1$, so the five-pound weight is at position $0$ and the $35$-pound one at $4$. The center of mass is then given by: - -```math -\frac{5\cdot 0 + 10\cdot 1 + 15 \cdot 2 + 25 \cdot 3 + 35\cdot 4}{5 + 10 + 15 + 25 + 35} = \frac{255}{90} = 2.833 -``` - -If the $25$-pound weight is removed, how does the center of mass shift? - -We could add the terms again, or just reduce from our sum: - -```math -\frac{255 - 25\cdot 3}{90 - 25} = \frac{180}{65} = 2.769... -``` - -The center of mass shifts slightly, but since the removed weight was already close to the center of mass, the movement wasn't much. - -## Center of mass of figures - -Consider now a more general problem, the center of mass of a solid figure. We will restrict our attention to figures that can be represented by functions in the $x-y$ plane which are two dimensional. For example, consider the region in the plane bounded by the $x$ axis and the function $1 - \lvert x \rvert$. This is triangle with vertices $(-1,0)$, $(0,1)$, and $(1,0)$. - -This graph shows that the figure is symmetric: - -```julia; hold=true -f(x) = 1 - abs(x) -a, b = -1.5, 1.5 -plot(f, a, b) -plot!(zero, a, b) -``` - -As the center of mass should be a balancing value, we would guess intuitively that the center of mass in the $x$ direction will be $x=0$. - -But what should the center of mass formula be? - -As with many formulas that will end up involving a derived integral, we start with a sum approximation. If the region is described as the area under the graph of $f(x)$ between $a$ and $b$, then we can form a Riemann sum approximation, that is a choice of $a = x_0 < x_1 < x_2 \cdots < x_n = b$ and points $c_1$, $\dots$, $c_n$. If all the rectangles are made up of a material of uniform density, say $\rho$, then the mass of each rectangle will be the area times $\rho$, or $\rho f(c_i) \cdot (x_i - x_{i-1})$, for $i = 1, \dots , n$. - -```julia; hold=true; echo=false -n = 21 -f(x) = 1 - abs(x) -a, b = -1.5, 1.5 - -xs = range(-1, stop=1, length=n) -cs = (xs[1:(end-1)] + xs[2:end]) / 2 - -p = plot(legend=false); - -for i in 1:(n-1) - xi, xi1 = xs[i], xs[i+1] - plot!(p, [xi, xi1, xi1, xi], [0,0,1,1]*f(cs[i]), linetype=:polygon, color=:red); -end - -for i in 1:(n-1) - ci = cs[i] - scatter!(p, [ci], [.1], markersize=12*f(cs[i]), color=:orange); -end -plot!(p, [-1,1], [0,0]) - -p -``` - -The figure shows the approximating rectangles and circles representing their masses for $n=20$. - - -Generalizing from this figure shows the center of mass for such an approximation will be: - -```math -\begin{align*} -&\frac{\rho f(c_1) (x_1 - x_0) \cdot x_1 + \rho f(c_2) (x_2 - x_1) \cdot x_1 + \cdots + \rho f(c_n) (x_n- x_{n-1}) \cdot x_{n-1}}{\rho f(c_1) (x_1 - x_0) + \rho f(c_2) (x_2 - x_1) + \cdots + \rho f(c_n) (x_n- x_{n-1})} \\ -&=\\ -&\quad\frac{f(c_1) (x_1 - x_0) \cdot x_1 + f(c_2) (x_2 - x_1) \cdot x_1 + \cdots + f(c_n) (x_n- x_{n-1}) \cdot x_{n-1}}{f(c_1) (x_1 - x_0) + f(c_2) (x_2 - x_1) + \cdots + f(c_n) (x_n- x_{n-1})}. -\end{align*} -``` - -But the top part is an approximation to the integral $\int_a^b x f(x) dx$ and the bottom part the integral $\int_a^b f(x) dx$. The ratio of these defines the center of mass. - -> **Center of Mass**: The center of mass (in the $x$ direction) of a region in the $x-y$ plane described by -> the area under a (positive) function $f(x)$ between $a$ and $b$ is given by -> -> -> ``\text{Center of mass} = \text{cm}_x = \frac{\int_a^b xf(x) dx}{\int_a^b f(x) dx}.`` -> -> For regions described by a more complicated set of equations, the center of mass is found from the same formula where $f(x)$ is the total height in the $x$ direction for a given $x$. - -For the triangular shape, we have by the fact that $f(x) = 1 - \lvert x \rvert$ is an even function that $xf(x)$ will be odd, so the integral around $-1,1$ will be $0$. So the center of mass formula applied to this problem agrees with our expectation. - -##### Example - -What about the center of mass of the triangle formed by the line $x=-1$, the $x$ axis and $(1-x)/2$? This too is defined between $a=-1$ and $b=-1$, but the center of mass will be negative, as a graph shows more mass to the left of $0$ than the right: - -```julia; hold=true -f(x) = (1-x)/2 -plot(f, -1, 1) -plot!(zero, -1, 1) -``` - -The formulas give: - -```math -\int_{-1}^1 xf(x) dx = \int_{-1}^1 x\cdot (1-x)/2 = (\frac{x^2}{4} - \frac{x^3}{6})\big|_{-1}^1 = -\frac{1}{3}. -``` - -The bottom integral is just the area (or total mass if the $\rho$ were not canceled) and by geometry is $1/2 (1)(2) = 1$. So $\text{cm}_x = -1/3$. - -##### Example - -Find the center of mass formed by the intersection of the parabolas $y=1 - x^2$ and $y=(x-1)^2 - 2$. - -The center of mass (in the $x$ direction) can be seen to be close to $x=1/2$: - -```julia -f1(x) = 1 - x^2 -f2(x) = (x-1)^2 -2 -plot(f1, -3, 3) -plot!(f2, -3, 3) -``` - -To find it, we need to find the intersection points, then integrate. We do so numerically. - -```julia; hold=true -h(x) = f1(x) - f2(x) -a,b = find_zeros(h, -3, 3) -top, err = quadgk(x -> x * h(x), a, b) -bottom, err = quadgk(h, a, b) -cm = top / bottom -``` - -Our guess from the diagram proves correct. - -!!! note - It proves convenient to use the `->` notation for an anonymous function above, as our function `h` is not what is being integrated all the time, but some simple modification. If this isn't palatable, a new function could be defined and passed along to `quadgk`. - - -##### Example - -Consider a region bounded by a probability density function. (These functions are non-negative, and integrate to $1$.) The center of mass formula simplifies to $\int xf(x) dx$, as the denominator will be $1$, and the answer is called the *mean*, and often denoted by the Greek letter $\mu$. - -For the probability density $f(x) = e^{-x}$ for $x\geq 0$ and $0$ otherwise, find the mean. - -We need to compute $\int_{-\infty}^\infty xf(x) dx$, but in this case since $f$ is $0$ to the left of the origin, we just have: - -```math -\mu = \int_0^\infty x e^{-x} dx = -(1+x) \cdot e^{-x} \big|_0^\infty = 1 -``` - -For fun, we compare this to the median, which is the value $M$ so that the total area is split in half. That is, the following formula is satisfied: $\int_0^M f(x) dx = 1/2$. To compute, we have: - -```math -\int_0^M e^{-x} dx = -e^{-x} \big|_0^M = 1 - e^{-M}. -``` - -Solving $1/2 = 1 - e^{-M}$ gives $M=\log(2) = 0.69...$, The median is to the left of the mean in this example. - -!!! note - In this example, we used an infinite region, so the idea of "balancing" may be a bit unrealistic, nonetheless, this intuitive interpretation is still a good one to keep this in mind. The point of comparing to the median is that the balancing point is to the right of where the area splits in half. Basically, the center of mass follows in the direction of the area far to the right of the median, as this area is skewed in that direction. - - -##### Example - -A figure is formed by transformations of the function $\phi(u) = e^{2(k-1)} - e^{2(k-u)}$, for some fixed $k$, as follows: - -```julia; -k = 3 -phi(u) = exp(2(k-1)) - exp(2(k-u)) -f(u) = max(0, phi(u)) -g(u) = min(f(u+1), f(k)) - -plot(f, 0, k, legend=false) -plot!(g, 0, k) -plot!(zero, 0, k) -``` - -(This is basically the graph of $\phi(u)$ and the graph of its shifted value $\phi(u+1)$, only truncated on the top and bottom.) - -The center of mass of this figure is found with: - -```julia; -h(x) = g(x) - f(x) -top, _ = quadgk(x -> x*h(x), 0, k) -bottom, _ = quadgk(h, 0, k) -top/bottom -``` - -This figure has constant slices of length $1$ for fixed values of $y$. If we were to approximate the values with blocks of height $1$, then the center of mass would be to the left of $1$ - for any $k$, but the top most block would have an overhang to the right of $1$ - out to a value of $k$. That is, this figure should balance: - -```julia; echo=false -u(i) = 1/2*(2k - log(exp(2(k-1)) - i)) -p = plot(legend=false); -for i in 0:floor(phi(k)) - x = u(i) - plot!(p, [x,x,x-1,x-1], [f(x),f(x)+1, f(x)+1, f(x)], linetype=:polygon, color=:orange); -end -xs = range(0, stop=exp(1), length=50) -plot!(p, f, 0, e, linewidth=5); -plot!(p, g, 0, 3, linewidth=5) -p -``` - -See this [paper](https://math.dartmouth.edu/~pw/papers/maxover.pdf) and its references for some background on this example and its extensions. - -### The $y$ direction. - -We can talk about the center of mass in the $y$ direction too. The approximating picture uses horizontal rectangles - not vertical ones - and if we describe them by $f(y)$, then the corresponding formulas would be - -> ``\text{center of mass} = \text{cm}_y = \frac{\int_a^b y f(y) dy}{\int_a^b f(y) dy}.`` - -For example, consider, again, the triangle bounded by the line $x=-1$, the $x$ axis, and the line $y=(1-x)/2$. In terms of describing this in $y$, the function $f(y)=2 -2y$ gives the total length of the horizontal slice (which comes from solving $y=(x-1)/2$for $x$, the general method to find an inverse function, and subtracting $-1$) and the interval is $y=0$ to $y=1$. Thus our center of mass in the $y$ direction will be - -```math -\text{cm}_y = \frac{\int_0^1 y (2 - 2y) dy}{\int_0^1 (2 - 2y) dy} = \frac{(2y^2/2 - 2y^3/3)\big|_0^1}{1} = \frac{1}{3}. -``` - -Here the center of mass is below $1/2$ as the bulk of the area is. (The bottom area is just $1$, as known from the area of a triangle.) - -As seen, the computation of the center of mass in the $y$ direction has an identical formula, though may be more involved if an inverse function must be computed. - -##### Example - - -More generally, consider a right triangle with vertices $(0,0)$, $(0,a)$, and $(b,0)$. The center of mass of this can be computed with the help of the equation for the line that forms the hypotenuse: $x/b + y/a = 1$. We find the center of mass symbolically in the $y$ variable by solving for $x$ in terms of $y$, then integrating from $0$ to $a$: - -```julia; -@syms a b x y -eqn = x/b + y/a - 1 -fy = solve(eqn, x)[1] -integrate(y*fy, (y, 0, a)) / integrate(fy, (y, 0, a)) -``` - -The answer involves $a$ linearly, but not $b$. If we find the center of mass in $x$, we *could* do something similar: - -```julia; -fx = solve(eqn, y)[1] -integrate(x*fx, (x, 0, b)) / integrate(fx, (x, 0, b)) -``` - -But really, we should have just noted that simply by switching the labels $a$ and $b$ in the diagram we could have discovered this formula. - -!!! note - The [centroid](http://en.wikipedia.org/wiki/Centroid) of a region in the plane is just $(\text{cm}_x, \text{cm}_y)$. This last fact says the centroid of the right triangle is just $(b/3, a/3)$. The centroid can be found by other geometric means. The link shows the plumb line method. For triangles, the centroid is also the intersection point of the medians, the lines that connect a vertex with its opposite midpoint. - - -##### Example - -Compute the $x$ and $y$ values of the center of mass of the half circle described by the area below the function $f(x) = \sqrt{1 - x^2}$ and above the $x$-axis. - -A plot shows the value of cm$_x$ will be $0$ by symmetry: - -```julia; hold=true -f(x) = sqrt(1 - x^2) -plot(f, -1, 1) -``` - -($f(x)$ is even, so $xf(x)$ will be odd.) - -However, the value for cm$_y$ will - like the last problem - be around $1/3$. The exact value is compute using slices in the $y$ direction. Solving for $x$ in $y=\sqrt{1-x^2}$, or $x = \pm \sqrt{1-y^2}$, if $f(y) = 2\sqrt{1 - y^2}$. The value is then: - -```math -\text{cm}_y = \frac{\int_{0}^1 y 2 \sqrt{1 - y^2}dy}{\int_{0}^1 2\sqrt{1-y^2}} = -\frac{-(1-x^2)^{3/2}/3\big|_0^1}{\pi/2} = \frac{1}{3}. -``` - -In fact it is exactly $1/3$. The top calculation is done by $u$-substitution, the bottom by using the area formula for a half circle, $\pi r^2/2$. - -##### Example - -A disc of radius $2$ is centered at the origin, as a disc of radius $1$ is bored out between $y=0$ and $y=1$. Find the resulting center of mass. - -A picture shows that this could be complicated, especially for $y > 0$, as we need to describe the length of the red lines below for $-2 < y < 2$: - -```julia; hold=true; echo=false -a,b = 0, 2pi -ts = range(a, stop=b, length=50) -p = plot(t -> 2cos(t), t->2sin(t), a, b, legend=false, aspect_ratio=:equal); -plot!(p, cos.(ts), 1 .+ sin.(ts), linetype=:polygon, color=:red); -plot!(p, [-sqrt(3), sqrt(3)], [-1,-1], color=:orange); -plot!(p, [-sqrt(3), -1], [1,1], color=:orange); -plot!(p, [sqrt(3), 1], [1,1], color=:orange); -p - -``` - -We can see that cm$_x = 0$, by symmetry, but to compute cm$_y$ we need -to find $f(y)$, which will depend on the value of $y$ between $-2$ and -$2$. The outer circle is $x^2 + y^2 = 4$, the inner circle $x^2 + -(y-1)^2 = 1$. When $y < 0$, $f(y)$ is the distance across the outer -circle or, $2\sqrt{4 - y^2}$. When $y \geq 0$, $f(y)$ is *twice* the -distance from the bigger circle to the smaller, of $2(\sqrt{4 - y^2} - -\sqrt{1 - (y-1)^2})$. - -We use this to compute: - -```julia; hold=true -f(y) = y < 0 ? 2*sqrt(4 - y^2) : 2* (sqrt(4 - y^2)- sqrt(1 - (y-1)^2)) -top, _ = quadgk( y -> y * f(y), -2, 2) -bottom, _ = quadgk( f, -2, 2) -top/bottom -``` - -The nice answer of $-1/3$ makes us think there may be a different way to visualize this. Were we to rearrange the top integral, we could write it as $\int_{-2}^2 y 2 \sqrt{4 -y^2}dy - \int_0^2 2y\sqrt{1 - (y-1)^2}dy$. Call this $A - B$. The left term, $A$, is part of the center of mass formula for the big circle (which is this value divided by $M=4\pi$), and the right term, $B$, is part of the center of mass formula for the (drilled out) smaller circle (which is this value divided by $m=\pi$. These values are weighted according to $(AM - Bm)/(M-m)$. In this case $A=0$, $B=1$ and $M=4m$, so the answer is $-1/3$. - -## Questions - -###### Question - -Find the center of mass in the $x$ variable for the region bounded by parabola $x=4 - y^2$ and the $y$ axis. - -```julia; hold=true; echo=false -f(x) = sqrt(4 - x) -a, b = 0, 4 -top, _ = quadgk(x -> x*f(x), a,b) -bottom, _ = quadgk(f, a,b) -val = top/bottom -numericq(val) -``` - - -###### Question - -Find the center of mass in the $x$ variable of the region in the first and fourth quadrants bounded by the ellipse $(x/2)^2 + (y/3)^2 = 1$. - -```julia; hold=true; echo=false -f(x) = 3 * sqrt(1 - (x/2)^2) -a, b= 0, 2 - -top, _ = quadgk(x -> x*f(x), a,b) -bottom, _ = quadgk(f, a,b) -val = top/bottom -numericq(val) -``` - -###### Question - -Find the center of mass of the region in the first quadrant bounded by the function $f(x) = x^3(1-x)^4$. - -```julia; hold=true; echo=false -f(x) = x^3 * (1-x)^4 -a, b= 0, 1 - -top, _ = quadgk(x -> x*f(x), a,b) -bottom, _ = quadgk(f, a,b) -val = top/bottom -numericq(val) -``` - -###### Question - -Let $k$ and $\lambda$ be parameters in $(0, \infty)$. The [Weibull](http://en.wikipedia.org/wiki/Weibull_distribution) density is a probability density on $[0, \infty)$ (meaning it is $0$ when $x < 0$ satisfying: - -```math -f(x) = \frac{k}{\lambda}\left(\frac{x}{\lambda}\right)^{k-1} \exp(-(\frac{x}{\lambda})^k) -``` - -For $k=2$ and $\lambda = 2$, compute the mean. (The center of mass, assuming the total area is $1$.) - -```julia; hold=true; echo=false -k, lambda = 2, 2 -f(x) = (k/lambda) * (x/lambda)^(k-1) * exp(-(x/lambda)^k) -a, b = 0, Inf - -top, _ = quadgk(x -> x*f(x), a,b) -bottom, _ = quadgk(f, a,b) -val = top/bottom -numericq(val) -``` - -###### Question - -The [logistic](http://en.wikipedia.org/wiki/Logistic_distribution) density depends on two parameters $m$ and $s$ and is given by: - -```math -f(x) = \frac{1}{4s} \text{sech}(\frac{x-\mu}{2s})^2, \quad -\infty < x < \infty. -``` - -(Where $\text{sech}$ is the hyperbolic secant, implemented in `julia` through `sech`.) - -For $m=2$ and $s=4$ compute the mean, or center of mass, of this density. - -```julia; hold=true; echo=false -m, s = 2, 4 -f(x) = 1/(4s) * sech((x-m)/2s)^2 -a,b = -Inf, Inf - -top, _ = quadgk(x -> x*f(x), a,b) -bottom, _ = quadgk(f, a,b) -val = top/bottom -numericq(val) -``` - -###### Question - -A region is formed by intersecting the area bounded by the circle $x^2 + y^2 = 1$ that lies above the line $y=3/4$. Find the center of mass in the $y$ direction (that of the $x$ direction is $0$ by symmetry). - -```julia; hold=true; echo=false -f(y) = 2*sqrt(1 - y^2) -a, b = 3/4, 1 - - -top, _ = quadgk(x -> x*f(x), a,b) -bottom, _ = quadgk(f, a,b) -val = top/bottom -numericq(val) -``` - -###### Question - -Find the center of mass in the $y$ direction of the area bounded by the cosine curve and the $x$ axis between $-\pi/2$ and $\pi/2$. - -```julia; hold=true; echo=false -f(y) = 2*acos(y) -a, b= 0, 1 - -top, _ = quadgk(x -> x*f(x), a,b) -bottom, _ = quadgk(f, a,b) -val = top/bottom -numericq(val) -``` - -###### Question - -A penny, nickel, dime and quarter are stacked so that their right most edges align and are centered so that the center of mass in the $y$ direction is $0$. Find the center of mass in the $x$ direction. - -```julia; hold=true; echo=false -ds = [0.75, 0.835, 0.705, 0.955] -rs = ds/2 -xs = rs[4] .- rs -ts = range(0,stop=2pi, length=50) -p = plot(legend=false, aspect_ratio=:equal); -for i in 1:4 - plot!(p, xs[i] .+ rs[i]*cos.(ts), rs[i]*sin.(ts)); -end - -p -``` - -You will need some specifications, such as these from the [US Mint](http://www.usmint.gov/about_the_mint/?action=coin_specifications) - -```eval=false - diameter(in) weight(gms) -penny 0.750 2.500 -nickel 0.835 5.000 -dime 0.705 2.268 -quarter 0.955 5.670 - -``` - -(Hint: Though this could be done with integration, it is easier to treat each coin as a single point (its centroid) with the given mass and then apply the formula for sums.) - -```julia; hold=true; echo=false -ds = [0.75, 0.835, 0.705, 0.955] -ms = [2.5, 5, 2.268, 5.670] -rs = ds/2 -xs = rs[4] .- rs -val = sum(ms .* xs) / sum(ms) -numericq(val) -``` diff --git a/CwJ/integrals/figures/beeckman-1618.png b/CwJ/integrals/figures/beeckman-1618.png deleted file mode 100644 index ec478e1..0000000 Binary files a/CwJ/integrals/figures/beeckman-1618.png and /dev/null differ diff --git a/CwJ/integrals/figures/beer_glasses.jpg b/CwJ/integrals/figures/beer_glasses.jpg deleted file mode 100644 index fd7347f..0000000 Binary files a/CwJ/integrals/figures/beer_glasses.jpg and /dev/null differ diff --git a/CwJ/integrals/figures/big-solo-cup.jpg b/CwJ/integrals/figures/big-solo-cup.jpg deleted file mode 100644 index d39058b..0000000 Binary files a/CwJ/integrals/figures/big-solo-cup.jpg and /dev/null differ diff --git a/CwJ/integrals/figures/companion-curve-bisects-rectangle.png b/CwJ/integrals/figures/companion-curve-bisects-rectangle.png deleted file mode 100644 index 9e1c9e3..0000000 Binary files a/CwJ/integrals/figures/companion-curve-bisects-rectangle.png and /dev/null differ diff --git a/CwJ/integrals/figures/cycloid-companion-curve.png b/CwJ/integrals/figures/cycloid-companion-curve.png deleted file mode 100644 index 4d144ce..0000000 Binary files a/CwJ/integrals/figures/cycloid-companion-curve.png and /dev/null differ diff --git a/CwJ/integrals/figures/gehry-hendrix.jpg b/CwJ/integrals/figures/gehry-hendrix.jpg deleted file mode 100644 index 1287bd8..0000000 Binary files a/CwJ/integrals/figures/gehry-hendrix.jpg and /dev/null differ diff --git a/CwJ/integrals/figures/integration-glass.jpg b/CwJ/integrals/figures/integration-glass.jpg deleted file mode 100644 index 8a5fa4c..0000000 Binary files a/CwJ/integrals/figures/integration-glass.jpg and /dev/null differ diff --git a/CwJ/integrals/figures/johns-catenary-details.jpg b/CwJ/integrals/figures/johns-catenary-details.jpg deleted file mode 100644 index 72b092a..0000000 Binary files a/CwJ/integrals/figures/johns-catenary-details.jpg and /dev/null differ diff --git a/CwJ/integrals/figures/johns-catenary.jpg b/CwJ/integrals/figures/johns-catenary.jpg deleted file mode 100644 index 3dbffca..0000000 Binary files a/CwJ/integrals/figures/johns-catenary.jpg and /dev/null differ diff --git a/CwJ/integrals/figures/jump-rope.png b/CwJ/integrals/figures/jump-rope.png deleted file mode 100644 index 1e74bd5..0000000 Binary files a/CwJ/integrals/figures/jump-rope.png and /dev/null differ diff --git a/CwJ/integrals/figures/michelin-man.jpg b/CwJ/integrals/figures/michelin-man.jpg deleted file mode 100644 index 336d98f..0000000 Binary files a/CwJ/integrals/figures/michelin-man.jpg and /dev/null differ diff --git a/CwJ/integrals/figures/oresme-1350-mean-value.png b/CwJ/integrals/figures/oresme-1350-mean-value.png deleted file mode 100644 index 1aa1363..0000000 Binary files a/CwJ/integrals/figures/oresme-1350-mean-value.png and /dev/null differ diff --git a/CwJ/integrals/figures/pacman.png b/CwJ/integrals/figures/pacman.png deleted file mode 100644 index 33b8de4..0000000 Binary files a/CwJ/integrals/figures/pacman.png and /dev/null differ diff --git a/CwJ/integrals/figures/red-solo-cup.jpg b/CwJ/integrals/figures/red-solo-cup.jpg deleted file mode 100644 index 987094f..0000000 Binary files a/CwJ/integrals/figures/red-solo-cup.jpg and /dev/null differ diff --git a/CwJ/integrals/figures/seesaw.png b/CwJ/integrals/figures/seesaw.png deleted file mode 100644 index c36246e..0000000 Binary files a/CwJ/integrals/figures/seesaw.png and /dev/null differ diff --git a/CwJ/integrals/figures/surface-revolution.png b/CwJ/integrals/figures/surface-revolution.png deleted file mode 100644 index 169bd51..0000000 Binary files a/CwJ/integrals/figures/surface-revolution.png and /dev/null differ diff --git a/CwJ/integrals/figures/verrazzano-loaded.jpg b/CwJ/integrals/figures/verrazzano-loaded.jpg deleted file mode 100644 index ab870e3..0000000 Binary files a/CwJ/integrals/figures/verrazzano-loaded.jpg and /dev/null differ diff --git a/CwJ/integrals/figures/verrazzano-unloaded.jpg b/CwJ/integrals/figures/verrazzano-unloaded.jpg deleted file mode 100644 index 87ee20d..0000000 Binary files a/CwJ/integrals/figures/verrazzano-unloaded.jpg and /dev/null differ diff --git a/CwJ/integrals/ftc.jmd b/CwJ/integrals/ftc.jmd deleted file mode 100644 index 232e414..0000000 --- a/CwJ/integrals/ftc.jmd +++ /dev/null @@ -1,1246 +0,0 @@ -# Fundamental Theorem or Calculus - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using Roots -using QuadGK -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Fundamental Theorem or Calculus", - description = "Calculus with Julia: Fundamental Theorem or Calculus", - tags = ["CalculusWithJulia", "integrals", "fundamental theorem or calculus"], -); -fig_size = (800, 600) -nothing -``` - ----- - -We refer to the example from the section on -[transformations](../precalc/transformations.html#two_operators_D_S) -where two operators on functions were defined: - -```math -D(f)(k) = f(k) - f(k-1), \quad S(f)(k) = f(1) + f(2) + \cdots + f(k). -``` - -It was remarked that these relationships hold: $D(S(f))(k) = f(k)$ and -$S(D(f))(k) = f(k) - f(0)$. These being a consequence of the inverse -relationship between addition and subtraction. These two -relationships are examples of a more general pair of relationships -known as the -[Fundamental theorem of calculus](http://en.wikipedia.org/wiki/Fundamental_theorem_of_calculus) or FTC. - - -We will see that with suitable rewriting, the derivative of a function is related to a certain limit of `D(f)` and the definite integral of a function is related to a certain limit of `S(f)`. The addition and subtraction rules encapsulated in the relations of $D(S(f))(k) = f(k)$ and $S(D(f))(k) = f(k) - f(0)$ then generalize to these calculus counterparts. - -The FTC details the interconnectivity between the operations of integration and -differentiation. - -For example: - -> What is the definite integral of the derivative? - -That is, what is $A = \int_a^b f'(x) dx$? (Assume $f'$ is continuous.) - -To investigate, we begin with the right Riemann sum using $h = (b-a)/n$: - -```math -A \approx S_n = \sum_{i=1}^n f'(a + ih) \cdot h. -``` - -But the mean value theorem says that for small $h$ we have $f'(x) \approx (f(x) - f(x-h))/h$. Using this approximation with $x=a+ih$ gives: - -```math -A \approx -\sum_{i=1}^n \left(f(a + ih) - f(a + (i-1)h)\right). -``` - -If we let $g(i) = f(a + ih)$, then the summand above is just $g(i) - g(i-1) = D(g)(i)$ and the above then is just the sum of the $D(g)(i)$s, or: - -```math -A \approx S(D(g))(n) = g(n) - g(0). -``` - -But $g(n) - g(0) = f(a + nh) - f(a + 0h) = f(b) - f(a)$. That is, we -expect that the $\approx$ in the limit becomes $=$, or: - -```math -\int_a^b f'(x) dx = f(b) - f(a). -``` - -This is indeed the case. - -The other question would be - -> What is the derivative of the integral? - -That is, can we find the derivative of $\int_0^x f(u) du$? (The derivative in ``x``, the variable ``u`` is a dummy variable of integration.) - - -Let's look first at the integral using the right-Riemann sum, again using $h=(b-a)/n$: - -```math -\int_a^b f(u) du \approx f(a + 1h)h + f(a + 2h)h + \cdots f(a +nh)h = S(g)(n), -``` - -where we define $g(i) = f(a + ih)h$. In the -above, $n$ relates to $b$, but we could have stopped accumulating at -any value. The analog for $S(g)(k)$ would be $\int_a^x f(u) du$ where -$x = a + kh$. That is we can make a function out of integration by -considering the mapping $(x, \int_a^x f(u) du)$. This might be written -as $F(x) = \int_a^x f(u)du$. With this definition, can we take a -derivative in $x$? - -Again, we fix a large $n$ and let $h=(b-a)/n$. And suppose $x = a + -Mh$ for some $M$. Then writing out the approximations to both the -definite integral and the derivative we have - -```math -\begin{align*} -F'(x) = & \frac{d}{dx} \int_a^x f(u) du \\ -& \approx \frac{F(x) - F(x-h)}{h} \\ -&= \frac{\int_a^x f(u) du - \int_a^{x-h} f(u) du}{h}\\ -& \approx \frac{\left(f(a + 1h)h + f(a + 2h)h + \cdots + f(a + (M-1)h)h + f(a + Mh)h\right)}{h}\\ -&- \quad -\frac{\left(f(a + 1h)h + f(a + 2h)h + \cdots + f(a + (M-1)h)h \right)}{h} \\ -& = \left(f(a + 1h) + \quad f(a + 2h) + \cdots + f(a + (M-1)h) + f(a + Mh)\right)\\ -&- \quad -\left(f(a + 1h) + f(a + 2h) + \cdots + f(a + (M-1)h) \right) \\ -&= f(a + Mh). -\end{align*} -``` - -If $g(i) = f(a + ih)$, then the above becomes - -```math -\begin{align*} -F'(x) & \approx D(S(g))(M) \\ -&= f(a + Mh)\\ -&= f(x). -\end{align*} -``` - - - -That is $F'(x) \approx f(x)$. - -In the limit, then, we would expect that - -```math -\frac{d}{dx} \int_a^x f(u) du = f(x). -``` - - -With these heuristics, we now have: - -> **The fundamental theorem of calculus** -> -> Part 1: Let $f$ be a continuous -> function on a closed interval $[a,b]$ and define $F(x) = \int_a^x -> f(u) du$ for $a \leq x \leq b$. Then $F$ is continuous on $[a,b]$, -> differentiable on $(a,b)$ and moreover, $F'(x) =f(x)$. -> -> Part 2: Now -> suppose $f$ is any integrable function on a closed interval $[a,b]$ -> and $F(x)$ is *any* differentiable function on $[a,b]$ with $F'(x) = -> f(x)$. Then $\int_a^b f(x)dx=F(b)-F(a)$. - - - -!!! note - In Part 1, the integral $F(x) = \int_a^x f(u) du$ is defined for any - Riemann integrable function, $f$. If the function is not continuous, - then it is true the $F$ will be continuous, but it need not be true - that it is differentiable at all points in $(a,b)$. Forming $F$ from - $f$ is a form of *smoothing*. It makes a continuous function out of an - integrable one, a differentiable function from a continuous one, and a - $k+1$-times differentiable function from a $k$-times differentiable - one. - -## Using the fundamental theorem of calculus to evaluate definite integrals - -The major use of the FTC is the computation of $\int_a^b f(x) -dx$. Rather then resort to Riemann sums or geometric arguments, there -is an alternative - *when possible*, find a function $F$ with $F'(x) = f(x)$ and compute $F(b) - F(a)$. - -Some examples: - -* Consider the problem of Archimedes, $\int_0^1 x^2 dx$. Clearly, we - have with $f(x) = x^2$ that $F(x)=x^3/3$ will satisfy the - assumptions of the FTC, so that: - -```math -\int_0^1 x^2 dx = F(1) - F(0) = \frac{1^3}{3} - \frac{0^3}{3} = \frac{1}{3}. -``` - - -* More generally, we know if $n\neq-1$ that if $f(x) = x^{n}$, that -$F(x) = x^{n+1}/(n+1)$ will satisfy $F'(x)=f(x)$, so that - -```math -\int_a^b x^n dx = \frac{b^{n+1} - a^{n+1}}{n+1}, \quad n\neq -1. -``` - -(Well almost! We must be careful to know that $a \cdot b > 0$, as -otherwise we will encounter a place where $f(x)$ may not be -integrable.) - -We note that the above includes the case of a constant, or $n=0$. - - - -What about the case $n=-1$, or $f(x) = 1/x$, that is not covered by -the above? For this special case, it is known that $F(x) = -\log(x)$ (natural log) will have $F'(x) = 1/x$. This gives for $0 < a -< b$: - -```math -\int_a^b \frac{1}{x} dx = \log(b) - \log(a). -``` - - -* Let $f(x) = \cos(x)$. How much area is between $-\pi/2$ and $\pi/2$? - We have that $F(x) = \sin(x)$ will have $F'(x) = f(x)$, so: - -```math -\int_{-\pi/2}^{\pi/2} \cos(x) dx = F(\pi/2) - F(-\pi/2) = 1 - (-1) = 2. -``` - -### An alternate notation for $F(b) - F(a)$ - -The expression $F(b) - F(a)$ is often written in this more compact form: - -```math -\int_a^b f(x) dx = F(b) - F(a) = F(x)\big|_{x=a}^b, \text{ or just expr}\big|_{x=a}^b. -``` - - - -The vertical bar is used for the *evaluation* step, in this case the -$a$ and $b$ mirror that of the definite integral. This notation lends -itself to working inline, as we illustrate with this next problem -where we "know" a function "$F$", so just express it "inline": - -```math -\int_0^{\pi/4} \sec^2(x) dx = \tan(x) \big|_{x=0}^{\pi/4} = 1 - 0 = 1. -``` - -A consequence of this notation is: - -```math -F(x) \big|_{x=a}^b = -F(x) \big|_{x=b}^a. -``` - -This says nothing more than $F(b)-F(a) = -F(a) - (-F(b))$, though more compactly. - -## The indefinite integral - -A function $F(x)$ with $F'(x) = f(x)$ is known as an -*antiderivative* of $f$. For a given $f$, there are infinitely many -antiderivatives: if $F(x)$ is one, then so is $G(x) = F(x) + C$. But - due to the mean value theorem - all antiderivatives for $f$ differ at -most by a constant. - -The **indefinite integral** of $f(x)$ is denoted by: - -```math -\int f(x) dx. -``` - -(There are no limits of integration.) There are two possible -definitions: this refers to the set of *all* antiderivatives, or is -just one of the set of all antiderivatives for $f$. The former gives -rise to expressions such as - -```math -\int x^2 dx = \frac{x^3}{3} + C -``` - -where $C$ is the *constant of integration* and isn't really a fixed -constant, but any possible constant. These notes will follow the lead -of `SymPy` and not give a $C$ in the expression, but instead rely on -the reader to understand that there could be many other possible -expressions given, though all differ by no more than a constant. This means, that -$\int f(x) dx$ refers to *an* antiderivative, not *the* -collection of all antiderivatives. - -### The `integrate` function from `SymPy` - -`SymPy` provides the `integrate` function to perform integration. There are two usages: - -- `integrate(ex, var)` to find an antiderivative - -- `integrate(ex, (var, a, b))` to find the definite integral. This integrates the - expression in the variable `var` from `a` to `b`. - -To illustrate, we have, this call finds an antiderivative: - -```julia; -@syms x -integrate(sin(x),x) -``` - -Whereas this call computes the "area" under $f(x)$ between `a` and `b`: - -```julia; -integrate(sin(x), (x, 0, pi)) -``` - -As does this for a different function: - -```julia; -integrate(acos(1-x), (x, 0, 2)) -``` - -Answers may depend on conditions, as here, where the case ``n=-1`` breaks a pattern: - -```julia; hold=true -@syms x::real n::real -integrate(x^n, x) # indefinite integral -``` - -Answers may depend on specific assumptions: - -```julia; hold=true -@syms u -integrate(abs(u),u) -``` - -Yet - -```julia; hold=true -@syms u::real -integrate(abs(u),u) -``` - -Answers may not be available as elementary functions, but there may be special functions that have special cases. - -```julia; hold=true -@syms x::real -integrate(x / sqrt(1-x^3), x) -``` - -The different cases explored by `integrate` are after the questions. - - - -## Rules of integration - -There are some "rules" of integration that allow integrals to be re-expressed. These follow from the rules of derivatives. - -* The integral of a constant times a function: - -```math -\int c \cdot f(x) dx = c \cdot \int f(x) dx. -``` - -This follows as if $F(x)$ is an antiderivative of $f(x)$, then $[cF(x)]' = c f(x)$ by rules of derivatives. - -* The integral of a sum of functions: - -```math -\int (f(x) + g(x)) dx = \int f(x) dx + \int g(x) dx. -``` - -This follows immediately as if $F(x)$ and $G(x)$ are antiderivatives of $f(x)$ and $g(x)$, then $[F(x) + G(x)]' = f(x) + g(x)$, so the right hand side will have a derivative of $f(x) + g(x)$. - -In fact, this more general form where $c$ and $d$ are constants covers both cases: - -```math -\int (cf(x) + dg(x)) dx = c \int f(x) dx + d \int g(x) dx. -``` - - -This statement is nothing more than the derivative formula -$[cf(x) + dg(x)]' = cf'(x) + dg'(x)$. The product rule gives rise to a -technique called *integration by parts* and the chain rule gives rise -to a technique of *integration by substitution*, but we defer those -discussions to other sections. - -##### Examples - -- The antiderivative of the polynomial $p(x) = a_n x^n + \cdots a_1 x + a_0$ follows from the linearity of the integral and the general power rule: - -```math -\begin{align} -\int (a_n x^n + \cdots a_1 x + a_0) dx -&= \int a_nx^n dx + \cdots \int a_1 x dx + \int a_0 dx \\ -&= a_n \int x^n dx + \cdots + a_1 \int x dx + a_0 \int dx \\ -&= a_n\frac{x^{n+1}}{n+1} + \cdots + a_1 \frac{x^2}{2} + a_0 \frac{x}{1}. -\end{align} -``` - - -- More generally, a [Laurent](https://en.wikipedia.org/wiki/Laurent_polynomial) polynomial allows for terms with negative powers. These too can be handled by the above. For example - -```math -\begin{align} -\int (\frac{2}{x} + 2 + 2x) dx -&= \int \frac{2}{x} dx + \int 2 dx + \int 2x dx \\ -&= 2\int \frac{1}{x} dx + 2 \int dx + 2 \int xdx\\ -&= 2\log(x) + 2x + 2\frac{x^2}{2}. -\end{align} -``` - -- Consider this integral: - -```math -\int_0^\pi 100 \sin(x) dx = F(\pi) - F(0), -``` - -where $F(x)$ is an antiderivative of $100\sin(x)$. But: - -```math -\int 100 \sin(x) dx = 100 \int \sin(x) dx = 100 (-\cos(x)). -``` - -So the answer to the question is - - -```math -\int_0^\pi 100 \sin(x) dx = (100 (-\cos(\pi))) - (100(-\cos(0))) = (100(-(-1))) - (100(-1)) = 200. -``` - - -This seems like a lot of work, and indeed it is more than is needed. The following would be more typical once the rules are learned: - -```math -\int_0^\pi 100 \sin(x) dx = 100(-\cos(x)) \big|_0^{\pi} = 100 \cos(x) \big|_{\pi}^0 = 100(1) - 100(-1) = 200. -``` - -## The derivative of the integral - -The relationship that $[\int_a^x f(u) du]' = f(x)$ is a bit harder to appreciate, as it doesn't help answer many ready made questions. Here we give some examples of its use. - -First, the expression defining an antiderivative, or indefinite integral, is given in term of a definite integral: - -```math -F(x) = \int_a^x f(u) du. -``` - -The value of $a$ does not matter, as long as the integral is defined. - -```julia; hold=true; echo=false; cache=true -##{{{ftc_graph}}} - -function make_ftc_graph(n) - a, b = 2, 3 - ts = range(0, stop=b, length=50) - xs = range(a, stop=b, length=8) - g(x) = x - G(x) = x^2/2 - - xn,xn1 = xs[n:(n+1)] - xbar = (xn+xn1)/2 - rxs = collect(range(xn, stop=xn1, length=2)) - rys = map(g, rxs) - - p = plot(g, 0, b, legend=false, size=fig_size, xlim=(0,3.25), ylim=(0,5)) - plot!(p, [xn, rxs..., xn1], [0,rys...,0], linetype=:polygon, color=:orange) - plot!(p, [xn1, xn1], [G(xn), G(xn1)], color=:orange, alpha = 0.25) - annotate!(p, collect(zip([xn1, xn1], [G(xn1)/2, G(xn1)], ["A", "A"]))) - - p -end - -caption = L""" - -Illustration showing $F(x) = \int_a^x f(u) du$ is a function that -accumulates area. The value of $A$ is the area over $[x_{n-1}, x_n]$ -and also the difference $F(x_n) - F(x_{n-1})$. - -""" - -n = 7 - -anim = @animate for i=1:n - make_ftc_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - - -The picture for this, for non-negative $f$, is of accumulating area as -$x$ increases. It can be used to give insight into some formulas: - -For any function, we know that $F(b) - F(c) + F(c) - F(a) = F(b) - F(a)$. For this specific function, this translates into this property of the integral: - -```math -\int_a^b f(x) dx = \int_a^c f(x) dx + \int_c^b f(x) dx. -``` - -Similarly, $\int_a^a f(x) dx = F(a) - F(a) = 0$ follows. - -To see that the value of $a$ does not matter, consider $a_0 < a_1$. Then we have with - -```math -F(x) = \int_{a_0}^x f(u)du, \quad G(x) = \int_{a_0}^x f(u)du, -``` - -That $F(x) = G(x) + \int_{a_0}^{a_1} f(u) du$. The additional part may -look complicated, but the point is that as far as $x$ is involved, it -is a constant. Hence both $F$ and $G$ are antiderivatives if either -one is. - -##### Example - -From the familiar formula rate $\times$ time $=$ distance, we "know," -for example, that a car traveling 60 miles an hour for one hour will -have traveled 60 miles. This allows us to translate statements about -the speed (or more generally velocity) into statements about position -at a given time. If the speed is not constant, we don't have such an -easy conversion. - -Suppose our velocity at time $t$ is $v(t)$, and always positive. We -want to find the position at time $t$, $x(t)$. Let's assume $x(0) = -0$. Let $h$ be some small time step, say $h=(t - 0)/n$ for some large -$n>0$. Then we can *approximate* $v(t)$ between -$[ih, (i+1)h)$ by $v(ih)$. This is a constant so the change in position over the time interval $[ih, (i+1)h)$ would simply be $v(ih) \cdot h$, and ignoring the accumulated errors, the approximate position at time $t$ would be found by adding this pieces together: $x(t) \approx v(0h)\cdot h + v(1h)\cdot h + v(2h) \cdot h + \cdots + v(nh)h$. But we recognize this (as did [Beeckman](http://www.math.harvard.edu/~knill/teaching/math1a_2011/exhibits/bressoud/) -in 1618) as nothing more than an approximation for the Riemann sum of -$v$ over the interval $[0, t]$. That is, we expect: - -```math -x(t) = \int_0^t v(u) du. -``` - -Hopefully this makes sense: our position is the result of accumulating -our change in position over small units of time. The old -one-foot-in-front-of-another approach to walking out the door. - -The above was simplified by the assumption that $x(0) = 0$. What if $x(0) = x_0$ for some non-zero value. Then the above is not exactly correct, as $\int_0^0 v(u) du = 0$. So instead, we might write this more concretely as: - -```math -x(t) = x_0 + \int_0^t v(u) du. -``` - -There is a similar relationship between velocity and acceleration, but let's think about it formally. If we know that the acceleration is the rate of change of velocity, then we have $a(t) = v'(t)$. By the FTC, then - -```math -\int_0^t a(u) du = \int_0^t v'(t) = v(t) - v(0). -``` - -Rewriting gives a similar statement as before: - -```math -v(t) = v_0 + \int_0^t a(u) du. -``` - - -##### Example - -In probability theory, for a positive, continuous random variable, the -probability that the random value is less than $a$ is given by $P(X -\leq a) = F(a) = \int_{0}^a f(x) dx$. (Positive means the integral -starts at $0$, whereas in general it could be $-\infty$, a minor complication that -we haven't yet discussed.) - -For example, the exponential distribution with rate $1$ has $f(x) = e^{-x}$. Compute $F(x)$. - -This is just $F(x) = \int_0^x e^{-u} du = -e^{-u}\big|_0^x = 1 - e^{-x}$. - -The "uniform" distribution on $[a,b]$ has - -```math -F(x) = -\begin{cases} -0 & x < a\\ -\frac{x-a}{b-a} & a \leq x \leq b\\ -1 & x > b -\end{cases} -``` - -Find $f(x)$. There are some subtleties here. If we assume that $F(x) = \int_0^x f(u) du$ then we know if $f(x)$ is continuous that $F'(x) = f(x)$. Differentiating we get - -```math -f(x) = \begin{cases} -0 & x < a\\ -\frac{1}{b-a} & a < x < b\\ -0 & x > b -\end{cases} -``` - -However, the function $f$ is *not* continuous on $[a,b]$ and $F'(x)$ is not -differentiable on $(a,b)$. It is true that $f$ is integrable, and -where $F$ is differentiable $F'=f$. So $f$ is determined except -possibly at the points $x=a$ and $x=b$. - -##### Example - -The error function is defined by $\text{erf}(x) = 2/\sqrt{\pi}\int_0^x e^{-u^2} -du$. It is implemented in `Julia` through `erf`. Suppose, we were to -ask where it takes on it's maximum value, what would we find? - -The answer will either be at a critical point, at $0$ or as $x$ goes to $\infty$. We can differentiate to find critical points: - -```math -[\text{erf}(x)] = \frac{2}{\pi}e^{-x^2}. -``` - -Oh, this is never $0$, so there are no critical points. The maximum occurs at $0$ or as $x$ goes to $\infty$. Clearly at $0$, we have $\text{erf}(0)=0$, so the answer will be as $x$ goes to $\infty$. - -In retrospect, this is a silly question. As $f(x) > 0$ for all $x$, we -*must* have that $F(x)$ is strictly increasing, so never gets to a -local maximum. - -##### Example - -The [Dawson](http://en.wikipedia.org/wiki/Dawson_function) function is - -```math -F(x) = e^{-x^2} \int_0^x e^{t^2} dt -``` - -Characterize any local maxima or minima. - -For this we need to consider the product rule. The fundamental theorem of calculus will help with the right-hand side. We have: - -```math -F'(x) = (-2x)e^{-x^2} \int_0^x e^{t^2} dt + e^{-x^2} e^{x^2} = -2x F(x) + 1 -``` - -We need to figure out when this is $0$. For that, we use some numeric math. - -```julia; -F(x) = exp(-x^2) * quadgk(t -> exp(t^2), 0, x)[1] -Fp(x) = -2x*F(x) + 1 -cps = find_zeros(Fp, -4, 4) -``` - -We could take a second derivative to characterize. For that we use -$F''(x) = [-2xF(x) + 1]' = -2F(x) + -2x(-2xF(x) + 1)$, so - -```julia; -Fpp(x) = -2F(x) + 4x^2*F(x) - 2x -Fpp.(cps) -``` - -The first value being positive says there is a relative minimum at $-0.924139$, at $0.924139$ there is a relative maximum. - - -##### Example - -Returning to probability, suppose there are ``n`` positive random numbers ``X_1``, ``X_2``, ..., ``X_n``. A natural question might be to ask what formulas describes the largest of these values, assuming each is identical in some way. A description that is helpful is to define ``F(a) = P(X \leq a)`` for some random number ``X``. That is the probability that ``X`` is less than or equal to ``a`` is ``F(a)``. For many situations, there is a *density* function, ``f``, for which ``F(a) = \int_0^a f(x) dx``. - -Under assumptions that the ``X`` are identical and independent, the largest value, ``M``, may b -characterized by ``P(M \leq a) = \left[F(a)\right]^n``. Using ``f`` and ``F`` describe the derivative of this expression. - -This problem is constructed to take advantage of the FTC, and we have: - -```math -\begin{align*} -\left[P(M \leq a)\right]' -&= \left[F(a)^n\right]'\\ -&= n \cdot F(a)^{n-1} \left[F(a)\right]'\\ -&= n F(a)^{n-1}f(a) -\end{align*} -``` - -##### Example - -Suppose again probabilities of a random number between ``0`` and ``1``, say, are given by a positive, continuous function ``f(x)``on ``(0,1)`` by ``F(a) = P(X \leq a) = \int_0^a f(x) dx``. The median value of the random number is a value of ``a`` for which ``P(X \leq a) = 1/2``. Such an ``a`` makes ``X`` a coin toss -- betting if ``X`` is less than ``a`` is like betting on heads to come up. More generally the ``q``th quantile of ``X`` is a number ``a`` with ``P(X \leq a) = q``. The definition is fine, but for a given ``f`` and ``q`` can we find ``a``? - -Abstractly, we are solving ``F(a) = q`` or ``F(a)-q = 0`` for ``a``. That is, this is a zero-finding question. We have discussed different options for this problem: bisection, a range of derivative free methods, and Newton's method. As evaluating ``F`` involves an integral, which may involve many evaluations of ``f``, a method which converges quickly is preferred. For that, Newton's method is a good idea, it having quadratic convergence in this case, as ``a`` is a simple zero given that ``F`` is increasing under the assumptions above. - -Newton's method involves the update step `x = x - f(x)/f'(x)`. For this "``f``" is ``h(x) = \int_0^x f(u) du - q``. The derivative is easy, the FTC just applies: ``h'(x) = f(x)``; no need for automatic differentiation, which may not even apply to this setup. - -To do a concrete example, we take the [Beta](https://en.wikipedia.org/wiki/Beta_distribution)(``\alpha, \beta``) distribution (``\alpha, \beta > 0``) which has density, ``f``, over ``[0,1]`` given by - -```math -f(x) = x^{\alpha-1}\cdot (1-x)^{\beta-1} \cdot \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} -``` - -The Wikipedia link above gives an approximate answer for the median of ``(\alpha-1/3)/(\alpha+\beta-2/3)`` when ``\alpha,\beta > 1``. Let's see how correct this is when ``\alpha=5`` and ``\beta=6``. The `gamma` function used below implements ``\Gamma``. It is in the `SpecialFunctions` package, which is loaded with the `CalculusWithJulia` package. - -```julia -alpha, beta = 5,6 -f(x) = x^(alpha-1)*(1-x)^(beta-1) * gamma(alpha + beta) / (gamma(alpha) * gamma(beta)) -q = 1/2 -h(x) = first(quadgk(f, 0, x)) - q -hp(x) = f(x) - -x0 = (alpha-1/3)/(alpha + beta - 2/3) -xstar = find_zero((h, hp), x0, Roots.Newton()) - -xstar, x0 -``` - -The asymptotic answer agrees with the answer in the first four decimal places. - -As an aside, we ask how many function evaluations were taken? We can track this with a trick - using a closure to record when ``f`` is called: - -```julia -function FnWrapper(f) - ctr = 0 - function(x) - ctr += 1 - f(x) - end -end -``` - -Then we have the above using `FnWrapper(f)` in place of `f`: - -```julia; hold=true -ff = FnWrapper(f) -F(x) = first(quadgk(ff, 0, x)) -h(x) = F(x) - q -hp(x) = ff(x) -xstar = find_zero((h, hp), x0, Roots.Newton()) -xstar, ff.ctr -``` - -So the answer is the same. Newton's method converged in 3 steps, and called `h` or `hp` 5 times. - -Assuming the number inside `Core.Box` is the value of `ctr`, we see not so many function calls, just ``48``. - -Were `f` very expensive to compute or `h` expensive to compute (which can happen -if, say, `f` were highly oscillatory) then steps could be made to cut -this number down, such as evaluating ``F(x_n) = \int_{x_0}^{x_n} f(x) -dx``, using linearity, as ``\int_0^{x_0} f(x) dx + -\int_{x_0}^{x_1}f(x)dx + \int_{x_1}^{x_2}f(x)dx + \cdots + -\int_{x_{n-1}}^{x_n}f(x)dx``. Then all but the last term could be -stored from the previous steps of Newton's method. The last term presumably being less costly as it would typically involve a small interval. - -!!! note - The trick using a closure relies on an internal way of accessing elements in a closure. The same trick could be implemented many different ways which aren't reliant on undocumented internals, this approach was just a tad more convenient. It shouldn't be copied for work intended for distribution, as the internals may change without notice or deprecation. - - -##### Example - -A junior engineer at `Treadmillz.com` is tasked with updating the -display of calories burned for an older-model treadmill. The old display -involved a sequence of LED "dots" that updated each minute. The last -10 minutes were displayed. Each dot corresponded to one calorie -burned, so the total number of calories burned in the past 10 minutes -was the number of dots displayed, or the sum of each column of dots. -An example might be: - -```julia; eval=false - - ** - **** - ***** - ******** -********** -``` - - -In this example display there was 1 calorie burned in the first minute, then -2, then 5, 5, 4, 3, 2, 2, 1. The total is $24$. - - -In her work the junior engineer found this old function for updating the display - -```julia; eval=false -function cnew = update(Cnew, Cold) - cnew = Cnew - Cold -end -``` - -She discovered that the function was written awhile ago, and in MATLAB. The function -receives the values `Cnew` and `Cold` which indicate the *total* -number of calories burned up until that time frame. The value `cnew` -is the number of calories burned in the minute. (Some other engineer -has cleverly figured out how many calories have been burned during the -time on the machine.) - -The new display will have twice as many dots, so the display can be -updated every 30 seconds and still display 10 minutes worth of data. What should the -`update` function now look like? - -Her first attempt was simply to rewrite the function in `Julia`: - -```julia; -function update₁(Cnew, Cold) - cnew = Cnew - Cold -end -``` - -This has the advantage that each "dot" still represents a calorie -burned, so that a user can still count the dots to see the total burned -in the past 10 minutes. - - -```julia; eval=false - - - - * * - ****** * - ************* * -``` - - -Sadly though, users didn't like it. Instead of a set of dots being, -say, 5 high, they were now 3 high and 2 high. It "looked" like they -were doing less work! What to do? - -The users actually were not responding to the number of dots, which -hadn't changed, but rather the *area* that they represented - and -this shrank in half. (It is much easier to visualize area than count dots when tired.) -How to adjust for that? - -Well our engineer knew - double the dots and count each as half a -calorie. This makes the "area" constant. She also generalized letting -`n` be the number of updates per minute, in anticipation of even -further improvements in the display technology: - -```julia; -function update(Cnew, Cold, n) - cnew = (Cnew - Cold) * n -end -``` - -Then the "area" represented by the dots stays fixed over this time frame. - -The engineer then thought a bit more, as the form of her answer seemed -familiar. She decides to parameterize it in terms of $t$ and found with -$h=1/n$: `c(t) = (C(t) - C(t-h))/h`. Ahh - the derivative -approximation. But then what is the "area"? It is no longer just the -sum of the dots, but in terms of the functions she finds that each -column represents $c(t)\cdot h$, and the sum is just $c(t_1)h + -c(t_2)h + \cdots c(t_n)h$ which looks like an approximate integral. - -If the display were to reach the modern age and replace LED "dots" -with a higher-pixel display, then the function to display would be $c(t) -= C'(t)$ and the area displayed would be $\int_{t-10}^t c(u) du$. - - -Thinking a bit harder, she knows that her `update` function is getting -$C(t)$, and displaying the *rate* of calorie burn leads to the area -displayed being interpretable as the total calories burned between $t$ -and $t-10$ (or $C(t)-C(t-10)$) by the fundamental theorem of calculus. - -## Questions - -###### Question - -If $F(x) = e^{x^2}$ is an antiderivative for $f$, find $\int_0^2 f(x) dx$. - -```julia; hold=true; echo=false -F(x) = exp(x^2) -val = F(2) - F(0) -numericq(val) -``` - -###### Question - -If $\sin(x) - x\cos(x)$ is an antiderivative for $x\sin(x)$, find the following integral $\int_0^\pi x\sin(x) dx$. - -```julia; hold=true; echo=false -F(x) = sin(x) - x*cos(x) -a, b= 0, pi -val = F(b) - F(a) -numericq(val) -``` - -###### Question - -Find an antiderivative then evaluate $\int_0^1 x(1-x) dx$. - -```julia; hold=true; echo=false -f(x) = x*(1-x) -a,b = 0, 1 -F(x) = x^2/2 - x^3/3 -val = F(b) - F(a) -numericq(val) -``` - -###### Question - -Use the fact that $[e^x]' = e^x$ to evaluate $\int_0^e (e^x - 1) dx$. - -```julia; hold=true; echo=false -f(x) = exp(x) - 1 -a, b = 0, exp(1) -F(x) = exp(x) - x -val = F(b) - F(a) -numericq(val) -``` - -###### Question - -Find the value of $\int_0^1 (1-x^2/2 + x^4/24) dx$. - -```julia; hold=true; echo=false -f(x) = 1 - x^2/2 + x^4/24 -a, b = 0, 1 -val, _ = quadgk(f, a, b) -numericq(val) -``` - -###### Question - -Using `SymPy`, what is an antiderivative for $x^2 \sin(x)$? - -```julia; hold=true; echo=false -choices = [ -"``-x^2\\cos(x)``", -"``-x^2\\cos(x) + 2x\\sin(x)``", -"``-x^2\\cos(x) + 2x\\sin(x) + 2\\cos(x)``" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -Using `SymPy`, what is an antiderivative for $xe^{-x}$? - -```julia; hold=true; echo=false -choices = [ -"``-e^{-x}``", -"``-xe^{-x}``", -"``-(1+x) e^{-x}``", -"``-(1 + x + x^2) e^{-x}``" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Using `SymPy`, integrate the function $\int_0^{2\pi} e^x \cdot \sin(x) dx$. - -```julia; hold=true; echo=false -@syms x -val = N(integrate(exp(x) * sin(x), (x, 0, 2pi))) -numericq(val) -``` - - -###### Question - -A particle has velocity $v(t) = 2t^2 - t$ between $0$ and $1$. If $x(0) = 0$, find the position $x(1)$. - -```julia; hold=true; echo=false -v(t) = 2t^2 - t -f(x) = quadgk(v, 0, x)[1] - 0 -numericq(f(1)) -``` - - -###### Question - -A particle has acceleration given by $\sin(t)$ between $0$ and -$\pi$. If the initial velocity is $v(0) = 0$, find $v(\pi/2)$. - -```julia; hold=true; echo=false -f(x) = quadgk(sin, 0, x)[1] - 0 -numericq(f(pi/2)) -``` - - -###### Question - -The position of a particle is given by $x(t) = \int_0^t g(u) du$, -where $x(0)=0$ and $g(u)$ is given by this piecewise linear graph: - -```julia; hold=true; echo=false -function g1(x) - if x < 2 - -1 + x - elseif 2 < x < 3 - 1 - else - 1 + (1/2)*(x-3) - end - end -plot(g1, 0, 5) -``` - -* The velocity of the particle is positive over: - -```julia; hold=true; echo=false -choices = [ -"It is always positive", -"It is always negative", -L"Between $0$ and $1$", -L"Between $1$ and $5$" -] -answ = 4 -radioq(choices, answ, keep_order=true) -``` - -* The position of the particle is $0$ at $t=0$ and: - -```julia; hold=true; echo=false -choices = [ -"``t=1``", -"``t=2``", -"``t=3``", -"``t=4``"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -* The position of the particle at time $t=5$ is? - -```julia; hold=true; echo=false -val = 4 -numericq(val) -``` - -* On the interval $[2,3]$: - -```julia; hold=true; echo=false -choices = [ -L"The position, $x(t)$, stays constant", -L"The position, $x(t)$, increases with a slope of $1$", -L"The position, $x(t)$, increases quadratically from $-1/2$ to $1$", -L"The position, $x(t)$, increases quadratically from $0$ to $1$" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Let $F(x) = \int_{t-10}^t f(u) du$ for $f(u)$ a positive, continuous function. What is $F'(t)$? - -```julia; hold=true; echo=false -choices = [ -"``f(t)``", -"``-f(t-10)``", -"``f(t) - f(t-10)``" -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Suppose $f(x) \geq 0$ and $F(x) = \int_0^x f(u) du$. $F(x)$ is continuous and so has a maximum value on the interval $[0,1]$ taken at some $c$ in $[0,1]$. It is - -```julia; hold=true; echo=false -choices = [ -"At a critical point", -L"At the endpoint $0$", -L"At the endpoint $1$"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Let $F(x) = \int_0^x f(u) du$, where $f(x)$ is given by the graph below. Identify the $x$ values of all *relative maxima* of $F(x)$. Explain why you know these are the values. - -```julia; hold=true; echo=false -xs = [0,1,2,3,4,5,6,7,8,9,10] -ys = [-1,0,1,0,-1,0,1/2, 0, 1/2, 0, -1] -plot(xs, ys , linewidth=3, legend=false, xticks=0:10) -``` - -```julia; hold=true; echo=false -choices = [ -"The derivative of ``F`` is ``f``, so by the first derivative test, ``x=1,5``", -"The derivative of ``F`` is ``f``, so by the first derivative test, ``x=3, 9``", -"The derivative of ``F`` is ``f``, so by the second derivative test, ``x=7``", -"The graph of ``f`` has relative maxima at ``x=2,6,8``" -] -answ = 2 -radioq(choices, answ) -``` - - -###### Question - -Suppose $f(x)$ is monotonically decreasing with $f(0)=1$, $f(1/2) = 0$ and $f(1) = -1$. Let $F(x) = \int_0^x f(u) du$. $F(x)$ is continuous and so has a maximum value on the interval $[0,1]$ taken at some $c$ in $[0,1]$. It is - -```julia; hold=true; echo=false -choices = [ -L"At a critical point, either $0$ or $1$", -L"At a critical point, $1/2$", -L"At the endpoint $0$", -L"At the endpoint $1$"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Barrow presented a version of the fundamental theorem of calculus in a -1670 volume edited by Newton, Barrow's student -(cf. [Wagner](http://www.maa.org/sites/default/files/0746834234133.di020795.02p0640b.pdf)). His version can be stated as follows (cf. [Jardine](http://www.maa.org/publications/ebooks/mathematical-time-capsules)): - -Consider the following figure where $f$ is a strictly increasing -function with $f(0) = 0$. and $x > 0$. The function $A(x) = \int_0^x -f(u) du$ is also plotted. The point $Q$ is $f(x)$, and the point $P$ -is $A(x)$. The point $T$ is chosen to so that the length between $T$ -and $x$ times the length between $Q$ and $x$ equals the length from -$P$ to $x$. ($\lvert Tx \rvert \cdot \lvert Qx \rvert = \lvert Px -\rvert$.) Barrow showed that the line segment $PT$ is tangent to the graph of -$A(x)$. This figure illustrates the labeling for some function: - -```julia; hold=true; echo=false -f(x) = x^(2/3) -x = 2 -A(x) = quadgk(f, 0, x)[1] -m=f(x) -T = x - A(x)/f(x) -Q = f(x) -P = A(x) -secpt = u -> 0 + P/(x-T) * (u-T) -xs = range(0, stop=x+1/4, length=50 -) -p = plot(f, 0, x + 1/4, legend=false) -plot!(p, A, 0, x + 1/4, color=:red) -scatter!(p, [T, x, x, x], [0, 0, Q, P], color=:orange) -annotate!(p, collect(zip([T, x, x+.1, x+.1], [0-.15, 0-.15, Q-.1, P], ["T", "x", "Q", "P"]))) -plot!(p, [T-1/4, x+1/4], map(secpt, [T-1/4, x + 1/4]), color=:orange) -plot!(p, [T, x, x], [0, 0, P], color=:green) - -p -``` - -The fact that $\lvert Tx \rvert \cdot \lvert Qx \rvert = \lvert Px -\rvert$ says what in terms of $f(x)$, $A(x)$ and $A'(x)$? - -```julia; hold=true; echo=false -choices = [ -"``\\lvert Tx \\rvert \\cdot f(x) = A(x)``", -"``A(x) / \\lvert Tx \\rvert = A'(x)``", -"``A(x) \\cdot A'(x) = f(x)``" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -The fact that $\lvert PT \rvert$ is tangent says what in terms of $f(x)$, $A(x)$ and $A'(x)$? - - -```julia; hold=true; echo=false -choices = [ -"``\\lvert Tx \\rvert \\cdot f(x) = A(x)``", -"``A(x) / \\lvert Tx \\rvert = A'(x)``", -"``A(x) \\cdot A'(x) = f(x)``" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -Solving, we get: - -```julia; hold=true; echo=false -choices = [ -"``A'(x) = f(x)``", -"``A(x) = A^2(x) / f(x)``", -"``A'(x) = A(x)``", -"``A(x) = f(x)``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -According to [Bressoud](http://www.math.harvard.edu/~knill/teaching/math1a_2011/exhibits/bressoud/) "Newton observes that the rate of change of an accumulated quantity is the rate at which that quantity is accumulating". Which part of the FTC does this refer to: - -```julia; hold=true; echo=false -choices = [ -L"Part 1: $[\int_a^x f(u) du]' = f$", -L"Part 2: $\int_a^b f(u) du = F(b)- F(a)$."] -answ=1 -radioq(choices, answ, keep_order=true) -``` - - -## More on SymPy's `integrate` - -Finding the value of a definite integral through the fundamental theorem of calculus relies on the algebraic identification of an antiderivative. This is difficult to do by hand and by computer, and is complicated by the fact that not every [elementary ](https://en.wikipedia.org/wiki/Elementary_function)function has an elementary antiderivative. -`SymPy`'s documentation on integration indicates that several different means to integrate a function are used internally. As it is of interest here, it is copied with just minor edits below (from an older version of SymPy): - - -#### Simple heuristics (based on pattern matching and integral table): - -* most frequently used functions (e.g. polynomials, products of trigonometric functions) - -#### Integration of rational functions: - -* A complete algorithm for integrating rational functions is - implemented (the Lazard-Rioboo-Trager algorithm). The - algorithm also uses the partial fraction decomposition - algorithm implemented in `apart` as a preprocessor to make - this process faster. Note that the integral of a rational - function is always elementary, but in general, it may include - a `RootSum`. - -#### Full Risch algorithm: - -* The Risch algorithm is a complete decision procedure for integrating - elementary functions, which means that given any elementary - function, it will either compute an elementary - antiderivative, or else prove that none exists. Currently, - part of transcendental case is implemented, meaning - elementary integrals containing exponentials, logarithms, and - (soon!) trigonometric functions can be computed. The - algebraic case, e.g., functions containing roots, is much - more difficult and is not implemented yet. - -* If the routine fails (because the integrand is not elementary, or - because a case is not implemented yet), it continues on to - the next algorithms below. If the routine proves that the - integrals is nonelementary, it still moves on to the - algorithms below, because we might be able to find a - closed-form solution in terms of special functions. If - `risch=true`, however, it will stop here. - -#### The Meijer G-Function algorithm: - -* This algorithm works by first rewriting the integrand in terms of - very general Meijer G-Function (`meijerg` in `SymPy`), - integrating it, and then rewriting the result back, if - possible. This algorithm is particularly powerful for - definite integrals (which is actually part of a different - method of Integral), since it can compute closed-form - solutions of definite integrals even when no closed-form - indefinite integral exists. But it also is capable of - computing many indefinite integrals as well. - -* Another advantage of this method is that it can use some results - about the Meijer G-Function to give a result in terms of a - Piecewise expression, which allows to express conditionally - convergent integrals. - -* Setting `meijerg=true` will cause `integrate` to use only this - method. - -#### The "manual integration" algorithm: - -* This algorithm tries to mimic how a person would find an - antiderivative by hand, for example by looking for a - substitution or applying integration by parts. This algorithm - does not handle as many integrands but can return results in a - more familiar form. - -* Sometimes this algorithm can evaluate parts of an integral; in - this case `integrate` will try to evaluate the rest of the - integrand using the other methods here. - -* Setting `manual=true` will cause `integrate` to use only this - method. - -#### The Heuristic Risch algorithm: - -* This is a heuristic version of the Risch algorithm, meaning that - it is not deterministic. This is tried as a last resort because - it can be very slow. It is still used because not enough of the - full Risch algorithm is implemented, so that there are still some - integrals that can only be computed using this method. The goal - is to implement enough of the Risch and Meijer G-function methods - so that this can be deleted. - Setting `heurisch=true` will cause `integrate` to use only this - method. Set `heurisch=false` to not use it. diff --git a/CwJ/integrals/improper_integrals.jmd b/CwJ/integrals/improper_integrals.jmd deleted file mode 100644 index 778d578..0000000 --- a/CwJ/integrals/improper_integrals.jmd +++ /dev/null @@ -1,507 +0,0 @@ -# Improper Integrals - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using QuadGK -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Improper Integrals", - description = "Calculus with Julia: Improper Integrals", - tags = ["CalculusWithJulia", "integrals", "improper integrals"], -); -fig_size=(800, 600) - -nothing -``` - ----- - -A function $f(x)$ is Riemann integrable over an interval $[a,b]$ if -some limit involving Riemann sums exists. This limit will fail to exist if -$f(x) = \infty$ in $[a,b]$. As well, the Riemann sum idea is undefined if either $a$ -or $b$ (or both) are infinite, so the limit won't exist in this case. - -To define integrals with either functions having singularities or infinite domains, the idea of an improper integral is -introduced with definitions to handle the two cases above. - -```julia; hold=true; echo=false; cache=true -### {{{sqrt_graph}}} - -function make_sqrt_x_graph(n) - - b = 1 - a = 1/2^n - xs = range(1/2^8, stop=b, length=250) - x1s = range(a, stop=b, length=50) - f(x) = 1/sqrt(x) - val = N(integrate(f, 1/2^n, b)) - title = "area under f over [1/$(2^n), $b] is $(rpad(round(val, digits=2), 4))" - - plt = plot(f, range(a, stop=b, length=251), xlim=(0,b), ylim=(0, 15), legend=false, size=fig_size, title=title) - plot!(plt, [b, a, x1s...], [0, 0, map(f, x1s)...], linetype=:polygon, color=:orange) - - plt - - -end -caption = L""" - -Area under $1/\sqrt{x}$ over $[a,b]$ increases as $a$ gets closer to $0$. Will it grow unbounded or have a limit? - -""" -n = 10 -anim = @animate for i=1:n - make_sqrt_x_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - -## Infinite domains - -Let $f(x)$ be a reasonable function, so reasonable that for any $a < -b$ the function is Riemann integrable, meaning $\int_a^b f(x)dx$ -exists. - -What needs to be the case so that we can discuss the integral over the entire real number line? - -Clearly something. The function $f(x) = 1$ is reasonable by the idea -above. Clearly the integral over and $[a,b]$ is just $b-a$, but the -limit over an unbounded domain would be $\infty$. Even though limits -of infinity can be of interest in some cases, not so here. What will -ensure that the area is finite over an infinite region? - - -Or is that even the right question. Now consider $f(x) = \sin(\pi -x)$. Over every interval of the type $[-2n, 2n]$ the area is $0$, and -over any interval, $[a,b]$ the area never gets bigger than $2$. But -still this function does not have a well defined area on an infinite -domain. - -The right question involves a limit. Fix a finite $a$. We define the definite integral over $[a,\infty)$ to be - -```math -\int_a^\infty f(x) dx = \lim_{M \rightarrow \infty} \int_a^M f(x) dx, -``` - -when the limit exists. Similarly, we define the definite integral over $(-\infty, a]$ through - -```math -\int_{-\infty}^a f(x) dx = \lim_{M \rightarrow -\infty} \int_M^a f(x) dx. -``` - -For the interval $(-\infty, \infty)$ we have need *both* these limits to exist, and then: - - -```math -\int_{-\infty}^\infty f(x) dx = \lim_{M \rightarrow -\infty} \int_M^a f(x) dx + \lim_{M \rightarrow \infty} \int_a^M f(x) dx. -``` - -!!! note - When the integral exists, it is said to *converge*. If it doesn't exist, it is said to *diverge*. - -##### Examples - - -* The function $f(x) = 1/x^2$ is integrable over $[1, \infty)$, as this limit exists: - -```math -\lim_{M \rightarrow \infty} \int_1^M \frac{1}{x^2}dx = \lim_{M \rightarrow \infty} -\frac{1}{x}\big|_1^M -= \lim_{M \rightarrow \infty} 1 - \frac{1}{M} = 1. -``` - -* The function $f(x) = 1/x^{1/2}$ is not integrable over $[1, \infty)$, as this limit fails to exist: - -```math -\lim_{M \rightarrow \infty} \int_1^M \frac{1}{x^{1/2}}dx = \lim_{M \rightarrow \infty} \frac{x^{1/2}}{1/2}\big|_1^M -= \lim_{M \rightarrow \infty} 2\sqrt{M} - 2 = \infty. -``` - -The limit is infinite, so does not exist except in an extended sense. - -* The function $x^n e^{-x}$ for $n = 1, 2, \dots$ is integrable over $[0,\infty)$. - -Before showing this, we recall the fundamental theorem of calculus. The limit existing is the same as saying the limit of $F(M) - F(a)$ exists for an antiderivative of $f(x)$. - -For this particular problem, it can be shown by integration by parts that for positive, integer values of $n$ that an antiderivative exists of the form $F(x) = p(x)e^{-x}$, where $p(x)$ is a polynomial of degree $n$. But we've seen that for any $n>0$, $\lim_{x \rightarrow \infty} x^n e^{-x} = 0$, so the same is true for any polynomial. So, $\lim_{M \rightarrow \infty} F(M) - F(1) = -F(1)$. - - -* The function $e^x$ is integrable over $(-\infty, a]$ but not -$[a, \infty)$ for any finite $a$. This is because, $F(M) = e^x$ and this has a limit as $x$ goes to $-\infty$, but not $\infty$. - - -* Let $f(x) = x e^{-x^2}$. This function has an integral over $[0, \infty)$ and more generally $(-\infty, \infty)$. To see, we note that as it is an odd function, the area from $0$ to $M$ is the opposite sign of that from $-M$ to $0$. So $\lim_{M \rightarrow \infty} (F(M) - F(0)) = \lim_{M \rightarrow -\infty} (F(0) - (-F(\lvert M\lvert)))$. We only then need to investigate the one limit. But we can see by substitution with $u=x^2$, that an antiderivative is $F(x) = (-1/2) \cdot e^{-x^2}$. Clearly, $\lim_{M \rightarrow \infty}F(M) = 0$, so the answer is well defined, and the area from $0$ to $\infty$ is just $e/2$. From $-\infty$ to $0$ it is $-e/2$ and the total area is $0$, as the two sides "cancel" out. - -* Let $f(x) = \sin(x)$. Even though $\lim_{M \rightarrow \infty} (F(M) - F(-M) ) = 0$, this function is not integrable. The fact is we need *both* the limit $F(M)$ and $F(-M)$ to exist as $M$ goes to $\infty$. In this case, even though the area cancels if $\infty$ is approached at the same rate, this isn't sufficient to guarantee the two limits exists independently. - - - -* Will the function $f(x) = 1/(x\cdot(\log(x))^2)$ have an integral over $[e, \infty)$? - -We first find an antiderivative using the $u$-substitution $u(x) = \log(x)$: - -```math -\int_e^M \frac{e}{x \log(x)^{2}} dx -= \int_{\log(e)}^{\log(M)} \frac{1}{u^{2}} du -= \frac{-1}{u} \big|_{1}^{\log(M)} -= \frac{-1}{\log(M)} - \frac{-1}{1} -= 1 - \frac{1}{M}. -``` - -As $M$ goes to $\infty$, this will converge to $1$. - - - -* The sinc function $f(x) = \sin(\pi x)/(\pi x)$ does not have a nice antiderivative. Seeing if the limit exists is a bit of a problem. However, this function is important enough that there is a built-in function, `Si`, that computes $\int_0^x \sin(u)/u\cdot du$. This function can be used through `sympy.Si(...)`: - -```julia; -@syms M -limit(sympy.Si(M), M => oo) -``` - -### Numeric integration - -The `quadgk` function (available through `QuadGK`) is able to accept `Inf` and `-Inf` as endpoints of the interval. For example, this will integrate $e^{-x^2/2}$ over the real line: - -```julia; -f(x) = exp(-x^2/2) -quadgk(f, -Inf, Inf) -``` - -(If may not be obvious, but this is $\sqrt{2\pi}$.) - -## Singularities - -Suppose $\lim_{x \rightarrow c}f(x) = \infty$ or $-\infty$. Then a Riemann sum that contains an interval including $c$ will not be finite if the point chosen in the interval is $c$. Though we could choose another point, this is not enough as the definition must hold for any choice of the $c_i$. - -However, if $c$ is isolated, we can get close to $c$ and see how the area changes. - -Suppose $a < c$, we define $\int_a^c f(x) dx = \lim_{M \rightarrow c-} \int_a^c f(x) dx$. If this limit exists, the definite integral with $c$ is well defined. Similarly, the integral from $c$ to $b$, where $b > c$, can be defined by a right limit going to $c$. The integral from $a$ to $b$ will exist if both the limits are finite. - -##### Examples - -* Consider the example of the initial illustration, $f(x) = 1/\sqrt{x}$ at $0$. Here $f(0)= \infty$, so the usual notion of a limit won't apply to $\int_0^1 f(x) dx$. However, - -```math -\lim_{M \rightarrow 0+} \int_M^1 \frac{1}{\sqrt{x}} dx -= \lim_{M \rightarrow 0+} \frac{\sqrt{x}}{1/2} \big|_M^1 -= \lim_{M \rightarrow 0+} 2(1) - 2\sqrt{M} = 2. -``` - -!!! note - The cases $f(x) = x^{-n}$ for $n > 0$ are tricky to keep straight. For $n > 1$, the functions can be integrated over $[1,\infty)$, but not $(0,1]$. For $0 < n < 1$, the functions can be integrated over $(0,1]$ but not $[1, \infty)$. - -* Now consider $f(x) = 1/x$. Is this integral $\int_0^1 1/x \cdot dx$ defined? It will be *if* this limit exists: - -```math -\lim_{M \rightarrow 0+} \int_M^1 \frac{1}{x} dx -= \lim_{M \rightarrow 0+} \log(x) \big|_M^1 -= \lim_{M \rightarrow 0+} \log(1) - \log(M) = \infty. -``` - -As the limit does not exist, the function is not integrable around $0$. - -* `SymPy` may give answers which do not coincide with our definitions, as it uses complex numbers as a default assumption. In this case it returns the proper answer when integrated from ``0`` to ``1`` and `NaN` for an integral over ``(-1,1)``: - -```julia; -@syms x -integrate(1/x, (x, 0, 1)), integrate(1/x, (x, -1, 1)) -``` - -* Suppose you know $\int_1^\infty x^2 f(x) dx$ exists. Does this imply $\int_0^1 f(1/x) dx$ exists? - -We need to consider the limit of $\int_M^1 f(1/x) dx$. We try the $u$-substitution $u(x) = 1/x$. This gives $du = -(1/x^2)dx = -u^2 dx$. So, the substitution becomes: - -```math -\int_M^1 f(1/x) dx = \int_{1/M}^{1/1} f(u) (-u^2) du = \int_1^{1/M} u^2 f(u) du. -``` - -But the limit as $M \rightarrow 0$ of $1/M$ is the same going to $\infty$, so the right side will converge by the assumption. Thus we get $f(1/x)$ is integrable over $(0,1]$. - -### Numeric integration - -So far our use of the `quadgk` function specified the region to -integrate via `a`, `b`, as in `quadgk(f, a, b)`. In fact, it can -specify values in between for which the function should not be -sampled. For example, were we to integrate $1/\sqrt{\lvert x\rvert}$ -over $[-1,1]$, we would want to avoid $0$ as a point to sample. Here -is how: - -```julia;hold=true -f(x) = 1 / sqrt(abs(x)) -quadgk(f, -1, 0, 1) -``` - -Just trying `quadgk(f, -1, 1)` leads to a `DomainError`, as `0` will -be one of the points sampled. The general call is like `quadgk(f, a, b, c, d,...)` -which integrates over $(a,b)$ and $(b,c)$ and $(c,d)$, -$\dots$. The algorithm is not supposed to evaluate the function at the -endpoints of the intervals. - -## Probability applications - -A probability density is a function $f(x) \geq 0$ which is integrable -on $(-\infty, \infty)$ and for which $\int_{-\infty}^\infty f(x) dx =1$. The cumulative distribution function is defined by $F(x)=\int_{-\infty}^x f(u) du$. - -Probability densities are good example of using improper integrals. - -* Show that $f(x) = (1/\pi) (1/(1 + x^2))$ is a probability density function. - -We need to show that the integral exists and is $1$. For this, we use the fact that $(1/\pi) \cdot \tan^{-1}(x)$ is an antiderivative. Then we have: - -$\lim_{M \rightarrow \infty} F(M) = (1/\pi) \cdot \pi/2$ and as -$\tan^{-1}(x)$ is odd, we must have $F(-\infty) = \lim_{M \rightarrow --\infty} f(M) = -(1/\pi) \cdot \pi/2$. All told, $F(\infty) - -F(-\infty) = 1/2 - (-1/2) = 1$. - -* Show that $f(x) = 1/(b-a)$ for $a \leq x \leq b$ and $0$ otherwise is a probability density. - -The integral for $-\infty$ to $a$ of $f(x)$ is just an integral of the constant $0$, so will be $0$. (This is the only constant with finite area over an infinite domain.) Similarly, the integral from $b$ to $\infty$ will be $0$. This means: - -```math -\int_{-\infty}^\infty f(x) dx = \int_a^b \frac{1}{b-a} dx = 1. -``` - -(One might also comment that ``f`` is Riemann integrable on any ``[0,M]`` despite being discontinuous at ``a`` and ``b``.) - -* Show that if $f(x)$ is a probability density then so is $f(x-c)$ for any $c$. - -We have by the $u$-substitution - -```math -\int_{-\infty}^\infty f(x-c)dx = \int_{u(-\infty)}^{u(\infty)} f(u) du = \int_{-\infty}^\infty f(u) du = 1. -``` - -The key is that we can use the regular $u$-substitution formula -provided $\lim_{M \rightarrow \infty} u(M) = u(\infty)$ is -defined. (The *informal* notation $u(\infty)$ is defined by that -limit.) - -* If $f(x)$ is a probability density, then so is $(1/h) f((x-c)/h)$ for any $c, h > 0$. - -Again, by a $u$ substitution with, now, $u(x) = (x-c)/h$, we have $du = (1/h) \cdot dx$ and the result follows just as before: - - -```math -\int_{-\infty}^\infty \frac{1}{h}f(\frac{x-c}{h})dx = \int_{u(-\infty)}^{u(\infty)} f(u) du = \int_{-\infty}^\infty f(u) du = 1. -``` - - -* If $F(x) = 1 - e^{-x}$, for $x \geq 0$, and $0$ otherwise, find $f(x)$. - -We want to just say $F'(x)= e^{-x}$ so $f(x) = e^{-x}$. But some care -is needed. First, that isn't right. The derivative for $x<0$ of $F(x)$ -is $0$, so $f(x) = 0$ if $x < 0$. What about for $x>0$? The derivative -is $e^{-x}$, but is that the right answer? $F(x) = \int_{-\infty}^x -f(u) du$, so we have to at least discuss if the $-\infty$ affects -things. In this case, and in general the answer is *no*. For any $x$ -we can find $M < x$ so that we have $F(x) = \int_{-\infty}^M f(u) du + \int_M^x f(u) du$. The first part -is a constant, so will have derivative $0$, the second will have -derivative $f(x)$, if the derivative exists (and it will exist at $x$ -if the derivative is continuous in a neighborhood of $x$). - -Finally, at $x=0$ we have an issue, as $F'(0)$ does not exist. The -left limit of the secant line approximation is $0$, the right limit of -the secant line approximation is $1$. So, we can take $f(x) = e^{-x}$ -for $x > 0$ and $0$ otherwise, noting that redefining $f(x)$ at a -point will not effect the integral as long as the point is finite. - -## Questions - - -###### Question - -Is $f(x) = 1/x^{100}$ integrable around $0$? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -###### Question - -Is $f(x) = 1/x^{1/3}$ integrable around $0$? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -###### Question - -Is $f(x) = x\cdot\log(x)$ integrable on $[1,\infty)$? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -###### Question - -Is $f(x) = \log(x)/ x$ integrable on $[1,\infty)$? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -###### Question - -Is $f(x) = \log(x)$ integrable on $[1,\infty)$? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -###### Question - -Compute the integral $\int_0^\infty 1/(1+x^2) dx$. - -```julia; hold=true; echo=false -f(x) = 1/(1+x^2) -a, b= 0, Inf -val, _ = quadgk(f, a, b) -numericq(val) -``` - -###### Question - -Compute the the integral $\int_1^\infty \log(x)/x^2 dx$. - -```julia; hold=true; echo=false -f(x) =log(x)/x^2 -a, b= 1, Inf -val, _ = quadgk(f, a, b) -numericq(val) -``` - -###### Question - -Compute the integral $\int_0^2 (x-1)^{2/3} dx$. - -```julia; hold=true; echo=false -f(x) = cbrt((x-1)^2) -val, _ = quadgk(f , 0, 1, 2) -numericq(val) -``` - -###### Question - -From the relationship that if $0 \leq f(x) \leq g(x)$ then $\int_a^b f(x) dx \leq \int_a^b g(x) dx$ it can be deduced that - -* if $\int_a^\infty f(x) dx$ diverges, then so does $\int_a^\infty g(x) dx$. -* if $\int_a^\infty g(x) dx$ converges, then so does $\int_a^\infty f(x) dx$. - -Let $f(x) = \lvert \sin(x)/x^2 \rvert$. - -What can you say about $\int_1^\infty f(x) dx$, as $f(x) \leq 1/x^2$ on $[1, \infty)$? - - - -```julia; hold=true; echo=false -choices =[ -"It is convergent", -"It is divergent", -"Can't say"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - - ----- - -Let $f(x) = \lvert \sin(x) \rvert / x$. - -What can you say about $\int_1^\infty f(x) dx$, as $f(x) \leq 1/x$ on $[1, \infty)$? - -```julia; hold=true; echo=false -choices =[ -"It is convergent", -"It is divergent", -"Can't say"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - ----- - -Let $f(x) = 1/\sqrt{x^2 - 1}$. -What can you say about $\int_1^\infty f(x) dx$, as $f(x) \geq 1/x$ on $[1, \infty)$? - - - - -```julia; hold=true; echo=false -choices =[ -"It is convergent", -"It is divergent", -"Can't say"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - ----- - - -Let $f(x) = 1 + 4x^2$. -What can you say about $\int_1^\infty f(x) dx$, as $f(x) \leq 1/x^2$ on $[1, \infty)$? - -```julia; hold=true; echo=false -choices =[ -"It is convergent", -"It is divergent", -"Can't say"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - ----- - - -Let $f(x) = \lvert \sin(x)^{10}\rvert/e^x$. -What can you say about $\int_1^\infty f(x) dx$, as $f(x) \leq e^{-x}$ on $[1, \infty)$? - -```julia; hold=true; echo=false -choices =[ -"It is convergent", -"It is divergent", -"Can't say"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The difference between "blowing up" at $0$ versus being integrable at -$\infty$ can be seen to be related through the $u$-substitution -$u=1/x$. With this $u$-substitution, what becomes of $\int_0^1 x^{-2/3} dx$? - -```julia; hold=true; echo=false -choices = [ -"``\\int_1^\\infty u^{2/3}/u^2 \\cdot du``", -"``\\int_0^1 u^{2/3} \\cdot du``", -"``\\int_0^\\infty 1/u \\cdot du``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The antiderivative of $f(x) = 1/\pi \cdot 1/\sqrt{x(1-x)}$ is $F(x)=(2/\pi)\cdot \sin^{-1}(\sqrt{x})$. - -Find $\int_0^1 f(x) dx$. - -```julia; hold=true; echo=false -f(x) = 1/pi * 1/sqrt(x*(1-x)) -a, b= 0, 1 -val, _ = quadgk(f, a, b) -numericq(val) -``` diff --git a/CwJ/integrals/integration_by_parts.jmd b/CwJ/integrals/integration_by_parts.jmd deleted file mode 100644 index 28968d5..0000000 --- a/CwJ/integrals/integration_by_parts.jmd +++ /dev/null @@ -1,657 +0,0 @@ -# Integration By Parts - - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using QuadGK - -frontmatter = ( - title = "Integration By Parts", - description = "Calculus with Julia: Integration By Parts", - tags = ["CalculusWithJulia", "integrals", "integration by parts"], -); - -nothing -``` - ----- - -So far we have seen that the *derivative* rules lead to *integration rules*. In particular: - -* The sum rule $[au(x) + bv(x)]' = au'(x) + bv'(x)$ gives rise to an - integration rule: $\int (au(x) + bv(x))dx = a\int u(x)dx + b\int - v(x))dx$. (That is, the linearity of the derivative means the - integral has linearity.) - -* The chain rule $[f(g(x))]' = f'(g(x)) g'(x)$ gives $\int_a^b - f(g(x))g'(x)dx=\int_{g(a)}^{g(b)}f(x)dx$. That is, substitution - reverses the chain rule. - -Now we turn our attention to the implications of the *product rule*: $[uv]' = u'v + uv'$. The resulting technique is called integration by parts. - - -The following illustrates integration by parts of the integral $(uv)'$ over -$[a,b]$ [original](http://en.wikipedia.org/wiki/Integration_by_parts#Visualization). - -```julia; echo=false -let - ## parts picture - u(x) = sin(x*pi/2) - v(x) = x - xs = range(0, stop=1, length=50) - a,b = 1/4, 3/4 - p = plot(u, v, 0, 1, legend=false) - plot!(p, zero, 0, 1) - scatter!(p, [u(a), u(b)], [v(a), v(b)], color=:orange, markersize=5) - - plot!(p, [u(a),u(a),0, 0, u(b),u(b),u(a)], - [0, v(a), v(a), v(b), v(b), 0, 0], - linetype=:polygon, fillcolor=:orange, alpha=0.25) - annotate!(p, [(0.65, .25, "A"), (0.4, .55, "B")]) - annotate!(p, [(u(a),v(a) + .08, "(u(a),v(a))"), (u(b),v(b)+.08, "(u(b),v(b))")]) -end -``` - -The figure is a parametric plot of $(u,v)$ with the points $(u(a), -v(a))$ and $(u(b), v(b))$ marked. The difference $u(b)v(b) - u(a)v(a) -= u(x)v(x) \mid_a^b$ is shaded. This area breaks into two pieces, -$A$ and $B$, partitioned by the curve. If $u$ is increasing and the -curve is parameterized by $t \rightarrow u^{-1}(t)$, then -$A=\int_{u^{-1}(a)}^{u^{-1}(b)} v(u^{-1}(t))dt$. A $u$-substitution -with $t = u(x)$ changes this into the integral $\int_a^b v(x) u'(x) -dx$. Similarly, for increasing $v$, it can be seen that $B=\int_a^b -u(x) v'(x) dx$. This suggests a relationship between the integral of -$u v'$, the integral of $u' v$ and the value $u(b)v(b) - u(a)v(a)$. - - - - -In terms of formulas, by the fundamental theorem of calculus: - -```math -u(x)\cdot v(x)\big|_a^b = \int_a^b [u(x) v(x)]' dx = \int_a^b u'(x) \cdot v(x) dx + \int_a^b u(x) \cdot v'(x) dx. -``` - -This is re-expressed as - -```math -\int_a^b u(x) \cdot v'(x) dx = u(x) \cdot v(x)\big|_a^b - \int_a^b v(x) \cdot u'(x) dx, -``` - -Or, more informally, as $\int udv = uv - \int v du$. - - - -This can sometimes be confusingly written as: - -```math -\int f(x) g'(x) dx = f(x)g(x) - \int f'(x) g(x) dx. -``` - -(The confusion coming from the fact that the indefinite integrals are only defined up to a constant.) - -How does this help? It allows us to differentiate parts of an integral in hopes it makes the result easier to integrate. - -An illustration can clarify. - -Consider the integral $\int_0^\pi x\sin(x) dx$. If we let $u=x$ and $dv=\sin(x) dx$, then $du = 1dx$ and $v=-\cos(x)$. The above then says: - -```math -\begin{align*} -\int_0^\pi x\sin(x) dx &= \int_0^\pi u dv\\ -&= uv\big|_0^\pi - \int_0^\pi v du\\ -&= x \cdot (-\cos(x)) \big|_0^\pi - \int_0^\pi (-\cos(x)) dx\\ -&= \pi (-\cos(\pi)) - 0(-\cos(0)) + \int_0^\pi \cos(x) dx\\ -&= \pi + \sin(x)\big|_0^\pi\\ -&= \pi. -\end{align*} -``` - -The technique means one part is differentiated and one part -integrated. The art is to break the integrand up into a piece that gets -easier through differentiation and a piece that doesn't get much -harder through integration. - -#### Examples - -Consider $\int_1^2 x \log(x) dx$. We might try differentiating the $\log(x)$ term, so we set -```math -u=\log(x) \text{ and } dv=xdx -``` - -Then we get - -```math -du = \frac{1}{x} dx \text{ and } v = \frac{x^2}{2}. -``` - -Putting together gives: - -```math -\begin{align*} -\int_1^2 x \log(x) dx -&= (\log(x) \cdot \frac{x^2}{2}) \big|_1^2 - \int_1^2 \frac{x^2}{2} \frac{1}{x} dx\\ -&= (2\log(2) - 0) - (\frac{x^2}{4})\big|_1^2\\ -&= 2\log(2) - (1 - \frac{1}{4}) \\ -&= 2\log(2) - \frac{3}{4}. -\end{align*} -``` - -##### Example - -This related problem, ``\int \log(x) dx``, uses the same idea, though perhaps harder to see at first glance, as setting `dv=dx` is almost too simple to try: - -```math -\begin{align*} -u &= \log(x) & dv &= dx\\ -du &= \frac{1}{x}dx & v &= x -\end{align*} -``` - - -```math -\begin{align*} -\int \log(x) dx -&= \int u dv\\ -&= uv - \int v du\\ -&= (\log(x) \cdot x) - \int x \cdot \frac{1}{x} dx\\ -&= x \log(x) - \int dx\\ -&= x \log(x) - x -\end{align*} -``` - -Were this a definite integral problem, we would have written: - -```math -\int_a^b \log(x) dx = (x\log(x))\big|_a^b - \int_a^b dx = (x\log(x) - x)\big|_a^b. -``` - -##### Example - -Sometimes integration by parts is used two or more times. Here we let $u=x^2$ and $dv = e^x dx$: - - -```math -\int_a^b x^2 e^x dx = (x^2 \cdot e^x)\big|_a^b - \int_a^b 2x e^x dx. -``` - - -But we can do $\int_a^b x e^xdx$ the same way: - -```math -\int_a^b x e^x = (x\cdot e^x)\big|_a^b - \int_a^b 1 \cdot e^xdx = (xe^x - e^x)\big|_a^b. -``` - -Combining gives the answer: - -```math -\int_a^b x^2 e^x dx -= (x^2 \cdot e^x)\big|_a^b - 2( (xe^x - e^x)\big|_a^b ) = -e^x(x^2 - 2x - 1) \big|_a^b. -``` - - -In fact, it isn't hard to see that an integral of $x^m e^x$, $m$ a positive integer, can be handled in this manner. For example, when $m=10$, `SymPy` gives: - -```julia; -@syms 𝒙 -integrate(𝒙^10 * exp(𝒙), 𝒙) -``` - - -The general answer is $\int x^n e^xdx = p(x) e^x$, where $p(x)$ is a -polynomial of degree $n$. - -##### Example - -The same technique is attempted for this integral, but ends differently. First in the following we let $u=\sin(x)$ and $dv=e^x dx$: - -```math -\int e^x \sin(x)dx = \sin(x) e^x - \int \cos(x) e^x dx. -``` - -Now we let $u = \cos(x)$ and again $dv=e^x dx$: - -```math -\int e^x \sin(x)dx = \sin(x) e^x - \int \cos(x) e^x dx = \sin(x)e^x - \cos(x)e^x - \int (-\sin(x))e^x dx. -``` - -But simplifying this gives: - -```math -\int e^x \sin(x)dx = - \int e^x \sin(x)dx + e^x(\sin(x) - \cos(x)). -``` - -Solving for the "unknown" $\int e^x \sin(x) dx$ gives: - -```math -\int e^x \sin(x) dx = \frac{1}{2} e^x (\sin(x) - \cos(x)). -``` - -##### Example - -Positive integer powers of trigonometric functions can be addressed by this technique. Consider -$\int \cos(x)^n dx$. We let $u=\cos(x)^{n-1}$ and $dv=\cos(x) dx$. Then $du = (n-1)\cos(x)^{n-2}(-\sin(x))dx$ and $v=\sin(x)$. So, - -```math -\begin{align*} -\int \cos(x)^n dx &= \cos(x)^{n-1} \cdot (\sin(x)) - \int (\sin(x)) ((n-1)\sin(x) \cos(x)^{n-2}) dx \\ -&= \sin(x) \cos(x)^{n-1} + (n-1)\int \sin^2(x) \cos(x)^{n-1} dx\\ -&= \sin(x) \cos(x)^{n-1} + (n-1)\int (1 - \cos(x)^2) \cos(x)^{n-2} dx\\ -&= \sin(x) \cos(x)^{n-1} + (n-1)\int \cos(x)^{n-2}dx - (n-1)\int \cos(x)^n dx. -\end{align*} -``` - -We can then solve for the unknown ($\int \cos(x)^{n}dx$) to get this *reduction formula*: - - -```math -\int \cos(x)^n dx = \frac{1}{n}\sin(x) \cos(x)^{n-1} + \frac{n-1}{n}\int \cos(x)^{n-2}dx. -``` - -This is called a reduction formula as it reduces the problem from an -integral with a power of $n$ to one with a power of $n - 2$, so could -be repeated until the remaining indefinite integral required knowing either -$\int \cos(x) dx$ (which is $-\sin(x)$) or $\int \cos(x)^2 dx$, which -by a double angle formula application, is $x/2 - \sin(2x)/4$. - -`SymPy` is quite able to do this repeated bookkeeping. For example with $n=10$: - -```julia; -integrate(cos(𝒙)^10, 𝒙) -``` - -##### Example - -The visual interpretation of integration by parts breaks area into two pieces, the one labeled "B" looks like it would be labeled "A" for an inverse function for $f$. Indeed, integration by parts gives a means to possibly find antiderivatives for inverse functions. - -Let $uv = x f^{-1}(x)$. Then we have $[uv]' = u'v + uv' = f^{-1}(x) + x [f^{-1}(x)]'$. -So, up to a constant ``uv = \int [uv]'dx = \int f^{-1}(x) + \int x [f^{-1}(x)]'``. Re-expressing gives: - -```math -\begin{align*} -\int f^{-1}(x) dx -&= xf^{-1}(x) - \int x [f^{-1}(x)]' dx\\ -&= xf^{-1}(x) - \int f(u) du.\\ -\end{align*} -``` - -The last line follows from the $u$-substitution: -$u=f^{-1}(x)$ for then $du = [f^{-1}(x)]' dx$ and $x=f(u)$. - -We use this to find an antiderivative for $\sin^{-1}(x)$: - -```math -\begin{align*} -\int \sin^{-1}(x) dx &= x \sin^{-1}(x) - \int \sin(u) du \\ -&= x \sin^{-1}(x) + \cos(u) \\ -&= x \sin^{-1}(x) + \cos(\sin^{-1}(x)). -\end{align*} -``` - -Using right triangles to simplify, the last value -$\cos(\sin^{-1}(x))$ can otherwise be written as $\sqrt{1 - x^2}$. - -##### Example - -The [trapezoid](http://en.wikipedia.org/wiki/Trapezoidal_rule) rule is an approximation to the definite integral like a -Riemann sum, only instead of approximating the area above -$[x_i, x_i + h]$ by a rectangle with height $f(c_i)$ (for some $c_i$), -it uses a trapezoid formed by the left and right endpoints. That is, -this area is used in the estimation: $(1/2)\cdot (f(x_i) + f(x_i+h)) -\cdot h$. - -Even though we suggest just using `quadgk` for numeric integration, -estimating the error in this approximation is still of some -theoretical interest. - -Recall, just using *either* $x_i$ or $x_{i-1}$ -for $c_i$ gives an error that is "like" $1/n$, as $n$ gets large, though the exact rate depends on the function and the length of the interval. - -This [proof](http://www.math.ucsd.edu/~ebender/20B/77_Trap.pdf) for the -error estimate is involved, but is reproduced here, as it nicely -integrates many of the theoretical concepts of integration discussed so far. - -First, for convenience, we consider the interval $x_i$ to $x_i+h$. The -actual answer over this is just $\int_{x_i}^{x_i+h}f(x) dx$. By a -$u$-substitution with $u=x-x_i$ this becomes $\int_0^h f(t + x_i) -dt$. For analyzing this we integrate once by parts using $u=f(t+x_i)$ and -$dv=dt$. But instead of letting $v=t$, we choose to add - as is our prerogative - a constant of integration -$A$, so $v=t+A$: - -```math -\begin{align*} -\int_0^h f(t + x_i) dt &= uv \big|_0^h - \int_0^h v du\\ -&= f(t+x_i)(t+A)\big|_0^h - \int_0^h (t + A) f'(t + x_i) dt. -\end{align*} -``` - -We choose $A$ to be $-h/2$, any constant is possible, for then the term $f(t+x_i)(t+A)\big|_0^h$ -becomes $(1/2)(f(x_i+h) + f(x_i)) \cdot h$, or the trapezoid -approximation. This means, the error over this interval - actual minus estimate - satisfies: - -```math -\text{error}_i = \int_{x_i}^{x_i+h}f(x) dx - \frac{f(x_i+h) -f(x_i)}{2} \cdot h = - \int_0^h (t + A) f'(t + x_i) dt. -``` - -For this, we *again* integrate by parts with - -```math -\begin{align*} -u &= f'(t + x_i) & dv &= (t + A)dt\\ -du &= f''(t + x_i) & v &= \frac{(t + A)^2}{2} + B -\end{align*} -``` - -Again we added a constant of integration, ``B``, to $v$. -The error becomes: - -```math -\text{error}_i = -(\frac{(t+A)^2}{2} + B)f'(t+x_i)\big|_0^h + \int_0^h (\frac{(t+A)^2}{2} + B) \cdot f''(t+x_i) dt. -``` - -With $A=-h/2$, $B$ is chosen so ``(t+A)^2/2 + B = 0``, or $B=-h^2/8$. -The error becomes - -```math -\text{error}_i = \int_0^h \left(\frac{(t-h/2)^2}{2} - \frac{h^2}{8}\right) \cdot f''(t + x_i) dt. -``` - -Now, we assume the $\lvert f''(t)\rvert$ is bounded by $K$ for any $a \leq t \leq b$. This -will be true, for example, if the second derivative is assumed to exist and be continuous. Using this fact about definite integrals -$\lvert \int_a^b g dx\rvert \leq \int_a^b \lvert g \rvert dx$ we have: - - -```math -\lvert \text{error}_i \rvert \leq K \int_0^h \lvert (\frac{(t-h/2)^2}{2} - \frac{h^2}{8}) \rvert dt. -``` - -But what is the function in the integrand? Clearly it is a quadratic in $t$. Expanding gives $1/2 \cdot -(t^2 - ht)$. This is negative over $[0,h]$ (and $0$ at these -endpoints, so the integral above is just: - -```math -\frac{1}{2}\int_0^h (ht - t^2)dt = \frac{1}{2} (\frac{ht^2}{2} - \frac{t^3}{3})\big|_0^h = \frac{h^3}{12} -``` - -This gives the bound: $\vert \text{error}_i \rvert \leq K h^3/12$. The -*total* error may be less, but is not more than the value found by -adding up the error over each of the $n$ intervals. As our bound does not -depend on the $i$, we have this sum satisfies: - -```math -\lvert \text{error}\rvert \leq n \cdot \frac{Kh^3}{12} = \frac{K(b-a)^3}{12}\frac{1}{n^2}. -``` - -So the error is like $1/n^2$, in contrast to the $1/n$ error of the -Riemann sums. One way to see this, for the Riemann sum it takes twice -as many terms to half an error estimate, but for the trapezoid rule -only $\sqrt{2}$ as many, and for Simpson's rule, only $2^{1/4}$ as -many. - -## Area related to parameterized curves - -The figure introduced to motivate the integration by parts formula also suggests that areas described parametrically (by a pair of functions ``x=u(t), y=v(t)`` for ``a \le t \le b``) can have their area computed. - -When ``u(t)`` is strictly *increasing*, and hence having an inverse function, then re-parameterizing by ``\phi(t) = u^{-1}(t)`` gives a ``x=u(u^{-1}(t))=t, y=v(u^{-1}(t))`` and integrating this gives the area by ``A=\int_a^b v(t) u'(t) dt`` - -However, the correct answer requires understanding a minus sign. Consider the area enclosed by ``x(t) = \cos(t), y(t) = \sin(t)``: - -```julia; echo=false -let - r(t) = [cos(t), sin(t)] - p=plot_parametric(0..2pi, r, aspect_ratio=:equal, legend=false) - for t ∈ (pi/4, 3pi/4, 5pi/4, 7pi/4) - quiver!(unzip([r(t)])..., quiver=Tuple(unzip([0.1*r'(t)]))) - end - ti, tj = pi/3, pi/3+0.1 - plot!([cos(tj), cos(ti), cos(ti), cos(tj), cos(tj)], [0,0,sin(tj), sin(tj),0]) - quiver!([0],[0], quiver=Tuple(unzip([r(ti)]))) - quiver!([0],[0], quiver=Tuple(unzip([r(tj)]))) - p -end -``` - - -We added a rectangle for a Riemann sum for ``t_i = \pi/3`` and ``t_{i+1} = \pi/3 + \pi/8``. The height of this rectangle if ``y(t_i)``, the base is of length ``x(t_i) - x(t_{i+1})`` *given* the orientation of how the circular curve is parameterized (counter clockwise here). - - -Taking this Riemann sum approach, we can approximate the area under the curve parameterized by ``(u(t), v(t))`` over the time range ``[t_i, t_{i+1}]`` as a rectangle with height ``y(t_i)`` and base ``x(t_{i}) - x(t_{i+1})``. Then we get, as expected: - -```math -\begin{align*} -A &\approx \sum_i y(t_i) \cdot (x(t_{i}) - x(t_{i+1}))\\ - &= - \sum_i y(t_i) \cdot (x(t_{i+1}) - x(t_{i}))\\ - &= - \sum_i y(t_i) \cdot \frac{x(t_{i+1}) - x(t_i)}{t_{i+1}-t_i} \cdot (t_{i+1}-t_i)\\ - &\approx -\int_a^b y(t) x'(t) dt. -\end{align*} -``` - -So with a counterclockwise rotation, the actual answer for the area includes a minus sign. If the area is traced out in a *clockwise* manner, there is no minus sign. - - -This is a case of [Green's Theorem](https://en.wikipedia.org/wiki/Green%27s_theorem#Area_calculation) to be taken up in [Green's Theorem, Stokes' Theorem, and the Divergence Theorem](file:///Users/verzani/julia/CalculusWithJulia/html/integral_vector_calculus/stokes_theorem.html). - -##### Example - -Apply the formula to a parameterized circle to ensure, the signed area is properly computed. If we use ``x(t) = r\cos(t)`` and ``y(t) = r\sin(t)`` then we have the motion is counterclockwise: - -```julia; hold=true -@syms 𝒓 t -𝒙 = 𝒓 * cos(t) -𝒚 = 𝒓 * sin(t) --integrate(𝒚 * diff(𝒙, t), (t, 0, 2PI)) -``` - -We see the expected answer for the area of a circle. - - -##### Example - -Apply the formula to find the area under one arch of a cycloid, parameterized by ``x(t) = t - \sin(t), y(t) = 1 - \cos(t)``. - -Working symbolically, we have one arch given by the following described in a *clockwise* manner, so we use ``\int y(t) x'(t) dt``: - -```julia; hold=true -@syms t -𝒙 = t - sin(t) -𝒚 = 1 - cos(t) -integrate(𝒚 * diff(𝒙, t), (t, 0, 2PI)) -``` - -([Galileo](https://mathshistory.st-andrews.ac.uk/Curves/Cycloid/) was thwarted in finding this answer exactly and resorted to constructing one from metal to *estimate* the value.) - -##### Example - -Consider the example ``x(t) = \cos(t) + t\sin(t), y(t) = \sin(t) - t\cos(t)`` for ``0 \leq t \leq 2\pi``. - -```julia; echo=false -let - x(t) = cos(t) + t*sin(t) - y(t) = sin(t) - t*cos(t) - ts = range(0,2pi, length=100) - plot(x.(ts), y.(ts)) -end -``` - -How much area is enclosed by this curve and the ``x`` axis? The area is described in a counterclockwise manner, so we have: - -```julia; hold=true -let - x(t) = cos(t) + t*sin(t) - y(t) = sin(t) - t*cos(t) - yx′(t) = -y(t) * x'(t) # yx\prime[tab] - quadgk(yx′, 0, 2pi) -end -``` - -This particular problem could also have been done symbolically, but many curves will need to have a numeric approximation used. - - -## Questions - - -###### Question - -In the integral of $\int \log(x) dx$ we let $u=\log(x)$ and $dv=dx$. What are $du$ and $v$? - -```julia; hold=true; echo=false -choices = [ -"``du=1/x dx \\quad v = x``", -"``du=x\\log(x) dx\\quad v = 1``", -"``du=1/x dx\\quad v = x^2/2``"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -In the integral $\int \sec(x)^3 dx$ we let $u=\sec(x)$ and $dv = \sec(x)^2 dx$. What are $du$ and $v$? - -```julia; hold=true; echo=false -choices = [ -"``du=\\sec(x)\\tan(x)dx \\quad v=\\tan(x)``", -"``du=\\csc(x) dx \\quad v=\\sec(x)^3 / 3``", -"``du=\\tan(x) dx \\quad v=\\sec(x)\\tan(x)``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -In the integral $\int e^{-x} \cos(x)dx$ we let $u=e^{-x}$ and $dv=\cos(x) dx$. -What are $du$ and $v$? - -```julia; hold=true; echo=false -choices = [ -"``du=-e^{-x} dx \\quad v=\\sin(x)``", -"``du=-e^{-x} dx \\quad v=-\\sin(x)``", -"``du=\\sin(x)dx \\quad v=-e^{-x}``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Find the value of $\int_1^4 x \log(x) dx$. You can integrate by parts. - -```julia; hold=true; echo=false -f(x) = x*log(x) -a,b = 1,4 -val,err = quadgk(f, a, b) -numericq(val) -``` - -###### Question - -Find the value of $\int_0^{\pi/2} x\cos(2x) dx$. You can integrate by parts. - -```julia; hold=true; echo=false -f(x) = x*cos(2x) -a,b = 0, pi/2 -val,err = quadgk(f, a, b) -numericq(val) -``` - - -###### Question - - -Find the value of $\int_1^e (\log(x))^2 dx$. You can integrate by parts. - -```julia; hold=true; echo=false -f(x) = log(x)^2 -a,b = 1,exp(1) -val,err = quadgk(f, a, b) -numericq(val) -``` - - - -###### Question - -Integration by parts can be used to provide "reduction" formulas, where an antiderivative is written in terms of another antiderivative with a lower power. Which is the proper reduction formula for $\int (\log(x))^n dx$? - -```julia; hold=true; echo=false -choices = [ -"``x(\\log(x))^n - n \\int (\\log(x))^{n-1} dx``", -"``\\int (\\log(x))^{n+1}/(n+1) dx``", -"``x(\\log(x))^n - \\int (\\log(x))^{n-1} dx``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The [Wikipedia](http://en.wikipedia.org/wiki/Integration_by_parts) -page has a rule of thumb with an acronym LIATE to indicate what is a -good candidate to be "$u$": **L**og function, **I**nverse functions, -**A**lgebraic functions ($x^n$), **T**rigonmetric functions, and -**E**xponential functions. - -Consider the integral $\int x \cos(x) dx$. Which letter should be tried first? - -```julia; hold=true; echo=false -choices = ["L", "I", "A", "T", "E"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - ----- - - -Consider the integral $\int x^2\log(x) dx$. Which letter should be tried first? - -```julia; hold=true; echo=false -choices = ["L", "I", "A", "T", "E"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - ----- - - -Consider the integral $\int x^2 \sin^{-1}(x) dx$. Which letter should be tried first? - -```julia; hold=true; echo=false -choices = ["L", "I", "A", "T", "E"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - - ----- - - -Consider the integral $\int e^x \sin(x) dx$. Which letter should be tried first? - -```julia; hold=true; echo=false -choices = ["L", "I", "A", "T", "E"] -answ = 4 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Find an antiderivative for $\cos^{-1}(x)$ using the integration by parts formula. - -```julia; hold=true; echo=false -choices = [ -"``x\\cos^{-1}(x)-\\sqrt{1 - x^2}``", -"``x^2/2 \\cos^{-1}(x) - x\\sqrt{1-x^2}/4 - \\cos^{-1}(x)/4``", -"``-\\sin^{-1}(x)``"] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/integrals/mean_value_theorem.jmd b/CwJ/integrals/mean_value_theorem.jmd deleted file mode 100644 index 3a29fb1..0000000 --- a/CwJ/integrals/mean_value_theorem.jmd +++ /dev/null @@ -1,361 +0,0 @@ -# Mean value theorem for integrals - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using QuadGK -``` - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Mean value theorem for integrals", - description = "Calculus with Julia: Mean value theorem for integrals", - tags = ["CalculusWithJulia", "integrals", "mean value theorem for integrals"], -); -nothing -``` - ----- - -## Average value of a function - -Let $f(x)$ be a continuous function over the interval $[a,b]$ with $a < b$. - -The average value of $f$ over $[a,b]$ is defined by: - -```math -\frac{1}{b-a} \int_a^b f(x) dx. -``` - -If $f$ is a constant, this is just the contant value, as would be -expected. If $f$ is *piecewise* linear, then this is the weighted -average of these constants. - -#### Examples - -##### Example: average velocity - -The average velocity between times $a < b$, is simply the -change in position during the time interval divided by the change in -time. In notation, this would be $(x(b) - x(a)) / (b-a)$. If $v(t) = -x'(t)$ is the velocity, then by the second part of the fundamental -theorem of calculus, we have, in agreement with the definition above, -that: - -```math -\text{average velocity} = \frac{x(b) - x(a)}{b-a} = \frac{1}{b-a} \int_a^b v(t) dt. -``` - - -The average speed is the change in *total* distance over time, which is given by - -```math -\text{average speed} = \frac{1}{b-a} \int_a^b \lvert v(t)\rvert dt. -``` - - - -Let $\bar{v}$ be the average velocity. Then we have $\bar{v} -\cdot(b-a) = x(b) - x(a)$, or the change in position can be written as -a constant ($\bar{v}$) times the time, as though we had a constant -velocity. This is an old intuition. -[Bressoud](http://www.math.harvard.edu/~knill/teaching/math1a_2011/exhibits/bressoud/) -comments on the special case known to scholars at Merton -College around ``1350`` that the distance traveled by an object under -uniformly increasing velocity starting at $v_0$ and ending at $v_t$ is -equal to the distance traveled by an object with constant velocity of -$(v_0 + v_t)/2$. - -##### Example - -What is the average value of $f(x)=\sin(x)$ over $[0, \pi]$? - -```math -\text{average} = \frac{1}{\pi-0} \int_0^\pi \sin(x) dx = \frac{1}{\pi} (-\cos(x)) \big|_0^\pi = \frac{2}{\pi} -``` - -Visually, we have: - -```julia; -plot(sin, 0, pi) -plot!(x -> 2/pi) -``` - -##### Example - -What is the average value of the function $f$ which is $1$ between -$[0,3]$, $2$ between $(3,5]$ and $1$ between $(5,6]$? - -Though not continuous, $f(x)$ is integrable as it contains only jumps. The integral from $[0,6]$ can be computed with geometry: $3\cdot 3 + 2 \cdot 2 + 1 \cdot 1 = 14$. The average then is $14/(6-0) = 7/3$. - -##### Example - -What is the average value of the function $e^{-x}$ between $0$ and $\log(2)$? - -```math -\begin{align*} -\text{average} = \frac{1}{\log(2) - 0} \int_0^{\log(2)} e^{-x} dx\\ -&= \frac{1}{\log(2)} (-e^{-x}) \big|_0^{\log(2)}\\ -&= -\frac{1}{\log(2)} (\frac{1}{2} - 1)\\ -&= \frac{1}{2\log(2)}. -\end{align*} -``` - -Visualizing, we have - -```julia; -plot(x -> exp(-x), 0, log(2)) -plot!(x -> 1/(2*log(2))) -``` - -## The mean value theorem for integrals - -If $f(x)$ is assumed integrable, the average value of $f(x)$ is -defined, as above. Re-expressing gives that there exists a $K$ with - -```math -K \cdot (b-a) = \int_a^b f(x) dx. -``` - -When we assume that $f(x)$ is continuous, we can describe $K$ as a value in the range of $f$: - -> **The mean value theorem for integrals**: Let $f(x)$ be a continuous -> function on $[a,b]$ with $a < b$. Then there exists $c$ with $a \leq c \leq b$ with -> -> ``f(c) \cdot (b-a) = \int_a^b f(x) dx.``` - -The proof comes from the intermediate value theorem and the extreme -value theorem. Since $f$ is continuous on a closed interval, there -exists values $m$ and $M$ with $f(c_m) = m \leq f(x) \leq M=f(c_M)$, -for some $c_m$ and $c_M$ in the interval $[a,b]$. Since $m \leq f(x) \leq M$, we must have: - -```math -m \cdot (b-a) \leq K\cdot(b-a) \leq M\cdot(b-a). -``` - -So in particular $K$ is in $[m, M]$. But $m$ and $M$ correspond to -values of $f(x)$, so by the intermediate value theorem, $K=f(c)$ for -some $c$ that must lie in between $c_m$ and $c_M$, which means as well -that it must be in $[a,b]$. - -##### Proof of second part of Fundamental Theorem of Calculus - -The mean value theorem is exactly what is needed to prove formally the -second part of the Fundamental Theorem of Calculus. Again, suppose -$f(x)$ is continuous on $[a,b]$ with $a < b$. For any $a < x < b$, we -define $F(x) = \int_a^x f(u) du$. Then the derivative of $F$ exists and is $f$. - -Let $h>0$. Then consider the forward difference $(F(x+h) - -F(x))/h$. Rewriting gives: - -```math -\frac{\int_a^{x+h} f(u) du - \int_a^x f(u) du}{h} =\frac{\int_x^{x+h} f(u) du}{h} = f(\xi(h)). -``` - -The value $\xi(h)$ is just the $c$ corresponding to a given value in $[x, x+h]$ -guaranteed by the mean value theorem. We only know that $x \leq \xi(h) \leq x+h$. But -this is plenty - it says that $\lim_{h \rightarrow 0+} \xi(h) = -x$. Using the fact that $f$ is continuous and the known properties of -limits of compositions of functions this gives $\lim_{h \rightarrow -0+} f(\xi(h)) = f(x)$. But this means that the (right) limit of the -secant line expression exists and is equal to $f(x)$, which is what we -want to prove. Repeating a similar argument when $h < 0$, finishes the proof. - - -The basic notion used is simply that for small $h$, this expression is well -approximated by the left Riemann sum taken over $[x, x+h]$: - -```math -f(\xi(h)) \cdot h = \int_x^{x+h} f(u) du. -``` - - -## Questions - -###### Question - -Between $0$ and $1$ a function is constantly $1$. Between $1$ and $2$ the function is constantly $2$. What is the average value of the function over the interval $[0,2]$? - -```julia; hold=true; echo=false -f(x) = x < 1 ? 1.0 : 2.0 -a,b = 0, 2 -val, _ = quadgk(f, a, b) -numericq(val/(b-a)) -``` - - -###### Question - - -Between $0$ and $2$ a function is constantly $1$. Between $2$ and $3$ the function is constantly $2$. What is the average value of the function over the interval $[0,3]$? - -```julia; hold=true; echo=false -f(x) = x < 2 ? 1.0 : 2.0 -a, b= 0, 2 -val, _ = quadgk(f, a, b) -numericq(val/(b-a)) -``` - - -###### Question - -What integral will show the intuition of the Merton College scholars -that the distance traveled by an object under uniformly increasing -velocity starting at $v_0$ and ending at $v_t$ is equal to the -distance traveled by an object with constant velocity of $(v_0 + -v_t)/2$. - -```julia; hold=true; echo=false -choices = [ -"``\\int_0^t v(u) du = v^2/2 \\big|_0^t``", -"``\\int_0^t (v(0) + v(u))/2 du = v(0)/2\\cdot t + x(u)/2\\ \\big|_0^t``", -"``(v(0) + v(t))/2 \\cdot \\int_0^t du = (v(0) + v(t))/2 \\cdot t``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Find the average value of $\cos(x)$ over the interval $[-\pi/2, \pi/2]$. - -```julia; hold=true; echo=false -f(x) = cos(x) -a,b = -pi/2,pi/2 -val, _ = quadgk(f, a, b) -val = val/(b-a) -numericq(val) -``` - -###### Question - - -Find the average value of $\cos(x)$ over the interval $[0, \pi]$. - -```julia; hold=true; echo=false -f(x) = cos(x) -a,b = 0, pi -val, _ = quadgk(f, a, b) -val = val/(b-a) -numericq(val) -``` - -###### Question - -Find the average value of $f(x) = e^{-2x}$ between $0$ and $2$. - -```julia; hold=true; echo=false -f(x) = exp(-2x) -a, b = 0, 2 -val, _ = quadgk(f, a, b) -val = val/(b-a) -numericq(val) -``` - -###### Question - -Find the average value of $f(x) = \sin(x)^2$ over the $0$, $\pi$. - - -```julia; hold=true; echo=false -f(x) = sin(x)^2 -a, b = 0, pi -val, _ = quadgk(f, a, b) -val = val/(b-a) -numericq(val) -``` - -###### Question - -Which is bigger? The average value of $f(x) = x^{10}$ or the average -value of $g(x) = \lvert x \rvert$ over the interval $[0,1]$? - -```julia; hold=true; echo=false -choices = [ -L"That of $f(x) = x^{10}$.", -L"That of $g(x) = \lvert x \rvert$."] -answ = 2 -radioq(choices, answ) -``` - - - -###### Question - -Define a family of functions over the interval $[0,1]$ by $f(x; a,b) = -x^a \cdot (1-x)^b$. Which has a greater average, $f(x; 2,3)$ or $f(x; -3,4)$? - -```julia; hold=true; echo=false -choices = [ -"``f(x; 2,3)``", -"``f(x; 3,4)``" -] -n1, _ = quadgk(x -> x^2 *(1-x)^3, 0, 1) -n2, _ = quadgk(x -> x^3 *(1-x)^4, 0, 1) -answ = 1 + (n1 < n2) -radioq(choices, answ) -``` - -###### Question - -Suppose the average value of $f(x)$ over $[a,b]$ is $100$. What is the average value of $100 f(x)$ over $[a,b]$? - -```julia; hold=true; echo=false -numericq(100 * 100) -``` - -###### Question - -Suppose $f(x)$ is continuous and positive on $[a,b]$. - -* Explain why for any $x > a$ it must be that: - -```math -F(x) = \int_a^x f(x) dx > 0 -``` - -```julia; hold=true; echo=false -choices = [ -L"Because the mean value theorem says this is $f(c) (x-a)$ for some $c$ and both terms are positive by the assumptions", -"Because the definite integral is only defined for positive area, so it is always positive" -] -answ = 1 -radioq(choices, answ) -``` - -* Explain why $F(x)$ is increasing. - -```julia; hold=true; echo=false -choices = [ -L"By the extreme value theorem, $F(x)$ must reach its maximum, hence it must increase.", -L"By the intermediate value theorem, as $F(x) > 0$, it must be true that $F(x)$ is increasing", -L"By the fundamental theorem of calculus, part I, $F'(x) = f(x) > 0$, hence $F(x)$ is increasing" -] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -For $f(x) = x^2$, which is bigger: the average of the function $f(x)$ over $[0,1]$ or the geometric mean which is the exponential of the average of the logarithm of $f$ over the same interval? - -```julia; hold=true; echo=false -f(x) = x^2 -a,b = 0, 1 -val1 = quadgk(f, a, b)[1] / (b-a) -val2 = exp(quadgk(x -> log(f(x)), a, b)[1] / (b - a)) - -choices = [ -L"The average of $f$", -L"The exponential of the average of $\log(f)$" -] -answ = val1 > val2 ? 1 : 2 -radioq(choices, answ) -``` diff --git a/CwJ/integrals/partial_fractions.jmd b/CwJ/integrals/partial_fractions.jmd deleted file mode 100644 index 9d79135..0000000 --- a/CwJ/integrals/partial_fractions.jmd +++ /dev/null @@ -1,493 +0,0 @@ -# Partial Fractions - -```julia -using CalculusWithJulia -using SymPy -``` - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Partial Fractions", - description = "Calculus with Julia: Partial Fractions", - tags = ["CalculusWithJulia", "integrals", "partial fractions"], -); -nothing -``` - ----- - -Integration is facilitated when an antiderivative for $f$ can be found, as then definite integrals can be evaluated through the fundamental theorem of calculus. - -However, despite integration being an algorithmic procedure, -integration is not. There are "tricks" to try, such as substitution -and integration by parts. These work in some cases. However, there are -classes of functions for which algorithms exist. For example, the -`SymPy` `integrate` function mostly implements an algorithm that decides if -an elementary function has an antiderivative. The -[elementary](http://en.wikipedia.org/wiki/Elementary_function) -functions include exponentials, their inverses (logarithms), -trigonometric functions, their inverses, and powers, including $n$th -roots. Not every elementary function will have an antiderivative -comprised of (finite) combinations of elementary functions. The -typical example is $e^{x^2}$, which has no simple antiderivative, -despite its ubiquitousness. - -There are classes of functions where an (elementary) antiderivative can always be -found. Polynomials provide a case. More surprisingly, so do their -ratios, *rational functions*. - -## Partial fraction decomposition - -Let $f(x) = p(x)/q(x)$, where $p$ and $q$ are polynomial -functions with real coefficients. Further, we assume without comment that $p$ and $q$ have no common -factors. (If they did, we can divide them out, an act which has no -effect on the integrability of $f(x)$. - - -The function $q(x)$ will factor over the real numbers. The fundamental -theorem of algebra can be applied to say that $q(x)=q_1(x)^{n_1} -\cdots q_k(x)^{n_k}$ where $q_i(x)$ is a linear or quadratic -polynomial and $n_k$ a positive integer. - - -> **Partial Fraction Decomposition**: There are unique polynomials $a_{ij}$ with degree $a_{ij} <$ -> degree $q_i$ such that -> ```math -> \frac{p(x)}{q(x)} = a(x) + \sum_{i=1}^k \sum_{j=1}^{n_i} \frac{a_{ij}(x)}{q_i(x)^j}. -> ``` - -The method is attributed to John Bernoulli, one of the prolific -Bernoulli brothers who put a stamp on several areas of math. This -Bernoulli was a mentor to Euler. - -This basically says that each factor $q_i(x)^{n_i}$ contributes a term like: - -```math -\frac{a_{i1}(x)}{q_i(x)^1} + \frac{a_{i2}(x)}{q_i(x)^2} + \cdots + \frac{a_{in_i}(x)}{q_i(x)^{n_i}}, -``` - -where each $a_{ij}(x)$ has degree less than the degree of $q_i(x)$. - -The value of this decomposition is that the terms $a_{ij}(x)/q_i(x)^j$ -each have an antiderivative, and so the sum of them will also have an -antiderivative. - -!!! note - Many calculus texts will give some examples for finding a partial - fraction decomposition. We push that work off to `SymPy`, as for all - but the easiest cases - a few are in the problems - it can be a bit tedious. - -In `SymPy`, the `apart` function will find the partial fraction -decomposition when a factorization is available. For example, here we see $n_i$ terms for each power of -$q_i$ - -```julia; -@syms a::real b::real c::real A::real B::real x::real -``` - -```julia; -apart((x-2)*(x-3) / (x*(x-1)^2*(x^2 + 2)^3)) -``` - - -### Sketch of proof - -A standard proof uses two facts of number systems: the division -algorithm and a representation of the greatest common divisor in terms -of sums, extended to polynomials. Our sketch shows how these are used. - -Take one of the factors of the denominators, and consider this -representation of the rational function $P(x)/(q(x)^k Q(x))$ where -there are no common factors to any of the three polynomials. - -Since $q(x)$ and $Q(x)$ share no factors, -[Bezout's](http://tinyurl.com/kd6prns) -identity says there exists polynomials $a(x)$ and $b(x)$ with: - -```math -a(x) Q(x) + b(x) q(x) = 1. -``` - - -Then dividing by $q^k(x)Q(x)$ gives the decomposition - -```math -\frac{1}{q(x)^k Q(x)} = \frac{a(x)}{q(x)^k} + \frac{b(x)}{q(x)^{k-1}Q(x)}. -``` - -So we get by multiplying the $P(x)$: - -```math -\frac{P(x)}{q(x)^k Q(x)} = \frac{A(x)}{q(x)^k} + \frac{B(x)}{q(x)^{k-1}Q(x)}. -``` - -This may look more complicated, but what it does is peel off one term -(The first) and leave something which is smaller, in this case by a -factor of $q(x)$. This process can be repeated pulling off a power of a factor at a time until nothing is left to do. - - -What remains is to establish that we can take $A(x) = a(x)\cdot P(x)$ with a degree less than that of $q(x)$. - -In Proposition 3.8 of -[Bradley](http://www.m-hikari.com/imf/imf-2012/29-32-2012/cookIMF29-32-2012.pdf) -and Cook we can see how. Recall the division algorithm, for example, says there are ``q_k`` and ``r_k`` with ``A=q\cdot q_k + r_k`` where the degree of ``r_k`` is less than that of ``q``, which is linear or quadratic. This is repeatedly applied below: - -```math -\begin{align*} -\frac{A}{q^k} &= \frac{q\cdot q_k + r_k}{q^k}\\ -&= \frac{r_k}{q^k} + \frac{q_k}{q^{k-1}}\\ -&= \frac{r_k}{q^k} + \frac{q \cdot q_{k-1} + r_{k-1}}{q^{k-1}}\\ -&= \frac{r_k}{q^k} + \frac{r_{k-1}}{q^{k-1}} + \frac{q_{k-1}}{q^{k-2}}\\ -&= \frac{r_k}{q^k} + \frac{r_{k-1}}{q^{k-1}} + \frac{q\cdot q_{k-2} + r_{k-2}}{q^{k-2}}\\ -&= \cdots\\ -&= \frac{r_k}{q^k} + \frac{r_{k-1}}{q^{k-1}} + \cdots + q_1. -\end{align*} -``` - -So the term ``A(x)/q(x)^k`` can be expressed in terms of a sum where the numerators or each term have degree less than ``q(x)``, as expected by the statement of the theorem. - - - -## Integrating the terms in a partial fraction decomposition - -We discuss, by example, how each type of possible term in a partial -fraction decomposition has an antiderivative. Hence, rational -functions will *always* have an antiderivative that can be computed. - -### Linear factors - -For $j=1$, if $q_i$ is linear, then $a_{ij}/q_i^j$ must look like a constant over a linear term, or something like: - -```julia; -p = a/(x-c) -``` - -This has a logarithmic antiderivative: - -```julia; -integrate(p, x) -``` - - -For $j > 1$, we have powers. - -```julia; -@syms j::positive -integrate(a/(x-c)^j, x) -``` - - -### Quadratic factors - -When $q_i$ is quadratic, it looks like $ax^2 + bx + c$. Then $a_{ij}$ -can be a constant or a linear polynomial. The latter can be written as -$Ax + B$. - - -The integral of the following general form is presented below: - -```math -\frac{Ax +B }{(ax^2 + bx + c)^j}, -``` - -With `SymPy`, we consider a few cases of the following form, which results from a shift of `x` - -```math -\frac{Ax + B}{((ax)^2 \pm 1)^j} -``` - -This can be done by finding a $d$ so that $a(x-d)^2 + b(x-d) + c = dx^2 + e = e((\sqrt{d/e}x^2 \pm 1)$. - -The integrals of the type $Ax/((ax)^2 \pm 1)$ can completed by $u$-substitution, with $u=(ax)^2 \pm 1$. - -For example, - -```julia; -integrate(A*x/((a*x)^2 + 1)^4, x) -``` - -The integrals of the type $B/((ax)^2\pm 1)$ are completed by -trigonometric substitution and various reduction formulas. They can get involved, but are tractable. For -example: - - -```julia; -integrate(B/((a*x)^2 + 1)^4, x) -``` -and - -```julia; -integrate(B/((a*x)^2 - 1)^4, x) -``` - ----- - - -In [Bronstein](http://www-sop.inria.fr/cafe/Manuel.Bronstein/publications/issac98.pdf) this characterization can be found - "This method, which dates back to Newton, Leibniz and Bernoulli, should not be -used in practice, yet it remains the method found in most calculus texts and is -often taught. Its major drawback is the factorization of the denominator of the -integrand over the real or complex numbers." We can also find the following formulas which formalize the above exploratory calculations (``j>1`` and ``b^2 - 4c < 0`` below): - -```math -\begin{align*} -\int \frac{A}{(x-a)^j} &= \frac{A}{1-j}\frac{1}{(x-a)^{1-j}}\\ -\int \frac{A}{x-a} &= A\log(x-a)\\ -\int \frac{Bx+C}{x^2 + bx + c} &= \frac{B}{2} \log(x^2 + bx + c) + \frac{2C-bB}{\sqrt{4c-b^2}}\cdot \arctan\left(\frac{2x+b}{\sqrt{4c-b^2}}\right)\\ -\int \frac{Bx+C}{(x^2 + bx + c)^j} &= \frac{B' x + C'}{(x^2 + bx + c)^{j-1}} + \int \frac{C''}{(x^2 + bx + c)^{j-1}} -\end{align*} -``` - - - - - -The first returns a rational function; the second yields a logarithm -term; the third yields a logarithm and an arctangent term; while the -last, which has explicit constants available, provides a reduction -that can be recursively applied; - -That is integrating ``f(x)/g(x)``, a rational function, will yield an output that looks like the following, where the functions are polynomials: - -```math -\int f(x)/g(x) = P(x) + \frac{C(x)}{D{x}} + \sum v_i \log(V_i(x)) + \sum w_j \arctan(W_j(x)) -``` - - -(Bronstein also sketches the modern method which is to use a Hermite reduction to express ``\int (f/g) dx = p/q + \int (g/h) dx``, where ``h`` is square free (the "`j`" are all ``1``). The latter can be written over the complex numbers as logarithmic terms of the form ``\log(x-a)``, the "`a`s"found following a method due to Trager and Lazard, and Rioboo, which is mentioned in the SymPy documentation as the method used.) - - - -#### Examples - -Find an antiderivative for $1/(x\cdot(x^2+1)^2)$. - -We have a partial fraction decomposition is: - -```julia; -q = (x * (x^2 + 1)^2) -apart(1/q) -``` - -We see three terms. The first and second will be done by $u$-substitution, the third by a logarithm: - -```julia; -integrate(1/q, x) -``` - ----- - -Find an antiderivative of $1/(x^2 - 2x-3)$. - -We again just let `SymPy` do the work. A partial fraction decomposition is given by: - -```julia; -𝒒 = (x^2 - 2x - 3) -apart(1/𝒒) -``` - -We see what should yield two logarithmic terms: - -```julia; -integrate(1/𝒒, x) -``` - - -!!! note - `SymPy` will find ``\log(x)`` as an antiderivative for ``1/x``, but more - generally, ``\log(\lvert x\rvert)`` is one. - - -##### Example - -The answers found can become quite involved. [Corless](https://arxiv.org/pdf/1712.01752.pdf), Moir, Maza, and Xie use this example which at first glance seems tame enough: - -```julia -ex = (x^2 - 1) / (x^4 + 5x^2 + 7) -``` - -But the integral is something best suited to a computer algebra system: - -```julia -integrate(ex, x) -``` - -## Questions - -###### Question - -The partial fraction decomposition of $1/(x(x-1))$ must be of the form $A/x + B/(x-1)$. - -What is $A$? (Use `SymPy` or just put the sum over a common denominator and solve for $A$ and $B$.) - -```julia; hold=true; echo=false -val = -1 -numericq(val) -``` - -What is $B$? - -```julia; hold=true; echo=false -val = 1 -numericq(val) -``` - -###### Question - -The following gives the partial fraction decomposition for a rational expression: - -```math -\frac{3x+5}{(1-2x)^2} = \frac{A}{1-2x} + \frac{B}{(1-2x)^2}. -``` - -Find $A$ (being careful with the sign): - -```julia; hold=true; echo=false -numericq(-3/2) -``` - -Find $B$: - -```julia; hold=true; echo=false -numericq(13/2) -``` - -###### Question - -The following specifies the general partial fraction decomposition for a rational expression: - -```math -\frac{1}{(x+1)(x-1)^2} = \frac{A}{x+1} + \frac{B}{x-1} + \frac{C}{(x-1)^2}. -``` - -Find $A$: - -```julia; hold=true; echo=false -numericq(1/4) -``` - -Find $B$: - -```julia; hold=true; echo=false -numericq(-1/4) -``` - -Find $C$: - -```julia; hold=true; echo=false -numericq(1/2) -``` - - -###### Question - -Compute the following exactly: - -```math -\int_0^1 \frac{(x-2)(x-3)}{(x-4)^2\cdot(x-5)} dx -``` - -Is $-6\log(5) - 5\log(3) - 1/6 + 11\log(4)$ the answer? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -###### Question - -In the assumptions for the partial fraction decomposition is the fact that $p(x)$ and $q(x)$ share no common factors. Suppose, this isn't the case and in fact we have: - -```math -\frac{p(x)}{q(x)} = \frac{(x-c)^m s(x)}{(x-c)^n t(x)}. -``` - -Here $s$ and $t$ are polynomials such that $s(c)$ and $t(c)$ are non-zero. - -If $m > n$, then why can we cancel out the $(x-c)^n$ and not have a concern? - -```julia; hold=true; echo=false -choices = [ -"`SymPy` allows it.", -L"The value $c$ is a removable singularity, so the integral will be identical.", -L"The resulting function has an identical domain and is equivalent for all $x$." -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - - -If $m = n$, then why can we cancel out the $(x-c)^n$ and not have a concern? - -```julia; hold=true; echo=false -choices = [ -"`SymPy` allows it.", -L"The value $c$ is a removable singularity, so the integral will be identical.", -L"The resulting function has an identical domain and is equivalent for all $x$." -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - - -If $m < n$, then why can we cancel out the $(x-c)^n$ and not have a concern? - -```julia; hold=true; echo=false -choices = [ -"`SymPy` allows it.", -L"The value $c$ is a removable singularity, so the integral will be identical.", -L"The resulting function has an identical domain and is equivalent for all $x$." -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -##### Question - -The partial fraction decomposition, as presented, factors the denominator polynomial into linear and quadratic factors over the real numbers. Alternatively, factoring over the complex numbers is possible, resulting in terms like: - -```math -\frac{a + ib}{x - (\alpha + i \beta)} + \frac{a - ib}{x - (\alpha - i \beta)} -``` - -How to see that these give rise to real answers on integration is the point of this question. - -Breaking the terms up over ``a`` and ``b`` we have: - -```math -\begin{align*} -I &= \frac{a}{x - (\alpha + i \beta)} + \frac{a}{x - (\alpha - i \beta)} \\ -II &= i\frac{b}{x - (\alpha + i \beta)} - i\frac{b}{x - (\alpha - i \beta)} -\end{align*} -``` - -Integrating ``I`` leads to two logarithmic terms, which are combined to give: - -```math -\int I dx = a\cdot \log((x-(\alpha+i\beta)) \cdot (x - (\alpha-i\beta))) -``` - -This involves no complex numbers, as: - -```julia; hold=true; echo=false -choices = ["The complex numbers are complex conjugates, so the term in the logarithm will simply be ``x - 2\\alpha x + \\alpha^2 + \\beta^2``", - "The ``\\beta`` are ``0``, as the polynomials in question are real"] -radioq(choices, 1) -``` - - -The term ``II`` benefits from this computation (attributed to Rioboo by [Corless et. al](https://arxiv.org/pdf/1712.01752.pdf)) - -```math -\frac{d}{dx} i \log(\frac{X+iY}{X-iY}) = 2\frac{d}{dx}\arctan(\frac{X}{Y}) -``` - -Applying this with ``X=x - \alpha`` and ``Y=-\beta`` shows that ``\int II dx`` will be - -```julia; hold=true; echo=false -choices = ["``-2b\\arctan((x - \\alpha)/(\\beta))``", - "``2b\\sec^2(-(x-\\alpha)/(-\\beta))``"] -radioq(choices, 1) -``` diff --git a/CwJ/integrals/process.jl b/CwJ/integrals/process.jl deleted file mode 100644 index a9e21b8..0000000 --- a/CwJ/integrals/process.jl +++ /dev/null @@ -1,41 +0,0 @@ - -fnames = [ - "area", - "ftc", - - "substitution", - "integration_by_parts", - "partial_fractions", # XX add in trig integrals (cos()sin() stuff? mx or ^m... XXX - "improper_integrals", ## - - "mean_value_theorem", - "area_between_curves", - "center_of_mass", - "volumes_slice", - #"volumes_shell", ## XXX add this in if needed, but not really that excited to now XXX - "arc_length", - "surface_area" -] - - - -function process_file(nm, twice=false) - include("$nm.jl") - mmd_to_md("$nm.mmd") - markdownToHTML("$nm.md") - twice && markdownToHTML("$nm.md") -end - -process_files(twice=false) = [process_file(nm, twice) for nm in fnames] - - - - -""" -## TODO integrals - -* add in volumes shell??? -* mean value theorem is light? -* could add surface area problems - -""" diff --git a/CwJ/integrals/riemann.js b/CwJ/integrals/riemann.js deleted file mode 100644 index 3b2ffe7..0000000 --- a/CwJ/integrals/riemann.js +++ /dev/null @@ -1,31 +0,0 @@ -const b = JXG.JSXGraph.initBoard('jsxgraph', { - boundingbox: [-0.5,0.3,1.5,-1/4], axis:true -}); - -var g = function(x) { return x*x*x*x + 10*x*x - 60* x + 100} -var f = function(x) {return 1/Math.sqrt(g(x))}; - -var type = "right"; -var l = 0; -var r = 1; -var rsum = function() { - return JXG.Math.Numerics.riemannsum(f,n.Value(), type, l, r); -}; -var n = b.create('slider', [[0.1, -0.05],[0.75,-0.05], [2,1,50]],{name:'n',snapWidth:1}); - - - - -var graph = b.create('functiongraph', [f, l, r]); -var os = b.create('riemannsum', - [f, - function(){ return n.Value();}, - type, l, r - ], - {fillColor:'#ffff00', fillOpacity:0.3}); - - - -b.create('text', [0.1,0.25, function(){ - return 'Riemann sum='+(rsum().toFixed(4)); -}]); diff --git a/CwJ/integrals/substitution.jmd b/CwJ/integrals/substitution.jmd deleted file mode 100644 index dc6e90c..0000000 --- a/CwJ/integrals/substitution.jmd +++ /dev/null @@ -1,770 +0,0 @@ -# Substitution - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy - -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Substitution", - description = "Calculus with Julia: Substitution", - tags = ["CalculusWithJulia", "integrals", "substitution"], -); -nothing -``` - ----- - -The technique of $u$-[substitution](https://en.wikipedia.org/wiki/Integration_by_substitution) is derived from reversing the chain rule: $[f(g(x))]' = f'(g(x)) g'(x)$. - -Suppose that $g$ is continuous and $u(x)$ is differentiable with ``u'(x)`` being Riemann integrable. Then both these integrals are defined: - -```math -\int_a^b g(u(t)) \cdot u'(t) dt, \quad \text{and}\quad \int_{u(a)}^{u(b)} g(x) dx. -``` - -We wish to show they are equal. - -Let $G$ be an antiderivative of $g$, which exists as $g$ is assumed to be continuous. (By the Fundamental Theorem part I.) Consider the composition $G \circ u$. The chain rule gives: - -```math -[G \circ u]'(t) = G'(u(t)) \cdot u'(t) = g(u(t)) \cdot u'(t). -``` - -So, - -```math -\begin{align*} -\int_a^b g(u(t)) \cdot u'(t) dt &= \int_a^b (G \circ u)'(t) dt\\ -&= (G\circ u)(b) - (G\circ u)(a) \quad\text{(the FTC, part II)}\\ -&= G(u(b)) - G(u(a)) \\ -&= \int_{u(a)}^{u(b)} g(x) dx. \quad\text{(the FTC part II)} -\end{align*} -``` - - -That is, this substitution formula applies: - -> ``\int_a^b g(u(x)) u'(x) dx = \int_{u(a)}^{u(b)} g(x) dx.`` - -Further, for indefinite integrals, - -> ``\int f(g(x)) g'(x) dx = \int f(u) du.`` - - - -We have seen a special case of substitution where $u(x) = x-c$ in the formula $\int_{a-c}^{b-c} g(x) dx= \int_a^b g(x-c)dx$. - - - -The main use of this is to take complicated things inside of the function $g$ out of the function (the $u(x)$) by renaming them, then accounting for the change of name. - -Some examples are in order. - -Consider: - -```math -\int_0^{\pi/2} \cos(x) e^{\sin(x)} dx. -``` - -Clearly the $\sin(x)$ inside the exponential is an issue. If we let $u(x) = \sin(x)$, then $u'(x) = \cos(x)$, and this becomes - -```math -\int_0^2 u\prime(x) e^{u(x)} dx = -\int_{u(0)}^{u(\pi/2)} e^x dx = e^x \big|_{\sin(0)}^{\sin(\pi/2)} = e^1 - e^0. -``` - -This all worked, as the problem was such that it was more or less obvious what to choose for $u$ and $G$. - -### Integration by substitution - -The process of identifying the result of the chain rule in the -function to integrate is not automatic, but rather a bit of an art. The basic -step is to try some values and hope one works. Typically, this is taught by -"substituting" in some value for part of the expression (basically the -$u(x)$) and seeing what happens. - -In the above problem, $\int_0^{\pi/2} \cos(x) e^{\sin(x)} dx$, we -might just rename $\sin(x)$ to be $u$ (suppressing the "of $x$ -part). Then we need to rewrite the "$dx$" part of the integral. We -know in this case that $du/dx = \cos(x)$. In terms of differentials, -this gives $du = \cos(x) dx$. But this allows us to substitute in with -$u$ and $du$ as is possible: - -```math -\int_0^{\pi/2} \cos(x) e^{\sin(x)} dx = \int_0^{\pi/2} e^{\sin(x)} \cdot \cos(x) dx = \int_{u(0)}^{u(\pi)} e^u du. -``` - ----- - -Let's illustrate with a new problem: $\int_0^2 4x e^{x^2} dx$. - -Again, we see that the $x^2$ inside the exponential is a complication. Letting $u = x^2$ we have $du = 2x dx$. We have $4xdx$ in the -original problem, so we will end up with $2du$: - -```math -\int_0^2 4x e^{x^2} dx = 2\int_0^2 e^{x^2} \cdot 2x dx = 2\int_{u(0)}^{u(2)} e^u du = 2 \int_0^4 e^u du = -2 e^u\big|_{u=0}^4 = 2(e^4 - 1). -``` - ----- - -Consider now $\int_0^1 2x^2 \sqrt{1 + x^3} dx$. Here we see that the $1 + x^3$ makes the square root term complicated. If we call this $u$, then what is $du$? Clearly, $du = 3x^2 dx$, or $(1/3)du = x^2 dx$, so we can rewrite this as: - -```math -\int_0^1 2x^2 \sqrt{1 + x^3} dx = \int_{u(0)}^{u(1)} 2 \sqrt{u} (1/3) du = 2/3 \cdot \frac{u^{3/2}}{3/2} \big|_1^2 = -\frac{4}{9} \cdot(2^{3/2} - 1). -``` - - ----- - -Consider $\int_0^{\pi} \cos(x)^3 \sin(x) dx$. The $\cos(x)$ function inside the $x^3$ function is complicated. We let $u(x) = \cos(x)$ and see what that implies: $du = \sin(x) dx$, which we see is part of the question. So the above becomes: - -```math -\int_0^{\pi} \cos(x)^3 \sin(x) dx = \int_{u(0)}^{u(\pi)} u^3 du= \frac{u^4}{4}\big|_0^0 = 0. -``` - -Changing limits leaves the two endpoints the same, which means the -total area after substitution is $0$. A graph of this function shows -that about $\pi/2$ the function has odd-like symmetry, so the answer of $0$ is supported by the plot: - - -```julia;hold=true -f(x) = cos(x)^3 * sin(x) -plot(f, 0, 1pi) -``` - - - ----- - -Consider $\int_1^e \log(x)/x dx$. There isn't really an "inside" function here, but instead just a tricky $\log(x)$. If we let $u=\log(x)$, what happens? We get $du = 1/x \cdot dx$, which we see present in the original. So with this, we have: - -```math -\int_1^e \frac{\log(x)}{x} dx = \int_{u(1)}^{u(e)} u du = \frac{u^2}{2}\big|_0^1 = \frac{1}{2}. -``` - -##### Example: Transformations - -We say that the area intrinsically discussed in the definite integral -$A=\int_a^b f(x-c) dx$ is unaffected by shifts, in that $A = -\int_{a-c}^{b-c} f(x) dx$. What about more general transformations? -For example: if $g(x) = (1/h) \cdot f((x-c)/h)$ for values $c$ and $h$ what is -the integral over $a$ to $b$ in terms of the function $f(x)$? - -If $A = \int_a^b (1/h) \cdot f((x-c)/h) dx$ then we let $u = (x-c)/h$. With this, $du = 1/h \cdot dx$. This allows a straight substitution: - -```math -A = \int_a^b \frac{1}{h} f(\frac{x-c}{h}) dx = \int_{(a-c)/h}^{(b-c)/h} f(u) du. -``` - -So the answer is: the area under the transformed function over $a$ to $b$ -is the area of the function over the transformed region. - - -For example, consider the "hat" function $f(x) = 1 - \lvert x \rvert $ -when $-1 \leq x \leq 1$ and $0$ otherwise. The area under $f$ is just -$1$ - the graph forms a triangle with base of length $2$ and height -$1$. If we take any values of $c$ and $h$, what do we find for the -area under the curve of the transformed function? - -Let $u(x) = (x-c)/h$ and $g(x) = h f(u(x))$. Then, as $du = 1/h dx$ - -```math -\begin{align} -\int_{c-h}^{c+h} g(x) dx -&= \int_{c-h}^{c+h} h f(u(x)) dx\\ -&= \int_{u(c-h)}^{u(c+h)} f(u) du\\ -&= \int_{-1}^1 f(u) du\\ -&= 1. -\end{align} -``` - -So the area of this transformed function is still $1$. The shifting by -$c$ we know doesn't effect the area, the scaling by $h$ inside of $f$ -does, but is balanced out by the multiplication by $h$ outside of $f$. - -##### Example: Speed versus velocity - -The "velocity" of an object includes a sense of direction in addition to the sense of magnitude. The "speed" just includes the sense of magnitude. Speed is always non-negative, whereas velocity is a signed quantity. - -As mentioned previously, position is the integral of velocity, as expressed precisely through this equation: - -```math -x(t) = \int_0^t v(u) du - x(0). -``` - -What is the integral of speed? - -If $v(t)$ is the velocity, the $s(t) = \lvert v(t) \rvert$ is the speed. If integrating either $s(t)$ or $v(t)$, the integrals would agree when $v(t) \geq 0$. However, when $v(t) \leq 0$, the position back tracks so $x(t)$ decreases, where the integral of $s(t)$ would only increase. - -This integral - -```math -td(t) = \int_0^t s(u) du = \int_0^t \lvert v(u) \rvert du, -``` - -Gives the *total distance* traveled. - -To illustrate with a simple example, if a car drives East for one hour -at 60 miles per hour, then heads back West for an hour at 60 miles per -hour, the car's position after one hour is $x(2) = x(0)$, with a change in position $x(2) - x(0) = 0$. Whereas, the -total distance traveled is $120$ miles. (Gas is paid on total -distance, not change in position!). What are the formulas for speed -and velocity? Clearly $s(t) = 60$, a constant, whereas here $v(t) = -60$ for $0 \leq t \leq 1$ and $-60$ for $1 < t \leq 2$. - - - -Suppose $v(t)$ is given by $v(t) = (t-2)^3/3 - 4(t-2)/3$. If $x(0)=0$ -Find the position after 3 time units and the total distance traveled. - - -We let $u(t) = t - 2$ so $du=dt$. The position is given by - -```math -\int_0^3 ((t-2)^3/3 - 4(t-2)/3) dt = \int_{u(0)}^{u(3)} (u^3/3 - 4/3 u) du = -(\frac{u^4}{12} - \frac{4}{3}\frac{u^2}{2}) \big|_{-2}^1 = \frac{3}{4}. -``` - -The speed is similar, but we have to work harder: - - -```math -\int_0^3 \lvert v(t) \rvert dt = \int_0^3 \lvert ((t-2)^3/3 - 4(t-2)/3) \rvert dt = -\int_{-2}^1 \lvert u^3/3 - 4u/3 \rvert du. -``` - -But $u^3/3 - 4u/3 = (1/3) \cdot u(u-1)(u+2)$, so between $-2$ and $0$ -it is positive and between $0$ and $1$ negative, so this integral is: - -```math -\begin{align*} -\int_{-2}^0 (u^3/3 - 4u/3 ) du + \int_{0}^1 -(u^3/3 - 4u/3) du -&= (\frac{u^4}{12} - \frac{4}{3}\frac{u^2}{2}) \big|_{-2}^0 - (\frac{u^4}{12} - \frac{4}{3}\frac{u^2}{2}) \big|_{0}^1\\ -&= \frac{4}{3} - -\frac{7}{12}\\ -&= \frac{23}{12}. -\end{align*} -``` - -##### Example - -In probability, the normal distribution plays an outsized role. This distribution is characterized by a family of *density* functions: - -```math -f(x; \mu, \sigma) = \frac{1}{\sqrt{2\pi}}\frac{1}{\sigma} \exp(-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2). -``` - -Integrals involving this function are typically transformed by substitution. For example: - -```math -\begin{align*} -\int_a^b f(x; \mu, \sigma) dx -&= \int_a^b \frac{1}{\sqrt{2\pi}}\frac{1}{\sigma} \exp(-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2) dx \\ -&= \int_{u(a)}^{u(b)} \frac{1}{\sqrt{2\pi}} \exp(-\frac{1}{2}u^2) du \\ -&= \int_{u(a)}^{u(b)} f(u; 0, 1) du, -\end{align*} -``` - -where ``u = (x-\mu)/\sigma``, so ``du = (1/\sigma) dx``. - -This shows that integrals involving a normal density with parameters ``\mu`` and ``\sigma`` can be computed using the *standard* normal density with ``\mu=0`` and ``\sigma=1``. Unfortunately, there is no elementary antiderivative for ``\exp(-u^2/2)``, so integrals for the standard normal must be numerically approximated. - -There is a function `erf` in the `SpecialFunctions` package (which is loaded by `CalculusWithJulia`) that computes: - -```math -\int_0^x \frac{2}{\sqrt{\pi}} \exp(-t^2) dt -``` - -A further change of variables by ``t = u/\sqrt{2}`` (with ``\sqrt{2}dt = du``) gives: - -```math -\begin{align*} -\int_a^b f(x; \mu, \sigma) dx &= -\int_{t(u(a))}^{t(u(b))} \frac{\sqrt{2}}{\sqrt{2\pi}} \exp(-t^2) dt\\ -&= \frac{1}{2} \int_{t(u(a))}^{t(u(b))} \frac{2}{\sqrt{\pi}} \exp(-t^2) dt -\end{align*} -``` - -Up to a factor of ``1/2`` this is `erf`. - -So we would have, for example, with ``\mu=1``,``\sigma=2`` and ``a=1`` and ``b=3`` that: - -```math -\begin{align*} -t(u(a)) &= (1 - 1)/2/\sqrt{2} = 0\\ -t(u(b)) &= (3 - 1)/2/\sqrt{2} = \frac{1}{\sqrt{2}}\\ -\int_1^3 f(x; 1, 2) -&= \frac{1}{2} \int_0^{1/\sqrt{2}} \frac{2}{\sqrt{\pi}} \exp(-t^2) dt. -\end{align*} -``` - -Or - -```julia -1/2 * erf(1/sqrt(2)) -``` - -!!! note "The `Distributions` package" - - The above calculation is for illustration purposes. The add-on package `Distributions` makes much quicker work of such a task for the normal distribution and many other distributions from probability and statistics. - -## SymPy and substitution - -The `integrate` function in `SymPy` can handle most problems which involve substitution. Here are a few examples: - - -* This integral, $\int_0^2 4x/\sqrt{x^2 +1}dx$, involves a substitution for $x^2 + 1$: - -```julia; -@syms x::real t::real -integrate(4x / sqrt(x^2 + 1), (x, 0, 2)) -``` - - -* This integral, $\int_e^{e^2} 1/(x\log(x)) dx$ involves a substitution of $u=\log(x)$. Here we see the answer: - -```julia; hold=true -f(x) = 1/(x*log(x)) -integrate(f(x), (x, sympy.E, sympy.E^2)) -``` - -(We used `sympy.E)` - and not `e` - to avoid any conversion to floating point, which could yield an inexact answer.) - - -The antiderivative is interesting here; it being an *iterated* logarithm. - -```julia; -integrate(1/(x*log(x)), x) -``` - - - -### Failures... - - -Not every integral problem lends itself to solution by -substitution. For example, we can use substitution to evaluate the -integral of $xe^{-x^2}$, but for $e^{-x^2}$ or $x^2e^{-x^2}$. The -first has no familiar antiderivative, the second is done by a -different technique. - -Even when substitution can be used, `SymPy` may not be able to -algorithmically identify it. The main algorithm used can determine if -expressions involving rational functions, radicals, logarithms, and -exponential functions is integrable. Missing from this list are -absolute values. - - -For some such problems, we can help `SymPy` out - by breaking the integral into pieces where we know the sign of the expression. - -For substitution problems, we can also help out. For example, to find an antiderivative for - -```math -\int(1 + \log(x)) \sqrt{1 + (x\log(x))^2} dx -``` - -A quick attempt with `SymPy` turns up nothing: - - -```julia; -𝒇(x) = (1 + log(x)) * sqrt(1 + (x*log(x))^2 ) -integrate(𝒇(x), x) -``` - -But were we to try $u=x\log(x)$, we'd see that this simplifies to $\int \sqrt{1 + u^2} du$, which has some hope of having an antiderivative. - -We can help `SymPy` out by substitution: - -```julia; -u(x) = x * log(x) -@syms w dw -ex = 𝒇(x) -ex₁ = ex(u(x) => w, diff(u(x),x) => dw) -``` - -This verifies the above. Can it be integrated in `w`? The "`dw`" is only for familiarity, `SymPy` doesn't use this, so we set it to 1 then integrate: - -```julia; -ex₂ = ex₁(dw => 1) -ex₃ = integrate(ex₂, w) -``` - -Finally, we put back in the `u(x)` to get an antiderivative. - -```julia; -ex₃(w => u(x)) -``` - -!!! note - Lest it be thought this is an issue with `SymPy`, but not other - systems, this example was [borrowed](http://faculty.uml.edu/jpropp/142/Integration.pdf) from an - illustration for helping Mathematica. - - -## Trigonometric substitution - -Wait, in the last example an antiderivative for $\sqrt{1 + u^2}$ was found. But how? We haven't discussed this yet. - - -This can be found using *trigonometric* substitution. In this example, -we know that $1 + \tan(\theta)^2$ simplifies to $\sec(\theta)^2$, so -we might *try* a substitution of $\tan(u)=x$. This would simplify -$\sqrt{1 + x^2}$ to $\sqrt{1 + \tan(u)^2} = \sqrt{\sec(u)^2}$ which is -$\lvert \sec(u) \rvert$. What of $du$? The chain rule gives -$\sec(u)^2du = dx$. In short we get: - -```math -\int \sqrt{1 + x^2} dx = \int \sec(u)^2 \lvert \sec(u) \rvert du = \int \sec(u)^3 du, -``` - -if we know ``\sec(u) \geq 0``. - - -This leaves still the question of integrating $\sec(u)^3$, which we aren't (yet) prepared to discuss, but we see that this type of substitution can re-express an integral in a new way that may pay off. - -#### Examples - -Let's see some examples where a trigonometric substitution is all that is needed. - -##### Example - -Consider $\int 1/(1+x^2) dx$. This is an antiderivative of some function, but if that isn't observed, we might notice the $1+x^2$ and try to simplify that. First, an attempt at a $u$-substitution: - - -Letting $u = 1+x^2$ we get $du = 2xdx$ which gives $\int (1/u) (2x) du$. We aren't able to address the "$2x$" part successfully, so this attempt is for naught. - - -Now we try a trigonometric substitution, taking advantage of the identity $1+\tan(x)^2 = \sec(x)^2$. Letting $\tan(u) = x$ yields $\sec(u)^2 du = dx$ and we get: - -```math -\int \frac{1}{1+x^2} dx = \int \frac{1}{1 + \tan(u)^2} \sec(u)^2 du = \int 1 du = u. -``` - -But $\tan(u) = x$, so in terms of $x$, an antiderivative is just $\tan^{-1}(x)$, or the arctangent. Here we verify with `SymPy`: - -```julia; -integrate(1/(1+x^2), x) -``` - -The general form allows ``a^2 + (bx)^2`` in the denominator (squared so both are positive and the answer is nicer): - -```julia; hold=true -@syms a::real, b::real, x::real -integrate(1 / (a^2 + (b*x)^2), x) -``` - - -##### Example - -The expression $1-x^2$ can be attacked by the substitution $\sin(u) =x$ as then $1-x^2 = 1-\cos(u)^2 = \sin(u)^2$. Here we see this substitution being used successfully: - -```math -\begin{align*} -\int \frac{1}{\sqrt{9 - x^2}} dx &= \int \frac{1}{\sqrt{9 - (3\sin(u))^2}} \cdot 3\cos(u) du\\ -&=\int \frac{1}{3\sqrt{1 - \sin(u)^2}}\cdot3\cos(u) du \\ -&= \int du \\ -&= u \\ -&= \sin^{-1}(x/3). -\end{align*} -``` - - -Further substitution allows the following integral to be solved for an antiderivative: - -```julia; hold=true -@syms a::real, b::real -integrate(1 / sqrt(a^2 - b^2*x^2), x) -``` - -##### Example - -The expression $x^2 - 1$ is a bit different, this lends itself to $\sec(u) = x$ for a substitution, for $\sec(u)^2 - 1 = \tan(u)^2$. For example, we try $\sec(u) = x$ to integrate: - -```math -\begin{align*} -\int \frac{1}{\sqrt{x^2 - 1}} dx &= \int \frac{1}{\sqrt{\sec(u)^2 - 1}} \cdot \sec(u)\tan(u) du\\ -&=\int \frac{1}{\tan(u)}\sec(u)\tan(u) du\\ -&= \int \sec(u) du. -\end{align*} -``` - - -This doesn't seem that helpful, but the antiderivative to $\sec(u)$ is -$\log\lvert (\sec(u) + \tan(u))\rvert$, so we can proceed to get: - -```math -\begin{align*} -\int \frac{1}{\sqrt{x^2 - 1}} dx &= \int \sec(u) du\\ -&= \log\lvert (\sec(u) + \tan(u))\rvert\\ -&= \log\lvert x + \sqrt{x^2-1} \rvert. -\end{align*} -``` - -SymPy gives a different representation using the arccosine: - -```julia; hold=true -@syms a::positive, b::positive, x::real -integrate(1 / sqrt(a^2*x^2 - b^2), x) -``` - - - -##### Example - -The equation of an ellipse is $x^2/a^2 + y^2/b^2 = 1$. Suppose $a,b>0$. The area under -the function $b \sqrt{1 - x^2/a^2}$ between $-a$ and $a$ -will then be half the area of the ellipse. Find the area enclosed by -the ellipse. - -We need to compute: - -```math -2\int_{-a}^a b \sqrt{1 - x^2/a^2} dx = -4 b \int_0^a\sqrt{1 - x^2/a^2} dx. -``` - -Letting $\sin(u) = x/a$ gives $a\cos(u)du = dx$ and an antiderivative is found with: - -```math -4 b \int_0^a \sqrt{1 - x^2/a^2} dx = 4b \int_0^{\pi/2} \sqrt{1-u^2} a \cos(u) du -= 4ab \int_0^{\pi/2} \cos(u)^2 du -``` - -The identify $\cos(u)^2 = (1 + \cos(2u))/2$ makes this tractable: - -```math -\begin{align*} -4ab \int \cos(u)^2 du -&= 4ab\int_0^{\pi/2}(\frac{1}{2} + \frac{\cos(2u)}{2}) du\\ -&= 4ab(\frac{1}{2}u + \frac{\sin(2u)}{4})\big|_0^{\pi/2}\\ -&= 4ab (\pi/4 + 0) = \pi ab. -\end{align*} -``` - -Keeping in mind that that a circle with radius $a$ is an ellipse with -$b=a$, we see that this gives the correct answer for a circle. - -## Questions - -###### Question - -For $\int \sin(x) \cos(x) dx$, let $u=\sin(x)$. What is the resulting substitution? - -```julia; hold=true; echo=false -choices = [ -"``\\int u du``", -"``\\int u (1 - u^2) du``", -"``\\int u \\cos(x) du``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -For $\int \tan(x)^4 \sec(x)2 dx$ what $u$-substitution makes this easy? - -```julia; hold=true; echo=false -choices = [ -"``u=\\tan(x)``", -"``u=\\tan(x)^4``", -"``u=\\sec(x)``", -"``u=\\sec(x)^2``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -For $\int x \sqrt{x^2 - 1} dx$ what $u$ substitution makes this easy? - -```julia; hold=true; echo=false -choices = [ -"``u=x^2 - 1``", -"``u=x^2``", -"``u=\\sqrt{x^2 - 1}``", -"``u=x``" -] -answ = 1 -radioq(choices, answ) -``` -###### Question - -For $\int x^2(1-x)^2 dx$ will the substitution $u=1-x$ prove effective? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -What about expanding the factored polynomial to get a fourth degree polynomial, will this prove effective? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -###### Question - -For $\int (\log(x))^3/x dx$ the substitution $u=\log(x)$ reduces this to what? - -```julia; hold=true; echo=false -choices = [ -"``\\int u^3 du``", -"``\\int u du``", -"``\\int u^3/x du``" -] -answ = 1 -radioq(choices, answ) -``` -###### Question - -For $\int \tan(x) dx$ what substitution will prove effective? - -```julia; hold=true; echo=false -choices = [ -"``u=\\cos(x)``", -"``u=\\sin(x)``", -"``u=\\tan(x)``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Integrating $\int_0^1 x \sqrt{1 - x^2} dx$ can be done by using the $u$-substitution $u=1-x^2$. This yields an integral - -```math -\int_a^b \frac{-\sqrt{u}}{2} du. -``` - -What are $a$ and $b$? - -```julia; hold=true; echo=false -choices = [ -"``a=0,~ b=1``", -"``a=1,~ b=0``", -"``a=0,~ b=0``", -"``a=1,~ b=1``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The integral $\int \sqrt{1 - x^2} dx$ lends itself to what substitution? - -```julia; hold=true; echo=false -choices = [ -"``\\sin(u) = x``", -"``\\tan(u) = x``", -"``\\sec(u) = x``", -"``u = 1 - x^2``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The integral $\int x/(1+x^2) dx$ lends itself to what substitution? - -```julia; hold=true; echo=false -choices = [ -"``u = 1 + x^2``", -"``\\sin(u) = x``", -"``\\tan(u) = x``", -"``\\sec(u) = x``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The integral $\int dx / \sqrt{1 - x^2}$ lends itself to what substitution? - -```julia; hold=true; echo=false -choices = [ -"``\\sin(u) = x``", -"``\\tan(u) = x``", -"``\\sec(u) = x``", -"``u = 1 - x^2``" -] -answ = 1 -radioq(choices, answ) -``` - - - -###### Question - -The integral $\int dx / \sqrt{x^2 - 16}$ lends itself to what substitution? - -```julia; hold=true; echo=false -choices = [ -"``4\\sec(u) = x``", -"``\\sec(u) = x``", -"``4\\sin(u) = x``", -"``\\sin(u) = x``"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The integral $\int dx / (a^2 + x^2)$ lends itself to what substitution? - - -```julia; hold=true; echo=false -choices = [ -"``\\tan(u) = x``", -"``\\tan(u) = x``", -"``a\\sec(u) = x``", -"``\\sec(u) = x``"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The integral $\int_{1/2}^1 \sqrt{1 - x^2}dx$ can be approached with the substitution $\sin(u) = x$ giving: - -```math -\int_a^b \cos(u)^2 du. -``` - -What are $a$ and $b$? - -```julia; hold=true; echo=false -choices =[ -"``a=\\pi/6,~ b=\\pi/2``", -"``a=\\pi/4,~ b=\\pi/2``", -"``a=\\pi/3,~ b=\\pi/2``", -"``a=1/2,~ b= 1``" -] -answ =1 -radioq(choices, answ) -``` - -###### Question - -How would we verify that $\log\lvert (\sec(u) + \tan(u))\rvert$ is an antiderivative for $\sec(u)$? - -```julia; hold=true; echo=false -choices = [ -L"We could differentiate $\sec(u)$.", -L"We could differentiate $\log\lvert (\sec(u) + \tan(u))\rvert$ "] -answ = 2 -radioq(choices, answ) -``` diff --git a/CwJ/integrals/surface_area.jmd b/CwJ/integrals/surface_area.jmd deleted file mode 100644 index a6bf3c3..0000000 --- a/CwJ/integrals/surface_area.jmd +++ /dev/null @@ -1,562 +0,0 @@ -# Surface Area - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -using QuadGK -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Surface Area", - description = "Calculus with Julia: Surface Area", - tags = ["CalculusWithJulia", "integrals", "surface area"], -); -fig_size=(800, 600) -nothing -``` - ----- - -## Surfaces of revolution - -```julia; hold=true; echo=false -imgfile = "figures/gehry-hendrix.jpg" -caption = """ - -The exterior of the Jimi Hendrix Museum in Seattle has the signature -style of its architect Frank Gehry. The surface is comprised of -patches. A general method to find the amount of material to cover the -surface - the surface area - might be to add up the area of *each* of the -patches. However, in this section we will see for surfaces of -revolution, there is an easier way. (Photo credit to -[firepanjewellery](http://firepanjewellery.com/).) -""" - -ImageFile(:integrals, imgfile, caption) -``` - -> The surface area generated by rotating the graph of $f(x)$ between $a$ and $b$ about the $x$-axis -> is given by the integral -> -> ```math -> \int_a^b 2\pi f(x) \cdot \sqrt{1 + f'(x)^2} dx. -> ``` -> -> If the curve is parameterized by $(g(t), f(t))$ between $a$ and $b$ then the surface area is -> -> ```math -> \int_a^b 2\pi f(t) \cdot \sqrt{g'(t)^2 + f'(t)^2} dx. -> ``` -> These formulas do not add in the surface area of either of the ends. - - -```julia; hold=true; echo=false -F₀(u,v) = [u, u*cos(v), u*sin(v)] # a cone -us = range(0, 1, length=25) -vs = range(0, 2pi, length=25) -ws = unzip(F₀.(us, vs')) # make square -surface(ws..., legend=false) -plot!([-0.5,1.5], [0,0],[0,0]) -``` - -The above figure shows a cone (the line ``y=x``) presented as a surface of revolution about the ``x``-axis. - - -To see why this formula is as it is, we look at the parameterized case, the first -one being a special instance with $g(t) =t$. - -Let a partition of $[a,b]$ be given by $a = t_0 < t_1 -< t_2 < \cdots < t_n =b$. This breaks the curve into a collection of -line segments. Consider the line segment connecting $(g(t_{i-1}), -f(t_{i-1}))$ to $(g(t_i), f(t_i))$. Rotating this around the $x$ axis -will generate something approximating a disc, but in reality will be -the frustum of a cone. What will be the surface area? - -Consider a right-circular cone parameterized by an angle $\theta$ and -the largest radius $r$ (so that the height satisfies -$r/h=\tan(\theta)$). If this cone were made of paper, cut up a side, -and layed out flat, it would form a sector of a circle, whose area -would be $R\gamma$ where $R$ is the radius of the circle (also the -side length of our cone), and $\gamma$ an angle that we can figure out -from $r$ and $\theta$. To do this, we note that the arc length of the -circle's edge is $R\gamma$ and also the circumference of the bottom of -the cone so $R\gamma = 2\pi r$. With all this, we can solve to get $A -= \pi r^2/\sin(\theta)$. But we have a frustum of a cone with radii -$r_0$ and $r_1$, so the surface area is a difference: $A = -\pi (r_1^2 - r_0^2) /\sin(\theta)$. - -Relating this to our values in terms of $f$ and $g$, we have -$r_1=f(t_i)$, $r_0 = f(t_{i-1})$, and $\sin(\theta) = \Delta f / -\sqrt{(\Delta g)^2 + (\Delta f)^2}$, where $\Delta f = f(t_i) - -f(t_{i-1})$ and similarly for $\Delta g$. - -Putting this altogether we get that the surface area generarated by -rotating the line segment around the $x$ axis is - -```math -\text{sa}_i = \pi (f(t_i)^2 - f(t_{i-1})^2) \cdot \sqrt{(\Delta g)^2 + (\Delta f)^2} / \Delta f = -\pi (f(t_i) + f(t_{i-1})) \cdot \sqrt{(\Delta g)^2 + (\Delta f)^2}. -``` - -(This is $2 \pi$ times the average radius times the slant height.) - -As was done in the derivation of the formula for arc length, these -pieces are multiplied both top and bottom by $\Delta t = -t_{i} - t_{i-1}$. Carrying the bottom inside the square root and -noting that by the mean value theorem $\Delta g/\Delta t = g(\xi)$ -and $\Delta f/\Delta t = f(\psi)$ for some $\xi$ and $\psi$ in -$[t_{i-1}, t_i]$, this becomes: - -```math -\text{sa}_i = \pi (f(t_i) + f(t_{i-1})) \cdot \sqrt{(g'(\xi))^2 + (f'(\psi))^2} \cdot (t_i - t_{i-1}). -``` - -Adding these up, $\text{sa}_1 + \text{sa}_2 + \cdots + \text{sa}_n$, -we get a Riemann sum approximation to the integral - -```math -\text{SA} = \int_a^b 2\pi f(t) \sqrt{g'(t)^2 + f'(t)^2} dt. -``` - -If we assume integrability of the integrand, then as our partition -size goes to zero, this approximate surface area converges to the -value given by the limit. (As with arc length, this needs a technical -adjustment to the Riemann integral theorem as here we are evaluating the -integrand function at four points ($t_i$, $t_{i-1}$, $\xi$ and -$\psi$) and not just at some $c_i$. - -```julia; hold=true; echo=false; cache=true -## {{{approximate_surface_area}}} - -xs,ys = range(-1, stop=1, length=50), range(-1, stop=1, length=50) -f(x,y)= 2 - (x^2 + y^2) - -dr = [1/2, 3/4] -df = [f(dr[1],0), f(dr[2],0)] - -function sa_approx_graph(i) - p = plot(xs, ys, f, st=[:surface], legend=false) - for theta in range(0, stop=i/10*2pi, length=10*i ) - path3d!(p,sin(theta)*dr, cos(theta)*dr, df) - end - p -end -n = 10 - -anim = @animate for i=1:n - sa_approx_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -caption = L""" - -Surface of revolution of $f(x) = 2 - x^2$ about the $y$ axis. The lines segments are the images of rotating the secant line connecting $(1/2, f(1/2))$ and $(3/4, f(3/4))$. These trace out the frustum of a cone which approximates the corresponding surface area of the surface of revolution. In the limit, this approximation becomes exact and a formula for the surface area of surfaces of revolution can be used to compute the value. - -""" - -ImageFile(imgfile, caption) -``` - -#### Examples - -Lets see that the surface area of an open cone follows from this -formula, even though we just saw how to get this value. - -A cone be be envisioned as rotating the function $f(x) = x\tan(\theta)$ between $0$ and $h$ around the $x$ axis. This integral yields the surface area: - -```math -\begin{align*} -\int_0^h 2\pi f(x) \sqrt{1 + f'(x)^2}dx -&= \int_0^h 2\pi x \tan(\theta) \sqrt{1 + \tan(\theta)^2}dx \\ -&= (2\pi\tan(\theta)\sqrt{1 + \tan(\theta)^2} x^2/2 \big|_0^h \\ -&= \pi \tan(\theta) \sec(\theta) h^2 \\ -&= \pi r^2 / \sin(\theta). -\end{align*} -``` - -(There are many ways to express this, we used $r$ and $\theta$ to -match the work above. If the cone is parameterized by a height $h$ and -radius $r$, then the surface area of the sides is $\pi r\sqrt{h^2 + -r^2}$. If the base is included, there is an additional $\pi r^2$ -term.) - -##### Example - -Let the graph of $f(x) = x^2$ from $x=0$ to $x=1$ be rotated around the $x$ axis. What is the resulting surface area generated? - -```math -\text{SA} = \int_a^b 2\pi f(x) \sqrt{1 + f'(x)^2}dx = \int_0^1 2\pi x^2 \sqrt{1 + (2x)^2} dx. -``` - -This integral is done by a trig substitution, but gets involved. We let `SymPy` do it: - -```julia; -@syms x -F = integrate(2 * PI * x^2 * sqrt(1 + (2x)^2), x) -``` - -We show `F`, only to demonstrate that indeed the integral is a bit -involved. The actual surface area follows from a *definite* integral, which we get through the fundamental theorem of calculus: - -```julia; -F(1) - F(0) -``` - -### Plotting surfaces of revolution - -The commands to plot a surface of revolution will be described more clearly later; for now we present them as simply a pattern to be followed in case plots are desired. Suppose the curve in the ``x-y`` plane is given parametrically by ``(g(u), f(u))`` for ``a \leq u \leq b``. - - -To be concrete, we parameterize the circle centered at ``(6,0)`` with radius ``2`` by: - -```julia -g(u) = 6 + 2sin(u) -f(u) = 2cos(u) -a, b = 0, 2pi -``` - -The plot of this curve is: - -```julia; hold=true -us = range(a, b, length=100) -plot(g.(us), f.(us), xlims=(-0.5, 9), aspect_ratio=:equal, legend=false) -plot!([0,0],[-3,3], color=:red, linewidth=5) # y axis emphasis -plot!([3,9], [0,0], color=:green, linewidth=5) # x axis emphasis -``` - -Though parametric plots have a convenience constructor, `plot(g, f, a, b)`, we constructed the points with `Julia`'s broadcasting notation, as we will need to do for a surface of revolution. The `xlims` are adjusted to show the ``y`` axis, which is emphasized with a layered line. The line is drawn by specifying two points, ``(x_0, y_0)`` and ``(x_1, y_1)`` in the form `[x0,x1]` and `[y0,y1]`. - -Now, to rotate this about the ``y`` axis, creating a surface plot, we have the following pattern: - -```julia -S(u,v) = [g(u)*cos(v), g(u)*sin(v), f(u)] -us = range(a, b, length=100) -vs = range(0, 2pi, length=100) -ws = unzip(S.(us, vs')) # reorganize data -surface(ws..., zlims=(-6,6), legend=false) - -plot!([0,0], [0,0], [-3,3], color=:red, linewidth=5) # y axis emphasis -``` - -The `unzip` function is not part of base `Julia`, rather part of -`CalculusWithJulia`. This function rearranges data into a form -consumable by the plotting methods like `surface`. In this case, the -result of `S.(us,vs')` is a grid (matrix) of points, the result of -`unzip` is three grids of values, one for the ``x`` values, one for -the ``y`` values, and one for the ``z`` values. A manual adjustment -to the `zlims` is used, as `aspect_ratio` does not have an effect with -the `plotly()` backend and errors on 3d graphics with `pyplot()`. - -To rotate this about the ``x`` axis, we have this pattern: - -```julia; hold=true -S(u,v) = [g(u), f(u)*cos(v), f(u)*sin(v)] -us = range(a, b, length=100) -vs = range(0, 2pi, length=100) -ws = unzip(S.(us,vs')) -surface(ws..., legend=false) - -plot!([3,9], [0,0],[0,0], color=:green, linewidth=5) # x axis emphasis -``` - -The above pattern covers the case of rotating the graph of a function ``f(x)`` of ``a,b`` by taking ``g(t)=t``. - - -##### Example - -Rotate the graph of $x^x$ from $0$ to $3/2$ around the $x$ axis. What is the surface area generated? - -We work numerically for this one, as no antiderivative is forthcoming. Recall, the accompanying `CalculusWithJulia` package defines `f'` to return the automatic derivative through the `ForwardDiff` package. - -```julia; hold=true -f(x) = x^x -a, b = 0, 3/2 -val, _ = quadgk(x -> 2pi * f(x) * sqrt(1 + f'(x)^2), a, b) -val -``` - -(The function is not defined at $x=0$ mathematically, but is on the computer to be $1$, the limiting value. Even were this not the case, the `quadgk` function doesn't evaluate the function at the points `a` and `b` that are specified.) - - -```julia; hold=true -g(u) = u -f(u) = u^u -S(u,v) = [g(u)*cos(v), g(u)*sin(v), f(u)] -us = range(0, 3/2, length=100) -vs = range(0, pi, length=100) # not 2pi (to see inside) -ws = unzip(S.(us,vs')) -surface(ws..., alpha=0.75) -``` - -We compare this answer to that of the frustum of a cone with radii $1$ -and $(3/2)^2$, formed by rotating the line segment connecting $(0,f(0))$ -with $(3/2,f(3/2))$. From looking at the graph of the surface, these values should be comparable. The surface area of -the cone part is $\pi (r_1^2 + r_0^2) / \sin(\theta) = \pi (r_1 + r_0) -\cdot \sqrt{(\Delta h)^2 + (r_1-r_0)^2}$. - -```julia; hold=true -f(x) = x^x -r0, r1 = f(0), f(3/2) -pi * (r1 + r0) * sqrt((3/2)^2 + (r1-r0)^2) -``` - -##### Example - -What is the surface area generated by Gabriel's Horn, the solid formed by -rotating $1/x$ for $x \geq 1$ around the $x$ axis? - -```math -\text{SA} = \int_a^b 2\pi f(x) \sqrt{1 + f'(x)^2}dx = -\lim_{M \rightarrow \infty} \int_1^M 2\pi \frac{1}{x} \sqrt{1 + (-1/x^2)^2} dx. -``` - -We do this with `SymPy`: - -```julia; -@syms M -ex = integrate(2PI * (1/x) * sqrt(1 + (-1/x)^2), (x, 1, M)) -``` - -The limit as $M$ gets large is of interest. The only term that might get out of hand is `asinh(M)`. We check its limit: - - -```julia; -limit(asinh(M), M => oo) -``` - -So indeed it does. There is nothing to balance this out, so the integral will be infinite, as this shows: - -```julia; -limit(ex, M => oo) -``` - -This figure would have infinite surface, were it possible to actually construct an infinitely long solid. (But it has been shown to have *finite* volume.) - -##### Example - -The curve described parametrically by $g(t) = 2(1 + \cos(t))\cos(t)$ -and $f(t) = 2(1 + \cos(t))\sin(t)$ from $0$ to $\pi$ is rotated about -the $x$ axis. Find the resulting surface area. - -The graph shows half a heart, the resulting area will resemble an apple. - -```julia; hold=true -g(t) = 2(1 + cos(t)) * cos(t) -f(t) = 2(1 + cos(t)) * sin(t) -plot(g, f, 0, 1pi) -``` - - -The integrand simplifies to $8\sqrt{2}\pi \sin(t) (1 + \cos(t))^{3/2}$. This lends itself to $u$-substitution with $u=\cos(t)$. - -```math -\begin{align*} -\int_0^\pi 8\sqrt{2}\pi \sin(t) (1 + \cos(t))^{3/2} -&= 8\sqrt{2}\pi \int_1^{-1} (1 + u)^{3/2} (-1) du\\ -&= 8\sqrt{2}\pi (2/5) (1+u)^{5/2} \big|_{-1}^1\\ -&= 8\sqrt{2}\pi (2/5) 2^{5/2} = \frac{2^7 \pi}{5}. -\end{align*} -``` - -## The first Theorem of Pappus - -The [first](http://tinyurl.com/le3lvb9) theorem of Pappus provides a -simpler means to compute the surface area if the distance the centroid -is from the axis ($\rho$) and the arc length of the curve ($L$) are -both known. In that case, the surface area satisfies: - -```math -\text{SA} = 2 \pi \rho L -``` - -That is, the surface area is simply the circumference of the circle -traced out by the centroid of the curve times the length of the -curve - the distances rotated are collapsed to that of just the -centroid. - -##### Example - -The surface area of of an open cone can be computed, as the arc length -is $\sqrt{h^2 + r^2}$ and the centroid of the line is a distance $r/2$ -from the axis. This gives SA$=2\pi (r/2) \sqrt{h^2 + r^2} = \pi r -\sqrt{h^2 + r^2}$. - -##### Example - -We can get the surface area of a torus from this formula. - -The torus is found by rotating the curve $(x-b)^2 + y^2 = a^2$ about -the $y$ axis. The centroid is $b$, the arc length $2\pi a$, so the -surface area is $2\pi (b) (2\pi a) = 4\pi^2 a b$. - -A torus with ``a=2`` and ``b=6`` - -```julia; hold=true; echo=false -a,b = 2, 6 -F₀(u,v) = [a*(cos(u) + b)*cos(v), a*(cos(u) + b)*sin(v), a*sin(u)] -us = vs = range(0, 2pi, length=35) -ws = unzip(F₀.(us, vs')) -surface(ws..., legend=false, zlims=(-12,12)) -``` -##### Example - -The surface area of sphere will be SA$=2\pi \rho (\pi r) = 2 \pi^2 r -\cdot \rho$. What is $\rho$? The centroid of an arc formula can be -derived in a manner similar to that of the centroid of a region. The -formulas are: - -```math -\begin{align} -\text{cm}_x &= \frac{1}{L} \int_a^b g(t) \sqrt{g'(t)^2 + f'(t)^2} dt\\ -\text{cm}_y &= \frac{1}{L} \int_a^b f(t) \sqrt{g'(t)^2 + f'(t)^2} dt. -\end{align} -``` - -Here, $L$ is the arc length of the curve. - -For the sphere parameterized by $g(t) = r \cos(t)$, $f(t) = r\sin(t)$, we get that these become - -```math -\text{cm}_x = \frac{1}{L}\int_0^\pi r\cos(t) \sqrt{r^2(\sin(t)^2 + \cos(t)^2)} dt = \frac{1}{L}r^2 \int_0^\pi \cos(t) = 0. -``` - - -```math -\text{cm}_y = \frac{1}{L}\int_0^\pi r\sin(t) \sqrt{r^2(\sin(t)^2 + \cos(t)^2)} dt = \frac{1}{L}r^2 \int_0^\pi \sin(t) = \frac{1}{\pi r} r^2 \cdot 2 = \frac{2r}{\pi}. -``` - -Combining this, we see that the surface area of a sphere is $2 \pi^2 r (2r/\pi) = 4\pi r^2$, by Pappus' Theorem. - -## Questions - - -##### Questions - - -The graph of $f(x) = \sin(x)$ from $0$ to $\pi$ is rotated around the -$x$ axis. After a $u$-substitution, what integral would give the surface area generated? - -```julia; hold=true; echo=false -choices = [ -"``-\\int_1^{-1} 2\\pi \\sqrt{1 + u^2} du``", -"``-\\int_1^{_1} 2\\pi u \\sqrt{1 + u^2} du``", -"``-\\int_1^{_1} 2\\pi u^2 \\sqrt{1 + u} du``" -] -answ = 1 -radioq(choices, answ) -``` - -Though the integral can be computed by hand, give a numeric value. - -```julia; hold=true; echo=false -f(x) = sin(x) -a, b = 0, pi -val, _ = quadgk(x -> 2pi* f(x) * sqrt(1 + f'(x)^2), a, b) -numericq(val) -``` - -##### Questions - -The graph of $f(x) = \sqrt{x}$ from $0$ to $4$ is rotated around the -$x$ axis. Numerically find the surface area generated? - -```julia; hold=true; echo=false -f(x) = sqrt(x) -a, b = 0, 4 -val, _ = quadgk(x -> 2pi* f(x) * sqrt(1 + f'(x)^2), a, b) -numericq(val) -``` - -##### Questions - -Find the surface area generated by revolving the graph of the function -$f(x) = x^3/9$ from $x=0$ to $x=2$ around the $x$ axis. This can be done by hand or numerically. - -```julia; hold=true; echo=false -f(x) = x^3/9 -a, b = 0, 2 -val, _ = quadgk(x -> 2pi* f(x) * sqrt(1 + f'(x)^2), a, b) -numericq(val) -``` - -##### Questions - -(From Stewart.) If a loaf of bread is in the form of a sphere of radius $1$, the -amount of crust for a slice depends on the width, but not where in the -loaf it is sliced. - -That is this integral with $f(x) = \sqrt{1 - x^2}$ and $u, u+h$ in $[-1,1]$ does not depend on $u$: - -```math -A = \int_u^{u+h} 2\pi f(x) \sqrt{1 + f'(x)^2} dx. -``` - -If we let $f(x) = y$ then $f'(x) = x/y$. With this, what does the integral above come down to after cancellations: - -```julia; hold=true; echo=false -choices = [ -"``\\int_u^{u+h} 2\\pi dx``", -"``\\int_u^{u_h} 2\\pi y dx``", -"``\\int_u^{u_h} 2\\pi x dx``" -] -answ = 1 -radioq(choices, answ) -``` - -##### Questions - -Find the surface area of the dome of sphere generated by rotating the -the curve generated by $g(t) = \cos(t)$ and $f(t) = \sin(t)$ for $t$ -in $0$ to $\pi/6$. - -Numerically find the value. - -```julia; hold=true; echo=false -g(t) = cos(t) -f(t) = sin(t) -a, b = 0, pi/4 -val, _ = quadgk(t -> 2pi* f(t) * sqrt(g'(t)^2 + f'(t)^2), a, b) -numericq(val) -``` - - -##### Questions - -The [astroid](http://www-history.mcs.st-and.ac.uk/Curves/Astroid.html) -is parameterized by $g(t) = a\cos(t)^3$ and $f(t) = a \sin(t)^3$. Let -$a=1$ and rotate the curve from $t=0$ to $t=\pi$ around the $x$ -axis. What is the surface area generated? - - -```julia; hold=true; echo=false -g(t) = cos(t^3) -f(t) = sin(t^3) -a, b = 0, pi -val, _ = quadgk(t -> 2pi* f(t) * sqrt(g'(t)^2 + f'(t)^2), a, b) -numericq(val) -``` - - -##### Questions - -For the curve parameterized by $g(t) = a\cos(t)^5$ and $f(t) = a \sin(t)^5$. Let -$a=1$ and rotate the curve from $t=0$ to $t=\pi$ around the $x$ -axis. Numerically find the surface area generated? - - -```julia; hold=true; echo=false -g(t) = cos(t^5) -f(t) = sin(t^5) -a, b = 0, pi -val, _ = quadgk(t -> 2pi* f(t) * sqrt(g'(t)^2 + f'(t)^2), a, b) -numericq(val) -``` diff --git a/CwJ/integrals/volumes_slice.jmd b/CwJ/integrals/volumes_slice.jmd deleted file mode 100644 index 5a595f9..0000000 --- a/CwJ/integrals/volumes_slice.jmd +++ /dev/null @@ -1,782 +0,0 @@ -# Volumes by slicing - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using QuadGK -using Unitful, UnitfulUS -using Roots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -import LinearAlgebra: norm - -const frontmatter = ( - title = "Volumes by slicing", - description = "Calculus with Julia: Volumes by slicing", - tags = ["CalculusWithJulia", "integrals", "volumes by slicing"], -); -nothing -``` - ----- - -```julia; hold=true; echo=false -imgfile = "figures/michelin-man.jpg" -caption = """ - -Hey Michelin Man, how much does that costume weigh? - -""" -ImageFile(:integrals, imgfile, caption) -``` - - -An ad for a summer job says work as the Michelin Man! Sounds -promising, but how much will that costume weigh? A very hot summer may -make walking around in a heavy costume quite uncomfortable. - -A back-of-the envelope calculation would start by - -* Mentally separating out each "tire" and lining them up one by one. - -* Counting the number of "tires" (or rings), say $n$. - -* Estimating the radius for each tire, say $r_i$ for $1 \leq i \leq n$. - -* Estimating the height for each tire, say $h_i$ for $1 \leq i \leq n$ - -Then the volume would be found by adding: - -```math -V = \pi \cdot r_1^2 \cdot h_1 + \pi \cdot r_2^2 \cdot h_2 + \cdot + \pi \cdot r_n^2 \cdot h_n. -``` - -The weight would come by multiplying the volume by some appropriate density. - -Looking at the sum though, we see the makings of an approximate -integral. If the heights were to get infinitely small, we might expect -this to approach something like $V=\int_a^b \pi r(h)^2 dh$. - -In fact, we have in general: - -> **Volume of a figure with a known cross section**: The volume of a solid with known cross-sectional -> area $A_{xc}(x)$ from $x=a$ to $x=b$ is given by -> -> `` V = \int_a^b A_{xc}(x) dx.`` -> -> This assumes $A_{xc}(x)$ is integrable. - -This formula is derived by approximating the volume by "slabs" with -volume $A_{xc}(x) \Delta x$ and using the Riemann integral's definition to -pass to the limit. The discs of the Michelin man are an example, where -the cross-sectional area is just that of a circle, or $\pi r^2$. - -## Solids of revolution - -We begin with some examples of a special class of solids - solids of -revolution. These have an axis of symmetry from which the slabs are -then just circular disks. - -Consider the volume contained in this glass, it will depend on the radius at different values of $x$: - -```julia; hold=true; echo=false -imgfile = "figures/integration-glass.jpg" -caption = L""" - -A wine glass oriented so that it is seen as generated by revolving a -curve about the $x$ axis. The radius of revolution varies as a function of $x$ -between about $0$ and $6.2$cm. - -""" -ImageFile(:integrals, imgfile, caption) -``` - - -If $r(x)$ is the radius as a function of $x$, then the cross sectional -area is $\pi r(x)^2$ so the volume is given by: - -```math -V = \int_a^b \pi r(x)^2 dx. -``` - -!!! note - The formula is for a rotation around the $x$-axis, but can easily be generalized to rotating around any line (say the $y$-axis or $y=x$, ...) just by adjusting what $r(x)$ is taken to be. - - -For a numeric example, we consider the original Red -[Solo](http://en.wikipedia.org/wiki/Red_Solo_Cup) Cup. The dimensions -of the cup were basically: a top diameter of $d_1 = 3~ \frac{3}{4}$ -inches, a bottom diameter of $d_0 = 2~ \frac{1}{2}$ inches and a height of $h = -4~ \frac{3}{4}$ inches. - -The central axis is straight down. If we rotate the cup so this is the $x$-axis, then we can get - -```math -r(x) = \frac{d_0}{2} + \frac{d_1/2 - d_0/2}{h}x = \frac{5}{4} + \frac{5}{38}x -``` - -The volume in cubic inches will be: - -```math -V = \int_0^h \pi r(x)^2 dx -``` - -This is - -```julia; -d0, d1, h = 2.5, 3.75, 4.75 -rad(x) = d0/2 + (d1/2 - d0/2)/h * x -vol, _ = quadgk(x -> pi * rad(x)^2, 0, h) -``` - -So $36.9 \text{in}^3$. How many ounces is that? It is useful to know -that 1 [gallon](http://en.wikipedia.org/wiki/Gallon) of water is -defined as $231$ cubic inches, contains $128$ ounces, and weighs $8.34$ -pounds. - -So our cup holds this many ounces: - -```julia; -ozs = vol / 231 * 128 -``` - -Full it is about $20$ ounces, though this doesn't really account for the volume taken up by the bottom of the cup, etc. - -If you are poor with units, `Julia` can provide some help through the `Unitful` package. Here the additional `UnitfulUS` package must also be included, as was done above, to access fluid ounces: - -```julia -vol * u"inch"^3 |> us"floz" -``` - - -Before Solo "squared" the cup, the Solo cup had markings that - [some thought](http://www.snopes.com/food/prepare/solocups.asp) - indicated certain volume amounts. - -```julia; hold=true; echo=false -imgfile = "figures/red-solo-cup.jpg" -caption = "Markings on the red Solo cup indicated various volumes" -ImageFile(:integrals, imgfile, caption) -``` - -What is the height for $5$ ounces (for a glass of wine)? $12$ ounces (for a beer unit)? - -Here the volume is fixed, but the height is not. For $v$ ounces, we need to convert to cubic inches. The conversion is -$1$ ounce is $231/128 \text{in}^3$. - -So we need to solve $v \cdot (231/128) = \int_0^h\pi r(x)^2 dx$ for $h$ when $v=5$ and $v=12$. - -Let's express volume as a function of $h$: - -```julia; -Vol(h) = quadgk(x -> pi * rad(x)^2, 0, h)[1] -``` - -Then to solve we have: - -```julia; -v₅ = 5 -h5 = find_zero(h -> Vol(h) - v₅ * 231 / 128, 4) -``` - -and - -```julia; -v₁₂ = 12 -h12 = find_zero(h -> Vol(h) - v₁₂ * 231 / 128, 4) -``` - -As a percentage of the total height, these are: - -```julia; -h5/h, h12/h -``` - -!!! note - Were performance at issue, Newton's method might also have been considered here, as the derivative is easily computed by the fundamental theorem of calculus. - -##### Example - -By rotating the line segment $x/r + y/h=1$ that sits in the first -quadrant around the $y$ axis, we will generate a right-circular -cone. The volume of which can be expressed through the above formula -by noting the radius, as a function of $y$, will be $R = r(1 - -y/h)$. This gives the well-known volume of a cone: - -```julia; hold=true -@syms r h x y -R = r*(1 - y/h) -integrate(pi*R^2, (y, 0, h)) -``` - -The frustum of a cone is simply viewed as a cone with its top cut off. If the original height would have been $h_0$ and the actual height $h_1$, then the volume remaining is just $\int_{h_0}^h \pi r(y)^2 dy = \pi h_1 r^2/3 - \pi h_0 r^2/3 = \pi r^2 (h_1-h_0)/3$. - -It is not unusual to parameterize a cone by the angle $\theta$ it -makes and the height. Since $r/h=\tan\theta$, this gives the formula -$V = \pi/3\cdot h^3\tan(\theta)^2$. - -##### Example - -[Gabriel's](http://tinyurl.com/8a6ygv) horn is a geometric figure of mathematics - but not the real world - which has infinite height, but not volume! The figure is found by rotating the curve $y=1/x$ around the $x$ axis from $1$ to $\infty$. If the volume formula holds, what is the volume of this "horn?" - -```julia; -radius(x) = 1/x -quadgk(x -> pi*radius(x)^2, 1, Inf)[1] -``` - -That is a value very reminiscent of $\pi$, which it is as $\int_1^\infty 1/x^2 dx = -1/x\big|_1^\infty=1$. - -!!! note - The interest in this figure is that soon we will be able to show that - it has **infinite** surface area, leading to the - [paradox](http://tinyurl.com/osawwqm) that it seems possible to fill - it with paint, but not paint the outside. - -##### Example - -A movie studio hand is asked to find a prop vase to be used as a -[Ming vase](http://en.wikipedia.org/wiki/Chinese_ceramics) in an -upcoming scene. The dimensions specified are for the outside diameter -in centimeters and are given by - -```math -d(h) = \begin{cases} -2 \sqrt{26^2 - (h-20)^2} & 0 \leq h \leq 44\\ -20 \cdot e^{-(h - 44)/10} & 44 < h \leq 50. -\end{cases} -``` - -If the vase were solid, what would be the volume? - -We define `d` using a ternary operator to handle the two cases: - -```julia; -d(h) = h <= 44 ? 2*sqrt(26^2 - (h-20)^2) : 20 * exp(-(h-44)/10) -rad(h) = d(h)/2 -``` - -The volume in cm$^3$ is then: - -```julia; -Vₜ, _ = quadgk(h -> pi * rad(h)^2, 0, 50) -``` - -For the actual shoot, the vase is to be filled with ash, to simulate a -funeral urn. (It will then be knocked over in a humorous manner, of -course.) How much ash is needed if the vase has walls that are 1/2 -centimeter thick - -We need to subtract $0.5$ from the radius and `a` then recompute: - -```julia; -V_int, _ = quadgk(h -> pi * (rad(h) - 1/2)^2, 1/2, 50) -``` - -A liter of volume is $1000 \text{cm}^3$. So this is about $68$ liters, or more than 15 -gallons. Perhaps the dimensions given were bit off. - - -While we are here, to compute the actual volume of the material in the vase could be done by subtraction. - -```julia; -Vₜ - V_int -``` - -### The washer method - -Returning to the Michelin Man, in our initial back-of-the-envelope calculation we didn't account for the fact that a tire isn't a disc, as it has its center cut out. Returning, suppose $R_i$ is the outer radius and $r_i$ the inner radius. Then each tire has volume - -```math -\pi R_i^2 h_i - \pi r_i^2 h_i = \pi (R_i^2 - r_i^2) h_i. -``` - -Rather than use $\pi r(x)^2$ for a cross section, we would use $\pi (R(x)^2 - r(x)^2)$. - -In general we call a shape like the tire a "washer" and use this formula for a washer's cross section -$A_{xc}(x) = \pi(R(x)^2 - r(x)^2)$. - -Then the volume for the solid of revolution whose cross sections are washers would be: - -```math -V = \int_a^b \pi \cdot (R(x)^2 - r(x)^2) dx. -``` - -##### Example - -An artist is working with a half-sphere of material, and wishes to -bore out a conical shape. What would be the resulting volume, if the -two figures are modeled by - -```math -R(x) = \sqrt{1^2 - (x-1)^2}, \quad r(x) = x, -``` - -with $x$ ranging from $x=0$ to $1$? - -The answer comes by integrating: - -```julia; hold=true -Rad(x) = sqrt(1 - (x-1)^2) -rad(x) = x -V, _ = quadgk(x -> pi*(Rad(x)^2 - rad(x)^2), 0, 1) -``` - -## Solids with known cross section - -The Dart cup company now produces the red solo cup with a -[square](http://www.solocup.com/products/squared-plastic-cup/) cross -section. Suppose the dimensions are the same: a top diameter of $d_1 = -3 3/4$ inches, a bottom diameter of $d_0 = 2 1/2$ inches and a height -of $h = 4 3/4$ inches. What is the volume now? - -The difference, of course, is that cross sections now have area $d^2$, as opposed to $\pi r^2$. This leads to some difference, which we quantify, as follows: - -```julia; hold=true -d0, d1, h = 2.5, 3.75, 4.75 -d(x) = d0 + (d1 - d0)/h * x -vol, _ = quadgk(x -> d(x)^2, 0, h) -vol / 231 * 128 -``` - -This shape would have more volume - the cross sections are bigger. Presumably -the dimensions have changed. Without going out and buying a cup, let's -assume the cross-sectional diameter remained the same, not the -diameter. This means the largest dimension is the same. The cross -section diameter is $\sqrt{2}$ larger. What would this do to the area? - -We could do this two ways: divide $d_0$ and $d_1$ by $\sqrt{2}$ and -recompute. However, each cross section of this narrower cup would -simply be $\sqrt{2}^2$ smaller, so the total volume would change by -$2$, or be 13 ounces. We have $26.04$ is too big, and $13.02$ is too -small, so some other overall dimensions are used. - -##### Example - -For a general cone, we use this [definition](http://en.wikipedia.org/wiki/Cone): - -> A cone is the solid figure bounded by a base in a plane and by a -> surface (called the lateral surface) formed by the locus of all -> straight line segments joining the apex to the perimeter of the -> base. - -Let $h$ be the distance from the apex to the base. Consider cones with the property that all planes parallel to the base intersect the cone with the same shape, though perhaps a different scale. This figure shows an example, with the rays coming from the apex defining the volume. - -```julia; hold=true; echo=false -h = 5 -R, r, rho = 1, 1/4, 1/4 -f(t) = (R-r) * cos(t) + rho * cos((R-r)/r * t) -g(t) = (R-r) * sin(t) - rho * sin((R-r)/r * t) -ts = range(0, 2pi, length=100) - -p = plot(f.(ts), g.(ts), zero.(ts), legend=false) -for t ∈ range(0, 2pi, length=25) - plot!(p, [0,f(t)], [0,g(t)], [h, 0], linecolor=:red) -end -p -``` - - -A right circular cone is one where this shape is a circle. This -definition can be more general, as a square-based right pyramid is also -such a cone. After possibly reorienting the cone in space so the base -is at $u=0$ and the apex at $u=h$ the volume of the cone can be found -from: - -```math -V = \int_0^h A_{xc}(u) du. -``` - -The cross sectional area $A_{xc}(u)$ satisfies a formula in terms of $A_{xc}(0)$, the area of the base: - -```math -A_{xc}(u) = A_{xc}(0) \cdot (1 - \frac{u}{h})^2 -``` - -So the integral becomes: - -```math -V = \int_0^h A_{xc}(u) du = A_{xc}(0) \int_0^h (1 - \frac{u}{h})^2 du = A_{xc}(0) \int_0^1 v^2 \frac{1}{h} dv = A_{xc}(0) \frac{h}{3}. -``` - -This gives a general formula for the volume of such cones. - - - -### Cavalieri's method - -[Cavalieri's](http://tinyurl.com/oda9xd9) Principle is "Suppose two -regions in three-space (solids) are included between two parallel -planes. If every plane parallel to these two planes intersects both -regions in cross-sections of equal area, then the two regions have -equal volumes." (Wikipedia). - - -With the formula for the volume of solids based on cross sections, -this is a trivial observation, as the functions giving the -cross-sectional area are identical. Still, it can be surprising. -Consider a sphere with an interior cylinder bored out of it. (The -[Napkin](http://tinyurl.com/o237v83) ring problem.) The bore has -height $h$ - for larger radius spheres this means very wide bores. - - -```julia; hold=true; echo=false -#The following illustrates $R=5$ and $h=8$. - -R =5; h1 = 2*4 - -theta = asin(h1/2/R) -thetas = range(-theta, stop=theta, length=100) -ts = range(-pi, stop=pi, length=100) -y = h1/4 - -p = plot(legend=false, aspect_ratio=:equal); -plot!(p, R*cos.(ts), R*sin.(ts)); -plot!(p, R*cos.(thetas), R*sin.(thetas), color=:orange); - -plot!(p, [R*cos.(theta), R*cos.(theta)], [h1/2, -h1/2], color=:orange); -plot!(p, [R*cos.(theta), sqrt(R^2 - y^2)], [y, y], color=:orange) - -plot!(p, [0, R*cos.(theta)], [0,0], color=:red); -plot!(p,[ 0, R*cos.(theta)], [0,h1/2], color=:red); - -annotate!(p, [(.5, -2/3, "sqrt(R²- (h/2)²)"), - (R*cos.(theta)-.6, h1/4, "h/2"), - (1.5, 1.75*tan.(theta), "R")]) - -p - -``` - - -The small orange line is rotated, so using the washer method we get the cross sections given by $\pi(r_0^2 - r_i^2)$, the outer and inner radii, as a function of $y$. - -The outer radii has points $(x,y)$ satisfying $x^2 + y^2 = R^2$, so is $\sqrt{R^2 - y^2}$. The inner radii has a constant value, and as indicated in the figure, is $\sqrt{R^2 - (h/2)^2}$, by the Pythagorean theorem. - -Thus the cross sectional area is - -```math -\pi( (\sqrt{R^2 - y^2})^2 - (\sqrt{R^2 - (h/2)^2})^2 ) -= \pi ((R^2 - y^2) - (R^2 - (h/2)^2)) -= \pi ((\frac{h}{2})^2 - y^2) -``` - -As this does not depend on $R$, and the limits of integration would -always be $-h/2$ to $h/2$ by Cavalieri's principle, the volume of the -solid will be independent of $R$ too. - -To actually compute this volume, we take $R=h/2$, so that the bore -hole is just a line of no volume, the resulting volume is then that of -a sphere with radius $h/2$, or $4/3\pi(h/2)^3 = \pi h^3/6$. - -## The second theorem of Pappus - -The second theorem of [Pappus](http://tinyurl.com/l43vw4) says that if -a plane figure $F$ is rotated around an axis to form a solid of -revolution, the total volume can be written as $2\pi r A(F)$, where -$r$ is the distance the centroid is from the axis of revolution, and -$A(F)$ is the area of the plane figure. In short, the distance -traveled by the centroid times the area. - -This can make some computations trivial. For example, we can make -a torus (or donut) by rotating the circle $(x-2)^2 + y^2 = 1$ about the -$y$ axis. As the centroid is clearly $(2, 0)$, with $r=2$ in the above -formula, and the area of the circle is $\pi 1^2$, the volume of the -donut is $2\pi(2)(\pi) = 4\pi^2$. - -##### Example - -Above, we found the volume of a cone, as it is a solid of revolution, -through the general formula. However, parameterizing the cone as the -revolution of a triangle with vertices $(0,0)$, $(r, 0)$, and $(0,h)$ -and using the formula for the center of mass in the $x$ direction of -such a triangle, $r/3$, we get that the volume of a cone with -height $h$ and radius $r$ is $2\pi (r/3)\cdot (rh/2) = \pi r^2 h/3$, in agreement with the calculus based computation. - -## Questions - -###### Question - - -Consider this big Solo cup: - -```julia; hold=true; echo=false -imgfile ="figures/big-solo-cup.jpg" -caption = " Big solo cup. " -ImageFile(:integrals, imgfile, caption) -``` - -It has approximate dimensions: smaller radius 5 feet, upper radius 8 feet and height 15 feet. How many gallons is it? -At $8$ pounds a gallon this would be pretty heavy! - -Two facts are useful: - -* a cubic foot is 7.48052 gallons -* the radius as a function of height is $r(h) = 5 + (3/15)\cdot h$ - -```julia; hold=true; echo=false -gft = 7.48052 -rad(h) = 5 + (3/15)*h -a,err = quadgk(h -> pi*rad(h)^2, 0, 15) -val = a*gft -numericq(val, 1e1) -``` - - -###### Question - -In *Glass Shape Influences Consumption Rate* for Alcoholic -[Beverages](http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0043007) -the authors demonstrate that the shape of the glass can have an effect -on the rate of consumption, presumably people drink faster when they -aren't sure how much they have left. In particular, they comment that -people have difficulty judging the half-finished-by-volume mark. - -This figure shows some of the wide variety of beer-serving glasses: - -```julia; hold=true; echo=false -imgfile ="figures/beer_glasses.jpg" -caption = "A variety of different serving glasses for beer." -ImageFile(:integrals, imgfile, caption) -``` - -We work with metric units, as there is a natural relation between -volume in cm$^3$ and liquid measure ($1$ liter = $1000$ cm$^3$, so a $16$-oz -pint glass is roughly $450$ cm$^3$.) - -Let two glasses be given as follows. A typical pint glass with linearly increasing radius: - -```math -r(h) = 3 + \frac{1}{5}h, \quad 0 \leq h \leq b; -``` - -and a curved-edge one: - -```math -s(h) = 3 + \log(1 + h), \quad 0 \leq h \leq b -``` - -The following functions find the volume as a function of height, $h$: - - -```julia; -r1(h) = 3 + h/5 -s1(h) = 2 + log(1 + h) -r_vol(h) = quadgk(x -> pi*r1(x)^2, 0, h)[1] -s_vol(h) = quadgk(x -> pi*s1(x)^2, 0, h)[1] -``` - -* For the straight-sided glass find $h$ so that the volume is $450$. - -```julia; echo=false -h450 = find_zero(h -> r_vol(h) - 450, 10) -numericq(h450) -``` - -* For the straight-sided glass find $h$ so that the volume is $225$ (half full). - -```julia; echo=false -h225 = find_zero(h -> r_vol(h) - 225, 10) -numericq(h225) -``` - -* For the straight-sided glass, what is the percentage of the total height when the glass is half full. (For a cylinder it would just be 50. - -```julia; echo=false -numericq(h225/450 * 100, 2, units="percent") -``` - -* People often confuse the half-way by height amount for the half way - by volume, as it is for the cylinder. Take the height for the - straight-sided glass filled with $450$ mm, divide it by $2$, then - compute the percentage of volume at the half way height to the - original. - -```julia; echo=false -numericq(r_vol(h450/2)/450*100, 2, units="percent") -``` - ----- - - -* For the curved-sided glass find $h$ so that the volume is $450$. - -```julia; echo=false -h_450 = find_zero(h -> s_vol(h) - 450, 10) -numericq(h_450) -``` - -* For the curved-sided glass find $h$ so that the volume is $225$ (half full). - -```julia; echo=false -h_225 = find_zero(h -> s_vol(h) - 225, 10) -numericq(h_225) -``` - -* For the curved-sided glass, what is the percentage of the total height when the glass is half full. (For a cylinder it would just be 50. - -```julia; echo=false -numericq(h_225/450 * 100, 2, units="percent") -``` - -* People often confuse the half-way by height amount for the half way - by volume, as it is for the cylinder. Take the height for the - curved-sided glass filled with $450$ mm, divide it by $2$, then - compute the percentage of volume at the half way height to the - original. - -```julia; echo=false -numericq(s_vol(h_450/2)/450*100, 2, units="percent") -``` - -###### Question - -A right [pyramid](http://en.wikipedia.org/wiki/Pyramid_%28geometry%29) has its apex (top point) above the centroid of its base, and for our purposes, each of its cross sections. Suppose a pyramid has square base of dimension $w$ and height of dimension $h$. - -Will this integral give the volume: - -```math -V = \int_0^h w^2 (1 - \frac{y}{h})^2 dy? -``` - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -What is the volume? - -```julia; hold=true; echo=false -choices = [ -"``1/3 \\cdot b\\cdot h``", -"``1/3 \\cdot w^2\\cdot h``", -"``l\\cdot w \\cdot h/ 3``" -] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -An ellipsoid is formed by rotating the region in the first and second -quadrants bounded by the ellipse $(x/2)^2 + (y/3)^2=1$ and the $x$ -axis around the $x$ axis. What is the volume of this ellipsoid? Find it numerically. - -```julia; hold=true; echo=false -f(x) = 3*sqrt( 1 - (x/2)^2 ) -val, _ = quadgk(x -> pi * f(x)^2, -2, 2) -numericq(val) -``` - -###### Question - -An ellipsoid is formed by rotating the region in the first and second -quadrants bounded by the ellipse $(x/a)^2 + (y/b)^2=1$ and the $x$ -axis around the $x$ axis. What is the volume of this ellipsoid? Find it symbolically. - -```julia; hold=true; echo=false -choices = [ -"``4/3 \\cdot \\pi a b^2``", -"``4/3 \\cdot \\pi a^2 b``", -"``\\pi/3 \\cdot a b^2``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -A solid is generated by rotating the region enclosed by the graph -$y=\sqrt{x}$, the lines $x=1$, $x=2$, and $y=1$ about the $x$ -axis. Find the volume of the solid. - -```julia; hold=true; echo=false -Ra(x) = sqrt(x) -ra(x) = 1 -a,b=1,2 -val, _ = quadgk(x -> pi * (Ra(x)^2 - ra(x)^2), a,b) -numericq(val) -``` - -###### Question - -The region enclosed by the graphs of $y=x^3 - 1$ and $y=x-1$ are rotated around the $y$ axis. What is the volume of the solid? - -```julia; hold=true; -@syms x -plot(x^3 - 1, 0, 1, legend=false) -plot!(x-1) -``` - -```julia; hold=true; echo=false -Ra(y) = cbrt(y+1) -ra(y) = y + 1 -a,b = 0, 1 -val, _ = quadgk(x -> pi * (Ra(x)^2 - ra(x)^2), a,b) -numericq(val) -``` - -###### Question - -Rotate the region bounded by $y=e^x$, the line $x=\log(2)$ and the first quadrant about the line $x=\log(2)$. - -(Be careful, the radius in the formula $V=\int_a^b \pi r(u)^2 du$ is from the line $x=\log(2)$.) - -```julia; hold=true; echo=false -a, b = 0, exp(log(2)) -ra(y) = log(2) - log(y) -val, _ = quadgk(y -> pi * ra(y)^2, a, b) -numericq(val) -``` - -###### Question - -Find the volume of rotating the region bounded by the line $y=x$, $x=1$ and the $x$-axis around the line $y=x$. (The Theorem of Pappus is convenient and the fact that the centroid of the triangular region lies at $(2/3, 1/3)$.) - -```julia; hold=true; echo=false -cm=[2/3, 1/3] -c = [1/2, 1/2] -rr = norm(cm - c) -A = 1/2 * 1 * 1 -val = 2pi * rr * A -numericq(val) -``` - -###### Question - -Rotate the region bounded by the line $y=x$ and the function $f(x) = x^2$ about the line $y=x$. What is the resulting volume? - -You can integrate in the length along the line $y=x$ ($u$ from $0$ to $\sqrt{2}$). The radius then can be found by intersecting the line perpendicular line to $y=x$ at $u$ to the curve $f(x)$. This will do so: - -```julia; -theta = pi/4 ## we write y=x as y = x * tan(pi/4) for more generality, as this allows other slants. - -f(x) = x^2 -𝒙(u) = find_zero(x -> u*sin(theta) - 1/tan(theta) * (x - u*cos(theta)) - f(x), (u*cos(theta), 1)) -𝒓(u) = sqrt((u*cos(theta) - 𝒙(u))^2 + (u*sin(theta) - f(𝒙(u)))^2) -``` - -(Though in this case you can also find `r(u)` using the quadratic formula.) - -With this, find the volume. - -```julia; hold=true; echo=false -a, b = 0, sqrt(2) -val, _ = quadgk(u -> pi*𝒓(u)^2, a, b) -numericq(val) -``` - ----- - -Repeat (find the volume) only this time with the function $f(x) = x^{20}$. - -```julia; hold=true; echo=false -a, b = 0, sqrt(2) -f(x) = x^20 -xval(u) = find_zero(x -> u*sin(theta) - 1/tan(theta) * (x - u*cos(theta)) - f(x), (0,sqrt(2))) -rad(u) = sqrt((u*cos(theta) - xval(u))^2 + (u*sin(theta) - f(xval(u)))^2) -val, _ = quadgk(u -> pi*rad(u)^2, a, b) -numericq(val) -``` diff --git a/CwJ/limits/Project.toml b/CwJ/limits/Project.toml deleted file mode 100644 index efe47db..0000000 --- a/CwJ/limits/Project.toml +++ /dev/null @@ -1,10 +0,0 @@ -[deps] -DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" -ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" -IntervalArithmetic = "d1acc4aa-44c8-5952-acd4-ba5d80a2a253" -IntervalRootFinding = "d2bf35a9-74e0-55ec-b149-d360ff49b807" -Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" -PyPlot = "d330b81b-6aea-500a-939a-2ce795aea3ee" -QuadGK = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" -Roots = "f2b01f46-fcfa-551c-844a-d8ac1e96c665" -SymPy = "24249f21-da20-56a4-8eb1-6a02cf4ae2e6" diff --git a/CwJ/limits/bisection.js b/CwJ/limits/bisection.js deleted file mode 100644 index bf49929..0000000 --- a/CwJ/limits/bisection.js +++ /dev/null @@ -1,69 +0,0 @@ -var l = -1.5; -var r = 1.75; -var N = 8; - -const b = JXG.JSXGraph.initBoard('jsxgraph', { - boundingbox: [l, 6.0, r,-2.0], axis:true -}); - -var f = function(x) {return Math.pow(x,5) - x - 1}; - -var graph = b.create('functiongraph', [f, l, r]); - -slider = b.create('slider', [[0.25, 1], [1.0, 1], [0,0,N-1]], - {snapWidth:1, - suffixLabel:"n = "}); - -var intervals = [[0,1.5]]; - -for (i = 1; i < N; i++) { - var old = intervals[i-1]; - var ai = old[0]; - var bi = old[1]; - var ci = (ai + bi)/2; - var fa = f(ai); - var fb = f(bi); - var fc = f(ci); - if (fc == 0) { - var newint = [ci, ci]; - } else if (fa * fc < 0) { - var newint = [ai, ci]; - } else { - var newint = [ci, bi]; - } - intervals.push(newint); -}; - -b.create('functiongraph', [f, - function() { - var n = slider.Value(); - return intervals[n][0]; - }, - function() { - var n = slider.Value(); - return intervals[n][1]; - } - ], {strokeWidth:5}); - -var seg = b.create("segment", [function() { - var n = slider.Value(); - var ai = intervals[n][0]; - return [ai, 0]; -}, - function() { - var n = slider.Value(); - var bi = intervals[n][1]; - return [bi, 0]; - }], {strokeWidth: 5}); - -b.create("point", [function() { - var n = slider.Value(); - var ai = intervals[n][0] - return ai; -}, 0], {name:"a_n"}); - -b.create("point", [function() { - var n = slider.Value(); - var bi = intervals[n][1] - return bi; -}, 0], {name: "b_n"}); diff --git a/CwJ/limits/cache/continuity.cache b/CwJ/limits/cache/continuity.cache deleted file mode 100644 index 6e11310..0000000 Binary files a/CwJ/limits/cache/continuity.cache and /dev/null differ diff --git a/CwJ/limits/cache/intermediate_value_theorem.cache b/CwJ/limits/cache/intermediate_value_theorem.cache deleted file mode 100644 index 25bd8fe..0000000 Binary files a/CwJ/limits/cache/intermediate_value_theorem.cache and /dev/null differ diff --git a/CwJ/limits/cache/limits.cache b/CwJ/limits/cache/limits.cache deleted file mode 100644 index ef4026f..0000000 Binary files a/CwJ/limits/cache/limits.cache and /dev/null differ diff --git a/CwJ/limits/cache/limits_extensions.cache b/CwJ/limits/cache/limits_extensions.cache deleted file mode 100644 index 70b4f6c..0000000 Binary files a/CwJ/limits/cache/limits_extensions.cache and /dev/null differ diff --git a/CwJ/limits/continuity.jmd b/CwJ/limits/continuity.jmd deleted file mode 100644 index 0bdb3bd..0000000 --- a/CwJ/limits/continuity.jmd +++ /dev/null @@ -1,449 +0,0 @@ -# Continuity - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Continuity", - description = "Calculus with Julia: Continuity", - tags = ["CalculusWithJulia", "limits", "continuity"], -); - -nothing -``` - ----- - -The definition Google finds for *continuous* is *forming an unbroken whole; without interruption*. - -The concept in calculus, as transferred to functions, is -similar. Roughly speaking, a continuous function is one whose graph -could be drawn without having to lift (or interrupt) the pencil drawing it. - -Consider these two graphs: - -```julia; hold=true; echo=false -plt = plot([-1,0], [-1,-1], color=:black, legend=false, linewidth=5) -plot!(plt, [0, 1], [ 1, 1], color=:black, linewidth=5) -plt -``` - -and - -```julia; hold=true; echo=false -plot([-1,-.1, .1, 1], [-1,-1, 1, 1], color=:black, legend=false, linewidth=5) -``` - -Though similar at some level - they agree at nearly every value of -$x$ - the first has a "jump" from $-1$ to $1$ instead of the -transition in the second one. The first is not continuous at $0$ - a -break is needed to draw it - where as the second is continuous. - - - -A formal definition of continuity was a bit harder to come about. At -[first](http://en.wikipedia.org/wiki/Intermediate_value_theorem) the -concept was that for any $y$ between any two values in the range for -$f(x)$, the function should take on the value $y$ for some $x$. Clearly -this could distinguish the two graphs above, as one takes no values in -$(-1,1)$, whereas the other - the continuous one - takes on all values in that range. - - -However, [Cauchy](http://en.wikipedia.org/wiki/Cours_d%27Analyse) -defined continuity by $f(x + \alpha) - f(x)$ being small whenever -$\alpha$ was small. This basically rules out "jumps" and proves more -useful as a tool to describe continuity. - - -The [modern](http://en.wikipedia.org/wiki/Continuous_function#History) -definition simply pushes the details to the definition of the limit: - -> A function $f(x)$ is continuous at $x=c$ if $\lim_{x \rightarrow c}f(x) = f(c)$. - -This says three things - -* The limit exists at $c$. - -* The function is defined at $c$ ($c$ is in the domain). - -* The value of the limit is the same as $f(c)$. - - -This speaks to continuity at a point, we can extend this to continuity over an interval $(a,b)$ by saying: - -> A function $f(x)$ is continuous over $(a,b)$ if at each point $c$ with $a < c < b$, $f(x)$ is continuous at $c$. - -Finally, as with limits, it can be convenient to speak of *right* -continuity and *left* continuity at a point, where the limit in the -defintion is replaced by a right or left limit, as appropriate. - - -!!! warning - The limit in the definition of continuity is the basic limit and not an extended sense where - infinities are accounted for. - -##### Examples of continuity - -Most familiar functions are continuous everywhere. - -* For example, a monomial function $f(x) = ax^n$ for non-negative, integer $n$ will be continuous. This is because the limit exists everywhere, the domain of $f$ is all $x$ and there are no jumps. - -* Similarly, the basic trigonometric functions $\sin(x)$, $\cos(x)$ are continuous everywhere. - -* So are the exponential functions $f(x) = a^x, a > 0$. - -* The hyperbolic sine ($(e^x - e^{-x})/2$) and cosine ($(e^x + e^{-x})/2$) are, as $e^x$ is. - -* The hyperbolic tangent is, as $\cosh(x) > 0$ for all $x$. - -Some familiar functions are *mostly* continuous but not everywhere. - -* For example, $f(x) = \sqrt{x}$ is continuous on $(0,\infty)$ and right continuous at $0$, but it is not defined for negative $x$, so can't possibly be continuous there. - -* Similarly, $f(x) = \log(x)$ is continuous on $(0,\infty)$, but it is not defined at $x=0$, so is not right continuous at $0$. - -* The tangent function $\tan(x) = \sin(x)/\cos(x)$ is continuous everywhere *except* the points $x$ with $\cos(x) = 0$ ($\pi/2 + k\pi, k$ an integer). - -* The hyperbolic co-tangent is not continuous at $x=0$ -- when $\sinh$ is $0$, - -* The semicircle $f(x) = \sqrt{1 - x^2}$ is *continuous* on $(-1, 1)$. It is not continuous at $-1$ and $1$, though it is right continuous at $-1$ and left continuous at $1$. - -##### Examples of discontinuity - -There are various reasons why a function may not be continuous. - -* The function $f(x) = \sin(x)/x$ has a limit at $0$ but is not defined at $0$, so is not continuous at $0$. The function can be redefined to make it continuous. - -* The function $f(x) = 1/x$ is continuous everywhere *except* $x=0$ where *no* limit exists. - -* A rational function $f(x) = p(x)/q(x)$ will be continuous everywhere except where $q(x)=0$. (The function ``f`` may still have a limit where ``q`` is ``0``, should factors cancel, but ``f`` won't be defined at such values.) - -* The function - -```math -f(x) = \begin{cases} - -1 & x < 0 \\ - 0 & x = 0 \\ - 1 & x > 0 -\end{cases} -``` - -is implemented by `Julia`'s `sign` function. It has a value at $0$, -but no limit at $0$, so is not continuous at $0$. Furthermore, the -left and right limits exist at $0$ but are not equal to $f(0)$ so the -function is not left or right continuous at $0$. It is continuous everywhere except at $x=0$. - -* Similarly, the function defined by this graph - -```julia; hold=true; echo=false -plot([-1,-.01], [-1,-.01], legend=false, color=:black) -plot!([.01, 1], [.01, 1], color=:black) -scatter!([0], [1/2], markersize=5, markershape=:circle) -``` - -is not continuous at $x=0$. It has a limit of $0$ at $0$, a function -value $f(0) =1/2$, but the limit and the function value are not equal. - -* The `floor` function, which rounds down to the nearest integer, is also not continuous at the integers, but is right continuous at the integers, as, for example, $\lim_{x \rightarrow 0+} f(x) = f(0)$. This graph emphasizes the right continuity by placing a point for the value of the function when there is a jump: - -```julia; hold=true; echo=false -x = [0,1]; y=[0,0] -plt = plot(x.-2, y.-2, color=:black, legend=false) -plot!(plt, x.-1, y.-1, color=:black) -plot!(plt, x.-0, y.-0, color=:black) -plot!(plt, x.+1, y.+1, color=:black) -plot!(plt, x.+2, y.+2, color=:black) -scatter!(plt, [-2,-1,0,1,2], [-2,-1,0,1,2], markersize=5, markershape=:circle) -plt -``` - - -* The function $f(x) = 1/x^2$ is not continuous at $x=0$: $f(x)$ is not defined at $x=0$ and $f(x)$ has no limit at $x=0$ (in the usual sense). - -* On the Wikipedia page for [continuity](https://en.wikipedia.org/wiki/Continuous_function) the example of Dirichlet's function is given: - -```math -f(x) = -\begin{cases} -0 & \text{if } x \text{ is irrational,}\\ -1 & \text{if } x \text{ is rational.} -\end{cases} -``` - - -The limit for any $c$ is discontinuous, as any interval about $c$ will -contain *both* rational and irrational numbers so the function will -not take values in a small neighborhood around any potential $L$. - -##### Example - -Let a function be defined by cases: - -```math -f(x) = \begin{cases} -3x^2 + c & x \geq 0,\\ -2x-3 & x < 0. -\end{cases} -``` - -What value of $c$ will make $f(x)$ a continuous function? - -We note that for $x < 0$ and for $x > 0$ the function is a simple polynomial, so is continuous. At $x=0$ to be continuous we need a limit to exists and be equal to $f(0)$, which is $c$. A limit exists if the left and right limits are equal. This means we need to solve for $c$ to make the left and right limits equal. We do this next with a bit of overkill in this case: - -```julia; -@syms x c -ex1 = 3x^2 + c -ex2 = 2x-3 -del = limit(ex1, x=>0, dir="+") - limit(ex2, x=>0, dir="-") -``` - -We need to solve for $c$ to make `del` zero: - -```julia; -solve(del, c) -``` - -This gives the value of $c$. - -## Rules for continuity - -As we've seen, functions can be combined in several ways. How do these relate with continuity? - -Suppose $f(x)$ and $g(x)$ are both continuous on $I$. Then - -* The function $h(x) = a f(x) + b g(x)$ is continuous on $I$ for any real numbers $a$ and $b$; - -* The function $h(x) = f(x) \cdot g(x)$ is continuous on $I$; and - -* The function $h(x) = f(x) / g(x)$ is continuous at all points $c$ in $I$ **where** $g(c) \neq 0$. - -* The function $h(x) = f(g(x))$ is continuous at $x=c$ *if* $g(x)$ is continuous at $c$ *and* $f(x)$ is continuous at $g(c)$. - -So, continuity is preserved for all of the basic operations except when dividing by $0$. - -##### Examples - -* Since a monomial $f(x) = ax^n$ ($n$ a non-negative integer) is continuous, by the first rule, any polynomial will be continuous. - -* Since both $f(x) = e^x$ and $g(x)=\sin(x)$ are continuous everywhere, so will be $h(x) = e^x \cdot \sin(x)$. - -* Since $f(x) = e^x$ is continuous everywhere and $g(x) = -x$ is continuous everywhere, the composition $h(x) = e^{-x}$ will be continuous everywhere. - -* Since $f(x) = x$ is continuous everywhere, the function $h(x) = 1/x$ - a ratio of continuous functions - will be continuous everywhere *except* possibly at $x=0$ (where it is not continuous). - -* The function $h(x) = e^{x\log(x)}$ will be continuous on $(0,\infty)$, the same domain that $g(x) = x\log(x)$ is continuous. This function (also written as $x^x$) has a right limit at $0$ (of $1$), but is not right continuous, as $h(0)$ is not defined. - - -## Questions - -###### Question - -Let $f(x) = \sin(x)$ and $g(x) = \cos(x)$. Which of these is not continuous everywhere? - -```math -f+g,~ f-g,~ f\cdot g,~ f\circ g,~ f/g -``` - -```julia; hold=true; echo=false -choices = ["``f+g``", "``f-g``", "``f\\cdot g``", "``f\\circ g``", "``f/g``"] -answ = length(choices) -radioq(choices, answ) -``` - -###### Question - -Let $f(x) = \sin(x)$, $g(x) = \sqrt{x}$. - -When will $f\circ g$ be continuous? - -```julia; hold=true; echo=false -choices = [L"For all $x$", L"For all $x > 0$", L"For all $x$ where $\sin(x) > 0$"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -When will $g \circ f$ be continuous? - -```julia; hold=true; echo=false -choices = [L"For all $x$", L"For all $x > 0$", L"For all $x$ where $\sin(x) > 0$"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The composition $f\circ g$ will be continuous everywhere provided: - -```julia; hold=true; echo=false -choices = [ -L"The function $g$ is continuous everywhere", -L"The function $f$ is continuous everywhere", -L"The function $g$ is continuous everywhere and $f$ is continuous on the range of $g$", -L"The function $f$ is continuous everywhere and $g$ is continuous on the range of $f$"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -At which values is $f(x) = 1/\sqrt{x-2}$ not continuous? - -```julia; hold=true; echo=false -choices=[ -L"When $x > 2$", -L"When $x \geq 2$", -L"When $x \leq 2$", -L"For $x \geq 0$"] -answ = 3 -radioq(choices, answ) -``` - - -###### Question - -A value $x=c$ is a *removable singularity* for $f(x)$ if $f(x)$ is not -continuous at $c$ but will be if $f(c)$ is redefined to be $\lim_{x -\rightarrow c} f(x)$. - - -The function $f(x) = (x^2 - 4)/(x-2)$ has a removable singularity at -$x=2$. What value would we redefine $f(2)$ to be, to make $f$ a -continuous function? - - - -```julia; hold=true; echo=false -f(x) = (x^2 -4)/(x-2); -numericq(f(2.00001), .001) -``` - - - - - -###### Question - -The highly oscillatory function - -```math -f(x) = x^2 (\cos(1/x) - 1) -``` - -has a removable singularity at $x=0$. What value would we redefine -$f(0)$ to be, to make $f$ a continuous function? - - - -```julia; hold=true; echo=false -numericq(0, .001) -``` - - -###### Question - -Let $f(x)$ be defined by - -```math -f(x) = \begin{cases} -c + \sin(2x - \pi/2) & x > 0\\ -3x - 4 & x \leq 0. -\end{cases} -``` - -What value of $c$ will make $f(x)$ continuous? - -```julia; hold=true; echo=false -val = (3*0 - 4) - (sin(2*0 - pi/2)) -numericq(val) -``` - -###### Question - -Suppose $f(x)$, $g(x)$, and $h(x)$ are continuous functions on $(a,b)$. If $a < c < b$, are you sure that $lim_{x \rightarrow c} f(g(x))$ is $f(g(c))$? - -```julia; hold=true; echo=false -choices = [L"No, as $g(c)$ may not be in the interval $(a,b)$", -"Yes, composition of continuous functions results in a continuous function, so the limit is just the function value." -] -answ=1 -radioq(choices, answ) -``` - -###### Question - -Consider the function $f(x)$ given by the following graph - -```julia; hold=true; echo=false -xs = range(0, stop=2, length=50) -plot(xs, [sqrt(1 - (x-1)^2) for x in xs], legend=false, xlims=(0,4)) -plot!([2,3], [1,0]) -scatter!([3],[0], markersize=5) -plot!([3,4],[1,0]) -scatter!([4],[0], markersize=5) -``` - -The function $f(x)$ is continuous at $x=1$? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -The function $f(x)$ is continuous at $x=2$? - -```julia; hold=true; echo=false -yesnoq(false) -``` - -The function $f(x)$ is right continuous at $x=3$? - -```julia; hold=true; echo=false -yesnoq(false) -``` - -The function $f(x)$ is left continuous at $x=4$? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -###### Question - -Let $f(x)$ and $g(x)$ be continuous functions whose graph of $[0,1]$ is given by: - -```julia; hold=true; echo=false -xs = range(0, 1, length=251) -plot(xs, [sin.(2pi*xs) cos.(2pi*xs)], layout=2, title=["f" "g"], legend=false) -``` - -What is $\lim_{x \rightarrow 0.25} f(g(x))$? - -```julia; hold=true; echo=false -val = sin(2pi * cos(2pi * 1/4)) -numericq(val) -``` - -What is $\lim{x \rightarrow 0.25} g(f(x))$? - -```julia; hold=true; echo=false -val = cos(2pi * sin(2pi * 1/4)) -numericq(val) -``` - -What is $\lim_{x \rightarrow 0.5} f(g(x))$? - -```julia; hold=true; echo=false -choices = ["Can't tell", -"``-1.0``", -"``0.0``" -] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/limits/figures/cannonball.jpg b/CwJ/limits/figures/cannonball.jpg deleted file mode 100644 index dc37318..0000000 Binary files a/CwJ/limits/figures/cannonball.jpg and /dev/null differ diff --git a/CwJ/limits/figures/hardrock-100.png b/CwJ/limits/figures/hardrock-100.png deleted file mode 100644 index 7960963..0000000 Binary files a/CwJ/limits/figures/hardrock-100.png and /dev/null differ diff --git a/CwJ/limits/intermediate_value_theorem.jmd b/CwJ/limits/intermediate_value_theorem.jmd deleted file mode 100644 index 313cb36..0000000 --- a/CwJ/limits/intermediate_value_theorem.jmd +++ /dev/null @@ -1,1076 +0,0 @@ -# Implications of continuity - -This section uses these add-on packages: - -```julia -using CalculusWithJulia -using Plots -using Roots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Implications of continuity", - description = "Calculus with Julia: Implications of continuity", - tags = ["CalculusWithJulia", "limits", "implications of continuity"], -); - -fig_size=(800, 600) -nothing -``` - ----- - -Continuity for functions is a valued property which carries -implications. In this section we discuss two: the intermediate value -theorem and the extreme value theorem. These two theorems speak to -some fundamental applications of calculus: finding zeros of a function and finding -extrema of a function. - -## Intermediate Value Theorem - -> The *intermediate value theorem*: If $f$ is continuous on $[a,b]$ -> with, say, $f(a) < f(b)$, then for any $y$ with $f(a) \leq y \leq f(b)$ -> there exists a $c$ in $[a,b]$ with $f(c) = y$. - -```julia; hold=true; echo=false; cache=true -### {{{IVT}}} - -function IVT_graph(n) - f(x) = sin(pi*x) + 9x/10 - a,b = [0,3] - - xs = range(a,stop=b, length=50) - - - ## cheat -- pick an x, then find a y - Δ = .2 - x = range(a + Δ, stop=b - Δ, length=6)[n] - y = f(x) - - plt = plot(f, a, b, legend=false, size=fig_size) - plot!(plt, [0,x,x], [f(x),f(x),0], color=:orange, linewidth=3) - - plot - -end - -n = 6 -anim = @animate for i=1:n - IVT_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -caption = L""" - -Illustration of intermediate value theorem. The theorem implies that any randomly chosen $y$ -value between $f(a)$ and $f(b)$ will have at least one $x$ in $[a,b]$ -with $f(x)=y$. - -""" - -ImageFile(imgfile, caption) -``` - -In the early years of calculus, the intermediate value theorem was -intricately connected with the definition of continuity, now it is a -consequence. - -The basic proof starts with a set of points in $[a,b]$: $C = \{x -\text{ in } [a,b] \text{ with } f(x) \leq y\}$. The set is not empty -(as $a$ is in $C$) so it *must* have a largest value, call it $c$ -(this requires the completeness property of the real numbers). By -continuity of $f$, it can be shown that $\lim_{x \rightarrow c-} f(x) -= f(c) \leq y$ and $\lim_{y \rightarrow c+}f(x) =f(c) \geq y$, which -forces $f(c) = y$. - - - -### Bolzano and the bisection method - -Suppose we have a continuous function $f(x)$ on $[a,b]$ with $f(a) < -0$ and $f(b) > 0$. Then as $f(a) < 0 < f(b)$, the intermediate value -theorem guarantees the existence of a $c$ in $[a,b]$ with $f(c) = -0$. This was a special case of the intermediate value theorem proved -by Bolzano first. Such $c$ are called *zeros* of the function $f$. - -We use this fact when a building a "sign chart" of a polynomial function. -Between any two consecutive real zeros the polynomial can not -change sign. (Why?) So a "test point" can be used to determine the -sign of the function over an entire interval. - - -Here, we use the Bolzano theorem to give an algorithm - the *bisection method* - to locate the value $c$ under the assumption $f$ is continous on $[a,b]$ and changes sign between $a$ and $b$. - -```julia; hold=true; echo=false; cache=true -## {{{bisection_graph}}} -function bisecting_graph(n) - f(x) = x^2 - 2 - a,b = [0,2] - - err = 2.0^(1-n) - title = "b - a = $err" - xs = range(a, stop=b, length=100) - plt = plot(f, a, b, legend=false, size=fig_size, title=title) - - if n >= 1 - for i in 1:n - c = (a+b)/2 - if f(a) * f(c) < 0 - a,b=a,c - else - a,b=c,b - end - end - end - plot!(plt, [a,b],[0,0], color=:orange, linewidth=3) - scatter!(plt, [a,b], [f(a), f(b)], color=:orange, markersize=5, markershape=:circle) - - plt - -end - - -n = 9 -anim = @animate for i=1:n - bisecting_graph(i-1) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -caption = L""" - -Illustration of the bisection method to find a zero of a function. At -each step the interval has $f(a)$ and $f(b)$ having opposite signs so -that the intermediate value theorem guaratees a zero. - -""" - -ImageFile(imgfile, caption) -``` - - -Call $[a,b]$ a *bracketing* interval if $f(a)$ and $f(b)$ have different signs. -We remark that having different signs can be expressed mathematically as $f(a) \cdot f(b) < 0$. - -We can narrow down where a zero is in $[a,b]$ by following this recipe: - -* Pick a midpoint of the interval, for concreteness $c = (a+b)/2$. - -* If $f(c) = 0$ we are done, having found a zero in $[a,b]$. - -* Otherwise if must be that either $f(a)\cdot f(c) < 0$ or $f(c) \cdot f(b) < 0$. If $f(a) \cdot f(c) < 0$, then let $b=c$ and repeat the above. Otherwise, let $a=c$ and repeat the above. - -At each step the bracketing interval is narrowed -- indeed split in half -as defined -- or a zero is found. - -For the real numbers this algorithm never stops unless a zero is -found. A "limiting" process is used to say that if it doesn't stop, it -will converge to some value. - -However, using floating point numbers leads to differences from the -real-number situation. In this case, due to the ultimate granularity of the -approximation of floating point values to the real numbers, the -bracketing interval eventually can't be subdivided, that is no $c$ is found over -the floating point numbers with $a < c < b$. So there is a natural -stopping criteria: stop when there is an exact zero, when the -bracketing interval gets too small to subdivide, or when the interval is as small as desired. - -We can write a relatively simple program to implement this algorithm: - -```julia; -function simple_bisection(f, a, b) - if f(a) == 0 return(a) end - if f(b) == 0 return(b) end - if f(a) * f(b) > 0 error("[a,b] is not a bracketing interval") end - - tol = 1e-14 # small number (but should depend on size of a, b) - c = a/2 + b/2 - - while abs(b-a) > tol - if f(c) == 0 return(c) end - - if f(a) * f(c) < 0 - a, b = a, c - else - a, b = c, b - end - - c = a/2 + b/2 - - end - c -end -``` - -This function uses a `while` loop to repeat the process of subdividing -$[a,b]$. A `while` loop will repeat until the condition is no longer -`true`. The above will stop for reasonably sized floating point -values (within $(-100, 100)$, say), but, as written, ignores the fact -that the gap between floating point values depends on their magnitude. - -The value $c$ returned *need not* be an exact zero. Let's see: - -```julia; -c = simple_bisection(sin, 3, 4) -``` - -This value of $c$ is a floating-point approximation to $\pi$, but is not *quite* a zero: - -```julia; -sin(c) -``` - -(Even `pi` itself is not a "zero" due to floating point issues.) - - -### The `find_zero` function. - -The `Roots` package has a function `find_zero` that implements the -bisection method when called as `find_zero(f, (a,b))` where $[a,b]$ -is a bracket. Its use is similar to `simple_bisection` above. This package is loaded when `CalculusWithJulia` is. We illlustrate the usage of `find_zero` -in the following: - -```julia; -xstar = find_zero(sin, (3, 4)) -``` - -!!! warning - Notice, the call `find_zero(sin, (3, 4))` again fits the template `action(function, args...)` that we see repeatedly. The `find_zero` function can also be called through `fzero`. The use of `(3, 4)` to specify the interval is not necessary. For example `[3,4]` would work equally as well. (Anything where `extrema` is defined works.) - -This function utilizes some facts about floating point values to -guarantee that the answer will be an *exact* zero or a value where there is a sign change between the next bigger floating point or the next smaller, which means the sign at the next and previous floating point values is different: - -```julia -sin(xstar), sign(sin(prevfloat(xstar))), sign(sin(nextfloat(xstar))) -``` - -##### Example - -The polynomial $p(x) = x^5 - x + 1$ has a zero between $-2$ and $-1$. Find it. - -```julia; -p(x) = x^5 - x + 1 -c₀ = find_zero(p, (-2, -1)) -(c₀, p(c₀)) -``` - -We see, as before, that $p(c)$ is not quite $0$. But it can be easily checked that `p` is negative at the previous floating point number, while `p` is seen to be positive at the returned value: - -```julia -p(c₀), sign(p(prevfloat(c₀))), sign(p(nextfloat(c₀))) -``` - - - -##### Example - -The function $q(x) = e^x - x^4$ has a zero between $5$ and $10$, as this graph shows: - -```julia; -q(x) = exp(x) - x^4 -plot(q, 5, 10) -``` - -Find the zero numerically. The plot shows $q(5) < 0 < q(10)$, so $[5,10]$ is a bracket. We thus have: - -```julia; -find_zero(q, (5, 10)) -``` - - - -##### Example - -Find all real zeros of $f(x) = x^3 -x + 1$ using the bisection method. - -We show next that symbolic values can be used with `find_zero`, should that be useful. - -First, we produce a plot to identify a bracketing interval - -```julia; -@syms x -plot(x^3 - x + 1, -3, 3) -``` - -It appears (and a plot over $[0,1]$ verifies) that there is one zero between $-2$ and $-1$. It is found with: - -```julia; -find_zero(x^3 - x + 1, (-2, -1)) -``` - - - - -##### Example - -The equation $\cos(x) = x$ has just one solution, as can be seen in this plot: - -```julia; -𝒇(x) = cos(x) -𝒈(x) = x -plot(𝒇, -pi, pi) -plot!(𝒈) -``` - -Find it. - -We see from the graph that it is clearly between $0$ and $2$, so all we need is a function. (We have two.) The trick is to observe that solving $f(x) = g(x)$ is the same problem as solving for $x$ where $f(x) - g(x) = 0$. So we define the difference and use that: - -```julia; -𝒉(x) = 𝒇(x) - 𝒈(x) -find_zero(𝒉, (0, 2)) -``` - -#### Using parameterized functions (`f(x,p)`) with `find_zero` - -Geometry will tell us that ``\cos(x) = x/p`` for *one* ``x`` in ``[0, \pi/2]`` whenever ``p>0``. We could set up finding this value for a given ``p`` by making ``p`` part of the function definition, but as an illustration of passing parameters, we leave `p` as a parameter (in this case, as a second value with default of ``1``): - -```julia; hold=true; -f(x, p=1) = cos(x) - x/p -I = (0, pi/2) -find_zero(f, I), find_zero(f, I, p=2) -``` - -The second number is the solution when `p=2`. - - - -##### Example - -We wish to compare two trash collection plans - -* Plan 1: You pay ``47.49`` plus ``0.77`` per bag. - -* Plan 2: You pay ``30.00`` plus ``2.00`` per bag. - -There are some cases where plan 1 is cheaper and some where plan 2 is. Categorize them. - - -Both plans are *linear models* and may be written in *slope-intercept* form: - -```julia; -plan1(x) = 47.49 + 0.77x -plan2(x) = 30.00 + 2.00x -``` - -Assuming this is a realistic problem and an average American household -might produce ``10``-``20`` bags of trash a month (yes, that seems too much!) -we plot in that range: - -```julia; -plot(plan1, 10, 20) -plot!(plan2) -``` - - -We can see the intersection point is around ``14`` and that if a family -generates between ``0``-``14`` bags of trash per month that plan ``2`` would be -cheaper. - -Let's get a numeric value, using a simple bracket and an anonymous function: - -```julia; -find_zero(x -> plan1(x) - plan2(x), (10, 20)) -``` - -##### Example, the flight of an arrow - -The flight of an arrow can be modeled using various functions, -depending on assumptions. Suppose an arrow is launched in the air from -a height of ``0`` feet above the ground at an angle of $\theta = -\pi/4$. With a suitable choice for the initial velocity, a model -without wind resistance for the height of the arrow at a distance $x$ -units away may be: - -```math -j(x) = \tan(\theta) x - (1/2) \cdot g(\frac{x}{v_0 \cos\theta})^2. -``` - -In `julia` we have, taking $v_0=200$: - -```julia; -j(x; theta=pi/4, g=32, v0=200) = tan(theta)*x - (1/2)*g*(x/(v0*cos(theta)))^2 -``` - - -With a velocity-dependent wind resistance given by $\gamma$, again with some units, a similar -equation can be constructed. It takes a different form: - -```math -d(x) = (\frac{g}{\gamma v_0 \cos(\theta)} + \tan(\theta)) \cdot x + - \frac{g}{\gamma^2}\log(\frac{v_0\cos(\theta) - \gamma x}{v_0\cos(\theta)}) -``` - -Again, $v_0$ is the initial velocity and is taken to be $200$ -and $\gamma$ a resistance, which we take to be $1$. With this, we have -the following `julia` definition (with a slight reworking of $\gamma$): - -```julia; -function d(x; theta=pi/4, g=32, v0=200, gamma=1) - a = gamma * v0 * cos(theta) - (g/a + tan(theta)) * x + g/gamma^2 * log((a-gamma^2 * x)/a) -end -``` - -For each model, we wish to find the value of $x$ after launching where -the height is modeled to be ``0``. That is how far will the arrow travel -before touching the ground? - - -For the model without wind resistance, we can graph the function -easily enough. Let's guess the distance is no more than ``500`` feet: - -```julia; -plot(j, 0, 500) -``` - -Well, we haven't even seen the peak yet. Better to do a little spade -work first. This is a quadratic function, so we can use `roots` from `SymPy` to find the roots: - -```julia; -roots(j(x)) -``` - - -We see that $1250$ is the largest root. So we plot over this domain to visualize the flight: - -```julia; -plot(j, 0, 1250) -``` - - - -As for the model with wind resistance, a quick plot over the same interval, $[0, 1250]$ yields: - -```julia; -plot(d, 0, 1250) -``` - -This graph eventually goes negative and then stops. This is due to the asymptote in model when `(a - gamma^2*x)/a` is zero. To plot the trajectory until it returns to ``0``, we need to identify the value of the zero. -This model is non-linear and we don't have the simplicity of using `roots` to find out the answer, so we solve for when $a-\gamma^2 x$ is $0$: - -```julia; -gamma = 1 -a = 200 * cos(pi/4) -b = a/gamma^2 -``` - -Note that the function is infinite at `b`: - -```julia; -d(b) -``` - - -From the graph, we can see the zero is around `b`. As `y(b)` is `-Inf` we can use the bracket `(b/2,b)` - -```julia; -x1 = find_zero(d, (b/2, b)) -``` - -The answer is approximately $140.7$ - -(The bisection method only needs to know the sign of the function. Other bracketing methods would have issues with an endpoint with an infinite function value. To use them, some value between the zero and `b` would needed.) - - -Finally, we plot both graphs at once to see that it was a very windy -day indeed. - -```julia; -plot(j, 0, 1250, label="no wind") -plot!(d, 0, x1, label="windy day") -``` - -##### Example: bisection and non-continuity - -The Bolzano theorem assumes a continuous function $f$, and when -applicable, yields an algorithm to find a guaranteed zero. - -However, the algorithm itself does not know that the function is continuous or -not, only that the function changes sign. As such, it can produce -answers that are not "zeros" when used with discontinuous -functions. - -In general a function over floating point values could be considered as a large table of mappings: each of the ``2^{64}`` floating point values gets assigned a value. This is discrete mapping, there is nothing the computer sees related to continuity. - -> The concept of continuity, if needed, must be verified by the user of the algorithm. - -We have seen this when plotting rational functions or functions with vertical asymptotes. The default algorithms just connect points with lines. The user must manage the discontinuity (by assigning some values `NaN`, say); the algorithms used do not. - - -In this particular case, the bisection algorithm can still be fruitful -even when the function is not continuous, as the algorithm will yield -information about crossing values of $0$, possibly at -discontinuities. But the user of the algorithm must be aware that the -answers are only guaranteed to be zeros of the function if the -function is continuous and the algorithm did not check for that -assumption. - -As an example, let $f(x) = 1/x$. Clearly the interval $[-1,1]$ is a -"bracketing" interval as $f(x)$ changes sign between $a$ and $b$. What -does the algorithm yield: - -```julia; -fᵢ(x) = 1/x -x0 = find_zero(fᵢ, (-1, 1)) -``` - - - -The function is not defined at the answer, but we do have the fact -that just to the left of the answer (`prevfloat`) and just to the -right of the answer (`nextfloat`) the function changes sign: - -```julia; -sign(fᵢ(prevfloat(x0))), sign(fᵢ(nextfloat(x0))) -``` - -So, the "bisection method" applied here finds a point where the function crosses -$0$, either by continuity or by jumping over the $0$. (A `jump` -discontinuity at $x=c$ is defined by the left and right limits of $f$ -at $c$ existing but being unequal. The algorithm can find $c$ when -this type of function jumps over $0$.) - - -### The `find_zeros` function - -The bisection method suggests a naive means to search for all zeros within -an interval $(a, b)$: split the interval into many small intervals and for each that is a -bracketing interval find a zero. This simple description has three -flaws: it might miss values where the function doesn't actually -cross the $x$ axis; it might miss values where the function just dips -to the other side; and it might miss multiple values in the same small -interval. - -Still, with some engineering, this can be a useful approach, save the -caveats. This idea is implemented in the `find_zeros` function of the `Roots` package. The function is -called via `find_zeros(f, (a, b))` but here the interval -$[a,b]$ is not necessarily a bracketing interval. - -To see, we have: - -```julia; hold=true; -f(x) = cos(10*pi*x) -find_zeros(f, (0, 1)) -``` - -Or for a polynomial: - -```julia; hold=true; -f(x) = x^5 - x^4 + x^3 - x^2 + 1 -find_zeros(f, (-10, 10)) -``` - -(Here $-10$ and $10$ were arbitrarily chosen. Cauchy's method could be used to be more systematic.) - -##### Example: Solving f(x) = g(x) - -Use `find_zeros` to find when $e^x = x^5$ in the interval $[-20, 20]$. Verify the answers. - -To proceed with `find_zeros`, we define $f(x) = e^x - x^5$, as $f(x) = 0$ precisely when $e^x = x^5$. -The zeros are then found with: - -```julia; -f₁(x) = exp(x) - x^5 -zs = find_zeros(f₁, (-20,20)) -``` - - -The output of `find_zeros` is a vector of values. To check that each value -is an approximate zero can be done with the "." (broadcast) syntax: - - -```julia; -f₁.(zs) -``` - -(For a continuous function this should be the case that the values -returned by `find_zeros` are approximate zeros. Bear in mind that if $f$ is not -continous the algorithm might find jumping points that are not zeros and may not even be in the domain of the function.) - -### An alternate interface to `find_zero` - -The `find_zero` function in the `Roots` package is an interface to one of several methods. For now we focus on the *bracketing* methods, later we will see others. Bracketing methods, among others, include `Roots.Bisection()`, the basic bisection method though with a different sense of "middle" than ``(a+b)/2`` and used by default above; `Roots.A42()`, which will typically converge much faster than simple bisection; `Roots.Brent()` for the classic method of Brent, and `FalsePosition()` for a family of *regula falsi* methods. These can all be used by specifying the method in a call to `find_zero`. - -Alternatively, `Roots` implements the `CommonSolve` interface popularized by its use in the `DifferentialEquations.jl` ecosystem, a wildly successful area for `Julia`. The basic setup is two steps: setup a "problem," solve the problem. - -To set up a problem, we call `ZeroProblem` with the function and an initial interval, as in: - -```julia -f₅(x) = x^5 - x - 1 -prob = ZeroProblem(f₅, (1,2)) -``` - -Then we can "solve" this problem with `solve`. For example: - -```julia -solve(prob), solve(prob, Roots.Brent()), solve(prob, Roots.A42()) -``` - -Though the answers are identical, the methods employed were not. The first call, with an unspecified method, defaults to bisection. - - - - -## Extreme value theorem - -The Extreme Value Theorem is another consequence of continuity. - -To discuss the extreme value theorem, we define an *absolute maximum*. - -> The absolute maximum of $f(x)$ over an interval $I$, when it exists, is the value $f(c)$, $c$ in $I$, -> where $f(x) \leq f(c)$ for any $x$ in $I$. -> -> Similarly, an *absolute minimum* of -> $f(x)$ over an interval $I$ can be defined, when it exists, by a value ``f(c)`` where ``c`` is in ``I`` *and* -> ``f(c) \leq f(x)`` for any ``x`` in ``I``. - - -Related but different is the concept of a relative of *local extrema*: - -> A local maxima for ``f`` is a value ``f(c)`` where ``c`` is in **some** *open* interval ``I=(a,b)``, ``I`` in the domain of ``f``, and ``f(c)`` is an absolute maxima for ``f`` over ``I``. Similarly, an local minima for ``f`` is a value ``f(c)`` where ``c`` is in **some** *open* interval ``I=(a,b)``, ``I`` in the domain of ``f``, and ``f(x)`` is an absolute minima for ``f`` over ``I``. - -The term *local extrema* is used to describe either a local maximum or local minimum. - -The key point, is the extrema are values in the *range* that are realized by some value in the *domain* (possibly more than one.) - -This chart of the [Hardrock 100](http://hardrock100.com/) illustrates the two concepts. - -```julia; echo=false -###{{{hardrock_profile}}} -imgfile = "figures/hardrock-100.png" -caption = """ -Elevation profile of the Hardrock 100 ultramarathon. Treating the elevation profile as a function, the absolute maximum is just about 14,000 feet and the absolute minimum about 7600 feet. These are of interest to the runner for different reasons. Also of interest would be each local maxima and local minima - the peaks and valleys of the graph - and the total elevation climbed - the latter so important/unforgettable its value makes it into the chart's title. -""" - -ImageFile(:limits, imgfile, caption) -``` - - -The extreme value theorem discusses an assumption that ensures -absolute maximum and absolute minimum values exist. - -> The *extreme value theorem*: If $f(x)$ is continuous over a closed -> interval $[a,b]$ then $f$ has an absolute maximum and an absolute -> minimum over $[a,b]$. - -(By continuous over $[a,b]$ we mean continuous on $(a,b)$ and right -continuous at $a$ and left continuous at $b$.) - -The assumption that $[a,b]$ includes its endpoints (it is closed) is crucial to make a -guarantee. There are functions which are continuous on open intervals -for which this result is not true. For example, $f(x) = 1/x$ on $(0,1)$. This -function will have no smallest value or largest value, as defined above. - -The extreme value theorem is an important theoretical tool for -investigating maxima and minima of functions. - - - -##### Example - -The function $f(x) = \sqrt{1-x^2}$ is continuous on the interval -$[-1,1]$ (in the sense above). It then has an absolute maximum, we can -see to be $1$ occurring at an interior point $0$. The absolute minimum -is $0$, it occurs at each endpoint. - -##### Example - -The function $f(x) = x \cdot e^{-x}$ on the closed interval $[0, 5]$ is continuous. Hence it has an absolute maximum, which a graph shows to be $0.4$. It has an absolute minimum, clearly the value $0$ occurring at the endpoint. - -```julia; hold=true; -plot(x -> x * exp(-x), 0, 5) -``` - -##### Example - -The tangent function does not have a *guarantee* of absolute maximum -or minimum over $(-\pi/2, \pi/2),$ as it is not *continuous* at the -endpoints. In fact, it doesn't have either extrema - it has vertical asymptotes at each endpoint of this interval. - - -##### Example - -The function $f(x) = x^{2/3}$ over the interval $[-2,2]$ has cusp at $0$. However, it is continuous on this closed interval, so must have an absolute maximum and absolute minimum. They can be seen from the graph to occur at the endpoints and the cusp at $x=0$, respectively: - -```julia;hold=true; -plot(x -> (x^2)^(1/3), -2, 2) -``` - -(The use of just `x^(2/3)` would fail, can you guess why?) - - -##### Example - -A New York Times [article](https://www.nytimes.com/2016/07/30/world/europe/norway-considers-a-birthday-gift-for-finland-the-peak-of-an-arctic-mountain.html) discusses an idea of Norway moving its border some 490 feet north and 650 feet east in order to have the peak of Mount Halti be the highest point in Finland, as currently it would be on the boundary. Mathematically this hints at a higher dimensional version of the extreme value theorem. - - -## Continuity and closed and open sets - -We comment on two implications of continuity that can be generalized to more general settings. - - -The two intervals ``(a,b)`` and ``[a,b]`` differ as the latter includes the endpoints. The extreme value theorem shows this distinction can make a big difference in what can be said regarding *images* of such interval. - -In particular, if ``f`` is continuous and ``I = [a,b]`` with ``a`` and ``b`` finite (``I`` is *closed* and bounded) then the *image* of ``I`` sometimes denoted ``f(I) = \{y: y=f(x) \text{ for } x \in I\}`` has the property that it will be an interval and will include its endpoints (also closed and bounded). - -That ``f(I)`` is an interval is a consequence of the intermediate value theorem. That ``f(I)`` contains its endpoints is the extreme value theorem. - -On the real line, sets that are closed and bounded are "compact," a term that generalizes to other settings. - -> Continuity implies that the *image* of a compact set is compact. - -Now let ``(c,d)`` be an *open* interval in the range of ``f``. An open interval is an open set. On the real line, an open set is one where each point in the set, ``a``, has some ``\delta`` such that if ``|b-a| < \delta`` then ``b`` is also in the set. - -> Continuity implies that the *preimage* of an open set is an open set. - -The *preimage* of an open set, ``I``, is ``\{a: f(a) \in I\}``. (All ``a`` with an image in ``I``.) Taking some pair ``(a,y)`` with ``y`` in ``I`` and ``a`` in the preimage as ``f(a)=y``. -Let ``\epsilon`` be such that ``|x-y| < \epsilon`` implies ``x`` is in ``I``. -Then as ``f`` is continuous at ``a``, given ``\epsilon`` there is a ``\delta`` such that ``|b-a| <\delta`` implies ``|f(b) - f(a)| < \epsilon`` or ``|f(b)-y| < \epsilon`` which means that ``f(b)`` is in the ``I`` so ``b`` is in the preimage, implying the preimage is an open set. - - -## Questions - - -###### Question - -There is negative zero in the interval $[-10, 0]$ for the function -$f(x) = e^x - x^4$. Find its value numerically: - -```julia; hold=true; echo=false -f(x) = exp(x) - x^4 -val = find_zero(f, (-10, 0)); -numericq(val, 1e-3) -``` - - -###### Question - -There is zero in the interval $[0, 5]$ for the function -$f(x) = e^x - x^4$. Find its value numerically: - -```julia; hold=true; echo=false -f(x) = exp(x) - x^4 -val = find_zero(f, (0, 5)); -numericq(val, 1e-3) -``` - -###### Question - -Let $f(x) = x^2 - 10 \cdot x \cdot \log(x)$. This function has two -zeros on the positive $x$ axis. You are asked to find the largest -(graph and bracket...). - - -```julia; hold=true; echo=false -b = 10 -f(x) = x^2 - b * x * log(x) -val = find_zero(f, (10, 500)) -numericq(val, 1e-3) -``` - -###### Question - -The `airyai` function has infinitely many negative roots, as the -function oscillates when $x < 0$ and *no* positive roots. Find the -*second largest root* using the graph to bracket the answer, and then -solve. - -```julia; -plot(airyai, -10, 10) # `airyai` loaded in `SpecialFunctions` by `CalculusWithJulia` -``` - - -The second largest root is: - -```julia; hold=true; echo=false -val = find_zero(airyai, (-5, -4)); -numericq(val, 1e-8) -``` - -###### Question - -(From [Strang](http://ocw.mit.edu/ans7870/resources/Strang/Edited/Calculus/Calculus.pdf), p. 37) - -Certainly $x^3$ equals $3^x$ at $x=3$. Find the largest value for which $x^3 = 3x$. - -```julia; hold=true; echo=false -val = maximum(find_zeros(x -> x^3 - 3^x, (0, 20))) -numericq(val) -``` - -Compare $x^2$ and $2^x$. They meet at $2$, where do the meet again? - -```julia; hold=true; echo=false -choices = ["Only before 2", "Only after 2", "Before and after 2"] -answ = 3 -radioq(choices, answ) -``` - -Just by graphing, find a number in $b$ with $2 < b < 3$ where for -values less than $b$ there is a zero beyond $b$ of $b^x - x^b$ and for values more than $b$ there isn't. - -```julia; hold=true; echo=false -choices=[ -"``b \\approx 2.2``", -"``b \\approx 2.5``", -"``b \\approx 2.7``", -"``b \\approx 2.9``"] -answ = 3 -radioq(choices, answ) -``` - - - - -###### Question: What goes up must come down... - -```julia; hold=true; echo=false -### {{{cannonball_img}}} -figure= "figures/cannonball.jpg" -caption = """ -Trajectories of potential cannonball fires with air-resistance included. (http://ej.iop.org/images/0143-0807/33/1/149/Full/ejp405251f1_online.jpg) -""" -ImageFile(:limits, figure, caption) -``` - -In 1638, according to Amir D. [Aczel](http://books.google.com/books?id=kvGt2OlUnQ4C&pg=PA28&lpg=PA28&dq=mersenne+cannon+ball+tests&source=bl&ots=wEUd7e0jFk&sig=LpFuPoUvODzJdaoug4CJsIGZZHw&hl=en&sa=X&ei=KUGcU6OAKJCfyASnioCoBA&ved=0CCEQ6AEwAA#v=onepage&q=mersenne%20cannon%20ball%20tests&f=false), -an experiment was performed in the French Countryside. A monk, Marin -Mersenne, launched a cannonball straight up into the air in an attempt -to help Descartes prove facts about the rotation of the earth. Though -the experiment was not successful, Mersenne later observed that the -time for the cannonball to go up was greater than the time to come -down. ["Vertical Projection in a Resisting Medium: Reflections on Observations of Mersenne".](http://www.maa.org/publications/periodicals/american-mathematical-monthly/american-mathematical-monthly-contents-junejuly-2014) - -This isn't the case for simple ballistic motion where the time to go -up is equal to the time to come down. We can "prove" this numerically. For simple ballistic -motion: - -```math -f(t) = -\frac{1}{2} \cdot 32 t^2 + v_0t. -``` - -The time to go up and down are found by -the two zeros of this function. The peak time is related to a zero of -a function given by `f'`, which for now we'll take as a mystery -operation, but later will be known as the derivative. (The notation assumes `CalculusWithJulia` has been loaded.) - -Let $v_0= 390$. The three times in question can be found from the zeros of `f` and `f'`. What are they? - -```julia; hold=true; echo=false -choices = ["``(0.0, 12.1875, 24.375)``", - "``(-4.9731, 0.0, 4.9731)``", - "``(0.0, 625.0, 1250.0)``"] -answ = 1 -radioq(choices, answ) -``` - - - -###### Question What goes up must come down... (again) - -For simple ballistic motion you find that the time to go up is the -time to come down. For motion within a resistant medium, such as air, -this isn't the case. Suppose a model for the height as a function of time is given by - -```math -h(t) = (\frac{g}{\gamma^2} + \frac{v_0}{\gamma})(1 - e^{-\gamma t}) - \frac{gt}{\gamma} -``` - -([From "On the trajectories of projectiles depicted in early ballistic Woodcuts"](http://www.researchgate.net/publication/230963032_On_the_trajectories_of_projectiles_depicted_in_early_ballistic_woodcuts)) - -Here $g=32$, again we take $v_0=390$, and $\gamma$ is a drag -coefficient that we will take to be $1$. This is valid when $h(t) -\geq 0$. In `Julia`, rather than hard-code the parameter values, for -added flexibility we can pass them in as keyword arguments: - -```julia; -h(t; g=32, v0=390, gamma=1) = (g/gamma^2 + v0/gamma)*(1 - exp(-gamma*t)) - g*t/gamma -``` - -Now find the three times: $t_0$, the starting time; $t_a$, the time at -the apex of the flight; and $t_f$, the time the object returns to the -ground. - -```julia; hold=true; echo=false -t0 = 0.0 -tf = find_zero(h, (10, 20)) -ta = find_zero(D(h), (t0, tf)) -choices = ["``(0, 13.187, 30.0)``", - "``(0, 32.0, 390.0)``", - "``(0, 2.579, 13.187)``"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -Part of the proof of the intermediate value theorem rests on knowing what the limit is of $f(x)$ when $f(x) > y$ for all $x$. What can we say about $L$ supposing $L = \lim_{x \rightarrow c+}f(x)$ under this assumption on $f$? - -```julia; hold=true; echo=false -choices = [L"It must be that $L > y$ as each $f(x)$ is.", -L"It must be that $L \geq y$", -L"It can happen that $L < y$, $L=y$, or $L>y$"] -answ = 2 -radioq(choices, 2, keep_order=true) -``` - -###### Question - -The extreme value theorem has two assumptions: a continuous function -and a *closed* interval. Which of the following examples fails to -satisfy the consequence of the extreme value theorem because the interval is not closed? -(The consequence - the existence of an absolute maximum and minimum - can happen even if the theorem does not apply.) - -```julia; hold=true; echo=false -choices = [ -"``f(x) = \\sin(x),~ I=(-2\\pi, 2\\pi)``", -"``f(x) = \\sin(x),~ I=(-\\pi, \\pi)``", -"``f(x) = \\sin(x),~ I=(-\\pi/2, \\pi/2)``", -"None of the above"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -The extreme value theorem has two assumptions: a continuous function -and a *closed* interval. Which of the following examples fails to -satisfy the consequence of the extreme value theorem because the function is not continuous? - -```julia; hold=true; echo=false -choices = [ -"``f(x) = 1/x,~ I=[1,2]``", -"``f(x) = 1/x,~ I=[-2, -1]``", -"``f(x) = 1/x,~ I=[-1, 1]``", -"none of the above"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - - -The extreme value theorem has two assumptions: a continuous function -and a *closed* interval. Which of the following examples fails to -satisfy the consequence of the extreme value theorem because the function is not continuous? - -```julia; hold=true; echo=false -choices = [ -"``f(x) = \\text{sign}(x),~ I=[-1, 1]``", -"``f(x) = 1/x,~ I=[-4, -1]``", -"``f(x) = \\text{floor}(x),~ I=[-1/2, 1/2]``", -"none of the above"] -answ = 4 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -The function $f(x) = x^3 - x$ is continuous over the interval -$I=[-2,2]$. Find a value $c$ for which $M=f(c)$ is an absolute maximum -over $I$. - -```julia; hold=true; echo=false -val = 2 -numericq(val) -``` - - -###### Question - - -The function $f(x) = x^3 - x$ is continuous over the interval -$I=[-1,1]$. Find a value $c$ for which $M=f(c)$ is an absolute maximum -over $I$. - -```julia; hold=true; echo=false -val = -sqrt(3)/3 -numericq(val) -``` - - -###### Question - -Consider the continuous function $f(x) = \sin(x)$ over the closed interval $I=[0, 10\pi]$. Which of these is true? - -```julia; hold=true; echo=false -choices = [ -L"There is no value $c$ for which $f(c)$ is an absolute maximum over $I$.", -L"There is just one value of $c$ for which $f(c)$ is an absolute maximum over $I$.", -L"There are many values of $c$ for which $f(c)$ is an absolute maximum over $I$." -] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -Consider the continuous function $f(x) = \sin(x)$ over the closed interval $I=[0, 10\pi]$. Which of these is true? - -```julia; hold=true; echo=false -choices = [ -L"There is no value $M$ for which $M=f(c)$, $c$ in $I$ for which $M$ is an absolute maximum over $I$.", -L"There is just one value $M$ for which $M=f(c)$, $c$ in $I$ for which $M$ is an absolute maximum over $I$.", -L"There are many values $M$ for which $M=f(c)$, $c$ in $I$ for which $M$ is an absolute maximum over $I$." -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - - - -###### Question - -The extreme value theorem says that on a closed interval a continuous -function has an extreme value $M=f(c)$ for some $c$. Does it also say -that $c$ is unique? Which of these examples might help you answer this? - -```julia; hold=true; echo=false -choices = [ -"``f(x) = \\sin(x),\\quad I=[-\\pi/2, \\pi/2]``", -"``f(x) = \\sin(x),\\quad I=[0, 2\\pi]``", -"``f(x) = \\sin(x),\\quad I=[-2\\pi, 2\\pi]``"] -answ = 3 -radioq(choices, answ) -``` - -##### Question - -The zeros of the equation $\cos(x) \cdot \cosh(x) = 1$ are related to vibrations of rods. Using `find_zeros`, what is the largest zero in the interval $[0, 6\pi]$? - -```julia; hold=true; echo=false -val = maximum(find_zeros(x -> cos(x) * cosh(x) - 1, (0, 6pi))) -numericq(val) -``` - -##### Question - -A parametric equation is specified by a parameterization $(f(t), g(t)), a \leq t \leq b$. The parameterization will be continuous if and only if each function is continuous. - -Suppose $k_x$ and $k_y$ are positive integers and $a, b$ are positive numbers, will the [Lissajous](https://en.wikipedia.org/wiki/Parametric_equation#Lissajous_Curve) curve given by $(a\cos(k_x t), b\sin(k_y t))$ be continuous? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -Here is a sample graph for $a=1, b=2, k_x=3, k_y=4$: - -```julia; hold=true; -a,b = 1, 2 -k_x, k_y = 3, 4 -plot(t -> a * cos(k_x *t), t-> b * sin(k_y * t), 0, 4pi) -``` diff --git a/CwJ/limits/limit-example.js b/CwJ/limits/limit-example.js deleted file mode 100644 index 7308d5d..0000000 --- a/CwJ/limits/limit-example.js +++ /dev/null @@ -1,19 +0,0 @@ -const b = JXG.JSXGraph.initBoard('jsxgraph', { - boundingbox: [-6, 1.2, 6,-1.2], axis:true -}); - -var f = function(x) {return Math.sin(x) / x;}; -var graph = b.create("functiongraph", [f, -6, 6]) -var seg = b.create("line", [[-6,0], [6,0]], {fixed:true}); - -var X = b.create("glider", [2, 0, seg], {name:"x", size:4}); -var P = b.create("point", [function() {return X.X()}, function() {return f(X.X())}], {name:""}); -var Q = b.create("point", [0, function() {return P.Y();}], {name:"f(x)"}); - -var segup = b.create("segment", [P,X], {dash:2}); -var segover = b.create("segment", [P, [0, function() {return P.Y()}]], {dash:2}); - - -txt = b.create('text', [2, 1, function() { - return "x = " + X.X().toFixed(4) + ", f(x) = " + P.Y().toFixed(4); -}]); diff --git a/CwJ/limits/limits.jmd b/CwJ/limits/limits.jmd deleted file mode 100644 index 2a2d5e7..0000000 --- a/CwJ/limits/limits.jmd +++ /dev/null @@ -1,1593 +0,0 @@ -# Limits - -This section uses the following add-on packages: - -```julia -using CalculusWithJulia -using Plots -using Richardson # for extrapolation -using SymPy # for symbolic limits -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Limits", - description = "Calculus with Julia: Limits", - tags = ["CalculusWithJulia", "limits", "limits"], -); -fig_size=(800, 600) -nothing -``` - ----- - -An historic problem in the history of math was to find the area under -the graph of ``f(x)=x^2`` between ``[0,1]``. - - - -There wasn't a ready-made formula for the area of this -shape, as was known for a triangle or a square. However, -[Archimedes](http://en.wikipedia.org/wiki/The_Quadrature_of_the_Parabola) -found a method to compute areas enclosed by a parabola and line -segments that cross the parabola. - -```julia; hold=true; echo=false; cache=true -###{{{archimedes_parabola}}} - -f(x) = x^2 -colors = [:black, :blue, :orange, :red, :green, :orange, :purple] - -## Area of parabola -function make_triangle_graph(n) - title = "Area of parabolic cup ..." - n==1 && (title = "\${Area = }1/2\$") - n==2 && (title = "\${Area = previous }+ 1/8\$") - n==3 && (title = "\${Area = previous }+ 2\\cdot(1/8)^2\$") - n==4 && (title = "\${Area = previous }+ 4\\cdot(1/8)^3\$") - n==5 && (title = "\${Area = previous }+ 8\\cdot(1/8)^4\$") - n==6 && (title = "\${Area = previous }+ 16\\cdot(1/8)^5\$") - n==7 && (title = "\${Area = previous }+ 32\\cdot(1/8)^6\$") - - - - plt = plot(f, 0, 1, legend=false, size = fig_size, linewidth=2) - annotate!(plt, [(0.05, 0.9, text(title,:left))]) # if in title, it grows funny with gr - n >= 1 && plot!(plt, [1,0,0,1, 0], [1,1,0,1,1], color=colors[1], linetype=:polygon, fill=colors[1], alpha=.2) - n == 1 && plot!(plt, [1,0,0,1, 0], [1,1,0,1,1], color=colors[1], linewidth=2) - for k in 2:n - xs = range(0,stop=1, length=1+2^(k-1)) - ys = map(f, xs) - k < n && plot!(plt, xs, ys, linetype=:polygon, fill=:black, alpha=.2) - if k == n - plot!(plt, xs, ys, color=colors[k], linetype=:polygon, fill=:black, alpha=.2) - plot!(plt, xs, ys, color=:black, linewidth=2) - end - end - plt -end - - -n = 7 -anim = @animate for i=1:n - make_triangle_graph(i) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -caption = L""" -The first triangle has area $1/2$, the second has area $1/8$, then $2$ have area $(1/8)^2$, $4$ have area $(1/8)^3$, ... -With some algebra, the total area then should be $1/2 \cdot (1 + (1/4) + (1/4)^2 + \cdots) = 2/3$. -""" - -ImageFile(imgfile, caption) -``` - - -The figure illustrates a means to compute the area bounded by the -parabola, the line ``y=1`` and the line ``x=0`` using triangles. It -suggests that this area can be found by adding the following sum - -```math -A = 1/2 + 1/8 + 2 \cdot (1/8)^2 + 4 \cdot (1/8)^3 + \cdots -``` - - -This value is ``2/3``, so the area under the curve would be -``1/3``. Forget about this specific value - which through more modern machinery becomes uneventful - and focus for a minute on the -method: a problem is solved by a suggestion of an infinite process, in -this case the creation of more triangles to approximate the -unaccounted for area. This is the so-call method of -[exhaustion](http://en.wikipedia.org/wiki/Method_of_exhaustion) known -since the 5th century BC. - -Archimedes used this method to solve a wide -range of area problems related to basic geometric shapes, including a -more general statement of what we described above. - -The ``\cdots`` in the sum expression are the indication that this process continues and that -the answer is at the end of an *infinite* process. To make this line of -reasoning rigorous requires the concept of a limit. The concept of a -limit is then an old one, but it wasn't until the age of calculus -that it was formalized. - - - -Next, we illustrate how Archimedes approximated ``\pi`` -- the ratio of the circumference of a circle to its diameter -- using interior and exterior ``n``-gons whose perimeters could be computed. - - - -```julia; hold=true; echo=false -## Archimedes approximation for pi - -blue, green, purple, red = :royalblue, :forestgreen, :mediumorchid3, :brown3 - - -function archimedes!(p, n, xy=(0,0), radius=1; color=blue) - - x₀,y₀=xy - ts = range(0, 2pi, length=100) - - - - plot!(p, x₀ .+ sin.(ts), y₀ .+ cos.(ts), linewidth=2) - - α = ((2π)/n)/2 - αs = (-pi/2 + α):2α:(3pi/2 + α) - r = radius/cos(α) - - xs = x₀ .+ r*cos.(αs) - ys = y₀ .+ r*sin.(αs) - - plot!(p, xs, ys, linewidth=2, alpha=0.6) - plot!(p, xs, ys, - fill=true, - fillcolor=color, - alpha=0.4) - - r = radius - xs = x₀ .+ r*cos.(αs) - ys = y₀ .+ r*sin.(αs) - - plot!(p, xs, ys, linewidth=2, alpha=0.6) - plot!(p, xs, ys, - fill=true, - fillcolor=color, - alpha=0.8) - - p -end - -ns = [4,5,7,9] -xs = [1, 3.5, 1, 3.5] -ys = [3.5, 3.5, 1, 1] -p = plot(;xlims=(-.25, 4.75), ylims=(-0.25, 4.75), - axis=nothing, - xaxis=false, - yaxis=false, - legend=false, - padding = (0.0, 0.0), - background_color = :transparent, - foreground_color = :black, - aspect_ratio=:equal) - -for (x, y, n, col) ∈ zip(xs, ys, ns, (blue, green, purple, red)) - archimedes!(p, n, (x, y), color=col) -end - -caption = L""" -The ratio of the circumference of a circle to its diameter, $\pi$, can be approximated from above and below by computing the perimeters of the inscribed $n$-gons. Archimedes computed the perimeters for $n$ being $12$, $24$, $48$, and $96$ to determine that $3~1/7 \leq \pi \leq 3~10/71$. -""" - -ImageFile(p, caption) -``` - - -Here Archimedes uses *bounds* to constrain an unknown value. Had he been able to compute these bounds for larger and larger ``n`` the value of ``\pi`` could be more accurately determined. In a "limit" it would be squeezed in to have a specific value, which we now know is an irrational number. - -Continuing these concepts, -[Fermat](http://en.wikipedia.org/wiki/Adequality) in the 1600s -essentially took a limit to find the slope of a tangent line to a -polynomial curve. Newton in the late 1600s, exploited the idea in his -development of calculus (as did Leibniz). Yet it wasn't until the -1800s that -[Bolzano](http://en.wikipedia.org/wiki/Limit_of_a_function#History), -Cauchy and Weierstrass put the idea on a firm footing. - - -To make things more precise, we begin by discussing the limit of a univariate function as ``x`` approaches ``c``. - -Informally, if a limit exists it is the value that ``f(x)`` gets close to as ``x`` gets close to - but not equal to - ``c``. - -The modern formulation is due to Weirstrass: - - -> The limit of ``f(x)`` as ``x`` approaches ``c`` is ``L`` if for every real ``\epsilon > 0``, -> there exists a real ``\delta > 0`` such that for all real ``x``, ``0 < \lvert x − c \rvert < \delta`` -> implies ``\lvert f(x) − L \rvert < \epsilon``. The notation used is ``\lim_{x \rightarrow c}f(x) = L``. - -We comment on this later. - - -Cauchy begins his incredibly influential -[treatise](http://gallica.bnf.fr/ark:/12148/bpt6k90196z/f17.image) on -calculus considering two examples, the limit as ``x`` goes to ``0`` of - -```math -\frac{\sin(x)}{x} \quad\text{and}\quad (1 + x)^{1/x}. -``` - -These take the indeterminate forms $0/0$ and $1^\infty$, which are -found by just putting $0$ in for $x$. An expression does not need to -be defined at $c$, as these two aren't at ``c=0``, to discuss its limit. Cauchy -illustrates two methods to approach the questions above. The first is -to pull out an inequality: - -```math -\frac{\sin(x)}{\sin(x)} > \frac{\sin(x)}{x} > \frac{\sin(x)}{\tan(x)} -``` - -which is equivalent to: - -```math -1 > \frac{\sin(x)}{x} > \cos(x) -``` - -This bounds the expression $\sin(x)/x$ between $1$ and $\cos(x)$ and as $x$ gets close to $0$, the value of $\cos(x)$ "clearly" goes to $1$, hence $L$ must be $1$. This is an application of the squeeze theorem, the same idea Archimedes implied when bounding the value for ``\pi`` above and below. - -The above bound comes from this figure, for small ``x > 0``: - -```julia; hold=true; echo=false -p = plot(x -> sqrt(1 - x^2), 0, 1, legend=false, aspect_ratio=:equal, - linewidth=3, color=:black) -θ = π/6 -y,x = sincos(θ) -col=RGBA(0.0,0.0,1.0, 0.25) -plot!(range(0,x, length=2), zero, fillrange=u->y/x*u, color=col) -plot!(range(x, 1, length=50), zero, fillrange = u -> sqrt(1 - u^2), color=col) -plot!([x,x],[0,y], linestyle=:dash, linewidth=3, color=:black) -plot!([x,1],[y,0], linestyle=:dot, linewidth=3, color=:black) -plot!([1,1], [0,y/x], linewidth=3, color=:black) -plot!([0,1], [0,y/x], linewidth=3, color=:black) -plot!([0,1], [0,0], linewidth=3, color=:black) -Δ = 0.05 -annotate!([(0,0+Δ,"A"), (x-Δ,y+Δ/4, "B"), (1+Δ/2,y/x, "C"), - (1+Δ/2,0+Δ/2,"D")]) -annotate!([(.2*cos(θ/2), 0.2*sin(θ/2), "θ")]) -imgfile = tempname() * ".png" -savefig(p, imgfile) -caption = "Triangle ABD has less area than the shaded wedge, which has less area than triangle ACD. Their respective areas are ``(1/2)\\sin(\\theta)``, ``(1/2)\\theta``, and ``(1/2)\\tan(\\theta)``. The inequality used to show ``\\sin(x)/x`` is bounded below by ``\\cos(x)`` and above by ``1`` comes from a division by ``(1/2) \\sin(x)`` and taking reciprocals. -" -ImageFile(imgfile, caption) -``` - - -To discuss the case of $(1+x)^{1/x}$ it proved convenient to assume ``x = 1/m`` for integer values of ``m``. At the time of Cauchy, log tables were available to identify the approximate value of the limit. Cauchy computed the following value from logarithm tables. - - -```julia; hold=true; -x = 1/10000 -(1 + x)^(1/x) -``` - -A table can show the progression to this value: - -```julia; hold=true; -f(x) = (1 + x)^(1/x) -xs = [1/10^i for i in 1:5] -[xs f.(xs)] -``` - -This progression can be seen to be increasing. Cauchy, in his treatise, can see this through: - -```math -\begin{align*} -(1 + \frac{1}{m})^n &= 1 + \frac{1}{1} + \frac{1}{1\cdot 2}(1 = \frac{1}{m}) + \\ -& \frac{1}{1\cdot 2\cdot 3}(1 - \frac{1}{m})(1 - \frac{2}{m}) + \cdots \\ -&+ -\frac{1}{1 \cdot 2 \cdot \cdots \cdot m}(1 - \frac{1}{m}) \cdot \cdots \cdot (1 - \frac{m-1}{m}). -\end{align*} -``` - -These values are clearly increasing as ``m`` increases. Cauchy showed the value was bounded between ``2`` and ``3`` and had the approximate value above. Then he showed the restriction to integers was not necessary. Later we will use this definition for the exponential function: - -```math -e^x = \lim_{n \rightarrow \infty} (1 + \frac{x}{n})^n, -``` - -with a suitably defined limit. - - -These two cases illustrate that though the definition of the limit -exists, the computation of a limit is generally found by other means -and the intuition of the value of the limit can be gained numerically. - -### Indeterminate forms - -First it should be noted that for most of the functions encountered, the concepts of a limit at a typical point $c$ is nothing more than just function evaluation at $c$. This is because, at a typical point, the functions are nicely behaved (what we will soon call "*continuous*"). However, most questions asked about limits find points that are not typical. For these, the result of evaluating the function at $c$ is typically undefined, and the value comes in one of several *indeterminate forms*: $0/0$, $\infty/\infty$, $0 \cdot \infty$, $\infty - \infty$, $0^0$, $1^\infty$, and $\infty^0$. - -`Julia` can help - at times - identify these indeterminate forms, as many such operations produce `NaN`. For example: - -```julia; -0/0, Inf/Inf, 0 * Inf, Inf - Inf -``` - -However, the values with powers generally do not help, as the IEEE standard has `0^0` evaluating to 1: - -```julia; -0^0, 1^Inf, Inf^0 -``` - -However, this can be unreliable, as floating point issues may mask the true evaluation. However, as a cheap trick it can work. So, the limit as $x$ goes to $1$ of $\sin(x)/x$ is simply found by evaluation: - -```julia; hold=true; -x = 1 -sin(x) / x -``` - -But at ``x=0`` we get an indicator that there is an issue with just evaluating the function: - - -```julia; hold=true; -x = 0 -sin(x) / x -``` - -The above is really just a heuristic. For some functions this is just -not true. For example, the $f(x) = \sqrt{x}$ is only defined on $[0, -\infty)$ There is technically no limit at $0$, per se, as the function -is not defined around $0$. Other functions jump at values, and will -not have a limit, despite having well defined values. The `floor` -function is the function that rounds down to the nearest integer. At -integer values there will be a jump (and hence have no limit), even though the function is -defined. - -## Graphical approaches to limits - -Let's return to the function $f(x) = \sin(x)/x$. This function was studied by Euler as part of his solution to the [Basel](http://en.wikipedia.org/wiki/Basel_problem) problem. He knew that near $0$, $\sin(x) \approx x$, so the ratio is close to $1$ if $x$ is near $0$. Hence, the intuition is $\lim_{x \rightarrow 0} \sin(x)/x = 1$, as Cauchy wrote. We can verify this limit graphically two ways. First, a simple graph shows no issue at $0$: - - -```julia; hold=true; -f(x) = sin(x)/x -xs, ys = unzip(f, -pi/2, pi/2) # get points used to plot `f` -plot(xs, ys) -scatter!(xs, ys) -``` - -The $y$ values of the graph seem to go to $1$ as the $x$ values get -close to ``0``. (That the graph looks defined at $0$ is due to the fact -that the points sampled to graph do not include $0$, as shown through the `scatter!` command -- which can be checked via `minimum(abs, xs)`.) - -We can also verify Euler's intuition through this graph: - -```julia; hold=true; -plot(sin, -pi/2, pi/2) -plot!(identity) # the function y = x, like how zero is y = 0 -``` - -That the two are indistinguishable near $0$ makes it easy to see that their ratio should be going towards $1$. - -A parametric plot shows the same, we see below the slope at ``(0,0)`` is *basically* ``1``, because the two functions are varying at the same rate when they are each near ``0`` - -```julia; hold=true; -plot(sin, identity, -pi/2, pi/2) # parametric plot -``` - - -The graphical approach to limits - plotting $f(x)$ around $c$ and -observing if the $y$ values seem to converge to an $L$ value when $x$ -get close to $c$ - allows us to gather quickly if a function seems to -have a limit at $c$, though the precise value of $L$ may be hard to identify. - - -##### Example - -This example illustrates the same limit a different way. Sliding the ``x`` value towards ``0`` shows ``f(x) = \sin(x)/x`` approaches a value of ``1``. - - -```=html -
-``` - -```ojs -//| echo: false -//| output: false - -JXG = require("jsxgraph") - -b = JXG.JSXGraph.initBoard('jsxgraph', { - boundingbox: [-6, 1.2, 6,-1.2], axis:true -}); - -f = function(x) {return Math.sin(x) / x;}; -graph = b.create("functiongraph", [f, -6, 6]) -seg = b.create("line", [[-6,0], [6,0]], {fixed:true}); - -X = b.create("glider", [2, 0, seg], {name:"x", size:4}); -P = b.create("point", [function() {return X.X()}, function() {return f(X.X())}], {name:""}); -Q = b.create("point", [0, function() {return P.Y();}], {name:"f(x)"}); - -segup = b.create("segment", [P,X], {dash:2}); -segover = b.create("segment", [P, [0, function() {return P.Y()}]], {dash:2}); - - -txt = b.create('text', [2, 1, function() { - return "x = " + X.X().toFixed(4) + ", f(x) = " + P.Y().toFixed(4); -}]); -``` - - -##### Example - - - -Consider now the following limit - -```math -\lim_{x \rightarrow 2} \frac{x^2 - 5x + 6}{x^2 +x - 6} -``` - -Noting that this is a ratio of nice polynomial functions, we first -check whether there is anything to do: - -```julia; hold=true; -f(x) = (x^2 - 5x + 6) / (x^2 + x - 6) -c = 2 -f(c) -``` - -The `NaN` indicates that this function is indeterminate at $c=2$. A -quick plot gives us an idea that the limit exists and is roughly -$-0.2$: - -```julia; hold=true; -c, delta = 2, 1 -plot(x -> (x^2 - 5x + 6) / (x^2 + x - 6), c - delta, c + delta) -``` - - -The graph looks "continuous." In fact, the value $c=2$ is termed a -*removable singularity* as redefining $f(x)$ to be $-0.2$ when -$x=2$ results in a "continuous" function. - -As an aside, we can redefine `f` using the "ternary operator": - -```julia; eval=false -f(x) = x == 2.0 ? -0.2 : (x^2 - 5x + 6) / (x^2 + x - 6) -``` - -This particular case is a textbook example: one can easily factor -$f(x)$ to get: - -```math -f(x) = \frac{(x-2)(x-3)}{(x-2)(x+3)} -``` - -Written in this form, we clearly see that this is the same function as -$g(x) = (x-3)/(x+3)$ when $x \neq 2$. The function $g(x)$ is -"continuous" at $x=2$. So were one to redefine $f(x)$ at $x=2$ to be -$g(2) = (2 - 3)/(2 + 3) = -0.2$ it would be made continuous, hence the -term removable singularity. - -## Numerical approaches to limits - -The investigation of $\lim_{x \rightarrow 0}(1 + x)^{1/x}$ by -evaluating the function at $1/10000$ by Cauchy can be done much more easily -nowadays. As does a graphical approach, a numerical approach can -give insight into a limit and often a good numeric estimate. - -The basic idea is to create a sequence of $x$ values going towards $c$ -and then investigate if the corresponding $y$ values are eventually near some -$L$. - -Best, to see by example. Suppose we are asked to investigate - -```math -\lim_{x \rightarrow 25} \frac{\sqrt{x} - 5}{\sqrt{x - 16} - 3}. -``` - -We first define a function and check if there are issues at ``25``: - -```julia; -f(x) = (sqrt(x) - 5) / (sqrt(x-16) - 3) -``` - -```julia; -c = 25 -f(c) -``` - -So yes, an issue of the indeterminate form $0/0$. We investigate numerically by making a set of numbers getting close to $c$. This is most easily done making numbers getting close to $0$ and adding them to or subtracting them from $c$. Some natural candidates are negative powers of ``10``: - -```julia; -hs = [1/10^i for i in 1:8] -``` - -We can add these to $c$ and then evaluate: - -```julia; -xs = c .+ hs -ys = f.(xs) -``` - -To visualize, we can put in a table using `[xs ys]` notation: - -```julia; -[xs ys] -``` - -The $y$-values seem to be getting near $0.6$. - -Since limits are defined by the expression $0 < \lvert x-c\rvert < \delta$, we should also look at values smaller than $c$. There isn't much difference (note the `.-` sign in `c .- hs`): - -```julia; hold=true; -xs = c .- hs -ys = f.(xs) -[xs ys] -``` - -Same story. The numeric evidence supports a limit of $L=0.6$. - -##### Example: the secant line - -Let $f(x) = x^x$ and consider the ratio: - -```math -\frac{f(c + h) - f(c)}{h} -``` - -As $h$ goes to $0$, this will take the form $0/0$ in most cases, and -in the particular case of $f(x) = x^x$ and $c=1$ it will be. The -expression has a geometric interpretation of being the slope of the -secant line connecting the two points $(c,f(c))$ and $(c+h, f(c+h))$. - -To look at the limit in this example, we have (recycling the values in `hs`): - -```julia; hold=true; -c = 1 -f(x) = x^x -ys = [(f(c + h) - f(c)) / h for h in hs] -[hs ys] -``` - -The limit looks like $L=1$. A similar check on the left will confirm this numerically. - - - - - -### Issues with the numeric approach - -The numeric approach often gives a good intuition as to the existence of a limit and its value. However, it can be misleading. Consider this limit question: - -```math -\lim_{x \rightarrow 0} \frac{1 - \cos(x)}{x^2}. -``` - -We can see that it is indeterminate of the form $0/0$: - -```julia; -g(x) = (1 - cos(x)) / x^2 -g(0) -``` - -What is the value of $L$, if it exists? A quick attempt numerically yields: - -```julia; -𝒙s = 0 .+ hs -𝒚s = [g(x) for x in 𝒙s] -[𝒙s 𝒚s] -``` - -Hmm, the values in `ys` appear to be going to $0.5$, but then end up at -$0$. Is the limit $0$ or $1/2$? The answer is $1/2$. The last $0$ is -an artifact of floating point arithmetic and the last few deviations from `0.5` due to loss of precision in subtraction. To investigate, we look more carefully at the two ratios: - -```julia; -y1s = [1 - cos(x) for x in 𝒙s] -y2s = [x^2 for x in 𝒙s] -[𝒙s y1s y2s] -``` - -Looking at the bottom of the second column reveals the error. The value of `1 - cos(1.0e-8)` is -`0` and not a value around `5e-17`, as would be expected from the pattern above it. This is because the smallest -floating point value less than `1.0` is more than `5e-17` units away, -so `cos(1e-8)` is evaluated to be `1.0`. There just isn't enough -granularity to get this close to $0$. - -Not that we needed to. The answer would have been clear if we had -stopped with `x=1e-6`, say. - -In general, some functions will frustrate the numeric approach. It is -best to be wary of results. At a minimum they should confirm what a quick -graph shows, though even that isn't enough, as this next example shows. - - -##### Example - -Let ``h(x)`` be defined by - -```math -h(x) = x^2 + 1 + \log(| 11 \cdot x - 15 |)/99. -``` - -The question is to investigate - -```math -\lim_{x \rightarrow 15/11} h(x) -``` - -A plot shows the answer appears to be straightforward: - -```julia;echo=false -h(x) = x^2 + 1 + log(abs(11*x - 15))/99 -plot(h, 15/11 - 1, 15/11 + 1) -``` - -Taking values near ``15/11`` shows nothing too unusual: - -```julia; hold=true; -c = 15/11 -hs = [1/10^i for i in 4:3:16] -xs = c .+ hs -[xs h.(xs)] -``` - -(Though both the graph and the table hint at something a bit odd.) - -However the limit in this case is ``-\infty`` (or DNE), as there is an aysmptote at ``c=15/11``. The problem is the asymptote due to the logarithm is extremely narrow and happens between floating point values to the left and right of ``15/11``. - -### Richardson extrapolation - -The [`Richardson`](https://github.com/JuliaMath/Richardson.jl) package provides a function to extrapolate a function `f(x)` to `f(x0)`, as the numeric limit does. We illustrate its use by example: - -```julia; hold=true -f(x) = sin(x)/x -extrapolate(f, 1) -``` - -The answer involves two terms, the second being an estimate for the error in the estimation of `f(0)`. - -The values the method chooses could be viewed as follows: - -```julia; term=true -extrapolate(1) do x # using `do` notation for the function - @show x - sin(x)/x -end -``` - - - -The `extrapolate` function avoids the numeric problems encountered in the following example - -```julia; hold=true -f(x) = (1 - cos(x)) / x^2 -extrapolate(f, 1) -``` - -To find limits at a value of `c` not equal to `0`, we set the `x_0` argument. For example, - -```julia; hold=true -f(x) = (sqrt(x) - 5) / (sqrt(x-16) - 3) -c = 25 -extrapolate(f, 26, x0=25) -``` - -This value can also be `Inf`, in anticipation of infinite limits to be discussed in a subsequent section: - -```julia; hold=true -f(x) = (x^2 - 2x + 1)/(x^3 - 3x^2 + 2x + 1) -extrapolate(f, 10, x0=Inf) -``` - -(The starting value should be to the right of any zeros of the denominator.) - - -## Symbolic approach to limits - - -The `SymPy` package provides a `limit` function for finding the limit -of an expression in a given variable. It must be loaded, as was -done initially. The `limit` function's use requires the expression, the -variable and a value for $c$. (Similar to the three things in the -notation $\lim_{x \rightarrow c}f(x)$.) - -For example, the limit at $0$ of $(1-\cos(x))/x^2$ is easily handled: - -```julia -@syms x::real -limit((1 - cos(x)) / x^2, x => 0) -``` - -The pair notation (`x => 0`) is used to indicate the variable and the value it is going to. - -##### Example - -We look again at this function which despite having a vertical asymptote at ``x=15/11`` has the property that it is positive for all floating point values, making both a numeric and graphical approach impossible: - -```math -f(x) = x^2 + 1 + \log(| 11 \cdot x - 15 |)/99. -``` - -We find the limit symbolically at ``c=15/11`` as follows, taking care to use the exact value `15//11` and not the *floating point* approximation returned by `15/11`: - -```julia; hold=true; -f(x) = x^2 + 1 + log(abs(11x - 15))/99 -limit(f(x), x => 15 // 11) -``` - -##### Example - -Find the [limits](http://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule): - -```math -\lim_{x \rightarrow 0} \frac{2\sin(x) - \sin(2x)}{x - \sin(x)}, \quad -\lim_{x \rightarrow 0} \frac{e^x - 1 - x}{x^2}, \quad -\lim_{\rho \rightarrow 0} \frac{x^{1-\rho} - 1}{1 - \rho}. -``` - -We have for the first: - -```julia; -limit( (2sin(x) - sin(2x)) / (x - sin(x)), x => 0) -``` - -The second is similarly done, though here we define a function for variety: - -```julia; hold=true; -f(x) = (exp(x) - 1 - x) / x^2 -limit(f(x), x => 0) -``` - -Finally, for the third we define a new variable and proceed: - -```julia; -@syms rho::real -limit( (x^(1-rho) - 1) / (1 - rho), rho => 1) -``` - -This last limit demonstrates that the `limit` function of `SymPy` can readily evaluate limits that involve parameters, though at times some assumptions on the parameters may be needed, as was done through `rho::real` - -However, for some cases, the assumptions will not be enough, as they -are broad. (E.g., something might be true for some values of the -parameter and not others and these values aren't captured in the -assumptions.) So the user must be mindful that when parameters are -involved, the answer may not reflect all possible cases. - -##### Example: floating point conversion issues - -The Gruntz [algorithm](http://www.cybertester.com/data/gruntz.pdf) -implemented in `SymPy` for symbolic limits is quite powerful. However, -some care must be exercised to avoid undesirable conversions from -exact values to floating point values. - - -In a previous example, we used `15//11` and not `15/11`, as the former converts to an *exact* symbolic value for use in `SymPy`, but the latter would be approximated in floating point *before* this conversion so the exactness would be lost. - -To illustrate further, let's look at the limit as $x$ goes to $\pi/2$ of $j(x) -= \cos(x) / (x - \pi/2)$. We follow our past practice: - -```julia -j(x) = cos(x) / (x - pi/2) -j(pi/2) -``` - -The value is not `NaN`, rather `Inf`. This is because `cos(pi/2)` is -not exactly $0$ as it should be mathematically, as `pi/2` is rounded to a floating point number. This minor -difference is important. If we try and correct for this by using `PI` we have: - -```julia; -limit(j(x), x => PI/2) -``` - -The value is not right, as this simple graph suggests the limit is in fact $-1$: - -```julia; -plot(j, pi/4, 3pi/4) -``` - -The difference between `pi` and `PI` can be significant, and though -usually `pi` is silently converted to `PI`, it doesn't happen here as -the division by `2` happens first, which turns the symbol into an approximate -floating point number. Hence, `SymPy` is giving the correct answer for -the problem it is given, it just isn't the problem we wanted to look -at. - -Trying again, being more aware of how `pi` and `PI` differ, we have: - -```julia; hold=true; -f(x) = cos(x) / (x - PI/2) -limit(f(x), x => PI/2) -``` - -(The value `pi` is able to be exactly converted to `PI` when used in `SymPy`, as it is of type `Irrational`, and is not a floating point value. However, the expression `pi/2` converts `pi` to a floating point value and then divides by `2`, hence the loss of exactness when used symbolically.) - -##### Example: left and right limits - -Right and left limits will be discussed in the next section; here we -give an example of the idea. The mathematical convention is to say a -limit exists if both the left *and* right limits exist and are -equal. Informally a right (left) limit at ``c`` only considers values of ``x`` less (more) than ``c``. The `limit` function of `SymPy` finds directional limits by default, -a right limit, where ``x > c``. - -The left limit can be found by passing the argument `dir="-"`. Passing `dir="+-"` (and not `"-+"`) will compute the mathematical limit, throwing an error in `Python` if no limit exists. - -```julia -limit(ceil(x), x => 0), limit(ceil(x), x => 0, dir="-") -``` - - -This accurately shows the limit does not exist mathematically, but `limit(ceil(x), x => 0)` does exist (as it finds a right limit). - - -## Rules for limits - -The `limit` function doesn't compute limits from the definition, -rather it applies some known facts about functions within a set of -rules. Some of these rules are the following. Suppose the individual limits of $f$ and $g$ always exist (and are finite) below. - -```math -\begin{align*} -\lim_{x \rightarrow c} (a \cdot f(x) + b \cdot g(x)) &= a \cdot - \lim_{x \rightarrow c} f(x) + b \cdot \lim_{x \rightarrow c} g(x) - &\\ -%% -\lim_{x \rightarrow c} f(x) \cdot g(x) &= \lim_{x \rightarrow c} - f(x) \cdot \lim_{x \rightarrow c} g(x) - &\\ -%% -\lim_{x \rightarrow c} \frac{f(x)}{g(x)} &= - \frac{\lim_{x \rightarrow c} f(x)}{\lim_{x \rightarrow c} g(x)} - &(\text{provided }\lim_{x \rightarrow c} g(x) \neq 0)\\ -\end{align*} -``` - -These are verbally described as follows, when the individual limits exist and are finite then: - -* Limits involving sums, differences or scalar multiples of functions - *exist* **and** can be **computed** by first doing the individual - limits and then combining the answers appropriately. - -* Limits of products exist and can be found by computing the limits of the - individual factors and then combining. - -* Limits of ratios *exist* and can be found by computing the limit of the - individual terms and then dividing **provided** you don't divide by - ``0``. The last part is really important, as this rule is no help with - the common indeterminate form ``0/0`` - - -In addition, consider the composition: - -```math -\lim_{x \rightarrow c} f(g(x)) -``` - -Suppose that - -* The outer limit, ``\lim_{x \rightarrow b} f(x) = L``, exists, and -* the inner limit, ``\lim_{x \rightarrow c} g(x) = b``, exists **and** -* for some neighborhood around ``c`` (not including ``c``) ``g(x)`` is not ``b``, - -Then the limit exists and equals ``L``: - -`` \lim_{x \rightarrow c} f(g(x)) = \lim_{u \rightarrow b} f(u) = L.`` - -An alternative, is to assume ``f(x)`` is defined at ``b`` and equal to ``L`` (which is the definition of continuity), but that isn't the assumption above, hence the need to exclude ``g`` from taking on a value of ``b`` (where ``f`` may not be defined) near ``c``. - - -These rules, together with the fact that our basic algebraic functions -have limits that can be found by simple evaluation, mean that many -limits are easy to compute. - -##### Example: composition - -For example, consider for some non-zero $k$ the following limit: - -```math -\lim_{x \rightarrow 0} \frac{\sin(kx)}{x}. -``` - -This is clearly related to the function $f(x) = \sin(x)/x$, which has a limit of ``1`` as ``x \rightarrow 0``. We see ``g(x) = k f(kx)`` is the limit in question. As ``kx \rightarrow 0``, though not taking a value of ``0`` except when ``x=0``, the limit above is ``k \lim_{x \rightarrow 0} f(kx) = k \lim_{u \rightarrow 0} f(u) = 1``. - - -Basically when taking a limit as $x$ goes to $0$ we can multiply $x$ by any constant and figure out the limit for that. (It is as though we "go to" $0$ faster or slower. but are still going to $0$. - - -Similarly, - -```math -\lim_{x \rightarrow 0} \frac{\sin(x^2)}{x^2} = 1, -``` - -as this is the limit of ``f(g(x))`` with ``f`` as above and ``g(x) = x^2``. We need ``x \rightarrow 0``, ``g``is only ``0`` at ``x=0``, which is the case. - -##### Example: products - -Consider this complicated limit found on this [Wikipedia](http://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule) page. - -```math -\lim_{x \rightarrow 1/2} \frac{\sin(\pi x)}{\pi x} \cdot \frac{\cos(\pi x)}{1 - (2x)^2}. -``` - -We know the first factor has a limit found by evaluation: $2/\pi$, so it is really just a constant. The second we can compute: - -```julia; -l(x) = cos(PI*x) / (1 - (2x)^2) -limit(l, 1//2) -``` - -Putting together, we would get $1/2$. Which we could have done directly in this case: - -```julia; -limit(sin(PI*x)/(PI*x) * l(x), x => 1//2) -``` - -##### Example: ratios - -Consider again the limit of $\cos(\pi x) / (1 - (2x)^2)$ at $c=1/2$. A graph of both the top and bottom functions shows the indeterminate, $0/0$, form: - -```julia; -plot(cos(pi*x), 0.4, 0.6) -plot!(1 - (2x)^2) -``` - -However, following Euler's insight that $\sin(x)/x$ will have a limit -at $0$ of $1$ as $\sin(x) \approx x$, and $x/x$ has a limit of $1$ at -$c=0$, we can see that $\cos(\pi x)$ looks like $-\pi\cdot (x - 1/2)$ -and $(1 - (2x)^2)$ looks like $-4(x-1/2)$ around $x=1/2$: - -```julia; -plot(cos(pi*x), 0.4, 0.6) -plot!(-pi*(x - 1/2)) -``` - - -```julia; -plot(1 - (2x)^2, 0.4, 0.6) -plot!(-4(x - 1/2)) -``` - -So around $c=1/2$ the ratio should look like $-\pi (x-1/2) / ( -4(x - 1/2)) = \pi/4$, which indeed it does, as that is the limit. - -This is the basis of L'Hôpital's rule, which we will return to once the derivative is discussed. - - -##### Example: sums - -If it is known that the following limit exists by some means: - -```math -L = 0 = \lim_{x \rightarrow 0} \frac{e^{\csc(x)}}{e^{\cot(x)}} - (1 + \frac{1}{2}x + \frac{1}{8}x^2) -``` - -Then this limit will exist - -```math -M = \lim_{x \rightarrow 0} \frac{e^{\csc(x)}}{e^{\cot(x)}} -``` - - -Why? We can express the function ``e^{\csc(x)}/e^{\cot(x)}`` as the above function plus the polynomial ``1 + x/2 + x^2/8``. The above is then the sum of two functions whose limits exist and are finite, hence, we can conclude that ``M = 0 + 1``. - -### The [squeeze](http://en.wikipedia.org/wiki/Squeeze_theorem) theorem - -We note one more limit law. Suppose we wish to compute ``\lim_{x \rightarrow c}f(x)`` and we have two other functions, ``l`` and ``u``, satisfying: - -* for all ``x`` near ``c`` (possibly not including ``c``) ``l(x) \leq f(x) \leq u(x)``. -* These limits exist and are equal: ``L = \lim_{x \rightarrow c} l(x) = \lim_{x \rightarrow c} u(x)``. - -Then the limit of ``f`` must also be ``L``. - -```julia; hold=true; echo=false - -function squeeze_example(x) - x₀ = 0.5 - plot(cos, 0, x₀, label="cos") - plot!(x -> sin(x)/x, label = "sin(x)/x") - plot!(x -> 1, label = "y=1") - plot!([x,x], [ cos(x₀), 1], linestyle=:dash, label="") - scatter!([x,x,x], [cos(x), sin(x)/x, 1], label="") -end - -anim = @animate for x ∈ (0.4, 0.3, 0.2, 0.1, 0.05, 0.01) - squeeze_example(x) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -caption = """ -As ``x`` goes to ``0``, the values of ``sin(x)/x`` are squeezed between ``\\cos(x)`` and ``1`` which both converge to ``1``. -""" -ImageFile(imgfile, caption) -``` - - -## Limits from the definition - -The formal definition of a limit involves clarifying what it means for -$f(x)$ to be "close to $L$" when $x$ is "close to $c$". These are -quantified by the inequalities $0 < \lvert x-c\rvert < \delta$ and the $\lvert f(x) - -L\rvert < \epsilon$. The second does not have the restriction that it is -greater than $0$, as indeed $f(x)$ can equal $L$. The order is -important: it says for any idea of close for $f(x)$ to $L$, an idea of close must be found for $x$ to $c$. - -The key is identifying a value for $\delta$ for a given value of $\epsilon$. - -A simple case is the linear case. Consider the function $f(x) = 3x + 2$. Verify that the limit at $c=1$ is $5$. - -We show "numerically" that $\delta = \epsilon/3$. - -```julia; hold=true; -f(x) = 3x + 2 -c, L = 1, 5 -epsilon = rand() # some number in (0,1) -delta = epsilon / 3 -xs = c .+ delta * rand(100) # 100 numbers, c < x < c + delta -as = [abs(f(x) - L) < epsilon for x in xs] -all(as) # are all the as true? -``` - -These lines produce a random $\epsilon$, the resulting $\delta$, and then verify for 100 numbers -within $(c, c+\delta)$ that the inequality $\lvert f(x) - L \rvert < \epsilon$ -holds for each. Running them again and again should always produce -`true` if $L$ is the limit and $\delta$ is chosen properly. - -(Of course, we should also verify values to the left of $c$.) - - -(The random numbers are technically in ``[0,1)``, so in theory `epsilon` could be `0`. So the above approach would be more solid if some guard, such as `epsilon = max(eps(), rand())`, was used. As the formal definition is the domain of paper-and-pencil, we don't fuss.) - - -In this case, $\delta$ is easy to guess, as the function is linear and -has slope $3$. This basically says the $y$ scale is 3 times the $x$ -scale. For non-linear functions, finding $\delta$ for a given -$\epsilon$ can be a challenge. For the function $f(x) = x^3$, -illustrated below, a value of $\delta=\epsilon^{1/3}$ is used for $c=0$: - -```julia; hold=true; echo=false; cache=true -## {{{ limit_e_d }}} -function make_limit_e_d(n) - f(x) = x^3 - - xs = range(-.9, stop=.9, length=50) - ys = map(f, xs) - - - plt = plot(f, -.9, .9, legend=false, size=fig_size) - if n == 0 - nothing - else - k = div(n+1,2) - epsilon = 1/2^k - delta = cbrt(epsilon) - if isodd(n) - plot!(plt, xs, 0*xs .+ epsilon, color=:orange) - plot!(plt, xs, 0*xs .- epsilon, color=:orange) - else - plot!(delta * [-1, 1], epsilon * [ 1, 1], color=:orange) - plot!(delta * [ 1, -1], epsilon * [-1,-1], color=:orange) - plot!(delta * [-1, -1], epsilon * [-1, 1], color=:red) - plot!(delta * [ 1, 1], epsilon * [-1, 1], color=:red) - end - end - plt -end - - -n = 11 -anim = @animate for i=1:n - make_limit_e_d(i-1) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) - - -caption = L""" - -Demonstration of $\epsilon$-$\delta$ proof of $\lim_{x \rightarrow 0} -x^3 = 0$. For any $\epsilon>0$ (the orange lines) there exists a -$\delta>0$ (the red lines of the box) for which the function $f(x)$ -does not leave the top or bottom of the box (except possibly at the -edges). In this example $\delta^3=\epsilon$. - -""" - -ImageFile(imgfile, caption) -``` - - -## Questions - - -###### Question - -From the graph, find the limit: - -```math -L = \lim_{x\rightarrow 1} \frac{x^2−3x+2}{x^2−6x+5} -``` - - -```julia; hold=true; echo=false -f(x) = (x^2 - 3x +2) / (x^2 - 6x + 5) -plot(f, 0,2) -``` - - -```julia; hold=true; echo=false -answ = 1/4 -numericq(answ, 1e-1) -``` - - -###### Question - -From the graph, find the limit $L$: - -```math -L = \lim_{x \rightarrow -2} \frac{x}{x+1} \frac{x^2}{x^2 + 4} -``` - -```julia; hold=true; echo=false -f(x) = x/(x+1)*x^2/(x^2+4) -plot(f, -3, -1.25) -``` - -```julia; hold=true; echo=false -f(x) = x/(x+1)*x^2/(x^2+4) -val = f(-2) -numericq(val, 1e-1) -``` - - - -###### Question - -Graphically investigate the limit - -```math -L = \lim_{x \rightarrow 0} \frac{e^x - 1}{x}. -``` - -What is the value of $L$? - - -```julia; hold=true; echo=false -f(x) = (exp(x) - 1)/x -p = plot(f, -1, 1) -``` - - -```julia; hold=true; echo=false -val = N(limit((exp(x)-1)/x, x => 0)) -numericq(val, 1e-1) -``` - - - -###### Question - -Graphically investigate the limit - -```math -\lim_{x \rightarrow 0} \frac{\cos(x) - 1}{x}. -``` - -The limit exists, what is the value? - -```julia; hold=true; echo=false -val = 0 -numericq(val, 1e-2) -``` - -###### Question - -Select the graph for which there is no limit at ``a``. - -```julia; hold=true; echo=false -let - p1 = plot(;axis=nothing, legend=false) - title!(p1, "(a)") - plot!(p1, x -> x^2, 0, 2, color=:black) - plot!(p1, zero, linestyle=:dash) - annotate!(p1,[(1,0,"a")]) - - p2 = plot(;axis=nothing, legend=false) - title!(p2, "(b)") - plot!(p2, x -> 1/(1-x), 0, .95, color=:black) - plot!(p2, x-> -1/(1-x), 1.05, 2, color=:black) - plot!(p2, zero, linestyle=:dash) - annotate!(p2,[(1,0,"a")]) - - p3 = plot(;axis=nothing, legend=false) - title!(p3, "(c)") - plot!(p3, sinpi, 0, 2, color=:black) - plot!(p3, zero, linestyle=:dash) - annotate!(p3,[(1,0,"a")]) - - p4 = plot(;axis=nothing, legend=false) - title!(p4, "(d)") - plot!(p4, x -> x^x, 0, 2, color=:black) - plot!(p4, zero, linestyle=:dash) - annotate!(p4,[(1,0,"a")]) - - l = @layout[a b; c d] - p = plot(p1, p2, p3, p4, layout=l) - imgfile = tempname() * ".png" - savefig(p, imgfile) - hotspotq(imgfile, (1/2,1), (1/2,1)) -end -``` - -###### Question - -The following limit is commonly used: - -```math -\lim_{h \rightarrow 0} \frac{e^{x + h} - e^x}{h} = L. -``` - -Factoring out $e^x$ from the top and using rules of limits this becomes, - -```math -L = e^x \lim_{h \rightarrow 0} \frac{e^h - 1}{h}. -``` - -What is $L$? - - -```julia; hold=true; echo=false -choices = ["``0``", "``1``", "``e^x``"] -answ = 3 -radioq(choices, answ) -``` - - - - - -###### Question - -The following limit is commonly used: - -```math -\lim_{h \rightarrow 0} \frac{\sin(x + h) - \sin(x)}{h} = L. -``` - -The answer should depend on $x$, though it is possible it is a -constant. Using a double angle formula and the rules of limits, this -can be written as: - -```math -L = \cos(x) \lim_{h \rightarrow 0}\frac{\sin(h)}{h} + \sin(x) \lim_{h \rightarrow 0}\frac{\cos(h)-1}{h}. -``` - -Using the last result, what is the value of $L$? - -```julia; hold=true; echo=false -choices = ["``\\cos(x)``", "``\\sin(x)``", "``1``", "``0``", "``\\sin(h)/h``"] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Find the limit as $x$ goes to $2$ of - -```math -f(x) = \frac{3x^2 - x -10}{x^2 - 4} -``` - -```julia; hold=true; echo=false -f(x) = (3x^2 - x - 10)/(x^2 - 4); -val = convert(Float64, N(limit(f(x), x => 2))) -numericq(val) -``` - - -###### Question - -Find the limit as $x$ goes to $-2$ of - -```math -f(x) = \frac{\frac{1}{x} + \frac{1}{2}}{x^3 + 8} -``` - -```julia; hold=true; echo=false -f(x) = ((1/x) + (1/2))/(x^3 + 8) -numericq(-1/48, .001) -``` - - -###### Question - -Find the limit as $x$ goes to $27$ of - -```math -f(x) = \frac{x - 27}{x^{1/3} - 3} -``` - -```julia; hold=true; echo=false -f(x) = (x - 27)/(x^(1//3) - 3) -val = N(limit(f(x), x => 27)) -numericq(val) -``` - -###### Question - -Find the limit - -```math -L = \lim_{x \rightarrow \pi/2} \frac{\tan (2x)}{x - \pi/2} -``` - - -```julia; hold=true; echo=false -f(x) = tan(2x)/(x-PI/2) -val = N(limit(f(x), x => PI/2)) -numericq(val) -``` - - -###### Question - -The limit of $\sin(x)/x$ at $0$ has a numeric value. This depends upon -the fact that $x$ is measured in radians. Try to find this limit: -`limit(sind(x)/x, x => 0)`. What is the value? - -```julia; hold=true; echo=false -choices = [q"0", q"1", q"pi/180", q"180/pi"] -answ = 3 -radioq(choices, answ) -``` - - -What is the limit `limit(sinpi(x)/x, x => 0)`? - -```julia; hold=true; echo=false -choices = [q"0", q"1", q"pi", q"1/pi"] -answ = 3 -radioq(choices, answ) -``` - -###### Question: limit properties - -There are several properties of limits that allow one to break down -more complicated problems into smaller subproblems. For example, - -```math -\lim (f(x) + g(x)) = \lim f(x) + \lim g(x) -``` - -is notation to indicate that one can take a limit of the sum of two -function or take the limit of each first, then add and the answer will -be unchanged, provided all the limits in question exist. - -Use one or the either to find the limit of $f(x) = \sin(x) + \tan(x) + -\cos(x)$ as $x$ goes to $0$. - -```julia; hold=true; echo=false -f(x) = sin(x) + tan(x) + cos(x) -numericq(f(0), 1e-5) -``` - -###### Question - -The key assumption made above in being able to write - -```math -\lim_{x\rightarrow c} f(g(x)) = L, -``` - -when ``\lim_{x\rightarrow b} f(x) = L`` and ``\lim_{x\rightarrow c}g(x) = b`` is *continuity*. - -This [example](https://en.wikipedia.org/wiki/Limit_of_a_function#Limits_of_compositions_of_functions) shows why it is important. - -Take - -```math -f(x) = \begin{cases} -0 & x \neq 0\\ -1 & x = 0 -\end{cases} -``` - -We have ``\lim_{x\rightarrow 0}f(x) = 0``, as ``0`` is clearly a removable discontinuity. So were the above applicable we would have ``\lim_{x \rightarrow 0}f(f(x)) = 0``. But this is not true. What is the limit at ``0`` of ``f(f(x))``? - -```julia, echo=false -numericq(1) -``` - - - -###### Question - -Does this function have a limit as $h$ goes to $0$ from the right -(that is, assume $h>0$)? - -```math -\frac{h^h - 1}{h} -``` - - -```julia; hold=true; echo=false -choices = [ -"Yes, the value is `-9.2061`", -"Yes, the value is `-11.5123`", -"No, the value heads to negative infinity" -]; -answ = 3; -radioq(choices, answ) -``` - -###### Question - -Compute the limit - -```math -\lim_{x \rightarrow 1} \frac{x}{x-1} - \frac{1}{\ln(x)}. -``` - -```julia; hold=true; echo=false -f(x) = x/(x-1) - 1/log(x) -val = convert(Float64, N(limit(f(x), x => 1))) -numericq(val) -``` - -###### Question - -Compute the limit - -```math -\lim_{x \rightarrow 1/2} \frac{1}{\pi} \frac{\cos(\pi x)}{1 - (2x)^2}. -``` - -```julia; hold=true; echo=false -f(x) = 1/PI * cos(PI*x)/(1 - (2x)^2) -val = N(limit(f(x), x => 1//2)) -numericq(val) -``` - -###### Question - -Some limits involve parameters. For example, suppose we define `ex` as follows: - -```julia; hold=true; -@syms m::real k::real -ex = (1 + k*x)^(m/x) -``` - -What is `limit(ex, x => 0)`? - -```julia; hold=true; echo=false -choices = ["``e^{km}``", "``e^{k/m}``", "``k/m``", "``m/k``", "``0``"] -answwer = 1 -radioq(choices, answwer) -``` - -###### Question - -For a given ``a``, what is - -```math -L = \lim_{x \rightarrow 0+} (1 + a\cdot (e^{-x} -1))^{(1/x)} -``` - -```julia; hold=true; echo=false -choices = ["``e^{-a}``", "``e^a``", "``a``", "``L`` does not exist"] -radioq(choices, 1) -``` - -###### Question - -For positive integers ``m`` and ``n`` what is - -```math -\lim_{x \rightarrow 1} \frac{x^{1/m}-1}{x^{1/n}-1}? -``` - -```julia; hold=true; echo=false -choices = ["``m/n``", "``n/m``", "``mn``", "The limit does not exist"] -radioq(choices, 1) -``` - -###### Question - -What does `SymPy` find for the limit of `ex` (`limit(ex, x => 0)`), as defined here: - -```julia; hold=true -@syms x a -ex = (a^x - 1)/x -``` - -```julia; hold=true; echo=false -choices = ["``\\log(a)``", "``a``", "``e^a``", "``e^{-a}``"] -radioq(choices, 1) -``` - -Should `SymPy` have needed an assumption like - -```julia -@syms a::postive -``` - -```julia, echo=false -yesnoq("yes") -``` - -###### Question: The squeeze theorem - -Let's look at the function $f(x) = x \sin(1/x)$. A graph around $0$ -can be made with: - -```julia; hold=true; -f(x) = x == 0 ? NaN : x * sin(1/x) -c, delta = 0, 1/4 -plot(f, c - delta, c + delta) -plot!(abs) -plot!(x -> -abs(x)) -``` - -This graph clearly oscillates near $0$. To the graph of $f$, we added -graphs of both $g(x) = \lvert x\rvert$ and $h(x) = - \lvert x\rvert$. From this graph it is -easy to see by the "squeeze theorem" that the limit at $x=0$ is -$0$. Why? - -```julia; hold=true; echo=false -choices=[L"""The functions $g$ and $h$ both have a limit of $0$ at $x=0$ and the function $f$ is in -between both $g$ and $h$, so must to have a limit of $0$. -""", - L"The functions $g$ and $h$ squeeze each other as $g(x) > h(x)$", - L"The function $f$ has no limit - it oscillates too much near $0$"] -answ = 1 -radioq(choices, answ) -``` - -(The [Wikipedia](https://en.wikipedia.org/wiki/Squeeze_theorem) entry for the squeeze theorem has this unverified, but colorful detail: - -> In many languages (e.g. French, German, Italian, Hungarian and Russian), the squeeze theorem is also known as the two policemen (and a drunk) theorem, or some variation thereof. The story is that if two policemen are escorting a drunk prisoner between them, and both officers go to a cell, then (regardless of the path taken, and the fact that the prisoner may be wobbling about between the policemen) the prisoner must also end up in the cell. - - -###### Question - -Archimedes, in finding bounds on the value of ``\pi`` used ``n``-gons with sides ``12, 24, 48,`` and ``96``. This was so the trigonometry involved could be solved exactly for the interior angles (e.g. ``n=12`` is an interior angle of ``\pi/6`` which has `sin` and `cos` computable by simple geometry. See [Damini and Abhishek](https://arxiv.org/pdf/2008.07995.pdf)) These exact solutions led to subsequent bounds. A more modern approach to bound the circumference of a circle of radius ``r`` using a ``n``-gon with interior angle ``\theta`` would be to use the trigonometric functions. An upper bound would be found with (using the triangle with angle ``\theta/2``, opposite side ``x`` and adjacent side ``r``: - -```julia -@syms theta::real r::real -``` - -```julia; hold=true; -x = r * tan(theta/2) -n = 2PI/theta # using PI to avoid floaing point roundoff in 2pi -# C < n * 2x -upper = n*2x -``` - - -A lower bound would use the triangle with angle ``\theta/2``, hypotenuse ``r`` and opposite side ``x``: - -```julia; hold=true; -x = r*sin(theta/2) -n = 2PI/theta -# C > n * 2x -lower = n*2x -``` - -Using the above, find the limit of `upper` and `lower`. Are the two equal and equal to a familiar value? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - - -(If so, then the squeeze theorem would say that ``\pi`` is the common limit.) diff --git a/CwJ/limits/limits_extensions.jmd b/CwJ/limits/limits_extensions.jmd deleted file mode 100644 index d2920d4..0000000 --- a/CwJ/limits/limits_extensions.jmd +++ /dev/null @@ -1,978 +0,0 @@ -# Limits, issues, extensions of the concept - -This section uses the following add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -``` - - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using DataFrames - -const frontmatter = ( - title = "Limits, issues, extensions of the concept", - description = "Calculus with Julia: Limits, issues, extensions of the concept", - tags = ["CalculusWithJulia", "limits", "limits, issues, extensions of the concept"], -); -nothing -``` - ----- - -The limit of a function at $c$ need not exist for one of many -different reasons. Some of these reasons can be handled with -extensions to the concept of the limit, others are just problematic in -terms of limits. This section covers examples of each. - - -Let's begin with a function that is just problematic. Consider - -```math -f(x) = \sin(1/x) -``` - -As this is a composition of nice functions it will have a limit -everywhere except possibly when $x=0$, as then $1/x$ may not have a -limit. So rather than talk about where it is nice, let's consider the -question of whether a limit exists at $c=0$. - - -A graph shows the issue: - -```julia; hold=true; echo=false -f(x) = sin(1/x) -plot(f, range(-1, stop=1, length=1000)) -``` - -The graph oscillates between $-1$ and $1$ infinitely many times on -this interval - so many times, that no matter how close one zooms in, -the graph on the screen will fail to capture them all. Graphically, -there is no single value of $L$ that the function gets close to, as it -varies between all the values in $[-1,1]$ as $x$ gets close to $0$. A -simple proof that there is no limit, is to take any $\epsilon$ less -than $1$, then with any $\delta > 0$, there are infinitely many $x$ -values where $f(x)=1$ and infinitely many where $f(x) = -1$. That is, -there is no $L$ with $|f(x) - L| < \epsilon$ when $\epsilon$ is less than $1$ for all $x$ near $0$. - -This function basically has too many values it gets close to. Another -favorite example of such a function is the function that is $0$ if $x$ -is rational and $1$ if not. This function will have no limit anywhere, -not just at $0$, and for basically the same reason as above. - - -The issue isn't oscillation though. Take, for example, the function -$f(x) = x \cdot \sin(1/x)$. This function again has a limit everywhere -save possibly $0$. But in this case, there is a limit at $0$ of -$0$. This is because, the following is true: - -```math --|x| \leq x \sin(1/x) \leq |x|. -``` - -The following figure illustrates: - -```julia; hold=true; -f(x) = x * sin(1/x) -plot(f, -1, 1) -plot!(abs) -plot!(x -> -abs(x)) -``` - - -The [squeeze](http://en.wikipedia.org/wiki/Squeeze_theorem) theorem of -calculus is the formal reason $f$ has a limit at $0$, as as both the -upper function, $|x|$, and the lower function, $-|x|$, have a limit of -$0$ at $0$. - -## Right and left limits - -Another example where $f(x)$ has no limit is the function $f(x) = x /|x|, x \neq 0$. This -function is $-1$ for negative $x$ and $1$ for positive $x$. Again, -this function will have a limit everywhere except possibly at $x=0$, -where division by $0$ is possible. - -It's graph is - -```julia; hold=true; -f(x) = abs(x)/x -plot(f, -2, 2) -``` - -The sharp jump at $0$ is misleading - again, the plotting algorithm -just connects the points, it doesn't handle what is a fundamental -discontinuity well - the function is not defined at $0$ and jumps -from $-1$ to $1$ there. Similarly to our example of $\sin(1/x)$, near -$0$ the function get's close to both $1$ and $-1$, so will have no -limit. (Again, just take $\epsilon$ smaller than $1$.) - -But unlike the previous example, this function *would* have a limit if -the definition didn't consider values of $x$ on both sides of $c$. The -limit on the right side would be $1$, the limit on the left side would -be $-1$. This distinction is useful, so there is an extension of the idea of a -limit to *one-sided limits*. - - -Let's loosen up the language in the definition of a limit to read: - -> The limit of $f(x)$ as $x$ approaches $c$ is $L$ if for every -> neighborhood, $V$, of $L$ there is a neighborhood, $U$, of $c$ for -> which $f(x)$ is in $V$ for every $x$ in $U$, except possibly $x=c$. - -The $\epsilon-\delta$ definition has $V = (L-\epsilon, L + \epsilon)$ -and $U=(c-\delta, c+\delta)$. This is a rewriting of $L-\epsilon < -f(x) < L + \epsilon$ as $|f(x) - L| < \epsilon$. - -Now for the defintion: - - -> A function $f(x)$ has a limit on the right of $c$, written $\lim_{x -> \rightarrow c+}f(x) = L$ if for every $\epsilon > 0$, there exists a -> $\delta > 0$ such that whenever $0 < x - c < \delta$ it holds that -> $|f(x) - L| < \epsilon$. That is, $U$ is $(c, c+\delta)$ - -Similarly, a limit on the left is defined where $U=(c-\delta, c)$. - -The `SymPy` function `limit` has a keyword argument `dir="+"` or -`dir="-"` to request that a one-sided limit be formed. The default is `dir="+"`. Passing `dir="+-"` will compute both one side limits, and throw an error if the two are not equal, in agreement with no limit existing. - -```julia; -@syms x -``` - -```julia;hold=true -f(x) = abs(x)/x -limit(f(x), x=>0, dir="+"), limit(f(x), x=>0, dir="-") -``` - - -!!! warning - That means the mathematical limit need not exist when `SymPy`'s `limit` returns an answer, as `SymPy` is only carrying out a one sided limit. Explicitly passing `dir="+-"` or checking that both `limit(ex, x=>c)` and `limit(ex, x=>c, dir="-")` are equal would be needed to confirm a limit exists mathematically. - - -The relation between the two concepts is that a function has a limit at $c$ if -an only if the left and right limits exist and are equal. This -function $f$ has both existing, but the two limits are not equal. - - -There are other such functions that jump. Another useful one is the -floor function, which just rounds down to the nearest integer. A graph shows the basic shape: - -```julia; -plot(floor, -5,5) -``` - -Again, the (nearly) vertical lines are an artifact of the graphing -algorithm and not actual points that solve $y=f(x)$. The floor -function has limits except at the integers. There the left and right -limits differ. - -Consider the limit at $c=0$. If $0 < x < 1/2$, say, then $f(x) = 0$ as -we round down, so the right limit will be $0$. However, if $-1/2 < x < -0$, then the $f(x) = -1$, again as we round down, so the left limit -will be $-1$. Again, with this example both the left and right limits -exists, but at the integer values they are not equal, as they differ -by 1. - - -Some functions only have one-sided limits as they are not defined in -an interval around $c$. There are many examples, but we will take -$f(x) = x^x$ and consider ``c=0``. This function is not well defined for all $x < -0$, so it is typical to just take the domain to be $x > 0$. Still it -has a right limit $\lim_{x \rightarrow 0+} x^x = 1$. `SymPy` can verify: - -```julia; -limit(x^x, x, 0, dir="+") -``` - -This agrees with the IEEE convention of assigning `0^0` to be `1`. - -However, not all such functions with indeterminate forms of $0^0$ will -have a limit of $1$. - -##### Example - -Consider this funny graph: - -```julia; hold=true; echo=false -xs = range(0,stop=1, length=50) - -plot(x->x^2, -2, -1, legend=false) -plot!(exp, -1,0) -plot!(x -> 1-2x, 0, 1) -plot!(sqrt, 1, 2) -plot!(x -> 1-x, 2,3) -``` - -Describe the limits at $-1$, $0$, and $1$. - -* At $-1$ we see a jump, there is no limit but instead a left limit of 1 and a right limit appearing to be $1/2$. - -* At $0$ we see a limit of $1$. - -* Finally, at $1$ again there is a jump, so no limit. Instead the left limit is about $-1$ and the right limit $1$. - - - - -## Limits at infinity - -The loose definition of a horizontal asymptote is "a line such that -the distance between the curve and the line approaches $0$ as they -tend to infinity." This sounds like it should be defined by a -limit. The issue is, that the limit would be at $\pm\infty$ and not -some finite $c$. This requires the idea of a neighborhood of $c$, $0 < |x-c| < \delta$ to be -reworked. - -The basic idea for a limit at $+\infty$ is that for any $\epsilon$, -there exists an $M$ such that when $x > M$ it must be that $|f(x) - L| -< \epsilon$. For a horizontal asymptote, the line would be -$y=L$. Similarly a limit at $-\infty$ can be defined with $x < M$ -being the condition. - - -Let's consider some cases. - -The function $f(x) = \sin(x)$ will not have a limit at $+\infty$ for -exactly the same reason that $f(x) = \sin(1/x)$ does not have a limit -at $c=0$ - it just oscillates between $-1$ and $1$ so never -eventually gets close to a single value. - -`SymPy` gives an odd answer here indicating the range of values: - -```julia; -limit(sin(x), x => oo) -``` - -(We used `SymPy`'s `oo` for $\infty$ and not `Inf`.) - ----- - - -However, a damped oscillation, such as $f(x) = e^{-x} \sin(x)$ will have a limit: - -```julia; -limit(exp(-x)*sin(x), x => oo) -``` - - ----- - -We have rational functions will have the expected limit. In this -example $m = n$, so we get a horizontal asymptote that is not $y=0$: - -```julia; -limit((x^2 - 2x +2)/(4x^2 + 3x - 2), x=>oo) -``` ----- - -Though rational functions can have only one (at most) horizontal asymptote, this isn't true for all functions. Consider the following $f(x) = x / \sqrt{x^2 + 4}$. It has different limits depending if ``x`` goes to ``\infty`` or negative ``\infty``: - -```julia;hold=true; -f(x) = x / sqrt(x^2 + 4) -limit(f(x), x=>oo), limit(f(x), x=>-oo) -``` - -(A simpler example showing this behavior is just the function $x/|x|$ considered earlier.) - -##### Example: Limits at infinity and right limits at ``0`` - -Given a function ``f`` the question of whether this exists: - -```math -\lim_{x \rightarrow \infty} f(x) -``` - -can be reduced to the question of whether this limit exists: - -```math -\lim_{x \rightarrow 0+} f(1/x) -``` - -So whether ``\lim_{x \rightarrow 0+} \sin(1/x)`` exists is equivalent to whether ``\lim_{x\rightarrow \infty} \sin(x)`` exists, which clearly does not due to the oscillatory nature of ``\sin(x)``. - - -Similarly, one can make this reduction - -```math -\lim_{x \rightarrow c+} f(x) = -\lim_{x \rightarrow 0+} f(c + x) = -\lim_{x \rightarrow \infty} f(c + \frac{1}{x}). -``` - -That is, right limits can be analyzed as limits at ``\infty`` or right limits at ``0``, should that prove more convenient. - - - - - -## Limits of infinity - -Vertical asymptotes are nicely defined with horizontal asymptotes by -the graph getting close to some line. However, the formal definition -of a limit won't be the same. For a vertical asymptote, the value of -$f(x)$ heads towards positive or negative infinity, not some finite -$L$. As such, a neighborhood like $(L-\epsilon, L+\epsilon)$ will no -longer make sense, rather we replace it with an expression like $(M, -\infty)$ or $(-\infty, M)$. As in: the limit of $f(x)$ as $x$ -approaches $c$ is *infinity* if for every $M > 0$ there exists a -$\delta>0$ such that if $0 < |x-c| < \delta$ then $f(x) > M$. Approaching $-\infty$ would conclude with $f(x) < -M$ for all $M>0$. - -##### Examples - -Consider the function $f(x) = 1/x^2$. This will have a limit at every -point except possibly $0$, where division by $0$ is possible. In this -case, there is a vertical asymptote, as seen in the following graph. The limit at $0$ is $\infty$, in -the extended sense above. For $M>0$, we can take any $0 < \delta < -1/\sqrt{M}$. The following graph shows $M=25$ where the function -values are outside of the box, as $f(x) > M$ for those $x$ values with $0 < |x-0| < 1/\sqrt{M}$. - -```julia; hold=true; echo=false -f(x) = 1/x^2 -M = 25 -delta = 1/sqrt(M) - -f(x) = 1/x^2 > 50 ? NaN : 1/x^2 -plot(f, -1, 1, legend=false) -plot!([-delta, delta], [M,M], color=colorant"orange") -plot!([-delta, -delta], [0,M], color=colorant"red") -plot!([delta, delta], [0,M], color=colorant"red") -``` - ----- - -The function $f(x)=1/x$ requires us to talk about left and right limits of infinity, with the natural generalization. We can see that the left limit at $0$ is $-\infty$ and the right limit $\infty$: - -```julia; hold=true; echo=false -f(x) = 1/x -plot(f, 1/50, 1, color=:blue, legend=false) -plot!(f, -1, -1/50, color=:blue) -``` - -`SymPy` agrees: - -```julia; hold=true; -f(x) = 1/x -limit(f(x), x=>0, dir="-"), limit(f(x), x=>0, dir="+") -``` - - - ----- - -Consider the function $g(x) = x^x(1 + \log(x)), x > 0$. Does this have a *right* limit at $0$? - -A quick graph shows that a limit may be $-\infty$: - -```julia; -g(x) = x^x * (1 + log(x)) -plot(g, 1/100, 1) -``` - -We can check with `SymPy`: - -```julia; -limit(g(x), x=>0, dir="+") -``` -## Limits of sequences - -After all this, we still can't formalize the basic question asked in -the introduction to limits: what is the area contained in a parabola. For that -we developed a sequence of sums: $s_n = 1/2 \dot((1/4)^0 + (1/4)^1 + (1/4)^2 + -\cdots + (1/4)^n)$. This isn't a function of $x$, but rather depends -only on non-negative integer values of $n$. However, the same idea as -a limit at infinity can be used to define a limit. - -> Let $a_0,a_1, a_2, \dots, a_n, \dots$ be a sequence of values indexed by $n$. -> We have $\lim_{n \rightarrow \infty} a_n = L$ if for every $\epsilon > 0$ there exists an $M>0$ where if $n > M$ then $|a_n - L| < \epsilon$. - -Common language is the sequence *converges* when the limit exists and otherwise *diverges*. - -The above is essentially the same as a limit *at* infinity for a function, -but in this case the function's domain is only the non-negative -integers. - -`SymPy` is happy to compute limits of sequences. Defining this one involving a sum is best done with the `summation` function: - -```julia; -@syms i::integer n::(integer, positive) -s(n) = 1//2 * summation((1//4)^i, (i, 0, n)) # rationals make for an exact answer -limit(s(n), n=>oo) -``` - -##### Example - -The limit - -```math -\lim_{x \rightarrow 0} \frac{e^x - 1}{x} = 1, -``` - -is an important limit. Using the definition of ``e^x`` by an infinite sequence: - -```math -e^x = \lim_{n \rightarrow \infty} (1 + \frac{x}{n})^n, -``` - -we can establish the limit using the squeeze theorem. First, - -```math -A = |(1 + \frac{x}{n})^n - 1 - x| = |\Sigma_{k=0}^n {n \choose k}(\frac{x}{n})^k - 1 - x| = |\Sigma_{k=2}^n {n \choose k}(\frac{x}{n})^k|, -``` - -the first two sums cancelling off. The above comes from the binomial expansion theorem for a polynomial. Now ``{n \choose k} \leq n^k``so we have - -```math -A \leq \Sigma_{k=2}^n |x|^k = |x|^2 \frac{1 - |x|^{n+1}}{1 - |x|} \leq -\frac{|x|^2}{1 - |x|}. -``` - -using the *geometric* sum formula with ``x \approx 0`` (and not ``1``): - -```julia; hold=true -@syms x n i -summation(x^i, (i,0,n)) -``` - - -As this holds for all ``n``, as ``n`` goes to ``\infty`` we have: - -```math -|e^x - 1 - x| \leq \frac{|x|^2}{1 - |x|} -``` - -Dividing both sides by ``x`` and noting that as ``x \rightarrow 0``, ``|x|/(1-|x|)`` goes to ``0`` by continuity, the squeeze theorem gives the limit: - -```math -\lim_{x \rightarrow 0} \frac{e^x -1}{x} - 1 = 0. -``` - - -That ``{n \choose k} \leq n^k`` can be viewed as the left side counts the number of combinations of ``k`` choices from ``n`` distinct items, which is less than the number of permutations of ``k`` choices, which is less than the number of choices of ``k`` items from ``n`` distinct ones without replacement -- what ``n^k`` counts. - - - - - - -### Some limit theorems for sequences - -The limit discussion first defined limits of scalar univariate functions at a point ``c`` and then added generalizations. The pedagogical approach can be reversed by starting the discussion with limits of sequences and then generalizing from there. This approach relies on a few theorems to be gathered along the way that are mentioned here for the curious reader: - -* Convergent sequences are bounded. -* All *bounded* monotone sequences converge. -* Every bounded sequence has a convergent subsequence. (Bolzano-Weirstrass) -* The limit of ``f`` at ``c`` exists and equals ``L`` if and only if for *every* sequence ``x_n`` in the domain of ``f`` converging to ``c`` the sequence ``s_n = f(x_n)`` converges to ``L``. - - -## Summary - -The following table captures the various changes to the definition of -the limit to accommodate some of the possible behaviors. - -```julia; echo=false -limit_type=[ -"limit", -"right limit", -"left limit", -L"limit at $\infty$", -L"limit at $-\infty$", -L"limit of $\infty$", -L"limit of $-\infty$", -"limit of a sequence" -] - -Notation=[ -L"\lim_{x\rightarrow c}f(x) = L", -L"\lim_{x\rightarrow c+}f(x) = L", -L"\lim_{x\rightarrow c-}f(x) = L", -L"\lim_{x\rightarrow \infty}f(x) = L", -L"\lim_{x\rightarrow -\infty}f(x) = L", -L"\lim_{x\rightarrow c}f(x) = \infty", -L"\lim_{x\rightarrow c}f(x) = -\infty", -L"\lim_{n \rightarrow \infty} a_n = L" -] - -Vs = [ -L"(L-\epsilon, L+\epsilon)", -L"(L-\epsilon, L+\epsilon)", -L"(L-\epsilon, L+\epsilon)", -L"(L-\epsilon, L+\epsilon)", -L"(L-\epsilon, L+\epsilon)", -L"(M, \infty)", -L"(-\infty, M)", -L"(L-\epsilon, L+\epsilon)" -] - -Us = [ -L"(c - \delta, c+\delta)", -L"(c, c+\delta)", -L"(c - \delta, c)", -L"(M, \infty)", -L"(-\infty, M)", -L"(c - \delta, c+\delta)", -L"(c - \delta, c+\delta)", -L"(M, \infty)" -] - -d = DataFrame(Type=limit_type, Notation=Notation, V=Vs, U=Us) -table(d) -``` - -[Ross](https://doi.org/10.1007/978-1-4614-6271-2) summarizes this by enumerating the 15 different *related* definitions for ``\lim_{x \rightarrow a} f(x) = L`` that arise from ``L`` being either finite, ``-\infty``, or ``+\infty`` and ``a`` being any of ``c``, ``c-``, ``c+``, ``-\infty``, or ``+\infty``. - -## Rates of growth - -Consider two functions ``f`` and ``g`` to be *comparable* if there are positive integers ``m`` and ``n`` with *both* - -```math -\lim_{x \rightarrow \infty} \frac{f(x)^m}{g(x)} = \infty \quad\text{and } -\lim_{x \rightarrow \infty} \frac{g(x)^n}{f(x)} = \infty. -``` - -The first says ``g`` is eventually bounded by a power of ``f``, the second that ``f`` is eventually bounded by a power of ``g``. - -Here we consider which families of functions are *comparable*. - - -First consider ``f(x) = x^3`` and ``g(x) = x^4``. We can take ``m=2`` and ``n=1`` to verify ``f`` and ``g`` are comparable: - -```julia -fx, gx = x^3, x^4 -limit(fx^2/gx, x=>oo), limit(gx^1 / fx, x=>oo) -``` - -Similarly for any pairs of powers, so we could conclude ``f(x) = x^n`` and ``g(x) =x^m`` are comparable. (However, as is easily observed, for ``m`` and ``n`` both positive integers ``\lim_{x \rightarrow \infty} x^{m+n}/x^m = \infty`` and ``\lim_{x \rightarrow \infty} x^{m}/x^{m+n} = 0``, consistent with our discussion on rational functions that higher-order polynomials dominate lower-order polynomials.) - - -Now consider ``f(x) = x`` and ``g(x) = \log(x)``. These are not compatible as there will be no ``n`` large enough. We might say ``x`` dominates ``\log(x)``. - -```julia -limit(log(x)^n / x, x => oo) -``` - -As ``x`` could be replaced by any monomial ``x^k``, we can say "powers" grow faster than "logarithms". - - -Now consider ``f(x)=x`` and ``g(x) = e^x``. These are not compatible as there will be no ``m`` large enough: - -```julia -@syms m::(positive, integer) -limit(x^m / exp(x), x => oo) -``` - -That is ``e^x`` grows faster than any power of ``x``. - - -Now, if ``a, b > 1`` then ``f(x) = a^x`` and ``g(x) = b^x`` will be comparable. -Take ``m`` so that ``a^m > b`` and ``n`` so that ``b^n > x`` as then, say, - -```math -\frac{(a^x)^m}{b^x} = \frac{a^{xm}}{b^x} = \frac{(a^m)^x}{b^x} = (\frac{a^m}{b})^x, -``` - -which will go to ``\infty`` as ``x \rightarrow \infty`` as ``a^m/b > 1``. - - -Finally, consider ``f(x) = \exp(x^2)`` and ``g(x) = \exp(x)^2``. Are these comparable? No, as no ``n`` is large enough: - -```julia; hold=true; -@syms x n::(positive, integer) -fx, gx = exp(x^2), exp(x)^2 -limit(gx^n / fx, x => oo) -``` - -A negative test for compatability is the following: if - -```math -\lim_{x \rightarrow \infty} \frac{\log(|f(x)|)}{\log(|g(x)|)} = 0, -``` - -Then ``f`` and ``g`` are not compatible (and ``g`` grows faster than ``f``). Applying this to the last two values of ``f`` and ``g``, we have - -```math -\lim_{x \rightarrow \infty}\frac{\log(\exp(x)^2)}{\log(\exp(x^2))} = -\lim_{x \rightarrow \infty}\frac{2\log(\exp(x))}{x^2} = -\lim_{x \rightarrow \infty}\frac{2x}{x^2} = 0, -``` - -so ``f(x) = \exp(x^2)`` grows faster than ``g(x) = \exp(x)^2``. - - ----- - -Keeping in mind that logarithms grow slower than powers which grow slower than exponentials (``a > 1``) can help understand growth at ``\infty`` as a comparison of leading terms does for rational functions. - - -We can immediately put this to use to compute ``\lim_{x\rightarrow 0+} x^x``. We first express this problem using ``x^x = (\exp(\ln(x)))^x = e^{x\ln(x)}``. Rewriting ``u(x) = \exp(\ln(u(x)))``, which only uses the basic inverse relation between the two functions, can often be a useful step. - - -As ``f(x) = e^x`` is a suitably nice function (continuous) so that the limit of a composition can be computed through the limit of the inside function, ``x\ln(x)``, it is enough to see what ``\lim_{x\rightarrow 0+} x\ln(x)`` is. We *re-express* this as a limit at ``\infty`` - -```math -\lim_{x\rightarrow 0+} x\ln(x) = \lim_{x \rightarrow \infty} (1/x)\ln(1/x) = -\lim_{x \rightarrow \infty} \frac{-\ln(x)}{x} = 0 -``` - -The last equality follows, as the function ``x`` dominates the function ``\ln(x)``. So by the limit rule involving compositions we have: ``\lim_{x\rightarrow 0+} x^x = e^0 = 1``. - -## Questions - -###### Question - -Select the graph for which the limit at ``a`` is infinite. - -```julia; hold=true; echo=false -p1 = plot(;axis=nothing, legend=false) -title!(p1, "(a)") -plot!(p1, x -> x^2, 0, 2, color=:black) -plot!(p1, zero, linestyle=:dash) -annotate!(p1,[(1,0,"a")]) - -p2 = plot(;axis=nothing, legend=false) -title!(p2, "(b)") -plot!(p2, x -> 1/(1-x), 0, .95, color=:black) -plot!(p2, x-> -1/(1-x), 1.05, 2, color=:black) -plot!(p2, zero, linestyle=:dash) -annotate!(p2,[(1,0,"a")]) - -p3 = plot(;axis=nothing, legend=false) -title!(p3, "(c)") -plot!(p3, sinpi, 0, 2, color=:black) -plot!(p3, zero, linestyle=:dash) -annotate!(p3,[(1,0,"a")]) - -p4 = plot(;axis=nothing, legend=false) -title!(p4, "(d)") -plot!(p4, x -> x^x, 0, 2, color=:black) -plot!(p4, zero, linestyle=:dash) -annotate!(p4,[(1,0,"a")]) - -l = @layout[a b; c d] -p = plot(p1, p2, p3, p4, layout=l) -imgfile = tempname() * ".png" -savefig(p, imgfile) -hotspotq(imgfile, (1/2,1), (1/2,1)) -``` - -###### Question - -Select the graph for which the limit at ``\infty`` appears to be defined. - -```julia; hold=true; echo=false -p1 = plot(;axis=nothing, legend=false) -title!(p1, "(a)") -plot!(p1, x -> x^2, 0, 2, color=:black) -plot!(p1, zero, linestyle=:dash) - -p2 = plot(;axis=nothing, legend=false) -title!(p2, "(b)") -plot!(p2, x -> 1/(1-x), 0, .95, color=:black) -plot!(p2, x-> -1/(1-x), 1.05, 2, color=:black) -plot!(p2, zero, linestyle=:dash) - -p3 = plot(;axis=nothing, legend=false) -title!(p3, "(c)") -plot!(p3, sinpi, 0, 2, color=:black) -plot!(p3, zero, linestyle=:dash) - -p4 = plot(;axis=nothing, legend=false) -title!(p4, "(d)") -plot!(p4, x -> x^x, 0, 2, color=:black) -plot!(p4, zero, linestyle=:dash) - -l = @layout[a b; c d] -p = plot(p1, p2, p3, p4, layout=l) -imgfile = tempname() * ".png" -savefig(p, imgfile) -hotspotq(imgfile, (1/2,1), (1/2,1)) -``` - - -###### Question - -Consider the function $f(x) = \sqrt{x}$. - -Does this function have a limit at every $c > 0$? - -```julia; hold=true; echo=false -booleanq(true, labels=["Yes", "No"]) -``` - -Does this function have a limit at $c=0$? - - -```julia; hold=true; echo=false -booleanq(false, labels=["Yes", "No"]) -``` - - -Does this function have a right limit at $c=0$? - -```julia; hold=true; echo=false -booleanq(true, labels=["Yes", "No"]) -``` - -Does this function have a left limit at $c=0$? - -```julia; hold=true; echo=false -booleanq(false, labels=["Yes", "No"]) -``` - -##### Question - -Find $\lim_{x \rightarrow \infty} \sin(x)/x$. - -```julia; hold=true; echo=false -numericq(0) -``` - - -###### Question - -Find $\lim_{x \rightarrow \infty} (1-\cos(x))/x^2$. - -```julia; hold=true; echo=false -numericq(0) -``` - - -###### Question - -Find $\lim_{x \rightarrow \infty} \log(x)/x$. - -```julia; hold=true; echo=false -numericq(0) -``` - - - - - -###### Question - -Find $\lim_{x \rightarrow 2+} (x-3)/(x-2)$. - -```julia; hold=true; echo=false -choices=["``L=-\\infty``", "``L=-1``", "``L=0``", "``L=\\infty``"] -answ = 1 -radioq(choices, answ) -``` - -Find $\lim_{x \rightarrow -3-} (x-3)/(x+3)$. - - - -```julia; hold=true; echo=false -choices=["``L=-\\infty``", "``L=-1``", "``L=0``", "``L=\\infty``"] -answ = 4 -radioq(choices, answ) -``` - -###### Question - -Let ``f(x) = \exp(x + \exp(-x^2))`` and ``g(x) = \exp(-x^2)``. Compute: - -```math -\lim_{x \rightarrow \infty} \frac{\ln(f(x))}{\ln(g(x))}. -``` - -```julia; hold=true;echo=false -@syms x -ex = log(exp(x + exp(-x^2))) / log(exp(-x^2)) -val = N(limit(ex, x => oo)) -numericq(val) -``` - -###### Question - -Consider the following expression: - -```julia; -ex = 1/(exp(-x + exp(-x))) - exp(x) -``` - -We want to find the limit, ``L``, as ``x \rightarrow \infty``, which we assume exists below. - -We first rewrite `ex` using `w` as `exp(-x)`: - -```julia -@syms w -ex1 = ex(exp(-x) => w) -``` - -As ``x \rightarrow \infty``, ``w \rightarrow 0+``, so the limit at ``0+`` of `ex1` is of interest. - -Use this fact, to find ``L`` - -```julia -limit(ex1 - (w/2 - 1), w=>0) -``` - -``L`` is: - -```julia; hold=true; echo=false -numericq(-1) -``` - -(This awkward approach is generalizable: replacing the limit as ``w \rightarrow 0`` of an expression with the limit of a polynomial in `w` that is easy to identify.) - - - - -###### Question - -As mentioned, for limits that depend on specific values of parameters `SymPy` may have issues. -As an example, `SymPy` has an issue with this limit, whose answer depends on the value of ``k``" - -```math -\lim_{x \rightarrow 0+} \frac{\sin(\sin(x^2))}{x^k}. -``` - - - -Note, regardless of ``k`` you find: - -```julia; hold=true; -@syms x::real k::integer -limit(sin(sin(x^2))/x^k, x=>0) -``` - -For which value(s) of ``k`` in ``1,2,3`` is this actually the correct answer? (Do the above ``3`` times using a specific value of `k`, not a numeric one. - -```julia, echo=false -choices = ["``1``", "``2``", "``3``", "``1,2``", "``1,3``", "``2,3``", "``1,2,3``"] -radioq(choices, 1, keep_order=true) -``` - - -###### Question: No limit - -Some functions do not have a limit. Make a graph of $\sin(1/x)$ from $0.0001$ to $1$ and look at the output. Why does a limit not exist? - -```julia; hold=true; echo=false -choices=["The limit does exist - it is any number from -1 to 1", - "Err, the limit does exists and is 1", - "The function oscillates too much and its y values do not get close to any one value", - "Any function that oscillates does not have a limit."] -answ = 3 -radioq(choices, answ) -``` - - - -###### Question ``0^0`` is not *always* ``1`` - -Is the form $0^0$ really indeterminate? As mentioned `0^0` evaluates to `1`. - - -Consider this limit: - -```math -\lim_{x \rightarrow 0+} x^{k\cdot x} = L. -``` - -Consider different values of $k$ to see if this limit depends on $k$ or not. What is $L$? - - -```julia; hold=true; echo=false -choices = ["``1``", "``k``", "``\\log(k)``", "The limit does not exist"] -answ = 1 -radioq(choices, answ) -``` - - -Now, consider this limit: - -```math -\lim_{x \rightarrow 0+} x^{1/\log_k(x)} = L. -``` - -In `julia`, $\log_k(x)$ is found with `log(k,x)`. The default, `log(x)` takes $k=e$ so gives the natural log. So, we would define `h`, for a given `k`, with - -```julia; echo=false -k = 10 # say. Replace with actual value -h(x) = x^(1/log(k, x)) -``` - - - -Consider different values of $k$ to see if the limit depends on $k$ or not. What is $L$? - - -```julia; hold=true; echo=false -choices = ["``1``", "``k``", "``\\log(k)``", "The limit does not exist"] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -Limits *of* infinity *at* infinity. We could define this concept quite -easily mashing together the two definitions. Suppose we did. Which of -these ratios would have a limit of infinity at infinity: - -```math -x^4/x^3,\quad x^{100+1}/x^{100}, \quad x/\log(x), \quad 3^x / 2^x, \quad e^x/x^{100} -``` - -```julia; hold=true; echo=false -choices=[ -"the first one", -"the first and second ones", -"the first, second and third ones", -"the first, second, third, and fourth ones", -"all of them"] -answ = 5 -radioq(choices, answ, keep_order=true) -``` - - -###### Question - -A slant asymptote is a line $mx + b$ for which the graph of $f(x)$ -gets close to as $x$ gets large. We can't express this directly as a -limit, as "$L$" is not a number. How might we? - -```julia; hold=true; echo=false -choices = [ -L"We can talk about the limit at $\infty$ of $f(x) - (mx + b)$ being $0$", -L"We can talk about the limit at $\infty$ of $f(x) - mx$ being $b$", -L"We can say $f(x) - (mx+b)$ has a horizontal asymptote $y=0$", -L"We can say $f(x) - mx$ has a horizontal asymptote $y=b$", -"Any of the above"] -answ = 5 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Suppose a sequence of points $x_n$ converges to $a$ in the limiting sense. For a function $f(x)$, the sequence of points $f(x_n)$ may or may not converge. One alternative definition of a [limit](https://en.wikipedia.org/wiki/Limit_of_a_function#In_terms_of_sequences) due to Heine is that $\lim_{x \rightarrow a}f(x) = L$ if *and* only if **all** sequences $x_n \rightarrow a$ have $f(x_n) \rightarrow L$. - -Consider the function $f(x) = \sin(1/x)$, $a=0$, and the two sequences implicitly defined by $1/x_n = \pi/2 + n \cdot (2\pi)$ and $y_n = 3\pi/2 + n \cdot(2\pi)$, $n = 0, 1, 2, \dots$. - -What is $\lim_{x_n \rightarrow 0} f(x_n)$? - -```julia; hold=true; echo=false -numericq(1) -``` - -What is $\lim_{y_n \rightarrow 0} f(y_n)$? - -```julia; hold=true; echo=false -numericq(-1) -``` - -This shows that - -```julia; hold=true; echo=false -choices = [L" $f(x)$ has a limit of $1$ as $x \rightarrow 0$", -L" $f(x)$ has a limit of $-1$ as $x \rightarrow 0$", -L" $f(x)$ does not have a limit as $x \rightarrow 0$" -] -answ = 3 -radioq(choices, answ) -``` diff --git a/CwJ/limits/process.jl b/CwJ/limits/process.jl deleted file mode 100644 index 67f63e8..0000000 --- a/CwJ/limits/process.jl +++ /dev/null @@ -1,26 +0,0 @@ -using CwJWeaveTpl - -fnames = [ - "limits", - "limits_extensions", - # - "continuity", - "intermediate_value_theorem" - ] - - -process_file(nm; cache=:off) = CwJWeaveTpl.mmd(nm * ".jmd", cache=cache) - -function process_files(;cache=:user) - for f in fnames - @show f - process_file(f, cache=cache) - end -end - - - -""" -## TODO limits - -""" diff --git a/CwJ/makie-demos/Project.toml b/CwJ/makie-demos/Project.toml deleted file mode 100644 index 2c60422..0000000 --- a/CwJ/makie-demos/Project.toml +++ /dev/null @@ -1,9 +0,0 @@ -[deps] -AbstractPlotting = "537997a7-5e4e-5d89-9595-2241ea00577e" -Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f" -ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" -GLMakie = "e9467ef8-e4e7-5192-8a1a-b1aee30e663a" -LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" -Revise = "295af30f-e4ad-537b-8983-00126c2a3abe" -Roots = "f2b01f46-fcfa-551c-844a-d8ac1e96c665" -WGLMakie = "276b4fcb-3e11-5398-bf8b-a0c2d153d008" diff --git a/CwJ/makie-demos/README.md b/CwJ/makie-demos/README.md deleted file mode 100644 index 60ecfaa..0000000 --- a/CwJ/makie-demos/README.md +++ /dev/null @@ -1,15 +0,0 @@ -# Demos - -A collection of little demos made using Makie graphics, which allows interactivity through the dragging of points or the use of simple controls. - -* `tangent-line`: see how the slope of secant line converges to the slope of the tangent line as `h` goes to 0 - -* `optimization`: Identify the optimal crossing point to minimize time when there are different velocities north and south of the x axis. - -* `inscribed-area`: Identify the maximal inscribed rectangle - -* `integration`: Compare visually the left Riemann approximation, the trapezoid method, and Simpson's method for different values of `n`. - -* `spirograph`: adjust parameters for the plotting of spirograph(https://en.wikipedia.org/wiki/Spirograph) patterns. - -* `bezier`: create Bezier curves by dragging control points. diff --git a/CwJ/makie-demos/bezier.jl b/CwJ/makie-demos/bezier.jl deleted file mode 100644 index 6a85ebf..0000000 --- a/CwJ/makie-demos/bezier.jl +++ /dev/null @@ -1,112 +0,0 @@ -using AbstractPlotting -using AbstractPlotting.MakieLayout -using GLMakie - -using LinearAlgebra - - -function bezier() -descr = """ -Bezier Curves: B(t) = ∑(binomial(n,i) * tⁱ * (1-t)⁽ⁿ⁻¹⁾ * Pᵢ) -""" - -## From http://juliaplots.org/MakieReferenceImages/gallery//edit_polygon/index.html: -function add_move!(scene, points, pplot) - idx = Ref(0); dragstart = Ref(false); startpos = Base.RefValue(Point2f0(0)) - on(events(scene).mousedrag) do drag - if ispressed(scene, Mouse.left) - if drag == Mouse.down - plot, _idx = mouse_selection(scene) - if plot == pplot - idx[] = _idx; dragstart[] = true - startpos[] = to_world(scene, Point2f0(scene.events.mouseposition[])) - end - elseif drag == Mouse.pressed && dragstart[] && checkbounds(Bool, points[], idx[]) - pos = to_world(scene, Point2f0(scene.events.mouseposition[])) - - # very wierd, but we work with components - # not vector - z = zero(eltype(pos)) - x,y = pos - - ptidx = idx[] - - x = clamp(x, -1, 1) - y = clamp(y, -1, 1) - - points[][idx[]] = [x,y] - points[] = points[] - end - else - dragstart[] = false - end - return - end -end - -upperpoints = Point2f0[(0,0), (5, 0), (5, 5), (0,5)] -lowerpoints = (Point2f0[(0,0), (5, 0), (5, -5), (0,-5)]) - -points = Node(Point2f0[(1, 4), (3, 0), (4,-4.0)]) - -# where we lay our scene: -scene, layout = layoutscene() -layout.halign = :left -layout.valign = :top - - -p = layout[1, 1:2] = LScene(scene) -rowsize!(layout, 1, Auto(1)) -colsize!(layout, 1, Auto(1)) -colsize!(layout, 2, Auto(1)) - -npts = layout[2,1:2] = LSlider(scene, range=3:12, startvalue=4) -layout[3,1:2] = LText(scene, chomp(descr)) - -# points = Node(Point2f0[(-1/2, -1/2), -# (-1/2, 1/2), -# (1/2, 1/2), -# (1/2, -1/2)]) - -#npts = 6 -#ts = range(3pi/2, -pi/2, length=npts+2) -#points = Node(Point2f0[(cos(t),sin(t)) for t in ts[2:end-1]]) - -points = lift(npts.value) do val - ts = range(3pi/2, -pi/2, length=val+2) - Point2f0[(cos(t),sin(t)) for t in ts[2:end-1]] -end - - - - - - -bcurve = lift(points) do pts - n = length(pts) - 1 - B = t -> begin - tot = 0.0 - for (i′, Pᵢ) in enumerate(pts) - i = i′ - 1 - tot += binomial(n, i) * t^i * (1-t)^(n-i) * Pᵢ - end - tot - end - ts = range(0, 1, length=200) - Point2f0[B(t) for t in ts] -end - - - -lines!(p.scene, bcurve, strokecolor=:black, strokewidth=15) -lines!(p.scene, points, strokecolor=:gray90, strokewidth=5, linestyle=:dash) -scatter!(p.scene, points) -xlims!(p.scene, (-1,1)) -ylims!(p.scene, (-1,1)) -add_move!(p.scene, points, p.scene[end]) - - - -scene - -end diff --git a/CwJ/makie-demos/inscribed-area.jl b/CwJ/makie-demos/inscribed-area.jl deleted file mode 100644 index d853944..0000000 --- a/CwJ/makie-demos/inscribed-area.jl +++ /dev/null @@ -1,128 +0,0 @@ -using AbstractPlotting -using AbstractPlotting.MakieLayout -using GLMakie - -using Roots - -## Assumes f(a), f(b) are zero -## only 1 or 2 solutions for f(x) = f(c) for any c in [a,b] -f(x) = 1 - x^4 -a = -1 -b = 1 - - descr = """ -Adjust the point (c, f(c)) with c > 0 to find the inscribed rectangle with maximal area -""" - - -function _inscribed_area(c) - zs = fzeros(u -> f(u) - f(c), a, b) - length(zs) <= 1 ? 0 : abs(zs[1] - zs[2]) * f(c) -end -D(f) = c -> (f(c + 1e-4) - f(c))/1e-4 -function answer() - h = 1e-4 - zs = fzeros(D(_inscribed_area), 0, b-h) - a,i = findmax(_inscribed_area.(zs)) - a -end - -Answer = answer() - - -## From http://juliaplots.org/MakieReferenceImages/gallery//edit_polygon/index.html: -function add_move!(scene, points, pplot) - idx = Ref(0); dragstart = Ref(false); startpos = Base.RefValue(Point2f0(0)) - on(events(scene).mousedrag) do drag - if ispressed(scene, Mouse.left) - if drag == Mouse.down - plot, _idx = mouse_selection(scene) - if plot == pplot - idx[] = _idx; dragstart[] = true - startpos[] = to_world(scene, Point2f0(scene.events.mouseposition[])) - end - elseif drag == Mouse.pressed && dragstart[] && checkbounds(Bool, points[], idx[]) - - if idx[] == 3 - pos = to_world(scene, Point2f0(scene.events.mouseposition[])) - - # we work with components not vector - x,y = pos - c = clamp(x, a, b) - zs = fzeros(u -> f(u) - f(c), a , b) - - if length(zs) == 1 - c′ = c = first(zs) - else - c′, c = zs - end - - - points[][1] = [c′, 0] - points[][2] = [c′, f(c′)] - points[][3] = [c, f(c)] - points[][4] = [c, 0] - points[] = points[] - end - end - else - dragstart[] = false - end - return - end -end - - -c = b/2 -c′ = first(fzeros(u -> f(u) - f(c), a , b)) -area = round(abs(c - c′) * f(c), digits=4) - -points = Node(Point2f0[(c′,0), (c′, f(c′)), (c, f(c)), (c, 0)]) - - - -# where we lay our scene: -scene, layout = layoutscene() -layout.halign = :left -layout.valign = :top - -p = layout[0,0] = LScene(scene) - -colsize!(layout, 1, Auto(1)) -rowsize!(layout, 1, Auto(1)) - -label = layout[end, 1] = LText(scene, "Area = $area") -layout[end+1,1] = LText(scene, chomp(descr)) - -polycolor = lift(points) do pts - c′ = pts[1][1] - c = pts[3][1] - area = round(abs(c - c′)*f(c), digits=4) - - lbl = "Area = $area" - label.text[] = lbl - - if abs(area - Answer) <= 1e-4 - :green - else - :gray75 - end - - - -end - - -lines!(p.scene, a..b, f, strokecolor=:red, strokewidth=15) -lines!(p.scene, a..b, zero, strokecolor=:black, strokewidth=10) -poly!(p.scene, points, color = polycolor) -scatter!(p.scene, points, color = :white, strokewidth = 10, markersize = 0.05, strokecolor = :black, raw = true) -xlims!(p.scene, (a, b)) - - -add_move!(p.scene, points, p.scene[end]) - - - - -scene diff --git a/CwJ/makie-demos/integration.jl b/CwJ/makie-demos/integration.jl deleted file mode 100644 index bdfe3ee..0000000 --- a/CwJ/makie-demos/integration.jl +++ /dev/null @@ -1,142 +0,0 @@ -using AbstractPlotting -using AbstractPlotting.MakieLayout -using GLMakie - -using QuadGK - -function riemann(f::Function, a::Real, b::Real, n::Int; method="right") - if method == "right" - meth = f -> (lr -> begin l,r = lr; f(r) * (r-l) end) - elseif method == "left" - meth = f -> (lr -> begin l,r = lr; f(l) * (r-l) end) - elseif method == "trapezoid" - meth = f -> (lr -> begin l,r = lr; (1/2) * (f(l) + f(r)) * (r-l) end) - elseif method == "simpsons" - meth = f -> (lr -> begin l,r=lr; (1/6) * (f(l) + 4*(f((l+r)/2)) + f(r)) * (r-l) end) - end - - xs = a .+ (0:n) * (b-a)/n - - sum(meth(f), zip(xs[1:end-1], xs[2:end])) -end - - -""" - integration(f) - -Show graphically the left Riemann approximation, the trapezoid approximation, and Simpson's approximation to the integral of `f` over [-1,1]. - -Assumes `f` is non-negative. -""" -function integration(f=nothing) - if f == nothing - f = x -> x^2*exp(-x/3) - end - - a, b = -1, 1 - - -function left_riemann_pts(n) - xs = range(a, b, length=n+1) - pts = Point2f0[(xs[1], 0)] - for i in 1:n - xᵢ, xᵢ₊₁ = xs[i], xs[i+1] - fᵢ = f(xᵢ) - push!(pts, (xᵢ, fᵢ)) - push!(pts, (xᵢ₊₁, fᵢ)) - end - push!(pts, (xs[end], 0)) - pts -end - - -function trapezoid_pts(n) - xs = range(a, b, length=n+1) - pts = Point2f0[(xs[1], 0)] - for i in 1:n - xᵢ, xᵢ₊₁ = xs[i], xs[i+1] - fᵢ = f(xᵢ) - push!(pts, (xᵢ, f(xᵢ))) - end - push!(pts, (xs[end], f(xs[end]))) - push!(pts, (xs[end], 0)) - pts -end - -function simpsons_pts(n) - xs = range(a, b, length=n+1) - pts = Point2f0[(xs[1], 0), (xs[1], f(xs[1]))] - - for i in 1:n - xi, xi1 = xs[i], xs[i+1] - m = xi/2 + xi1/2 - p = x -> f(xi)*(x-m)*(x - xi1)/(xi-m)/(xi-xi1) + f(m) * (x-xi)*(x-xi1)/(m-xi)/(m-xi1) + f(xi1) * (x-xi)*(x-m) / (xi1-xi) / (xi1-m) - - xs′ = range(xi, xi1, length=10) - for j in 2:10 - x = xs′[j] - push!(pts, (x, p(x))) - end - end - push!(pts, (xs[end], 0)) - pts -end - - -# where we lay our scene: -scene, layout = layoutscene() -layout.halign = :left -layout.valign = :top - - -p1 = layout[1,1] = LScene(scene) -p2 = layout[1,2] = LScene(scene) -p3 = layout[1,3] = LScene(scene) - -n = layout[2,1] = LSlider(scene, range=2:25, startvalue=2) -output = layout[2, 2:3] = LText(scene, "...") - -lpts = lift(n.value) do n - left_riemann_pts(n) -end - -poly!(p1.scene, lpts, color=:gray75) -lines!(p1.scene, a..b, f, color=:black, strokewidth=10) -title(p1.scene, "Left Riemann") - - -tpts = lift(n.value) do n - trapezoid_pts(n) -end -poly!(p2.scene, tpts, color=:gray75) -lines!(p2.scene, a..b, f, color=:black, strokewidth=10) -title(p2.scene, "Trapezoid") - -spts = lift(n.value) do n - simpsons_pts(n) -end -poly!(p3.scene, spts, color=:gray75) -lines!(p3.scene, a..b, f, color=:black, strokewidth=10) -title(p3.scene, "Simpson's") - -on(n.value) do n - actual,err = quadgk(f, -1, 1) - lrr = riemann(f, -1, 1, n, method="left") - trap = riemann(f, -1, 1, n, method="trapezoid") - simp = riemann(f, -1, 1, n, method="simpsons") - - Δleft = round(abs(lrr - actual), digits=8) - Δtrap = round(abs(trap - actual), digits=8) - Δsimp = round(abs(simp - actual), digits=8) - - txt = "Riemann: $Δleft, Trapezoid: $Δtrap, Simpson's: $Δsimp" - output.text[] = txt -end -n.value[] = n.value[] - - - - - -scene -end diff --git a/CwJ/makie-demos/optimization.jl b/CwJ/makie-demos/optimization.jl deleted file mode 100644 index ab26eae..0000000 --- a/CwJ/makie-demos/optimization.jl +++ /dev/null @@ -1,151 +0,0 @@ -using AbstractPlotting -using AbstractPlotting.MakieLayout -using GLMakie - -using LinearAlgebra -using Roots -using ForwardDiff -D(f) = x -> ForwardDiff.derivative(f, float(x)) - -descr = """ -An old optimization problem is to find the shortest distance between two -points when the rate of travel differs due to some medium (darker means slower). -In this example, the relative rate can be adjusted (with the slider), and the -various points (by clicking on and dragging a point). From there, the -user can adjust the crossing point to identify the time. -""" - -## From http://juliaplots.org/MakieReferenceImages/gallery//edit_polygon/index.html: -function add_move!(scene, points, pplot) - idx = Ref(0); dragstart = Ref(false); startpos = Base.RefValue(Point2f0(0)) - on(events(scene).mousedrag) do drag - if ispressed(scene, Mouse.left) - if drag == Mouse.down - plot, _idx = mouse_selection(scene) - if plot == pplot - idx[] = _idx; dragstart[] = true - startpos[] = to_world(scene, Point2f0(scene.events.mouseposition[])) - end - elseif drag == Mouse.pressed && dragstart[] && checkbounds(Bool, points[], idx[]) - pos = to_world(scene, Point2f0(scene.events.mouseposition[])) - - # very wierd, but we work with components - # not vector - z = zero(eltype(pos)) - x,y = pos - - ptidx = idx[] - - x = clamp(x, 0, 5) - - if ptidx == 1 - y = clamp(y, z, 5) - elseif ptidx == 2 - y = z - elseif ptidx == 3 - y = clamp(y, -5, z) - end - - points[][idx[]] = [x,y] - points[] = points[] - end - else - dragstart[] = false - end - return - end -end - -upperpoints = Point2f0[(0,0), (5, 0), (5, 5), (0,5)] -lowerpoints = (Point2f0[(0,0), (5, 0), (5, -5), (0,-5)]) - -points = Node(Point2f0[(1, 4), (3, 0), (4,-4.0)]) - -# where we lay our scene: -scene, layout = layoutscene() -layout.halign = :left -layout.valign = :top - - -p = layout[1:3, 1] = LScene(scene) -rowsize!(layout, 1, Auto(1)) -colsize!(layout, 1, Auto(1)) - - -flayout = GridLayout() -layout[1,2] = flayout - -flayout[1,1] = LText(scene, chomp(descr)) - -details = flayout[2, 1] = LText(scene, "...") - -λᵣ = flayout[3,1] = LText(scene, "λ = v₁/v₂ = 1") -λ = flayout[4,1] = LSlider(scene, range = -3:0.2:3, startvalue = 0.0) - - - - -tm = lift(λ.value, points) do λ, pts - x0,y0 = pts[1] - x, y = pts[2] - x1, y1 = pts[3] - - v1 = 1 - v2 = v1/2.0^λ - - t(x) = sqrt((x-x0)^2 + y0^2)/v1 + sqrt((x1-x)^2 + y1^2)/v2 - val = t(x) - - details.text[] = "Time: $(round(val, digits=2)) units" - - val - -end - -a = lift(λ.value, points) do λ, pts - x0,y0 = pts[1] - x1, y1 = pts[3] - v1 = 1 - v2 = v1/2.0^λ - - t(x) = sqrt((x-x0)^2 + y0^2)/v1 + sqrt((x1-x)^2 + y1^2)/v2 - x′ = fzero(D(t), x0, x1) - t(x′) -end - -λcolor = lift(λ.value) do val - # val = v1/v2 ∈ [1/8, 8] - n = floor(Int, 50 - val/3 * 25) - Symbol("gray" * string(n)) -end - -linecolor = lift(a) do val - abs(val - tm[]) <= 1e-2 ? :green : :white -end - - -on(λ.value) do val - v = round(2.0^(val), digits=2) - txt = "λ = v₁/v₂ = $v" - λᵣ.text[] = txt - - -end - - - -poly!(p.scene, upperpoints, color = :gray50) # neutral color -poly!(p.scene, lowerpoints, color = λcolor) # color depends on λ - -lines!(p.scene, Point2f0[(0,0), (5, 0)], color=:black, strokewidth=5, strokecolor=:black, raw=true) -lines!(p.scene, points, color = linecolor, strokewidth = 10, markersize = 0.05, strokecolor = :black, raw = true) -scatter!(p.scene, points, color = linecolor, strokewidth = 10, markersize = 0.05, strokecolor = :black, raw = true) -xlims!(p.scene, (0,5)) -ylims!(p.scene, (-5,5)) - -add_move!(p.scene, points, p.scene[end]) - - - -scene - diff --git a/CwJ/makie-demos/spirograph.jl b/CwJ/makie-demos/spirograph.jl deleted file mode 100644 index 6fb6e01..0000000 --- a/CwJ/makie-demos/spirograph.jl +++ /dev/null @@ -1,49 +0,0 @@ -using AbstractPlotting -using AbstractPlotting.MakieLayout -using GLMakie - -# GUI for spirograph -# https://en.wikipedia.org/wiki/Spirograph - -function x(t; R=1, k=1/4, l=1/4) - R*[(1-k)*cos(t) + l*k*cos((1-k)/k*t), (1-k)*sin(t) - l*k*sin((1-k)/k*t)] -end - -# where we lay our scene: -scene, layout = layoutscene() - -flyt = GridLayout() -flyt.halign[] = :left # fails? -flyt.valign[] = :top - -layout[1,1] = flyt -p = layout[1,2] = LAxis(scene) -rowsize!(layout, 1, Relative(1)) -colsize!(layout, 2, Relative(2/3)) - -flyt[1,1] = LText(scene, "t") -ts = flyt[1,2] = LSlider(scene, range = 2pi:pi/8:40pi) - -flyt[2, 1] = LText(scene, "k = r/R") -k = flyt[2,2] = LSlider(scene, range = 0.01:0.01:1.0, startvalue=1/4) - -flyt[3,1] = LText(scene, "l=ρ/r") -l = flyt[3,2] = LSlider(scene, range = 0.01:0.01:1.0, startvalue=1/4) - - -data = lift(ts.value, k.value, l.value) do ts,k,l - - ts′ = range(0, ts, length=1000) - xys = Point2f0.(x.(ts′, R=1, k=k, l=l)) - -end - -lines!(p, data) -xlims!(p, (-1, 1)) -ylims!(p, (-1, 1)) - - - - -scene - diff --git a/CwJ/makie-demos/tangent-line.jl b/CwJ/makie-demos/tangent-line.jl deleted file mode 100644 index 4f1972c..0000000 --- a/CwJ/makie-demos/tangent-line.jl +++ /dev/null @@ -1,102 +0,0 @@ -using AbstractPlotting -using AbstractPlotting.MakieLayout -using GLMakie - -using ForwardDiff -Base.adjoint(f::Function) = x -> ForwardDiff.derivative(f, float(x)) - - -function tangent_line(f=nothing, a=0, b=pi) - - if f == nothing - f = x -> sin(x) - end - -descr = """ -The tangent line has a slope approximated by the slope of secant lines. -This demo allows the points c and c+h to be adjusted to see the two lines""" - -## From http://juliaplots.org/MakieReferenceImages/gallery//edit_polygon/index.html: -function add_move!(scene, points, pplot) - idx = Ref(0); dragstart = Ref(false); startpos = Base.RefValue(Point2f0(0)) - on(events(scene).mousedrag) do drag - if ispressed(scene, Mouse.left) - if drag == Mouse.down - plot, _idx = mouse_selection(scene) - if plot == pplot - idx[] = _idx; dragstart[] = true - startpos[] = to_world(scene, Point2f0(scene.events.mouseposition[])) - end - elseif drag == Mouse.pressed && dragstart[] && checkbounds(Bool, points[], idx[]) - pos = to_world(scene, Point2f0(scene.events.mouseposition[])) - - # we work with components not vector - x,y = pos - - x = clamp(x, a, b) - y = f(x) - - points[][idx[]] = [x,y] - points[] = points[] - end - else - dragstart[] = false - end - return - end -end - - -c, h = pi/4, .5 -points = Node(Point2f0[(c, f(c)), (c+h, f(c+h))]) - - -# where we lay our scene: -scene, layout = layoutscene() -layout.halign = :left -layout.valign = :top - -p = layout[1,1:2] = LScene(scene) - -rowsize!(layout, 1, Auto(1)) -colsize!(layout, 1, Auto(1)) - -layout[2,1:2] = LText(scene, descr) - - -secline = lift(points) do pts - c, ch = pts - x0, y0 = c - x1, y1 = ch - m = (y1 - y0)/(x1 - x0) - sl = x -> y0 + m * (x - x0) - Point2f0[(a, sl(a)), (b, sl(b))] -end - -tangentline = lift(points) do pts - c, ch = pts - x0, y0 = c - m = f'(x0) - tl = x -> y0 + m * (x - x0) - Point2f0[(a, tl(a)), (b, tl(b))] -end - - -lines!(p.scene, a..b, f, strokecolor=:red, strokewidth=15) -lines!(p.scene, secline, color = :blue, strokewidth = 10, raw=true) -lines!(p.scene, tangentline, color = :red, strokewidth = 10, raw=true) -scatter!(p.scene, points, color = :white, strokewidth = 10, markersize = 0.05, strokecolor = :black, raw = true) -xlims!(p.scene, (a, b)) -#ylims!(p.scene, (0, 1.5)) - - -add_move!(p.scene, points, p.scene[end]) - - - -scene - - end - - -tangent_line() diff --git a/CwJ/misc/Project.toml b/CwJ/misc/Project.toml deleted file mode 100644 index d00b321..0000000 --- a/CwJ/misc/Project.toml +++ /dev/null @@ -1,9 +0,0 @@ -[deps] -CalculusWithJulia = "a2e0e22d-7d4c-5312-9169-8b992201a882" -HCubature = "19dc6840-f33b-545b-b366-655c7e3ffd49" -ImplicitEquations = "95701278-4526-5785-aba3-513cca398f19" -LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" -Polynomials = "f27b6e38-b328-58d1-80ce-0feddd5e7a45" -PyPlot = "d330b81b-6aea-500a-939a-2ce795aea3ee" -QuadGK = "1fd47b50-473d-5c70-9696-f719f8f3bcdc" -SymPy = "24249f21-da20-56a4-8eb1-6a02cf4ae2e6" diff --git a/CwJ/misc/bibliography.md b/CwJ/misc/bibliography.md deleted file mode 100644 index 6737f0a..0000000 --- a/CwJ/misc/bibliography.md +++ /dev/null @@ -1,93 +0,0 @@ -# Bibliography, etc. - -(A work in progress...) - -## Historical Books - -* Oeuvres complètes d'Augustin Cauchy. Série 2, tome 4 -Calcul Diferentiel -[link](http://gallica.bnf.fr/ark:/12148/bpt6k90196z/f16.image) - -* -Analyse des infiniment petits, pour l'intelligence des lignes courbes -by L'Hospital, marquis de, 1661-1704 - -[link](https://archive.org/details/infinimentpetits1716lhos00uoft) - -* Fermat on maxmin -http://science.larouchepac.com/fermat/fermat-maxmin.pdf - -* Argobast (1800, primary book for a long time) - -http://books.google.com/books?id=YoPq8uCy5Y8C&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false - -## Open source text books - -Refer to [open](http://danaernst.com/resources/free-and-open-source-textbooks/) source textbooks to find - -* Strang -http://ocw.mit.edu/resources/res-18-001-calculus-online-textbook-spring-2005/textbook/ - -* Oliver Knill teaching notes -http://www.math.harvard.edu/~knill/teaching/summer2018/handouts.html - -* Open Stax -https://math.libretexts.org/Bookshelves/Calculus/Book%3A_Calculus_(OpenStax) - -* David Guichard (also Neal Koblitz) -http://www.whitman.edu/mathematics/calculus/ - -* Marsden, Weinstein -http://www.cds.caltech.edu/~marsden/volume/Calculus/ - - -* Joyner Differential Calculus with Sage based on Granville's text -http://wdjoyner.com/teach/calc1-sage/ - -* Sage for undergraudate -http://wdjoyner.com/teach/calc1-sage/ - -* AI Math -http://aimath.org/textbooks/approved-textbooks/ - - -## Articles - -http://www.ams.org/samplings/feature-column/fc-2012-02 - -http://www.maa.org/external_archive/joma/Volume7/Aktumen/Polygon.html - -* Bressoud - FTC -http://www.math.harvard.edu/~knill/teaching/math1a_2011/exhibits/bressoud/ - -* [Katz](http://www.jstor.org/stable/2689856) and [Katz](https://www.jstor.org/stable/2690275) - - - -## Websites - -* Math insight https://mathinsight.org/ has many informative pages to peruse - -* http://www.math.wpi.edu/IQP/BVCalcHist/calc4.html#_Toc407004376 - -* earliest uses of symbols in calculus -http://jeff560.tripod.com/calculus.html - -* Famous curves index https://www-history.mcs.st-and.ac.uk/Curves/Curves.html. See also [Kokoska](https://elepa.files.wordpress.com/2013/11/fifty-famous-curves.pdf). - -## Videos - -* https://www.coursera.org/learn/calculus1 - -* http://ocw.mit.edu/resources/res-18-005-highlights-of-calculus-spring-2010/ - -* http://ocw.mit.edu/courses/mathematics/18-01sc-single-variable-calculus-fall-2010/ - -* draining conical tank -https://www.youtube.com/watch?v=2jQ1jA8uJuU - -* proof of trapezoid rule -http://www.maa.org/sites/default/files/An_Elementary_Proof30705.pdf - -* Some notes on `Plots.jl` -https://www.math.purdue.edu/~allen450/Plotting-Tutorial.html diff --git a/CwJ/misc/calculus_with_julia.jmd b/CwJ/misc/calculus_with_julia.jmd deleted file mode 100644 index 6a24104..0000000 --- a/CwJ/misc/calculus_with_julia.jmd +++ /dev/null @@ -1,115 +0,0 @@ -# The `CalculusWithJulia` package - -To run the commands in these notes, some external packages must be installed and loaded. - -The `Pluto` interface does this in the background, so there is nothing -to do but execute the cells that call `using` or `import`. For `Julia` -post version `1.7`, this installation will be initiated for you when -`using` is called in the REPL terminal. - - -For other interfaces, to use the `CalculusWithJulia` package requires first that it be installed. From the command line. This can be done with this key sequence: - -```julia; eval=false -] add CalculusWithJulia -``` - -Or, using the `Pkg` package, the commands would be - -```julia; eval=false -import Pkg -Pkg.add("CalculusWithJulia") -``` - -Installation only needs to be done once. - ----- - -However, for each new `Julia` session, the package must be *loaded*, as with the following command: - -```julia; -using CalculusWithJulia -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -``` - -That is all. The rest of this page just provides some details for the interested reader. - - -## The package concept - - -The `Julia` language provides the building blocks for the wider `Julia` ecosystem that enhance and extend the language's applicability. - -`Julia` is extended through "packages." Some of these, such as packages for certain math constants and some linear algebra operations, are part of all `Julia` installations and must simple by loaded to be used. Others, such as packages for finding integrals or (automatic) derivatives are provided by users and must first be *installed* before being used. - -### Package installation - -Package installation is straightforward, as `Julia` has a package, `Pkg`, that facilitates this. - -Since `Julia` version 1.7, just attempting to load a package through `using PackageName` at the *command line* will either load an installed package *or* query for an uninstalled package to be installed before lading. So installation just requires confirming a prompt. - -For more control, the command line and `IJulia` provide access to the function in `Pkg` through the escape command `]`. For example, to find the status of all currently installed packages, the following command can be executed: - -```julia; eval=false -] status -``` - -External packages are *typically* installed from GitHub and if they are regisered, installation is as easy as calling `add`: - -```julia; eval=false -] add QuadGK -``` - -That command will consult `Julia`'s general registry for the location of the `QuadGK` package, use this location to download the necessary files, if necessary dependencies will be built and installed, and then the package available for use. - -For these notes, when the `CalculusWithJulia` package is installed it will also install many of the other packages that are needed. - - -See [Pkg](https://docs.julialang.org/en/v1/stdlib/Pkg/index.html) for more details, such as how to update the set of available packages. - - -### Using a package - -The features of an installed package are not available until the package is brought into the current session. A package need only be *installed* once, but must be loaded each session. - -To load a package, the `using` keyword is provided: - -```julia; -using QuadGK -``` - -The above command will make available all *exported* function names from the `QuadGK` package so they can be directly used, as in: - -```julia; -quadgk(sin, 0, pi) -``` - -(A command to find an integral of $f(x) = \sin(x)$ over $[0, \pi]$.) - - - -### Package details - -When a package is *first* loaded after installation, or some other change, it will go through a *pre-compilation* process. Depending on the package size, this can take a moment to several seconds. This won't happen the second time a package is loaded. - - -However, subsequent times a package is loaded some further compilation is done, so it can still take some time for a package to load. Mostly this is not noticeable, though with the plotting package used in these notes, it is. - - -When a package is loaded, all of its dependent packages are also loaded, but their functions are not immediately available to the user. - - -In *typical* `Julia` usage, each needed package is loaded on demand. This is faster and also keeps the namespace (the collection of variable and function names) smaller to avoid collisions. However, for these notes, the package `CalculusWithJulia` will load a few of the packages needed for the entire set of notes, not just the current section. This is to make it a bit *easier* for the *beginning* user. - -One issue with loading several packages is the possibility that more than one will export a function with the same name, causing a collision. Moreover, at times, there can be dependency conflicts between packages. A suggested workflow is to use projects and in each project use a minimal set of packages. In Pluto, this is done behind the scenes. - -The `Julia` language is designed around have several "generic" functions each with many different methods depending on their usage. This design allows many different implementations for operations such as addition or multiplication yet the user only needs to call one function name. Packages can easily extend these generic functions by providing their own methods for their own new types of data. For example, `SymPy`, which adds symbolic math features to `Julia` (using a Python package) extends both `+` and `*` for use with symbolic objects. - -This design works great when the "generic" usage matches the needs of the package authors, but there are two common issues that arise: - -* The extension of a generic is for a type defined outside the author's package. This is known as "type piracy" and is frowned on, as it can lead to subtle errors. The `CalculusWithJulia` package practices this for one case: using `'` to indicate derivatives for `Function` objects. - -* The generic function concept is not part of base `Julia`. An example might be the `solve` function. This name has a well-defined mathematical usage (e.g., "solve for $x$."), but the generic concept is not part of base `Julia`. As it is used by `SymPy` and `DifferentialEquations`, among others, the ecosystem has a stub package `CommonSolve` allowing the sharing of this "verb." diff --git a/CwJ/misc/getting_started_with_julia.jmd b/CwJ/misc/getting_started_with_julia.jmd deleted file mode 100644 index 9c0778e..0000000 --- a/CwJ/misc/getting_started_with_julia.jmd +++ /dev/null @@ -1,118 +0,0 @@ -# Getting started with Julia - - -```julia; echo=false; results="hidden" -using CalculusWithJulia -using CalculusWithJulia.WeaveSupport -nothing -``` - -Julia is a freely available, open-source programming language aimed at technical computing. - -As it is open source, indeed with a liberal MIT license, it can be -installed for free on many types of computers (though not phones or -tablets). - -## Running Julia through the web - -There are a few services for running `Julia` through the -web. Mentioned here is [Binder](https://mybinder.org), which provides -a web-based interface to `Julia` built around `Jupyter`. `Jupyter` is -a wildly succesful platform for interacting with different open-source -software programs. - -[lauch binder](https://mybinder.org/v2/gh/CalculusWithJulia/CwJScratchPad.git/master) - - -Clicking the launch link above will open a web page which provides a -blank notebook, save for a package used by these notes. However, -`Binder` is nowhere near as reliable as a local installation. - - - -## Installing Julia locally - -Installing `Julia` locally is not more difficult than installing other software. - -Binaries of `Julia` are provided at -[julialang.org](http://julialang.org/downloads/). Julia has an -official released version and a developmental version. Unless there is -a compelling reason, the latest released version should be downloaded -and installed for use. - -For Windows users, there is a `juliaup` program for managing the installation of Julia. - - -The base `Julia` provides a *command-line interface*, or REPL -(read-evaluate-parse). - - - - -## Basic interactive usage - -Once installed, `Julia` can be started by clicking on an icon or -typing `julia` at the command line. Either will open a *command line -interface* for a user to interact with a `Julia` process. The basic -workflow is easy: commands are typed then sent to a `Julia` process -when the "return" key is pressed for a complete expression. Then the -output is displayed. - - -A command is typed following the *prompt*. An example might be `2 + 2`. To send the command to the `Julia` interpreter the "return" key is pressed. A complete expression or expressions will then be parsed and evaluated (executed). If the expression is not complete, `julia`'s prompt will still accept input to complete the expression. Type `2 +` to see. (The expression `2 +` is not complete, as the infix operator `+` expects two arguments, one on its left and one on its right.) - -```julia; eval=false - _ - _ _ _(_)_ | Documentation: https://docs.julialang.org - (_) | (_) (_) | - _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. - | | | | | | |/ _` | | - | | |_| | | | (_| | | Version 1.7.0 (2021-11-30) - _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release -|__/ | - -julia> 2 + 2 -4 -``` - - -Above, `julia>` is the prompt. These notes will not include the -prompt, so that copying-and-pasting can be more easily used. Input and -output cells display similarly, though with differences in -coloring. For example: - -```julia; -2 + 2 -``` - -While many prefer a command line for interacting with `Julia`, when learning a notebook interfaces is suggested. (An IDE like [Julia for Visual Studio Code](https://www.julia-vscode.org/) might be preferred for experienced programmers). In [Julia interfaces](./julia_interfaces.html), we describe two different notebook interfaces that are available through add-on packages. - - - -## Add-on packages - -`Julia` is well on its way towards 10,000 external add-on packages -that enhance the offerings of base `Julia`. We refer to one, -`CalculusWithJulia`, that is designed to accompany these -notes. [Installation notes](./calculus_with_julia.html) are available. - - - -In `Julia` graphics are provided only by add-on packages -- there is no built-in -graphing. This is the case under `Pluto` or `Jupyter` or the command line. - -In these notes, we use the `Plots` package and its default backend. The -`Plots` package provides a common interface to several different -backends; this choice is easily changed. The `gr` backend is used in these notes, though for interactive use the `Plotly` backend has advantages; for more complicated graphics, `pyplot` has some advantages; for publication `PGFPlotsX` has advantages. - -The package, if installed, is loaded as any other package: - -```julia; -using Plots -``` - -With that in hand, to make a graph of a function over a range, we follow this pattern: - -```julia; -plot(sin, 0, 2pi) -``` diff --git a/CwJ/misc/julia_interfaces.jmd b/CwJ/misc/julia_interfaces.jmd deleted file mode 100644 index a68d1f3..0000000 --- a/CwJ/misc/julia_interfaces.jmd +++ /dev/null @@ -1,113 +0,0 @@ -# Julia interfaces - -```julia; echo=false; results="hidden" -using CalculusWithJulia -using CalculusWithJulia.WeaveSupport -using Plots -nothing -``` - - -`Julia` can be used in many different manners. This page describes a few. - - -## The `REPL` - - -Base `Julia` comes with a `REPL` package, which provides a means to interact with `Julia` at the command line. - -```julia; eval=false - _ - _ _ _(_)_ | Documentation: https://docs.julialang.org - (_) | (_) (_) | - _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. - | | | | | | |/ _` | | - | | |_| | | | (_| | | Version 1.7.0 (2021-11-30) - _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release -|__/ | - -julia> 2 + 2 -4 -``` - -The `julia>` prompt is where commands are typed. The `return` key will send a command to the interpreter and the results are displayed in the REPL terminal. - -The REPL has many features for editing, for interacting with the package manager, or interaction with the shell. However it is command-line based, which no support for mouse interaction. For that, other options are available. - -## `Pluto` - -The `Pluto` package provides a notebook interface for interacting with `Julia`, which has a few idiosyncrasies, as compared to other interfaces. - -Pluto is started from the REPL terminal with these two commands: - -```julia; eval=false -using Pluto -Pluto.run() -``` - - -Primarily, the variables in the notebook are **reactive**, meaning if a variable's value is modified, all references to that variables are also modified. This reactive nature makes it very easy to see the results of slight modifications and when coupled with HTML controls, allows easy user interfaces to be developed. - -As a result, a variable name may only be used once in the top-level scope. (Names can be reused inside functions, which create their own scope and in "`let`" blocks, a trick used within these notes.) In the notes, subscripting and unicode variants are used for symbols which are typically repurposed (e.g., `x` or `f`). - -Pluto cells may only contain one command, the result of which is displayed *above* the cell. This one command can be a `begin` or `let` block to join multiple statements. - -Pluto has a built-in package management system that manages the installation of packages on demand. - -`Pluto` notebooks can be easily run locally using `Pluto`. - -`Pluto` notebooks are just `.jl` scripts, so can easily be shared. - -## `IJulia` - -"Project [Jupyter](https://jupyter.org/) exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages." The `IJulia` package allows `Julia` to be one of these programming languages. This package must be installed prior to use. - -The Jupyter Project provides two web-based interfaces to `Julia`: the Jupyter notebook and the newer JupyterLab. The the [binder](https://mybinder.org/) project use Juptyer notebooks for their primary interface to `Julia`. To use a binder notebook, follow this link: - -[lauch binder](https://mybinder.org/v2/gh/CalculusWithJulia/CwJScratchPad.git/master) - - -To run locally, these interfaces are available once `IJulia` is installed. Since version 1.7, the following commands should do this: - -```julia; eval=false; -using IJulia -notebook() -``` - -Should that not work, then this should as well: - -```julia; eval=false -using Pkg -Pkg.add("PyCall") -Pkg.add("IJulia") -``` - ----- - - -The notebook interface has "cells" where one or more commands can be entered. - - -In `IJulia`, a block of commands is sent to the kernel (the `Julia` interpreter) by typing "shift+return" or clicking on a "run" button. The output is printed below a cell, including graphics. - -When a cell is evaluating, the leading `[]` has an asterick (`[*]`) showing the notebook is awaiting the results of the calculation. - -Once a cell is evaluated, the leading `[]` has a number inserted (e.g., `[1]`, as in the figure). This number indicates the order of cell evaluation. Once a notebook is interacted with, the state of the namespace need not reflect the top-to-bottom order of the notebook, but rather reflects the order of cell evaluations. - -To be specific, a variable like `x` may be redefined in a cell above where the variable is intially defined and this redefinition will hold the current value known to the interpreter. As well, a notebook, when reloaded, may have unevaluated cells with output showing. These will not influence the state of the kernel until they are evaluated. - -When a cell's commands are evaluated, the last command executed is displayed. If it is desirable that multiple values be displayed, they can be packed into a tuple. This is done by using commas to separate values. `IJulia` will also display other means to print output (e.g., `@show`, `display`, `print`, ...). - -To run all cells in a notebook from top to bottom, the "run all" command under the "Cell" menu is available. - -If a calculation takes much longer than anticipated, the "kernel" can be interrupted through a menu item of "Kernel". - -If the kernal appears unresponsive, it can be restarted through a menu item of "Kernel". - -Notebooks can be saved (as `*.ipynb` files) for sharing or for reuse. Notebooks can be printed at HTML pages, and if the proper underlying software is available, as formatted pages. - -JupyterLab, a variant, has more features, commonly associated with an integrated development environment (IDE). - -## VSCode - -[Julia for Visual Studio Code](https://www.julia-vscode.org/) provides support for the julia programming language for [VS Code](https://code.visualstudio.com/). VS Code is an open-sourced code editor supported by Microsoft. VS Code provides a cross-platform interface to `Julia` geared towards programming within the language. diff --git a/CwJ/misc/logo-60-by-48.png b/CwJ/misc/logo-60-by-48.png deleted file mode 100644 index 505917a..0000000 Binary files a/CwJ/misc/logo-60-by-48.png and /dev/null differ diff --git a/CwJ/misc/logo.jl b/CwJ/misc/logo.jl deleted file mode 100644 index cdc3797..0000000 --- a/CwJ/misc/logo.jl +++ /dev/null @@ -1,53 +0,0 @@ -using Plots - -# https://github.com/JuliaLang/julia-logo-graphics -blue, green, purple, red = :royalblue, :forestgreen, :mediumorchid3, :brown3 - - -function archimedes!(p, n, xy=(0,0), radius=1; color=blue) - - x₀,y₀=xy - ts = range(0, 2pi, length=100) - - plot!(p, x₀ .+ sin.(ts), y₀ .+ cos.(ts), linewidth=2) - - α = ((2π)/n)/2 - αs = (-pi/2 + α):2α:(3pi/2 + α) - r = radius/cos(α) - - xs = x₀ .+ r*cos.(αs) - ys = y₀ .+ r*sin.(αs) - - plot!(p, xs, ys, - fill=true, - fillcolor=color, - alpha=0.4) - - r = radius - xs = x₀ .+ r*cos.(αs) - ys = y₀ .+ r*sin.(αs) - - plot!(p, xs, ys, - fill=true, - fillcolor=color, - alpha=0.8) - - p -end - -gr() -Δ = 2.75 -p = plot(;xlims=(-Δ,Δ), ylims=(-Δ,Δ), - axis=nothing, - xaxis=false, - yaxis=false, - legend=false, - padding = (0.0, 0.0), - background_color = :transparent, - foreground_color = :black, - aspect_ratio=:equal) -archimedes!(p, 5, (-1.5, -1); color=red ) -archimedes!(p, 8, (0, 1); color=green ) -archimedes!(p, 13, (1.5, -1); color=purple ) - -savefig(p, "logo.png") diff --git a/CwJ/misc/logo.png b/CwJ/misc/logo.png deleted file mode 100644 index 8499570..0000000 Binary files a/CwJ/misc/logo.png and /dev/null differ diff --git a/CwJ/misc/quick_notes.jmd b/CwJ/misc/quick_notes.jmd deleted file mode 100644 index 07ae6f4..0000000 --- a/CwJ/misc/quick_notes.jmd +++ /dev/null @@ -1,1097 +0,0 @@ -# Quick introduction to Calculus with Julia - -The `Julia` programming language with a design that makes it well suited as a supplement for the learning of calculus, as this collection of notes is intended to illustrate. - - -As `Julia` is open source, it can be downloaded and used like many other programming languages. - - - -Julia can be used through the internet for free using the [mybinder.org](https://mybinder.org) service. This link: [launch binder](https://mybinder.org/v2/gh/CalculusWithJulia/CwJScratchPad.git/master) will take you to website that allows this. -Just click on the `CalcululsWithJulia.ipynb` file after launching Binder by clicking on the badge. Binder provides the Jupyter interface. - - - - ----- - - - -Here are some `Julia` usages to create calculus objects. - -The `Julia` packages loaded below are all loaded when the `CalculusWithJulia` package is loaded. - -A `Julia` package is loaded with the `using` command: - -```julia; -using LinearAlgebra -``` - -The `LinearAlgebra` package comes with a `Julia` installation. Other packages can be added. Something like: - -```julia; eval=false -using Pkg -Pkg.add("SomePackageName") -``` - -These notes have an accompanying package, `CalculusWithJulia`, that when installed, as above, also installs most of the necessary packages to perform the examples. - -Packages need only be installed once, but they must be loaded into *each* session for which they will be used. - -```julia; -using CalculusWithJulia -``` - -Packages can also be loaded through `import PackageName`. Importing does not add the exported objects of a function into the namespace, so is used when there are possible name collisions. - -## Types - -Objects in `Julia` are "typed." Common numeric types are `Float64`, `Int64` for floating point numbers and integers. Less used here are types like `Rational{Int64}`, specifying rational numbers with a numerator and denominator as `Int64`; or `Complex{Float64}`, specifying a comlex number with floating point components. Julia also has `BigFloat` and `BigInt` for arbitrary precision types. Typically, operations use "promotion" to ensure the combination of types is appropriate. Other useful types are `Function`, an abstract type describing functions; `Bool` for true and false values; `Sym` for symbolic values (through `SymPy`); and `Vector{Float64}` for vectors with floating point components. - -For the most part the type will not be so important, but it is useful to know that for some function calls the type of the argument will decide what method ultimately gets called. (This allows symbolic types to interact with Julia functions in an idiomatic manner.) - -## Functions - -### Definition - -Functions can be defined four basic ways: - -* one statement functions follow traditional mathematics notation: - -```julia; hold=true -f(x) = exp(x) * 2x -``` - - - -* multi-statement functions are defined with the `function` keyword. The `end` statement ends the definition. The last evaluated command is returned. There is no need for explicit `return` statement, though it can be useful for control flow. - -```julia; -function g(x) - a = sin(x)^2 - a + a^2 + a^3 -end -``` - - -* Anonymous functions, useful for example, as arguments to other functions or as return values, are defined using an arrow, `->`, as follows: - -```julia -fn = x -> sin(2x) -fn(pi/2) -``` - -In the following, the defined function, `Derivative`, returns an anonymously defined function that uses a `Julia` package, loaded with `CalculusWithJulia`, to take a derivative: - -```julia; -Derivatve(f::Function) = x -> ForwardDiff.derivative(f, x) # ForwardDiff is loaded in CalculusWithJulia -``` - -(The `D` function of `CalculusWithJulia` implements something similar.) - -* Anonymous function may also be created using the `function` keyword. - - - -For mathematical functions $f: R^n \rightarrow R^m$ when $n$ or $m$ is bigger than 1 we have: - -* When $n =1$ and $m > 1$ we use a "vector" for the return value - -```julia; hold=true -r(t) = [sin(t), cos(t), t] -``` - -(An alternative would be to create a vector of functions.) - -* When $n > 1$ and $m=1$ we use multiple arguments or pass the arguments in a container. This pattern is common, as it allows both calling styles. - -```julia; hold=true -f(x, y, z) = x*y + y*z + z*x -f(v) = f(v...) -``` - -Some functions need to pass in a container of values, for this the last definition is useful to expand the values. Splatting takes a container and treats the values like individual arguments. - -Alternatively, indexing can be used directly, as in: - -```julia; hold=true -f(x) = x[1]*x[2] + x[2]*x[3] + x[3]*x[1] -``` - -* For vector fields ($n,m > 1$) a combination is used: - - -```julia; hold=true -F(x,y,z) = [-y, x, z] -F(v) = F(v...) -``` - -### Calling a function - -Functions are called using parentheses to group the arguments. - -```julia; hold=true -f(t) = sin(t)*sqrt(t) -sin(1), sqrt(1), f(1) -``` - -When a function has multiple arguments, yet the value passed in is a container holding the arguments, splatting is used to expand the arguments, as is done in the definition `F(v) = F(v...)`, above. - -### Multiple dispatch - -`Julia` can have many methods for a single generic function. (E.g., it can have many different implementations of addiion when the `+` sign is encountered.) -The *type*s of the arguments and the number of arguments are used for dispatch. - -Here the number of arguments is used: - -```julia; -Area(w, h) = w * h # area of rectangle -Area(w) = Area(w, w) # area of square using area of rectangle defintion -``` - -Calling `Area(5)` will call `Area(5,5)` which will return `5*5`. - -Similarly, the definition for a vector field: - -```julia; hold=true -F(x,y,z) = [-y, x, z] -F(v) = F(v...) -``` - -takes advantage of multiple dispatch to allow either a vector argument or individual arguments. - -Type parameters can be used to restrict the type of arguments that are permitted. The `Derivative(f::Function)` definition illustrates how the `Derivative` function, defined above, is restricted to `Function` objects. - - -### Keyword arguments - -Optional arguments may be specified with keywords, when the function is defined to use them. Keywords are separated from positional arguments using a semicolon, `;`: - -```julia; -circle(x; r=1) = sqrt(r^2 - x^2) -circle(0.5), circle(0.5, r=10) -``` - -The main (but not sole) use of keyword arguments will be with plotting, where various plot attribute are passed as `key=value` pairs. - - - -## Symbolic objects - -The add-on `SymPy` package allows for symbolic expressions to be used. Symbolic values are defined with `@syms`, as below. - - -```julia -using SymPy -``` - -```julia; -@syms x y z -x^2 + y^3 + z -``` - -Assumptions on the variables can be useful, particularly with simplification, as in - -```julia; hold=true -@syms x::real y::integer z::positive -``` - -Symbolic expressions flow through `Julia` functions symbolically - -```julia; -sin(x)^2 + cos(x)^2 -``` - -Numbers are symbolic once `SymPy` interacts with them: - -```julia; -x - x + 1 # 1 is now symbolic -``` - -The number `PI` is a symbolic `pi`. - -```julia; -sin(PI), sin(pi) -``` - - -Use `Sym` to create symbolic numbers, `N` to find a `Julia` number from a symbolic number: - -```julia; -1 / Sym(2) -``` - -```julia; -N(PI) -``` - -Many generic `Julia` functions will work with symbolic objects through multiple dispatch (e.g., `sin`, `cos`, ...). Sympy functions that are not in `Julia` can be accessed through the `sympy` object using dot-call notation: - -```julia; -sympy.harmonic(10) -``` - -Some Sympy methods belong to the object and a called via the pattern `object.method(...)`. This too is the case using SymPy with `Julia`. For example: - -```julia; hold=true -A = [x 1; x 2] -A.det() # determinant of symbolic matrix A -``` - - -## Containers - -We use a few different containers: - -* Tuples. These are objects grouped together using parentheses. They need not be of the same type - -```julia; -x1 = (1, "two", 3.0) -``` - -Tuples are useful for programming. For example, they are uesd to return multiple values from a function. - -* Vectors. These are objects of the same type (typically) grouped together using square brackets, values separated by commas: - -```julia; -x2 = [1, 2, 3.0] # 3.0 makes theses all floating point -``` - -Unlike tuples, the expected arithmatic from Linear Algebra is implemented for vectors. - -* Matrices. Like vectors, combine values of the same type, only they are 2-dimensional. Use spaces to separate values along a row; semicolons to separate rows: - -```julia; -x3 = [1 2 3; 4 5 6; 7 8 9] -``` - -* Row vectors. A vector is 1 dimensional, though it may be identified as a column of two dimensional matrix. A row vector is a two-dimensional matrix with a single row: - -```julia; -x4 = [1 2 3.0] -``` - -These have *indexing* using square brackets: - -```julia; -x1[1], x2[2], x3[3] -``` - -Matrices are usually indexed by row and column: - -```julia; -x3[1,2] # row one column two -``` - -For vectors and matrices - but not tuples, as they are immutable - indexing can be used to change a value in the container: - -```julia; -x2[1], x3[1,1] = 2, 2 -``` - -Vectors and matrices are arrays. As hinted above, arrays have mathematical operations, such as addition and subtraction, defined for them. Tuples do not. - - -Destructuring is an alternative to indexing to get at the entries in certain containers: - -```julia; hold=true -a,b,c = x2 -``` - -### Structured collections - -An arithmetic progression, $a, a+h, a+2h, ..., b$ can be produced *efficiently* using the range operator `a:h:b`: - -```julia; -5:10:55 # an object that describes 5, 15, 25, 35, 45, 55 -``` - -If `h=1` it can be omitted: - -```julia; -1:10 # an object that describes 1,2,3,4,5,6,7,8,9,10 -``` - -The `range` function can *efficiently* describe $n$ evenly spaced points between `a` and `b`: - -```julia; -range(0, pi, length=5) # range(a, stop=b, length=n) for version 1.0 -``` - -This is useful for creating regularly spaced values needed for certain plots. - -## Iteration - - -The `for` keyword is useful for iteration, Here is a traditional for loop, as `i` loops over each entry of the vector `[1,2,3]`: - -```julia; -for i in [1,2,3] - println(i) -end -``` - -!!! note - Technical aside: For assignment within a for loop at the global level, a `global` declaration may be needed to ensure proper scoping. - -List comprehensions are similar, but are useful as they perform the iteration and collect the values: - -```julia; -[i^2 for i in [1,2,3]] -``` - -Comprehesions can also be used to make matrices - -```julia; -[1/(i+j) for i in 1:3, j in 1:4] -``` - -(The three rows are for `i=1`, then `i=2`, and finally for `i=3`.) - -Comprehensions apply an *expression* to each entry in a container through iteration. Applying a function to each entry of a container can be facilitated by: - -* Broadcasting. Using `.` before an operation instructs `Julia` to match up sizes (possibly extending to do so) and then apply the operation element by element: - -```julia; -xs = [1,2,3] -sin.(xs) # sin(1), sin(2), sin(3) -``` - -This example pairs off the value in `bases` and `xs`: - -```julia -bases = [5,5,10] -log.(bases, xs) # log(5, 1), log(5,2), log(10, 3) -``` - -This example broadcasts the scalar value for the base with `xs`: - -```julia -log.(5, xs) -``` - -Row and column vectors can fill in: - -```julia; -ys = [4 5] # a row vector -h(x,y) = (x,y) -h.(xs, ys) # broadcasting a column and row vector makes a matrix, then applies f. -``` - -This should be contrasted to the case when both `xs` and `ys` are (column) vectors, as then they pair off (and here cause a dimension mismatch as they have different lengths): - -```julia; -h.(xs, [4,5]) -``` - -* The `map` function is similar, it applies a function to each element: - -```julia; -map(sin, [1,2,3]) -``` - -!!! note - Many different computer languages implement `map`, broadcasting is less common. `Julia`'s use of the dot syntax to indicate broadcasting is reminiscent of MATLAB, but is quite different. - -## Plots - - -The following commands use the `Plots` package. The `Plots` package expects a choice of backend. We will use `gr` unless, but other can be substituted by calling an appropriate command, suchas `pyplot()` or `plotly()`. - -```julia; -using Plots -``` - -!!! note - The `plotly` backend and `gr` backends are available by default. The `plotly` backend is has some interactivity, `gr` is for static plots. The `pyplot` package is used for certain surface plots, when `gr` can not be used. - - - -### Plotting a univariate function $f:R \rightarrow R$ - -* using `plot(f, a, b)` - -```julia; -plot(sin, 0, 2pi) -``` - -Or - -```julia; hold=true -f(x) = exp(-x/2pi)*sin(x) -plot(f, 0, 2pi) -``` - -Or with an anonymous function - -```julia; -plot(x -> sin(x) + sin(2x), 0, 2pi) -``` - -!!! note - The time to first plot can be lengthy! This can be removed by creating a custom `Julia` image, but that is not introductory level stuff. As well, standalone plotting packages offer quicker first plots, but the simplicity of `Plots` is preferred. Subsequent plots are not so time consuming, as the initial time is spent compiling functions so their re-use is speedy. - - - -Arguments of interest include - -| Attribute | Value | -|:--------------:|:-------------------------------------------------------:| -| `legend` | A boolean, specify `false` to inhibit drawing a legend | -| `aspect_ratio` | Use `:equal` to have x and y axis have same scale | -| `linewidth` | Ingters greater than 1 will thicken lines drawn | -| `color` | A color may be specified by a symbol (leading `:`). | -| | E.g., `:black`, `:red`, `:blue` | - - - -* using `plot(xs, ys)` - -The lower level interface to `plot` involves directly creating x and y values to plot: - -```julia; hold=true -xs = range(0, 2pi, length=100) -ys = sin.(xs) -plot(xs, ys, color=:red) -``` - - -* plotting a symbolic expression - -A symbolic expression of single variable can be plotted as a function is: - -```julia; hold=true -@syms x -plot(exp(-x/2pi)*sin(x), 0, 2pi) -``` - -* Multiple functions - -The `!` Julia convention to modify an object is used by the `plot` command, so `plot!` will add to the existing plot: - -```julia; hold=true -plot(sin, 0, 2pi, color=:red) -plot!(cos, 0, 2pi, color=:blue) -plot!(zero, color=:green) # no a, b then inherited from graph. -``` - -The `zero` function is just 0 (more generally useful when the type of a number is important, but used here to emphasize the $x$ axis). - -### Plotting a parameterized (space) curve function $f:R \rightarrow R^n$, $n = 2$ or $3$ - -* Using `plot(xs, ys)` - -Let $f(t) = e^{t/2\pi} \langle \cos(t), \sin(t)\rangle$ be a parameterized function. Then the $t$ values can be generated as follows: - -```julia; hold=true -ts = range(0, 2pi, length = 100) -xs = [exp(t/2pi) * cos(t) for t in ts] -ys = [exp(t/2pi) * sin(t) for t in ts] -plot(xs, ys) -``` - -* using `plot(f1, f2, a, b)`. If the two functions describing the components are available, then - -```julia; hold=true -f1(t) = exp(t/2pi) * cos(t) -f2(t) = exp(t/2pi) * sin(t) -plot(f1, f2, 0, 2pi) -``` - -* Using `plot_parametric`. If the curve is described as a function of `t` with a vector output, then the `CalculusWithJulia` package provides `plot_parametric` to produce a plot: - -```julia; -r(t) = exp(t/2pi) * [cos(t), sin(t)] -plot_parametric(0..2pi, r) -``` - -The low-level approach doesn't quite work as easily as desired: - -```julia; hold=true -ts = range(0, 2pi, length = 4) -vs = r.(ts) -``` - -As seen, the values are a vector of vectors. To plot a reshaping needs to be done: - -```julia; hold=true -ts = range(0, 2pi, length = 100) -vs = r.(ts) -xs = [vs[i][1] for i in eachindex(vs)] -ys = [vs[i][2] for i in eachindex(vs)] -plot(xs, ys) -``` - -This approach is faciliated by the `unzip` function in `CalculusWithJulia` (and used internally by `plot_parametric`): - -```julia; -ts = range(0, 2pi, length = 100) -plot(unzip(r.(ts))...) -``` - - - -* Plotting an arrow - -An arrow in 2D can be plotted with the `quiver` command. We show the `arrow(p, v)` (or `arrow!(p,v)` function) from the `CalculusWithJulia` package, which has an easier syntax (`arrow!(p, v)`, where `p` is a point indicating the placement of the tail, and `v` the vector to represent): - -```julia;hold=true -plot_parametric(0..2pi, r) -t0 = pi/8 -arrow!(r(t0), r'(t0)) -``` - -### Plotting a scalar function $f:R^2 \rightarrow R$ - -The `surface` and `contour` functions are available to visualize a scalar function of $2$ variables: - -* A surface plot - - - -```julia; hold=true -f(x, y) = 2 - x^2 + y^2 -xs = ys = range(-2,2, length=25) -surface(xs, ys, f) -``` - -The function generates the $z$ values, this can be done by the user and then passed to the `surface(xs, ys, zs)` format: - -```julia; hold=true -f(x, y) = 2 - x^2 + y^2 -xs = ys = range(-2,2, length=25) -surface(xs, ys, f.(xs, ys')) -``` - - - - - -* A contour plot - -The `contour` function is like the `surface` function. - -```julia; hold=true -xs = ys = range(-2,2, length=25) -f(x, y) = 2 - x^2 + y^2 -contour(xs, ys, f) -``` - -The values can be computed easily enough, being careful where the transpose is needed: - -```julia; hold=true -xs = ys = range(-2,2, length=25) -f(x, y) = 2 - x^2 + y^2 -contour(xs, ys, f.(xs, ys')) -``` - - -* An implicit equation. The constraint $f(x,y)=c$ generates an - implicit equation. While `contour` can be used for this type of - plot - by adjusting the requested contours - the `ImplicitPlots` - package does this to make a plot of the equations ``f(x,y) = 0``" - - -```julia; hold=true -using ImplicitPlots -f(x,y) = sin(x*y) - cos(x*y) -implicit_plot(f) -``` - - -### Plotting a parameterized surface $f:R^2 \rightarrow R^3$ - - - -The `pyplot` (and `plotly`) backends allow plotting of parameterized surfaces. - -The low-level `surface(xs,ys,zs)` is used, and can be specified directly as follows: - -```julia; hold=true -X(theta, phi) = sin(phi)*cos(theta) -Y(theta, phi) = sin(phi)*sin(theta) -Z(theta, phi) = cos(phi) -thetas = range(0, pi/4, length=20) -phis = range(0, pi, length=20) -surface(X.(thetas, phis'), Y.(thetas, phis'), Z.(thetas, phis')) -``` - - - - - -### Plotting a vector field $F:R^2 \rightarrow R^2$. - -The `CalculusWithJulia` package provides `vectorfieldplot`, used as: - -```julia; hold=true -F(x,y) = [-y, x] -vectorfieldplot(F, xlim=(-2, 2), ylim=(-2,2), nx=10, ny=10) -``` - -There is also `vectorfieldplot3d`. - - -## Limits - -Limits can be investigated numerically by forming tables, eg.: - -```julia; hold=true -xs = [1, 1/10, 1/100, 1/1000] -f(x) = sin(x)/x -[xs f.(xs)] -``` - -Symbolically, `SymPy` provides a `limit` function: - -```julia; hold=true -@syms x -limit(sin(x)/x, x => 0) -``` - -Or - -```julia; hold=true -@syms h x -limit((sin(x+h) - sin(x))/h, h => 0) -``` - -## Derivatives - -There are numeric and symbolic approaches to derivatives. For the numeric approach we use the `ForwardDiff` package, which performs automatic differentiation. - - -### Derivatives of univariate functions - -Numerically, the `ForwardDiff.derivative(f, x)` function call will find the derivative of the function `f` at the point `x`: - -```julia; -ForwardDiff.derivative(sin, pi/3) - cos(pi/3) -``` - -The `CalculusWithJulia` package overides the `'` (`adjoint`) syntax for functions to provide a derivative which takes a function and returns a function, so its usage is familiar - -```julia; hold=true -f(x) = sin(x) -f'(pi/3) - cos(pi/3) # or just sin'(pi/3) - cos(pi/3) -``` - -Higher order derivatives are possible as well, - - -```julia; hold=true -f(x) = sin(x) -f''''(pi/3) - f(pi/3) -``` - - ----- - -Symbolically, the `diff` function of `SymPy` finds derivatives. - -```julia; hold=true -@syms x -f(x) = exp(-x)*sin(x) -ex = f(x) # symbolic expression -diff(ex, x) # or just diff(f(x), x) -``` - -Higher order derivatives can be specified as well - -```julia; hold=true -@syms x -ex = exp(-x)*sin(x) - -diff(ex, x, x) -``` - -Or with a number: - -```julia; hold=true -@syms x -ex = exp(-x)*sin(x) - -diff(ex, x, 5) -``` - -The variable is important, as this allows parameters to be symbolic - -```julia; hold=true -@syms mu sigma x -diff(exp(-((x-mu)/sigma)^2/2), x) -``` - -### Partial derivatives - -There is no direct partial derivative function provided by `ForwardDiff`, rather we use the result of the `ForwardDiff.gradient` function, which finds the partial derivatives for each variable. To use this, the function must be defined in terms of a point or vector. - -```julia; hold=true -f(x,y,z) = x*y + y*z + z*x -f(v) = f(v...) # this is needed for ForwardDiff.gradient -ForwardDiff.gradient(f, [1,2,3]) -``` - -We can see directly that $\partial{f}/\partial{x} = \langle y + z\rangle$. At the point $(1,2,3)$, this is $5$, as returned above. - ----- - -Symbolically, `diff` is used for partial derivatives: - -```julia; hold=true -@syms x y z -ex = x*y + y*z + z*x -diff(ex, x) # ∂f/∂x -``` - -> Gradient - -As seen, the `ForwardDiff.gradient` function finds the gradient at a point. In `CalculusWithJulia`, the gradient is extended to return a function when called with no additional arguments: - -```julia; hold=true -f(x,y,z) = x*y + y*z + z*x -f(v) = f(v...) -gradient(f)(1,2,3) - gradient(f, [1,2,3]) -``` - -The `∇` symbol, formed by entering `\nabla[tab]`, is mathematical syntax for the gradient, and is defined in `CalculusWithJulia`. - -```julia; hold=true -f(x,y,z) = x*y + y*z + z*x -f(x) = f(x...) -∇(f)(1,2,3) # same as gradient(f, [1,2,3]) -``` - ----- - -In `SymPy`, there is no gradient function, though finding the gradient is easy through broadcasting: - -```julia; hold=true -@syms x y z -ex = x*y + y*z + z*x -diff.(ex, [x,y,z]) # [diff(ex, x), diff(ex, y), diff(ex, z)] -``` - -The `CalculusWithJulia` package provides a method for `gradient`: - -```julia; hold=true -@syms x y z -ex = x*y + y*z + z*x - -gradient(ex, [x,y,z]) -``` - -The `∇` symbol is an alias. It can guess the order of the free symbols, but generally specifying them is needed. This is done with a tuple: - -```julia; hold=true -@syms x y z -ex = x*y + y*z + z*x - -∇((ex, [x,y,z])) # for this, ∇(ex) also works -``` - - -### Jacobian - -The Jacobian of a function $f:R^n \rightarrow R^m$ is a $m\times n$ matrix of partial derivatives. Numerically, `ForwardDiff.jacobian` can find the Jacobian of a function at a point: - -```julia; hold=true -F(u,v) = [u*cos(v), u*sin(v), u] -F(v) = F(v...) # needed for ForwardDiff.jacobian -pt = [1, pi/4] -ForwardDiff.jacobian(F , pt) -``` - ----- - -Symbolically, the `jacobian` function is a method of a *matrix*, so the calling pattern is different. (Of the form `object.method(arguments...)`.) - -```julia; hold=true -@syms u v -F(u,v) = [u*cos(v), u*sin(v), u] -F(v) = F(v...) - -ex = F(u,v) -ex.jacobian([u,v]) -``` - -As the Jacobian can be identified as the matrix with rows given by the transpose of the gradient of the component, it can be computed directly, but it is more difficult: - -```julia; hold=true -@syms u::real v::real -F(u,v) = [u*cos(v), u*sin(v), u] -F(v) = F(v...) - -vcat([diff.(ex, [u,v])' for ex in F(u,v)]...) -``` - -### Divergence - -Numerically, the divergence can be computed from the Jacobian by adding the diagonal elements. This is a numerically inefficient, as the other partial derivates must be found and discarded, but this is generally not an issue for these notes. The following uses `tr` (the trace from the `LinearAlgebra` package) to find the sum of a diagonal. - -```julia; hold=true -F(x,y,z) = [-y, x, z] -F(v) = F(v...) -pt = [1,2,3] -tr(ForwardDiff.jacobian(F , pt)) -``` - -The `CalculusWithJulia` package provides `divergence` to compute the divergence and provides the `∇ ⋅` notation (`\nabla[tab]\cdot[tab]`): - -```julia; hold=true -F(x,y,z) = [-y, x, z] -F(v) = F(v...) - -divergence(F, [1,2,3]) -(∇⋅F)(1,2,3) # not ∇⋅F(1,2,3) as that evaluates F(1,2,3) before the divergence -``` - ----- - - -Symbolically, the divergence can be found directly: - -```julia;hold=true -@syms x y z -ex = [-y, x, z] - -sum(diff.(ex, [x,y,z])) # sum of [diff(ex[1], x), diff(ex[2],y), diff(ex[3], z)] -``` - -The `divergence` function can be used for symbolic expressions: - -```julia; hold=true -@syms x y z -ex = [-y, x, z] - -divergence(ex, [x,y,z]) -∇⋅(ex, [x,y,z]) # For this, ∇ ⋅ F(x,y,z) also works -``` - -### Curl - -The curl can be computed from the off-diagonal elements of the Jacobian. The calculation follows the formula. The `CalculusWithJulia` package provides `curl` to compute this: - -```julia; hold=true -F(x,y,z) = [-y, x, 1] -F(v) = F(v...) - -curl(F, [1,2,3]) -``` - -As well, if no point is specified, a function is returned for which a point may be specified using 3 coordinates or a vector - -```julia; hold=true -F(x,y,z) = [-y, x, 1] -F(v) = F(v...) - -curl(F)(1,2,3), curl(F)([1,2,3]) -``` - -Finally, the `∇ ×` (`\nabla[tab]\times[tab]` notation is available) - -```julia; ohld=true -F(x,y,z) = [-y, x, 1] -F(v) = F(v...) - -(∇×F)(1,2,3) -``` - -For symbolic expressions, we have the `∇ ×` times notation is available **if** the symbolic vector contains all ``3`` variables - -```julia; hold=true -@syms x y z -F = [-y, x, z] # but not [-y, x, 1] which errs; use `curl` with variables specified - -curl([-y, x, 1], (x,y,z)), ∇×F -``` - -## Integrals - -Numeric integration is provided by the `QuadGK` package, for univariate integrals, and the `HCubature` package for higher dimensional integrals. - -```julia; -using QuadGK, HCubature -``` - -### Integrals of univariate functions - -A definite integral may be computed numerically using `quadgk` - -```julia; -quadgk(sin, 0, pi) -``` - -The answer and an estimate for the worst case error is returned. - -If singularities are avoided, improper integrals are computed as well: - -```julia; -quadgk(x->1/x^(1/2), 0, 1) -``` - - ------ - -SymPy provides the `integrate` function to compute both definite and indefinite integrals. - - -```julia; hold=true -@syms a::real x::real -integrate(exp(a*x)*sin(x), x) -``` - -Like `diff` the variable to integrate is specified. - -Definite integrals use a tuple, `(variable, a, b)`, to specify the variable and range to integrate over: - -```julia; hold=true -@syms a::real x::real -integrate(sin(a + x), (x, 0, PI)) # ∫_0^PI sin(a+x) dx -``` - - - - -### 2D and 3D iterated integrals - -Two and three dimensional integrals over box-like regions are computed numerically with the `hcubature` function from the `HCubature` package. If the box is $[x_1, y_1]\times[x_2,y_2]\times\cdots\times[x_n,y_n]$ then the limits are specified through tuples of the form $(x_1,x_2,\dots,x_n)$ and $(y_1,y_2,\dots,y_n)$. - -```julia; hold=true -f(x,y) = x*y^2 -f(v) = f(v...) - -hcubature(f, (0,0), (1, 2)) # computes ∫₀¹∫₀² f(x,y) dy dx -``` - -The calling pattern for more dimensions is identical. - -```julia; hold=true -f(x,y,z) = x*y^2*z^3 -f(v) = f(v...) - -hcubature(f, (0,0,0), (1, 2,3)) # computes ∫₀¹∫₀²∫₀³ f(x,y,z) dz dy dx -``` - -The box-like region requirement means a change of variables may be necessary. For example, to integrate over the region $x^2 + y^2 \leq 1; x \geq 0$, polar coordinates can be used with $(r,\theta)$ in $[0,1]\times[-\pi/2,\pi/2]$. When changing variables, the Jacobian enters into the formula, through - -$$~ -\iint_{G(S)} f(\vec{x}) dV = \iint_S (f \circ G)(\vec{u}) |\det(J_G)(\vec{u})| dU. -~$$ - -Here we implement this: - -```julia; hold=true -f(x,y) = x*y^2 -f(v) = f(v...) -Phi(r, theta) = r * [cos(theta), sin(theta)] -Phi(rtheta) = Phi(rtheta...) -integrand(rtheta) = f(Phi(rtheta)) * det(ForwardDiff.jacobian(Phi, rtheta)) -hcubature(integrand, (0.0,-pi/2), (1.0, pi/2)) -``` - - ----- - -Symbolically, the `integrate` function allows additional terms to be specified. For example, the above could be done through: - -```julia; hold=true -@syms x::real y::real -integrate(x * y^2, (y, -sqrt(1-x^2), sqrt(1-x^2)), (x, 0, 1)) -``` - - -### Line integrals - -A line integral of $f$ parameterized by $\vec{r}(t)$ is computed by: - -$$~ -\int_a^b (f\circ\vec{r})(t) \| \frac{dr}{dt}\| dt. -~$$ - -For example, if $f(x,y) = 2 - x^2 - y^2$ and $r(t) = 1/t \langle \cos(t), \sin(t) \rangle$, then the line integral over $[1,2]$ is given by: - -```julia; hold=true -f(x,y) = 2 - x^2 - y^2 -f(v) = f(v...) -r(t) = [cos(t), sin(t)]/t - -integrand(t) = (f∘r)(t) * norm(r'(t)) -quadgk(integrand, 1, 2) -``` - -To integrate a line integral through a vector field, say $\int_C F \cdot\hat{T} ds=\int_C F\cdot \vec{r}'(t) dt$ we have, for example, - -```julia; hold=true -F(x,y) = [-y, x] -F(v) = F(v...) -r(t) = [cos(t), sin(t)]/t -integrand(t) = (F∘r)(t) ⋅ r'(t) -quadgk(integrand, 1, 2) -``` - ----- - -Symbolically, there is no real difference from a 1-dimensional integral. Let $\phi = 1/\|r\|$ and integrate the gradient field over one turn of the helix $\vec{r}(t) = \langle \cos(t), \sin(t), t\rangle$. - -```julia; hold=true -@syms x::real y::real z::real t::real -phi(x,y,z) = 1/sqrt(x^2 + y^2 + z^2) -r(t) = [cos(t), sin(t), t] -∇phi = diff.(phi(x,y,z), [x,y,z]) -∇phi_r = subs.(∇phi, x.=> r(t)[1], y.=>r(t)[2], z.=>r(t)[3]) -rp = diff.(r(t), t) -global helix = simplify(∇phi_r ⋅ rp ) -``` - -Then - -```julia; -@syms t::real -integrate(helix, (t, 0, 2PI)) -``` - -### Surface integrals - - -The surface integral for a parameterized surface involves a surface element $\|\partial\Phi/\partial{u} \times \partial\Phi/\partial{v}\|$. This can be computed numerically with: - -```julia; -Phi(u,v) = [u*cos(v), u*sin(v), u] -Phi(v) = Phi(v...) - -function SE(Phi, pt) - J = ForwardDiff.jacobian(Phi, pt) - J[:,1] × J[:,2] -end - -norm(SE(Phi, [1,2])) -``` - -To find the surface integral ($f=1$) for this surface over $[0,1] \times [0,2\pi]$, we have: - -```julia; -hcubature(pt -> norm(SE(Phi, pt)), (0.0,0.0), (1.0, 2pi)) -``` - -Symbolically, the approach is similar: - -```julia; -@syms u::real v::real -exₚ = Phi(u,v) -Jₚ = exₚ.jacobian([u,v]) -SurfEl = norm(Jₚ[:,1] × Jₚ[:,2]) |> simplify -``` - -Then - -```julia; -integrate(SurfEl, (u, 0, 1), (v, 0, 2PI)) -``` - -Integrating a vector field over the surface, would be similar: - -```julia; hold=true -F(x,y,z) = [x, y, z] -ex = F(Phi(u,v)...) ⋅ (Jₚ[:,1] × Jₚ[:,2]) -integrate(ex, (u,0,1), (v, 0, 2PI)) -``` diff --git a/CwJ/misc/toc.jmd b/CwJ/misc/toc.jmd deleted file mode 100644 index 7ab6aed..0000000 --- a/CwJ/misc/toc.jmd +++ /dev/null @@ -1,327 +0,0 @@ - -```julia; echo=false -import CalculusWithJulia -logo_url = "https://raw.githubusercontent.com/jverzani/CalculusWithJulia.jl/master/CwJ/misc/logo.png" -txt = """ -
- Card image cap -
-""" -CalculusWithJulia.WeaveSupport.HTMLoutput(txt) -``` - -# Calculus with Julia - - -`CalculusWithJulia.jl` is a package for a set of notes for learning [calculus](http://en.wikipedia.org/wiki/Calculus) using the `Julia` languge. The package contains some support functions and the files that generate the notes being read now. - -Since the mid 90s there has been a push to teach calculus using many different points of view. The [Harvard](http://www.math.harvard.edu/~knill/pedagogy/harvardcalculus/) style rule of four says that as much as possible the conversation should include a graphical, numerical, algebraic, and verbal component. These notes use the programming language [Julia](http://julialang.org) to illustrate the graphical, numerical, and, at times, the algebraic aspects of calculus. - -There are many examples of integrating a computer algebra system (such as `Mathematica`, `Maple`, or `Sage`) into the calculus conversation. Computer algebra systems can be magical. The popular [WolframAlpha](http://www.wolframalpha.com/) website calls the full power of `Mathematica` while allowing an informal syntax that is flexible enough to be used as a backend for Apple's Siri feature. ("Siri what is the graph of x squared minus 4?") -For learning purposes, computer algebra systems model very well the algebraic/symbolic treatment of the material while providing means to illustrate the numeric aspects. -Theses notes are a bit different in that `Julia` is primarily used for the numeric style of computing and the algebraic/symbolic treatment is added on. Doing the symbolic treatment by hand can be very beneficial while learning, and computer algebra systems make those exercises seem kind of pointless, as the finished product can be produced much easier. - -Our real goal is to get at the concepts using technology as much as possible without getting bogged down in the mechanics of the computer language. We feel `Julia` has a very natural syntax that makes the initial start up not so much more difficult than using a calculator. The notes restrict themselves to a reduced set of computational concepts. This set is sufficient for working many of the problems in calculus, but do not cover thoroughly many aspects of programming. (Those who are interested can go off on their own and `Julia` provides a rich opportunity to do so.) Within this restricted set, are operators that make many of the computations of calculus reduce to a function call of the form `action(function, arguments...)`. With a small collection of actions that can be composed, many of the problems associated with introductory calculus can be attacked. - - -These notes are presented in pages covering a fairly focused concept, in a spirit similar to a section of a book. Just like a book, there are try-it-yourself questions at the end of each page. All have a limited number of self-graded answers. These notes borrow ideas from many sources including [Strang](https://ocw.mit.edu/resources/res-18-001-calculus-online-textbook-spring-2005/), [Knill](http://www.math.harvard.edu/~knill/teaching), [Schey](https://www.amazon.com/Div-Grad-Curl-All-That/dp/0393925161/), Thomas Calculus, Rogawski and Adams, and several Wikipedia pages. - - -## Getting started with Julia - -Before beginning, we need to get started with Julia. This is akin to going out and buying a calculator, though it won't take as long. - -- [Getting started](misc/getting_started_with_julia.html) - ----- - -[launch binder](https://mybinder.org/v2/gh/CalculusWithJulia/CwJScratchPad.git/master) - -Julia can be used through the internet for free using the [mybinder.org](https://mybinder.org) service. -Click on the `CalculusWithJulia.ipynb` file after launching Binder by clicking on the badge. - - - - -## Precalculus - -Many of the necessary computational skills needed for employing `Julia` successfully to assist in learning calculus are in direct analogy to concepts of mathematics that are first introduced in precalculus or prior. This precalculus *review*, covers some of the basic materials mathematically (though not systematically). More importantly it illustrates the key computational mechanics we will use throughout. - -A quick rundown of the `Julia` concepts presented in this setion is in a [Julia overview](precalc/julia_overview.html). - -### Number systems - -Taking for granted a familiarity with basic calculators, we show in these two sections how `Julia` implements the functionality of a calculator in a manner not so different. - -- [Calculator](precalc/calculator.html) - -- [Variables](precalc/variables.html) - - -Calculators really only use one type of number - floating point numbers. Floating point numbers are a model for the real numbers. However, there are many different sets of numbers in mathematics. Common ones include the integers, rational numbers, real numbers, and complex numbers. As well, we discuss logical values and vectors of numbers. Though integers are rational numbers, rational numbers are real numbers, and real numbers may be viewed as complex numbers, mathematically, these distinctions serve a purpose. `Julia` also makes these distinctions and more. - - - -- [Number systems](precalc/numbers_types.html) - -- [Inequalities and Boolean values](precalc/logical_expressions.html) - - -Vectors as a mathematical object could be postponed for later, but they are introduced here as the `Julia` implementation makes an excellent choice for a container of one or more values. We also see how to work with more than one value at a time, a useful facility in future work. - -- [Vectors](precalc/vectors.html) - -An arithmetic progression is a sequence of the form $a, a+h, a+2h, \dots, a+kh$. For example $3, 10, 17, 24, .., 52$. They prove very useful in describing collections of numbers. We introduce the range operator that models these within `Julia` and broadcasting, mapping, and comprehensions -- various styles that allow one to easily modify the simple sequences. - - -- [Arithmetic progressions](precalc/ranges.html) - -### Functions - -The use of functions within calculus is widespread. This section shows how the basic usage within `Julia` follows very closely to common mathematical usage. It also shows that the abstract concept of a function is quite valuable. - -- [Functions](precalc/functions.html) - -A graphing calculator makes it very easy to produce a graph. `Julia`, using the `Plots` package, makes it even easier and more flexible. - -- [Graphs of functions](precalc/plotting.html) - -- [Transformations of functions](precalc/transformations.html) - -- [Inverse functions](precalc/inversefunctions.html) - -#### Polynomials - -Polynomials play an important role in calculus. They give a family of functions for which the basic operations are well understood. In addition, they can be seen to provide approximations to functions. This section discusses polynomials and introduces the add-on package `SymPy` for manipulating expressions in `Julia` symbolically. (This package uses the SymPy library from Python.) - - -- [Polynomials](precalc/polynomial.html) - - -The roots of a univariate polynomial are the values of $x$ for which $p(x)=0$. Roots are related to its factors. In calculus, the zeros of a derived function are used to infer properties of a function. This section shows some tools in `SymPy` to find factors and roots, when they are available, and introduces the `Roots` package for estimating roots numerically. - - -- [Polynomial roots](precalc/polynomial_roots.html) - -The `Polynomials` package provides methods for working with polynomials of different types. - -- [The `Polynomials` package](precalc/polynomials_package.html) - -A rational expression is the ratio of two polynomial expressions. This section covers some additional details that arise when graphing such expressions. - -- [Rational functions](precalc/rational_functions.html) - -#### Exponential and logarithmic functions - -- [Exponential and logarithmic functions](precalc/exp_log_functions.html) - - -#### Trigonometric functions - -Trigonometric functions are used to describe triangles, circles and oscillatory behaviors. This section provide a brief review. - -- [Trigonometric functions](precalc/trig_functions.html) - -## Limits and Continuity - -The notion of a limit is at the heart of the two main operations of calculus: differentiation and integration. - -- [Limits](limits/limits.html) - -- [Examples and extensions of the basic limit definition](limits/limits_extensions.html) - - -Continuous functions are at the center of any discussion of calculus concepts. These sections define them and illustrate a few implications for continuous functions. - -- [Continuity](limits/continuity.html) - -- [Implications of continuity](limits/intermediate_value_theorem.html) includes the intermediate value theorem, the extreme value theorem and the bisection method. - -## Derivatives - -The derivative of a function is a derived function that for each $x$ yields the slope of the *tangent line* of the graph of $f$ at $(x,f(x))$. - -- [Derivatives](derivatives/derivatives.html) - -- [Numeric derivatives](derivatives/numeric_derivatives.html) - -- [Symbolic derivatives](derivatives/symbolic_derivatives.html) - - -The derivative of a function has certain features. These next sections explore one of the first uses of the derivative - using its zeros to characterize the original function. - -- [The Mean Value Theorem](derivatives/mean_value_theorem.html) - -- [Optimization](derivatives/optimization.html) - -- [First and second derivatives](derivatives/first_second_derivatives.html) - -- [Curve sketching](derivatives/curve_sketching.html) - - -The tangent line to the graph of a function at a point has slope given through the derivative. That the tangent line is the best linear approximation to the curve yields some insight to the curve through knowledge of just the tangent lines. - -- [Linearization](derivatives/linearization.html) - -- [Newton's method](derivatives/newtons_method.html) - -- [Derivative-free zero-finding methods](derivatives/more_zeros.html) - -- [L'Hospital's rule](derivatives/lhospitals_rule.html) - -The derivative finds use outside of the traditional way of specifying a function or relationship. These two sections look at some different cases. - -- [Implicit differentiation](derivatives/implicit_differentiation.html) - -- [Related rates](derivatives/related_rates.html) - -A generalization of the tangent line as the "best" approximation to a function by a line leads to the concept of the Taylor polynomial. - -- [Taylor polynomials](derivatives/taylor_series_polynomials.html) - -## Integration - -The integral is initially defined in terms of an associated area and then generalized. The Fundamental Theorem of Calculus allows this area to be computed easily through a related function and specifies the relationship between the integral and the derivative. - -- [Area](integrals/area.html) - -- [The Fundamental Theorem of Calculus](integrals/ftc.html) - -Integration is not algorithmic, but rather problems can involve an array of techniques. Many of these are implemented in `SymPy`. Theses sections introduce the main techniques that find widespread usage. - -- [Substitution](integrals/substitution.html) - -- [Integration by parts](integrals/integration_by_parts.html) - -- [Partial fractions](integrals/partial_fractions.html) - -- [Improper integrals](integrals/improper_integrals.html) - -### Applications - - - -Various applications of the integral are presented. The first two sections continue with the idea that an integral is related to area. From there, it is seen that volumes, arc-lengths, and surface areas may be expressed in terms of related integrals. - -- [Mean Value Theorem for integrals](integrals/mean_value_theorem.html) - -- [Area between curves](integrals/area_between_curves.html) - -- [Center of mass](integrals/center_of_mass.html) - -- [Volumes by slicing](integrals/volumes_slice.html) - -- [Arc length](integrals/arc_length.html) - -- [Surface Area](integrals/surface_area.html) - -### Ordinary differential equations - -Ordinary differential equations are an application of integration and the fundamental theorem of calculus. - -- [ODEs](ODEs/odes.html) - -- [Euler method](ODEs/euler.html) - -- [The problem-algorithm-solve interface](ODEs/solve.html) - -- [The DifferentialEquations suite of packages](ODEs/differential_equations.html) - -## Multivariable calculus - -Univariate functions take a single number as an input and return a number as the output. Notationally, we write $f: R \rightarrow R$. More generally, a function might have several input variables and might return several output variables, notationally $F: R^n \rightarrow R^m$, for positive, integer values of $n$ and $m$. Special cases are when $n=1$ (a space curve) or when $m=1$ (a scalar-valued function). Many of the concepts of calculus for univariate functions carry over, with suitable modifications. - - -Polar coordinates are an often useful alternative to describing location in the $x$-$y$ plane. - -- [Polar Coordinates](differentiable_vector_calculus/polar_coordinates.html) - -The calculus of functions involving more than $1$ variable is greatly simplified by the introduction of vectors and matrices. These objects, and their associated properties, allow many of the concepts of calculus of a single variable to be carried over. - -- [Vectors](differentiable_vector_calculus/vectors.html) - -### Differentiable vector calculus - -In general we will consider multivariable functions from $R^n$ into $R^m$ (functions of $n$ variables that return $m$ different values), but it is helpful to specialize to two cases first. These are vector valued functions ($f: R \rightarrow R^n$) and scalar functions ($f:R^n \rightarrow R$). - -- [Vector-valued functions](differentiable_vector_calculus/vector_valued_functions.html) - -- [Scalar functions and their derivatives](differentiable_vector_calculus/scalar_functions.html) - - -We discuss applications of the derivative for scalar functions. These include linearization, optimization, and constrained optimization. - -- [Applications for scalar functions](differentiable_vector_calculus/scalar_functions_applications.html) - -The derivative of a mulitvariable function is discussed here. We will see that with the proper notation, many formulas from single variable calculus will hold with slight modifications. - -- [Vector fields](differentiable_vector_calculus/vector_fields.html) - - - -### Integral vector calculus - -Integral vector calculus begins with a generalization of integration to compute area to integration to compute volumes (and its generalization to higher dimensions). The integration concept is then extended to integration over curves and surfaces. With this, generalizations of the fundamental theorem of calculus are discussed. - -We begin with the generalization of the Riemann integral to compute area to the computation of volume and its higher dimensional interpretations. - -- [Double and triple integrals](integral_vector_calculus/double_triple_integrals.html) - -Line and surface integrals are computed by 1- and 2-dimensional integrals, but offer new interpretations, espcially when vector fields are considered. - -- [Line and surface integrals](integral_vector_calculus/line_integrals.html) - -There are three main operations in differential vector calculus, the gradient, the divergence, and the curl. This is an introduction to the two latter ones. - -- [Divergence and curl](integral_vector_calculus/div_grad_curl.html) - - -The fundamental theorem of calculus states that a definite integral over an interval can be computed using a related function and the boundary points of the interval. The fundamental theorem of line integrals is a higher dimensional analog. In this section, related theorems are considered: Green's theorem in $2$ dimensions and Stokes' theorem and the divergence theorem in $3$ dimensions. - -- [Green's theorem, Stokes' theorem, and the divergence theorem](integral_vector_calculus/stokes_theorem.html) - - - ----- - -Here is a quick review of the math topics discussed on vector calculus. - -- [Review of vector calculus](integral_vector_calculus/review.html) - -For reference purposes, there are examples of creating graphics for `Plots`,`Makie`, and `PlotlyLight`. - -- [Two- and three-dimensional graphics with Plots](differentiable_vector_calculus/plots_plotting.html) - -- [Two- and three-dimensional graphics with Makie](alternatives/makie_plotting.html) - -- [Two- and three-dimensional graphics with PlotlyLight](alternatives/plotly_plotting.html) - - -## Bibliography - -- [Bibliography](misc/bibliography.html) - -## A quick review - -- [Quick notes](misc/quick_notes.html) - -A review of the `Julia` concepts used within these notes. - - -## Miscellaneous - -- Some different [interfaces](misc/julia_interfaces.html) to `Julia`. - -- The [CalculusWithJulia](misc/calculus_with_julia.html) package. - -- [Unicode symbol](misc/unicode.html) usage in `Julia`. - - ----- - -## Contributing, commenting, ... - - -This is a work in progress. To report an issue, make a comment, or suggest something new, please file an [issue](https://github.com/jverzani/CalculusWithJulia.jl/issues/). In your message add the tag `@jverzani` to ensure it is not overlooked. Otherwise, an email to `verzani` at `math.csi.cuny.edu` will also work. - -To make edits to the documents directly, a pull request with the modified `*.jmd` files in the `CwJ` directory should be made. Minor edits to the `*.jmd` files should be possible through the GitHub web interface. In the footer of each page a pencil icon accompanying "suggest an edit" when clicked should cause the opening of the corresponding `*.jmd` file on GitHub for suggesting modifications. The html files will be generated independently, that need not be done. diff --git a/CwJ/misc/unicode.jmd b/CwJ/misc/unicode.jmd deleted file mode 100644 index 22325ea..0000000 --- a/CwJ/misc/unicode.jmd +++ /dev/null @@ -1,39 +0,0 @@ -# Usages of Unicode symbols - - -`Julia` allows the use of *Unicode* symbols to replace variable names and for function calls. Unicode operations are entered in this pattern `\name[tab]`. That is a slash, `\`, the name (e.g., `alpha`), and then a press of the `tab` key. - -In these notes, the following may appear as variable or function names - - -| `\Name` | Symbol | Usage notes | -|:---------------- |:------ |:------------------------------- | -| `\euler` | `ℯ` | The variable `e` | -| `\pi` | `π` | | -| `\alpha` | `α` | | -| `\beta` | `β` | | -| `\delta` | `δ` | | -| `\Delta` | `Δ` | Change, as in `Δx` | -| `\gamma` | `γ` | | -| `\phi` | `ϕ` | | -| `\Phi` | `Φ` | Used for parameterized surfaces | -| `x\_1` | `x₁` | Subscripts | -| `r\vec` | `r⃗` | Vector annotation | -| `T\hat` | `T̂` | Unit vector annotation | - -The following are associated with derivatives - -| `\Name` | Symbol | Usage notes | -|:---------------- |:------ |:------------------------------- | -| `\partial` | `∂` | | -| `\nabla` | `∇` | del operator in CwJ package | - -The following are *infix* operators - -| `\Name` | Symbol | Usage notes | -|:---------------- |:------ |:------------------------------- | -| `\circ` | `∘` | composition | -| `\cdot` | `⋅` | dot product | -| `\times` | `×` | cross product | - -Infix operators may need parentheses due to precedence rules. For example, to call a composition, one needs `(f ∘ g)(x)` so that composition happens before function evaluation (`g(x)`). diff --git a/CwJ/misc/using-pluto.jmd b/CwJ/misc/using-pluto.jmd deleted file mode 100644 index c99329d..0000000 --- a/CwJ/misc/using-pluto.jmd +++ /dev/null @@ -1,6 +0,0 @@ -# Using Pluto - - - -!!! note - We see in this notebook the use of `let` blocks, which is not typical with `Pluto`. As `Pluto` is reactive -- meaning changes in a variable propagate automatically to variables which reference the changed one -- a variable can only be used *once* per notebook at the top level. The `let` block, like a function body, introduces a separate scope for the binding so `Pluto` doesn't incorporate the binding in its reactive model. This is necessary as we have more than one function named `f`. This is unlike `begin` blocks, which are quite typical in `Pluto`. The `begin` blocks allow one or more commands to occur in a cell, as the design of `Pluto` is one object per cell. diff --git a/CwJ/precalc/Project.toml b/CwJ/precalc/Project.toml deleted file mode 100644 index 74acd04..0000000 --- a/CwJ/precalc/Project.toml +++ /dev/null @@ -1,11 +0,0 @@ -[deps] -DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" -Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" -IntervalArithmetic = "d1acc4aa-44c8-5952-acd4-ba5d80a2a253" -Measures = "442fdcdd-2543-5da2-b0f3-8c86c306513e" -Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" -Polynomials = "f27b6e38-b328-58d1-80ce-0feddd5e7a45" -PyPlot = "d330b81b-6aea-500a-939a-2ce795aea3ee" -RealPolynomialRoots = "87be438c-38ae-47c4-9398-763eabe5c3be" -Richardson = "708f8203-808e-40c0-ba2d-98a6953ed40d" -Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" diff --git a/CwJ/precalc/cache/calculator.cache b/CwJ/precalc/cache/calculator.cache deleted file mode 100644 index 1d685b1..0000000 Binary files a/CwJ/precalc/cache/calculator.cache and /dev/null differ diff --git a/CwJ/precalc/cache/exp_log_functions.cache b/CwJ/precalc/cache/exp_log_functions.cache deleted file mode 100644 index be89b19..0000000 Binary files a/CwJ/precalc/cache/exp_log_functions.cache and /dev/null differ diff --git a/CwJ/precalc/cache/functions.cache b/CwJ/precalc/cache/functions.cache deleted file mode 100644 index 255d34f..0000000 Binary files a/CwJ/precalc/cache/functions.cache and /dev/null differ diff --git a/CwJ/precalc/cache/inversefunctions.cache b/CwJ/precalc/cache/inversefunctions.cache deleted file mode 100644 index bc3bef4..0000000 Binary files a/CwJ/precalc/cache/inversefunctions.cache and /dev/null differ diff --git a/CwJ/precalc/cache/julia_overview.cache b/CwJ/precalc/cache/julia_overview.cache deleted file mode 100644 index f4f91ea..0000000 Binary files a/CwJ/precalc/cache/julia_overview.cache and /dev/null differ diff --git a/CwJ/precalc/cache/logical_expressions.cache b/CwJ/precalc/cache/logical_expressions.cache deleted file mode 100644 index 10c832c..0000000 Binary files a/CwJ/precalc/cache/logical_expressions.cache and /dev/null differ diff --git a/CwJ/precalc/cache/numbers_types.cache b/CwJ/precalc/cache/numbers_types.cache deleted file mode 100644 index 5cff264..0000000 Binary files a/CwJ/precalc/cache/numbers_types.cache and /dev/null differ diff --git a/CwJ/precalc/cache/plotting.cache b/CwJ/precalc/cache/plotting.cache deleted file mode 100644 index 9fd533b..0000000 Binary files a/CwJ/precalc/cache/plotting.cache and /dev/null differ diff --git a/CwJ/precalc/cache/polynomial.cache b/CwJ/precalc/cache/polynomial.cache deleted file mode 100644 index a4139ac..0000000 Binary files a/CwJ/precalc/cache/polynomial.cache and /dev/null differ diff --git a/CwJ/precalc/cache/polynomial_roots.cache b/CwJ/precalc/cache/polynomial_roots.cache deleted file mode 100644 index 8a10874..0000000 Binary files a/CwJ/precalc/cache/polynomial_roots.cache and /dev/null differ diff --git a/CwJ/precalc/cache/ranges.cache b/CwJ/precalc/cache/ranges.cache deleted file mode 100644 index 15e9aba..0000000 Binary files a/CwJ/precalc/cache/ranges.cache and /dev/null differ diff --git a/CwJ/precalc/cache/rational_functions.cache b/CwJ/precalc/cache/rational_functions.cache deleted file mode 100644 index 1ecbb8b..0000000 Binary files a/CwJ/precalc/cache/rational_functions.cache and /dev/null differ diff --git a/CwJ/precalc/cache/testing.cache b/CwJ/precalc/cache/testing.cache deleted file mode 100644 index f686629..0000000 Binary files a/CwJ/precalc/cache/testing.cache and /dev/null differ diff --git a/CwJ/precalc/cache/transformations.cache b/CwJ/precalc/cache/transformations.cache deleted file mode 100644 index 89d8608..0000000 Binary files a/CwJ/precalc/cache/transformations.cache and /dev/null differ diff --git a/CwJ/precalc/cache/trig_functions.cache b/CwJ/precalc/cache/trig_functions.cache deleted file mode 100644 index 6cd4ba5..0000000 Binary files a/CwJ/precalc/cache/trig_functions.cache and /dev/null differ diff --git a/CwJ/precalc/cache/variables.cache b/CwJ/precalc/cache/variables.cache deleted file mode 100644 index 4c30b1b..0000000 Binary files a/CwJ/precalc/cache/variables.cache and /dev/null differ diff --git a/CwJ/precalc/cache/vector.cache b/CwJ/precalc/cache/vector.cache deleted file mode 100644 index 690d73e..0000000 Binary files a/CwJ/precalc/cache/vector.cache and /dev/null differ diff --git a/CwJ/precalc/cache/vectors.cache b/CwJ/precalc/cache/vectors.cache deleted file mode 100644 index 1c5c6be..0000000 Binary files a/CwJ/precalc/cache/vectors.cache and /dev/null differ diff --git a/CwJ/precalc/calculator.jmd b/CwJ/precalc/calculator.jmd deleted file mode 100644 index c6137b5..0000000 --- a/CwJ/precalc/calculator.jmd +++ /dev/null @@ -1,1048 +0,0 @@ -# From calculator to computer - - -```julia; echo=false; -using CalculusWithJulia -using CalculusWithJulia.WeaveSupport - -frontmatter = ( - title = "From calculator to computer", - description = "Calculus with Julia: Replacing the calculator with a computer", - tags = ["CalculusWithJulia", "precalc", "replacing the calculator with a computer"], -); - -nothing -``` - - -Let us consider a basic calculator with buttons to add, subtract, -multiply, divide, and take square roots. Using such a simple thing is -certainly familiar for any reader of these notes. Indeed, a -familiarity with a *graphing* calculator is expected. `Julia` makes -these familiar tasks just as easy, offering numerous conveniences along the -way. In this section we describe how. - -The following image is the calculator that Google presents upon searching for "calculator." - -```julia; echo=false; -imgfile = "figures/calculator.png" -caption = "Screenshot of a calculator provided by the Google search engine." -ImageFile(:precalc, imgfile, caption) -``` - - -This calculator should have a familiar appearance with a keypad of -numbers, a set of buttons for arithmetic operations, a set of buttons -for some common mathematical functions, a degree/radian switch, and -buttons for interacting with the calculator: `Ans`, `AC` (also `CE`), -and `=`. - -The goal here is to see the counterparts within `Julia` to these features. - - ----- - -For an illustration of a *really* basic calculator, have some fun watching this video: - -```julia; echo=false; -txt = """ -
- -
-""" -CalculusWithJulia.WeaveSupport.HTMLoutput(txt) -``` - -## Operations - -Performing a simple computation on the calculator typically involves -hitting buttons in a sequence, such as "`1`", "`+`", "`2`", "`=`" to compute -`3` from adding `1 + 2`. In `Julia`, the process is not so -different. Instead of pressing buttons, the various values are -typed in. So, we would have: - -```julia; -1 + 2 -``` - -Sending an expression to `Julia`'s interpreter - the equivalent of -pressing the "`=`" key on a calculator - is done at the command line -by pressing the `Enter` or `Return` key, and in `Pluto`, also using the -"play" icon, or the keyboard shortcut `Shift-Enter`. If the current -expression is complete, then `Julia` evaluates it and shows any -output. If the expression is not complete, `Julia`'s response depends -on how it is being called. Within `Pluto`, a message about -"`premature end of input`" is given. If the expression raises an error, -this will be noted. - - - -The basic arithmetic operations on a calculator are "+", "-", "×", -"÷", and "$xʸ$". These have parallels in `Julia` through the *binary* -operators: `+`, `-`, `*`, `/`, and `^`: - -```julia; -1 + 2, 2 - 3, 3 * 4, 4 / 5, 5 ^ 6 -``` - -On some calculators, there is a distinction between minus signs - the -binary minus sign and the unary minus sign to create values such as -$-1$. - -In `Julia`, the same symbol, "`-`", is used for each: - -```julia; --1 - 2 -``` - -An expression like $6 - -3$, subtracting minus three from six, must be handled with some care. With the -Google calculator, the expression must be entered with accompanying -parentheses: $6 -(-3)$. In `Julia`, parentheses may be used, but are not needed. However, if omitted, a -space is required between the two minus signs: - -```julia; -6 - -3 -``` - -(If no space is included, the value "`--`" is parsed like a different, undefined, operation.) - - -```julia; echo=false -warning(L""" - -`Julia` only uses one symbol for minus, but web pages may not! Copying -and pasting an expression with a minus sign can lead to hard to -understand errors such as: `invalid character "−"`. There are several -Unicode symbols that look similar to the ASCII minus sign, but are -different. These notes use a different character for the minus sign for -the typeset math (e.g., $1 - \pi$) than for the code within cells -(e.g. `1 - 2`). Thus, copying and pasting the typeset math may not work as expected. - -""") -``` - -### Examples - -##### Example - -For everyday temperatures, the conversion from Celsius to Fahrenheit -($9/5 C + 32$) is well approximated by simply doubling and -adding $30$. Compare these values for an average room temperature, $C=20$, and for a relatively chilly day, $C=5$: - -For $C=20$: - -```julia; -9 / 5 * 20 + 32 -``` - -The easy to compute approximate value is: - -```julia; -2 * 20 + 30 -``` - -The difference is: - -```julia; -(9/5*20 + 32) - (2 * 20 + 30) -``` - -For $C=5$, we have the actual value of: - -```julia; -9 / 5 * 5 + 32 -``` - -and the easy to compute value is simply $40 = 10 + 30$. The difference is - -```julia; -(9 / 5 * 5 + 32) - 40 -``` - - -##### Example - -Add the numbers $1 + 2 + 3 + 4 + 5$. - -```julia; -1 + 2 + 3 + 4 + 5 -``` - -##### Example - -How small is $1/2/3/4/5/6$? It is about $14/10,000$, as this will show: - -```julia; -1/2/3/4/5/6 -``` - -##### Example - -Which is bigger $4^3$ or $3^4$? We can check by computing their difference: - -```julia; -4^3 - 3^4 -``` - -So $3^4$ is bigger. - -##### Example - -A right triangle has sides $a=11$ and $b=12$. Find the length of the - hypotenuse squared. As $c^2 = a^2 + b^2$ we have: - -```julia; -11^2 + 12^2 -``` - -## Order of operations - -The calculator must use some rules to define how it will evaluate its instructions when two or more operations are involved. We know mathematically, that when $1 + 2 \cdot 3$ is to be evaluated the multiplication is done first then the addition. - -With the Google Calculator, typing `1 + 2 x 3 =` will give the value -$7$, but *if* we evaluate the `+` sign first, via `1` `+` `2` `=` `x` `3` `=` the -answer will be 9, as that will force the addition of `1+2` before -multiplying. The more traditional way of performing that calculation -is to use *parentheses* to force an evaluation. That is, -`(1 + 2) * 3 =` will produce `9` (though one must type it in, and not use a mouse -to enter). Except for the most primitive of calculators, there are -dedicated buttons for parentheses to group expressions. - -In `Julia`, the entire expression is typed in before being evaluated, -so the usual conventions of mathematics related to the order of -operations may be used. These are colloquially summarized by the -acronym [PEMDAS](http://en.wikipedia.org/wiki/Order_of_operations). - -> **PEMDAS**. This acronym stands for Parentheses, Exponents, -> Multiplication, Division, Addition, Subtraction. The order indicates -> which operation has higher precedence, or should happen first. This -> isn't exactly the case, as "M" and "D" have the same precedence, as -> do "A" and "S". In the case of two operations with equal precedence, -> *associativity* is used to decide which to do. For the operations -> `+`, `-`, `*`, `/` the associativity is left to right, as in the -> left one is done first, then the right. However, `^` has right -> associativity, so `4^3^2` is `4^(3^2)` and not `(4^3)^2`. (Be warned -> that some calculators - and spread sheets, such as Excel - will -> treat this expression with left associativity.) - - -With rules of precedence, an expression like the following has a -clear interpretation to `Julia` without the need for parentheses: - -```julia; -1 + 2 - 3 * 4 / 5 ^ 6 -``` - -Working through PEMDAS we see that `^` is first, then `*` and then `/` -(this due to associativity and `*` being the leftmost expression of -the two) and finally `+` and then `-`, again by associativity -rules. So we should have the same value with: - -```julia; -(1 + 2) - ((3 * 4) / (5 ^ 6)) -``` - -If different parentheses are used, the answer will likely be different. For example, the following forces the operations to be `-`, then `*`, then `+`. The result of that is then divided by `5^6`: - -```julia; -(1 + ((2 - 3) * 4)) / (5 ^ 6) -``` - - -### Examples - -##### Example - -The percentage error in $x$ if $y$ is the correct value is $(x-y)/y \cdot 100$. Compute this if $x=100$ and $y=98.6$. - -```julia; -(100 - 98.6) / 98.6 * 100 -``` - -##### Example - -The marginal cost of producing one unit can be computed by - finding the cost for $n+1$ units and subtracting the cost for - $n$ units. If the cost of $n$ units is $n^2 + 10$, find the marginal cost when $n=100$. - -```julia; -(101^2 + 10) - (100^2 + 10) -``` - -##### Example - -The average cost per unit is the total cost divided by the number of units. Again, if the cost of $n$ units is $n^2 + 10$, find the average cost for $n=100$ units. - -```julia; -(100^2 + 10) / 100 -``` - -##### Example - -The slope of the line through two points is $m=(y_1 - y_0) / (x_1 - x_0)$. For the two points $(1,2)$ and $(3,4)$ find the slope of the line through them. - -```julia; -(4 - 2) / (3 - 1) -``` - -### Two ways to write division - and they are not the same - -The expression $a + b / c + d$ is equivalent to $a + (b/c) + d$ due to the order of operations. It will generally have a different answer than $(a + b) / (c + d)$. - -How would the following be expressed, were it written inline: - -```math -\frac{1 + 2}{3 + 4}? -``` - -It would have to be computed through $(1 + 2) / (3 + 4)$. This is -because unlike `/`, the implied order of operation in the mathematical -notation with the *horizontal division symbol* (the -[vinicula](http://tinyurl.com/y9tj6udl)) is to compute the top and the -bottom and then divide. That is, the vinicula is a grouping notation -like parentheses, only implicitly so. Thus the above expression really -represents the more verbose: - - -```math -\frac{(1 + 2)}{(3 + 4)}. -``` - -Which lends itself readily to the translation: - -```julia; -(1 + 2) / (3 + 4) -``` - -To emphasize, this is not the same as the value without the parentheses: - -```julia; -1 + 2 / 3 + 4 -``` - -!!! warning - The viniculum also indicates grouping when used with the - square root (the top bar), and complex conjugation. That usage is - often clear enough, but the usage of the viniculum in division - often leads to confusion. The example above is one where the - parentheses are often, erroneously, omitted. However, more - confusion can arise when there is more than one vinicula. An - expression such as $a/b/c$ written inline has no confusion, it is: - $(a/b) / c$ as left association is used; but when written with a - pair of vinicula there is often the typographical convention of a - slightly longer vinicula to indicate which is to be considered - first. In the absence of that, then top to bottom association is - often implied. - - - -### Infix, postfix, and prefix notation - -The factorial button on the Google Button creates an expression like -`14!` that is then evaluated. The operator, `!`, appears after the -value (`14`) that it is applied to. This is called *postfix -notation*. When a unary minus sign is used, as in `-14`, the minus -sign occurs before the value it operates on. This uses *prefix -notation*. These concepts can be extended to binary operations, where -a third possibility is provided: *infix notation*, where the operator -is between the two values. The infix notation is common for our -familiar mathematical operations. We write `14 + 2` and not `+ 14 2` -or `14 2 +`. (Though if we had an old reverse-Polish notation -calculator, we would enter `14 2 +`!) In `Julia`, there are several -infix operators, such as `+`, `-`, ... and others that we may be -unfamiliar with. These mirror the familiar notation from most math -texts. - - -!!! note - In `Julia` many infix operations can be done using a prefix manner. For example `14 + 2` can also be evaluated by `+(14,2)`. There are very few *postfix* operations, though in these notes we will overload one, the `'` operation, to indicate a derivative. - - -## Constants - -The Google calculator has two built in constants, `e` and `π`. Julia provides these as well, though not quite as easily. First, `π` is just `pi`: - -```julia; -pi -``` - -Whereas, `e` is is not simply the character `e`, but *rather* a [Unicode](../unicode.html) character typed in as `\euler[tab]`. - -```julia -ℯ -``` - -!!! note - However, when the accompanying package, `CalculusWithJulia`, is loaded, the character `e` will refer to a floating point approximation to the Euler constant . - - -In the sequel, we will just use `e` for this constant (though more commonly the `exp` function), with the reminder that base `Julia` alone does not reserve this symbol. - -Mathematically these are irrational values with decimal expansions that do not repeat. `Julia` represents these values internally with additional accuracy beyond that which is displayed. Math constants can be used as though they were numbers, such is done with this expression: - -```julia; -ℯ^(1/(2*pi)) -``` - - -!!! warning - In most cases. There are occasional (basically rare) spots - where using `pi` by itself causes an eror where `1*pi` will - not. The reason is `1*pi` will create a floating point value from - the irrational object, `pi`. - - - -### Numeric literals - -For some special cases, Julia implements *multiplication* without a -multiplication symbol. This is when the value on the left is a number, -as in `2pi`, which has an equivalent value to `2*pi`. *However* the -two are not equivalent, in that multiplication with *numeric literals* -does not have the same precedence as regular multiplication - it is -higher. This has practical importance when used in division or -powers. For instance, these two are **not** the same: - -```julia; -1/2pi, 1/2*pi -``` - -Why? Because the first `2pi` is performed before division, as multiplication with numeric literals has higher precedence than regular multiplication, which is at the same level as division. - -To confuse things even more, consider - -```julia; -2pi^2pi -``` - -Is this the same as `2 * (pi^2) * pi` or `(2pi)^(2pi)`?. The former would be the case is powers had higher precedence than literal multiplication, the latter would be the case were it the reverse. In fact, the correct answer is `2 * (pi^(2*pi))`: - -```julia; -2pi^2pi, 2 * (pi/2) * pi, (2pi)^(2pi), 2 * (pi^(2pi)) -``` - -This follows usual mathematical convention, but is a source of potential confusion. It can be best to be explicit about multiplication, save for the simplest of cases. - - -## Functions - -On the Google calculator, the square root button has a single purpose: for the current value find a square root if possible, and if not signal an error (such as what happens if the value is negative). For more general powers, the $x^y$ key can be used. - -In `Julia`, functions are used to perform the actions that a -specialized button may do on the calculator. `Julia` provides many -standard mathematical functions - more than there could be buttons on -a calculator - and allows the user to easily define their own -functions. For example, `Julia` provides the same set of functions as on -Google's calculator, though with different names. For logarithms, -$\ln$ becomes `log` and $\log$ is `log10` (computer programs almost -exclusively reserve `log` for the natural log); for factorials, $x!$, -there is `factorial`; for powers $\sqrt{}$ becomes `sqrt`, $EXP$ -becomes `exp`, and $x^y$ is computed with the infix operator `^`. For the trigonometric -functions, the basic names are similar: `sin`, `cos`, `tan`. These -expect radians. For angles in degrees, the convenience functions -`sind`, `cosd`, and `tand` are provided. On the calculator, inverse -functions like $\sin^{-1}(x)$ are done by combining $Inv$ with -$\sin$. With `Julia`, the function name is `asin`, an abbreviation for -"arcsine." (Which is a good thing, as the notation using a power of -$-1$ is often a source of confusion and is not supported by `Julia` without work.) Similarly, there -are `asind`, `acos`, `acosd`, `atan`, and `atand` functions available -to the `Julia` user. - -The following table summarizes the above: - -```julia; echo=false -using DataFrames -calc = [ -L" $+$, $-$, $\times$, $\div$", -L"x^y", -L"\sqrt{}, \sqrt[3]{}", -L"e^x", -L" $\ln$, $\log$", -L"\sin, \cos, \tan, \sec, \csc, \cot", -"In degrees, not radians", -L"\sin^{-1}, \cos^{-1}, \tan^{-1}", -L"n!", -] - - -julia = [ -"`+`, `-`, `*`, `/`", -"`^`", -"`sqrt`, `cbrt`", -"`exp`", -"`log`, `log10`", -"`sin`, `cos`, `tan`, `sec`, `csc`, `cot`", -"`sind`, `cosd`, `tand`, `secd`, `cscd`, `cotd`", -"`asin`, `acos`, `atan`", -"`factorial`" -] - -CalculusWithJulia.WeaveSupport.table(DataFrame(Calculator=calc, Julia=julia)) -``` - - - - -Using a function is very straightforward. A function is called using parentheses, in a manner visually similar to how a function is called mathematically. So if we consider the `sqrt` function, we have: - -```julia; -sqrt(4), sqrt(5) -``` - -The function is referred to by name (`sqrt`) and called with parentheses. Any arguments are passed into the function using commas to separate values, should there be more than one. When there are numerous values for a function, the arguments may need to be given in a specific order or may possibly be specified with *keywords*. (A semicolon can be used instead of a comma to separate keyword arguments.) - -Some more examples: - -```julia; -exp(2), log(10), sqrt(100), 10^(1/2) -``` - - -!!! note - Parentheses have many roles. We've just seen that parentheses may be - used for grouping, and now we see they are used to indicate a function - is being called. These are familiar from their parallel usage in - traditional math notation. In `Julia`, a third usage is common, the - making of a "tuple," or a container of different objects, for example - `(1, sqrt(2), pi)`. In these notes, the output of multiple commands separated by commas is a printed tuple. - - - - -### Multiple arguments - -For the logarithm, we mentioned that `log` is the natural log and -`log10` implements the logarithm base 10. As well there is -`log2`. However, in general there is no `logb` for any base -`b`. Instead, the basic `log` function can take *two* arguments. When it -does, the first is the base, and the second the value to take the -logarithm of. This avoids forcing the user to remember that $\log_b(x) -= \log(x)/\log(b)$. - -So we have all these different, but related, uses to find logarithms: - -```julia; -log(e), log(2, e), log(10, e), log(e, 2) -``` - -In `Julia`, the "generic" function `log` not only has different implementations for -different types of arguments (real or complex), but also has a -different implementation depending on the number of arguments. - -### Examples - - -##### Example - -A right triangle has sides $a=11$ and $b=12$. Find the length of the hypotenuse. As $c^2 = a^2 + b^2$ we have: - -```julia; -sqrt(11^2 + 12^2) -``` - -##### Example - -A formula from statistics to compute the variance of a binomial random variable for parameters $p$ and $n$ -is $\sqrt{n p (1-p)}$. Compute this value for $p=1/4$ and $n=10$. - -```julia; -sqrt(10 * 1/4 * (1 - 1/4)) -``` - -##### Example - -Find the distance between the points $(-3, -4)$ and $(5,6)$. Using the distance formula $\sqrt{(x_1-x_0)^2+(y_1-y_0)^2}$, we have: - -```julia; -sqrt((5 - -3)^2 + (6 - -4)^2) -``` - - -##### Example - -The formula to compute the resistance of two resistors in parallel is -given by: $1/(1/r_1 + 1/r_2)$. Suppose the resistance is $10$ in one resistor -and $20$ in the other. What is the resistance in parallel? - -```julia; -1 / (1/10 + 1/20) -``` - -## Errors - -Not all computations on a calculator are valid. For example, the Google calculator will display `Error` as the output of $0/0$ or $\sqrt{-1}$. These are also errors mathematically, though the second is not if the complex numbers are considered. - -In `Julia`, there is a richer set of error types. The value `0/0` will in fact not be an error, but rather a value `NaN`. This is a special floating point value indicating "not a number" and is the result for various operations. The output of $\sqrt{-1}$ (computed via `sqrt(-1)`) will indicate a domain error: - -```julia; -sqrt(-1) -``` - -For integer or real-valued inputs, the `sqrt` function expects non-negative values, so that the output will always be a real number. - -There are other types of errors. Overflow is a common one on most -calculators. The value of $1000!$ is actually *very* large (over 2500 -digits large). On the Google calculator it returns `Infinity`, a -slight stretch. For `factorial(1000)` `Julia` returns an -`OverflowError`. This means that the answer is too large to be -represented as a regular integer. - -```julia; -factorial(1000) -``` - -How `Julia` handles overflow is a study in tradeoffs. For integer operations -that demand high performance, `Julia` does not check for overflow. So, -for example, if we are not careful strange answers can be -had. Consider the difference here between powers of 2: - -```julia; -2^62, 2^63 -``` - -On a machine with $64$-bit integers, the first of these two values is -correct, the second, clearly wrong, as the answer given is -negative. This is due to overflow. The cost of checking is considered -too high, so no error is thrown. The user is expected to have a sense -that they need to be careful when their values are quite large. (Or -the user can use floating point numbers, which though not always -exact, can represent much bigger values and are exact for a reasonably -wide range of integer values.) - - -!!! warning - In a turnaround from a classic blues song, we can think of - `Julia` as built for speed, not for comfort. All of these errors - above could be worked around so that the end user doesn't see - them. However, this would require slowing things down, either - through checking of operations or allowing different types of - outputs for similar type of inputs. These are tradeoffs that are - not made for performance reasons. For the most part, the tradeoffs - don't get in the way, but learning where to be careful takes some - time. Error messages often suggest a proper alternative. - - -##### Example - -Did Homer Simpson disprove [Fermat's Theorem](http://www.npr.org/sections/krulwich/2014/05/08/310818693/did-homer-simpson-actually-solve-fermat-s-last-theorem-take-a-look)? - -Fermat's theorem states there are no solutions over the integers to $a^n + b^n = c^n$ when $n > 2$. In the photo accompanying the linked article, we see: - -```math -3987^{12} + 4365^{12} - 4472^{12}. -``` - -If you were to do this on most calculators, the answer would be -$0$. Were this true, it would show that there is at least one solution -to $a^{12} + b^{12} = c^{12}$ over the integers - hence Fermat would be wrong. So is it $0$? - -Well, let's try something with `Julia` to see. Being clever, we check if $(3987^{12} + 4365^{12})^{1/12} = 4472$: - -```julia; -(3987^12 + 4365^12)^(1/12) -``` - -Not even close. Case closed. But wait? This number to be found must be *at least* as big as $3987$ and we got $28$. Doh! Something can't be right. Well, maybe integer powers are being an issue. (The largest $64$-bit integer is less than $10^{19}$ and we can see that $(4\cdot 10^3)^{12}$ is bigger than $10^{36})$. Trying again using floating point values for the base, we see: - - -```julia; -(3987.0^12 + 4365.0^12)^(1/12) -``` - -Ahh, we see something really close to $4472$, but not exactly. Why do -most calculators get this last part wrong? It isn't that they don't -use floating point, but rather the difference between the two numbers: - -```julia; -(3987.0^12 + 4365.0^12)^(1/12) - 4472 -``` - -is less than $10^{-8}$ so on a display with $8$ digits may be rounded to $0$. - -Moral: with `Julia` and with calculators, we still have to be mindful not to blindly accept an answer. - - - - - - -## Questions - -###### Question - - -Compute $22/7$ with `Julia`. - -```julia; hold=true; echo=false; -val = 22/7 -numericq(val) -``` - -###### Question - - -Compute $\sqrt{220}$ with `Julia`. - -```julia; hold=true; echo=false; -val = sqrt(220) -numericq(val) -``` - -###### Question - - -Compute $2^8$ with `Julia`. - -```julia; hold=true; echo=false; -val = 2^8 -numericq(val) -``` - -###### Question - -Compute the value of - -```math -\frac{9 - 5 \cdot (3-4)}{6 - 2}. -``` - -```julia; hold=true; echo=false; -val = (9-5*(3-4)) / (6-2) -numericq(val) -``` - -###### Question - - -Compute the following using `Julia`: - -```math -\frac{(.25 - .2)^2}{(1/4)^2 + (1/3)^2} -``` - -```julia; hold=true; echo=false; -val = (.25 - .2)^2/((1/4)^2 + (1/3)^2); -numericq(val) -``` - -###### Question - - -Compute the decimal representation of the following using `Julia`: - -```math -1 + \frac{1}{2} + \frac{1}{2^2} + \frac{1}{2^3} + \frac{1}{2^4} -``` - -```julia; hold=true; echo=false; -val = sum((1/2).^(0:4)); -numericq(val) -``` - - -###### Question - -Compute the following using `Julia`: - -```math -\frac{3 - 2^2}{4 - 2\cdot3} -``` - -```julia; hold=true; echo=false; -val = (3 - 2^2)/(4 - 2*3); -numericq(val) -``` - -###### Question - -Compute the following using `Julia`: - -```math -(1/2) \cdot 32 \cdot 3^2 + 100 \cdot 3 - 20 -``` - -```julia; hold=true; echo=false; -val = (1/2)*32*3^2 + 100*3 - 20; -numericq(val) -``` - - -###### Question - - -Wich of the following is a valid `Julia` expression for - -```math -\frac{3 - 2}{4 - 1} -``` - -that uses the least number of parentheses? - -```julia; hold=true; echo=false; -choices = [ -q"(3 - 2)/ 4 - 1", -q"3 - 2 / (4 - 1)", -q"(3 - 2) / (4 - 1)"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - - -Wich of the following is a valid `Julia` expression for - -```math -\frac{3\cdot2}{4} -``` - -that uses the least number of parentheses? - -```julia; hold=true; echo=false; -choices = [ -q"3 * 2 / 4", -q"(3 * 2) / 4" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - - -Which of the following is a valid `Julia` expression for - -```math -2^{4 - 2} -``` - -that uses the least number of parentheses? - -```julia; hold=true; echo=false; -choices = [ -q"2 ^ 4 - 2", -q"(2 ^ 4) - 2", -q"2 ^ (4 - 2)"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -In the U.S. version of the Office, the opening credits include a calculator calculation. The key sequence shown is `9653 +` which produces `11532`. What value was added to? - -```julia; hold=true; echo=false; -val = 11532 - 9653 -numericq(val) -``` - -###### Question - - -We saw that `1 / 2 / 3 / 4 / 5 / 6` is about $14$ divided by $10,000$. But what would be a more familiar expression representing it: - - - -```julia; hold=true; echo=false; -choices = [ -q"1 / (2 / 3 / 4 / 5 / 6)", -q"1 / 2 * 3 / 4 * 5 / 6", -q"1 /(2 * 3 * 4 * 5 * 6)"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - - -One of these three expressions will produce a different answer, select -that one: - -```julia; hold=true; echo=false; -choices = [ -q"2 - 3 - 4", -q"(2 - 3) - 4", -q"2 - (3 - 4)" -]; -answ = 3; -radioq(choices, answ) -``` - - -###### Question - -One of these three expressions will produce a different answer, select -that one: - -```julia; hold=true; echo=false; -choices = [ -q"2 - 3 * 4", -q"(2 - 3) * 4", -q"2 - (3 * 4)" -]; -answ = 2; -radioq(choices, answ) -``` - - -###### Question - -One of these three expressions will produce a different answer, select -that one: - - -```julia; hold=true; echo=false; -choices = [ -q"-1^2", -q"(-1)^2", -q"-(1^2)" -]; -answ = 2; -radioq(choices, answ) -``` - - -###### Question - -What is the value of $\sin(\pi/10)$? - -```julia; hold=true; echo=false; -val = sin(pi/10) -numericq(val) -``` -###### Question - - -What is the value of $\sin(52^\circ)$? - -```julia; hold=true; echo=false; -val = sind(52) -numericq(val) -``` - -###### Question - -What is the value of - -```math -\frac{\sin(\pi/3) - 1/2}{\pi/3 - \pi/6} -``` - -```julia; hold=true; echo=false; -val = (sin(pi/3) - 1/2)/(pi/3 - pi/6) -numericq(val) -``` - - - -###### Question - - -Is $\sin^{-1}(\sin(3\pi/2))$ equal to $3\pi/2$? (The "arc" functions -do no use power notation, but instead a prefix of `a`.) - -```julia; hold=true; echo=false; -yesnoq(false) -``` - -###### Question - - -What is the value of `round(3.5000)` - -```julia; hold=true; echo=false; -numericq(round(3.5)) -``` - -###### Question - - -What is the value of `sqrt(32 - 12)` - -```julia; hold=true; echo=false; -numericq(sqrt(32-12)) -``` - -###### Question - - -Which is greater $e^\pi$ or $\pi^e$? - - -```julia; hold=true; echo=false; -choices = [ -raw"``e^{\pi}``", -raw"``\pi^{e}``" -]; -answ = exp(pi) - pi^exp(1) > 0 ? 1 : 2; -radioq(choices, answ) -``` - -###### Question - - -What is the value of $\pi - (x - \sin(x)/\cos(x))$ when $x=3$? - - - -```julia; hold=true; echo=false; -x = 3; -answ = x - sin(x)/cos(x); -numericq(pi - answ) -``` - -###### Question - - -Factorials in `Julia` are computed with the function `factorial`, not the postfix operator `!`, as with math notation. What is $10!$? - -```julia; hold=true; echo=false; -val = factorial(10) -numericq(val) -``` - -###### Question - -Will `-2^2` produce `4` (which is a unary `-` evaluated *before* `^`) or `-4` (which is a unary `-` evaluated *after* `^`)? - -```julia; hold=true; echo=false; -choices = [q"4", q"-4"] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -A twitter post from popular mechanics generated some attention. - - -![](https://raw.githubusercontent.com/jverzani/CalculusWithJuliaNotes.jl/master/CwJ/precalc/figures/order_operations_pop_mech.png) - -What is the answer? - -```julia; hold=true; echo=false; -val = 8/2*(2+2) -numericq(val) -``` - -Does this expression return the *correct* answer using proper order of operations? - -```julia; -8÷2(2+2) -``` - -```julia; hold=true; echo=false; -yesnoq(false) -``` - -Why or why not: - -```julia; hold=true; echo=false; -choices = [ -"The precedence of numeric literal coefficients used for implicit multiplication is higher than other binary operators such as multiplication (`*`), and division (`/`, `\\`, and `//`)", -"Of course it is correct." -] -answ=1 -radioq(choices, answ) -``` diff --git a/CwJ/precalc/exp_log_functions.jmd b/CwJ/precalc/exp_log_functions.jmd deleted file mode 100644 index 17d8bcb..0000000 --- a/CwJ/precalc/exp_log_functions.jmd +++ /dev/null @@ -1,687 +0,0 @@ -# Exponential and logarithmic functions - -This section uses the following add-on packages: - -```julia -using CalculusWithJulia -using Plots -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Exponential and logarithmic functions", - description = "Calculus with julia", - tags = ["CalculusWithJulia", "precalc", "exponential and logarithmic functions"], -); - -nothing -``` - ----- - -The family of exponential functions is used to model growth and -decay. The family of logarithmic functions is defined here as the -inverse of the exponential functions, but have reach far outside of that. - - -## Exponential functions - -The family of exponential functions is defined by -$f(x) = a^x, -\infty< x < \infty$ and $a > 0$. -For $0 < a < 1$ these functions decay or decrease, for $a > 1$ the -functions grow or increase, and if $a=1$ the function is constantly $1$. - -For a given $a$, defining $a^n$ for positive integers is -straightforward, as it means multiplying $n$ copies of $a.$ From this, for *integer powers*, -the key properties of exponents: $a^x \cdot a^y = a^{x+y}$, and -$(a^x)^y = a^{x \cdot y}$ are immediate consequences. For example with ``x=3`` and ``y=2``: - -```math -\begin{align*} -a^3 \cdot a^2 &= (a\cdot a \cdot a) \cdot (a \cdot a) \\ - &= (a \cdot a \cdot a \cdot a \cdot a) \\ - &= a^5 = a^{3+2},\\ -(a^3)^2 &= (a\cdot a \cdot a) \cdot (a\cdot a \cdot a)\\ - &= (a\cdot a \cdot a \cdot a\cdot a \cdot a) \\ - &= a^6 = a^{3\cdot 2}. -\end{align*} -``` - - - -For $a \neq 0$, $a^0$ is defined to be $1$. - -For positive, integer values of $n$, we have by definition that $a^{-n} = 1/a^n$. - -For $n$ a positive integer, we can -define $a^{1/n}$ to be the unique positive solution to $x^n=a$. - -Using the key properties of exponents we can extend this to a definition -of $a^x$ for any rational $x$. - -Defining $a^x$ for any real number requires some more sophisticated -mathematics. - -One method is to use a -[theorem](http://tinyurl.com/zk86c8r) that says a *bounded* -monotonically increasing sequence will converge. (This uses the -[Completeness -Axiom](https://en.wikipedia.org/wiki/Completeness_of_the_real_numbers).) -Then for $a > 1$ we have if $q_n$ is a sequence of rational numbers -increasing to $x$, then $a^{q_n}$ will be a bounded sequence of -increasing numbers, so will converge to a number defined to be -$a^x$. Something similar is possible for the $0 < a < 1$ case. - -This definition can be done to ensure the rules of exponents hold for -$a > 0$: - -```math -a^{x + y} = a^x \cdot a^y, \quad (a^x)^y = a^{x \cdot y}. -``` - - - -In `Julia` these functions are implemented using `^`. A special value -of the base, ``e``, may be defined as well in terms of a limit. The -exponential function ``e^x`` is implemented in `exp`. - - -```julia;hold=true; -plot(x -> (1/2)^x, -2, 2, label="1/2") -plot!(x -> 1^x, label="1") -plot!(x -> 2^x, label="2") -plot!(x -> exp(x), label="e") -``` - -We see examples of some general properties: - -* The domain is all real $x$ and the range is all *positive* $y$ - (provided $a \neq 1$). -* For $0 < a < 1$ the functions are monotonically decreasing. -* For $a > 1$ the functions are monotonically increasing. -* If $1 < a < b$ and $x > 0$ we have $a^x < b^x$. - - -##### Example - -[Continuously](http://tinyurl.com/gsy939y) compounded interest allows -an initial amount $P_0$ to grow over time according to -$P(t)=P_0e^{rt}$. Investigate the difference between investing $1,000$ -dollars in an account which earns $2$% as opposed to an account which -earns $8$% over $20$ years. - -The $r$ in the formula is the interest rate, so $r=0.02$ or -$r=0.08$. To compare the differences we have: - -```julia; -r2, r8 = 0.02, 0.08 -P0 = 1000 -t = 20 -P0 * exp(r2*t), P0 * exp(r8*t) -``` - -As can be seen, there is quite a bit of difference. - -In ``1494``, [Pacioli](http://tinyurl.com/gsy939y) gave the "Rule of ``72``", -stating that to find the number of years it takes an investment to -double when continuously compounded one should divide the interest rate -into $72$. - -This formula is not quite precise, but a rule of thumb, the number is -closer to $69$, but $72$ has many divisors which makes this an easy to -compute approximation. Let's see how accurate it is: - -```julia; -t2, t8 = 72/2, 72/8 -exp(r2*t2), exp(r8*t8) -``` -So fairly close - after $72/r$ years the amount is $2.05...$ times more -than the initial amount. - -##### Example - -[Bacterial growth](https://en.wikipedia.org/wiki/Bacterial_growth) -(according to Wikipedia) is the asexual reproduction, or cell -division, of a bacterium into two daughter cells, in a process called -binary fission. During the log phase "the number of new bacteria -appearing per unit time is proportional to the present population." -The article states that "Under controlled conditions, *cyanobacteria* -can double their population four times a day..." - -Suppose an initial population of $P_0$ bacteria, a formula for the -number after $n$ *hours* is $P(n) = P_0 2^{n/6}$ where $6 = 24/4$. - -After two days what multiple of the initial amount is present if -conditions are appropriate? - -```julia;hold=true; -n = 2 * 24 -2^(n/6) -``` - -That would be an enormous growth. Don't worry: "Exponential growth -cannot continue indefinitely, however, because the medium is soon -depleted of nutrients and enriched with wastes." - -!!! note - The value of `2^n` and `2.0^n` is different in `Julia`. The former remains an integer and is subject to integer overflow for `n > 62`. As used above, `2^(n/6)` will not overflow for larger `n`, as when the exponent is a floating point value, the base is promoted to a floating point value. - - -##### Example - -The famous [Fibonacci](https://en.wikipedia.org/wiki/Fibonacci_number) -numbers are $1,1,2,3,5,8,13,\dots$, where $F_{n+1}=F_n+F_{n-1}$. These numbers increase. To see how fast, if we *guess* that -the growth is eventually exponential and assume $F_n \approx c \cdot a^n$, then -our equation is approximately $ca^{n+1} = ca^n + ca^{n-1}$. Factoring out common terms gives ``ca^{n-1} \cdot (a^2 - a - 1) = 0``. The term ``a^{n-1}`` is always positive, so any solution would satisfy $a^2 - a -1 = 0$. The positve solution is -$(1 + \sqrt{5})/2 \approx 1.618$ - -That is evidence that the $F_n \approx c\cdot 1.618^n$. (See -[Relation to golden ratio](https://en.wikipedia.org/wiki/Fibonacci_number#Relation_to_the_golden_ratio) -for a related, but more explicit exact formula. - -##### Example - -In the previous example, the exponential family of functions is used -to describe growth. Polynomial functions also increase. Could these be -used instead? If so that would be great, as they are easier to reason -about. - -The key fact is that exponential growth is much greater than -polynomial growth. That is for large enough $x$ and for any fixed -$a>1$ and positive integer $n$ it is true that $a^x \gg x^n$. - -Later we will see an easy way to certify this statement. - - -##### The mathematical constant ``e`` - - -Euler's number, ``e``, may be defined several ways. One way is to -define ``e^x`` by the limit ``(1+x/n)^n``. Then ``e=e^1``. The value -is an irrational number. This number turns up to be the natural base -to use for many problems arising in Calculus. In `Julia` there are a -few mathematical constants that get special treatment, so that when -needed, extra precision is available. The value `e` is not immediately -assigned to this value, rather `ℯ` is. This is typed -`\euler[tab]`. The label `e` is thought too important for other uses -to reserve the name for representing a single number. However, users -can issue the command `using Base.MathConstants` and `e` will be -available to represent this number. When the `CalculusWithJulia` -package is loaded, the value `e` is defined to be the floating point -number returned by `exp(1)`. This loses the feature of arbitrary -precision, but has other advantages. - - -A [cute](https://www.mathsisfun.com/numbers/e-eulers-number.html) appearance of ``e`` is in this problem: Let ``a>0``. Cut ``a`` into ``n`` equal pieces and then multiply them. What ``n`` will produce the largest value? Note that the formula is ``(a/n)^n`` for a given ``a`` and ``n``. - -Suppose ``a=5`` then for ``n=1,2,3`` we get: - -```julia; hold=true; -a = 5 -(a/1)^1, (a/2)^2, (a/3)^3 -``` - -We'd need to compare more, but at this point ``n=2`` is the winner when ``a=5``. - -With calculus, we will be able to see that the function ``f(x) = (a/x)^x`` will be maximized at ``a/e``, but for now we approach this in an exploratory manner. Suppose ``a=5``, then we have: - - -```julia;hold=true; -a = 5 -n = 1:10 -f(n) = (a/n)^n -@. [n f(n) (a/n - e)] # @. just allows broadcasting -``` - -We can see more clearly that ``n=2`` is the largest value for ``f`` and ``a/2`` is the closest value to ``e``. This would be the case for any ``a>0``, pick ``n`` so that ``a/n`` is closest to ``e``. - - -##### Example: The limits to growth - -The ``1972`` book [The limits to growth](https://donellameadows.org/wp-content/userfiles/Limits-to-Growth-digital-scan-version.pdf) by Meadows et. al. discusses the implications of exponential growth. It begins stating their conclusion (emphasis added): "If the present *growth* trends in world population, industrialization, pollution, food production, and resource depletion continue *unchanged*, the limits to growth on this planet will be reached sometime in the next *one hundred* years." They note it is possible to alter these growth trends. We are now half way into this time period. - -Let's consider one of their examples, the concentration of carbon dioxide in the atmosphere. In their Figure ``15`` they show data from ``1860`` onward of CO``_2`` concentration extrapolated out to the year ``2000``. At [climate.gov](https://www.climate.gov/news-features/understanding-climate/climate-change-atmospheric-carbon-dioxide) we can see actual measurements from ``1960`` to ``2020``. Numbers from each graph are read from the graphs, and plotted in the code below: - - -```julia; -co2_1970 = [(1860, 293), (1870, 293), (1880, 294), (1890, 295), (1900, 297), - (1910, 298), (1920, 300), (1930, 303), (1940, 305), (1950, 310), - (1960, 313), (1970, 320), (1980, 330), (1990, 350), (2000, 380)] -co2_2021 = [(1960, 318), (1970, 325), (1980, 338), (1990, 358), (2000, 370), - (2010, 390), (2020, 415)] - -xs,ys = unzip(co2_1970) -plot(xs, ys, legend=false) - -𝒙s, 𝒚s = unzip(co2_2021) -plot!(𝒙s, 𝒚s) - -r = 0.002 -x₀, P₀ = 1960, 313 -plot!(x -> P₀ * exp(r * (x - x₀)), 1950, 1990, linewidth=5, alpha=0.25) - -𝒓 = 0.005 -𝒙₀, 𝑷₀ = 2000, 370 -plot!(x -> 𝑷₀ * exp(𝒓 * (x - 𝒙₀)), 1960, 2020, linewidth=5, alpha=0.25) -``` - -(The `unzip` function is from the `CalculusWithJulia` package and will be explained in a subsequent section.) We can see that the projections from the year ``1970`` hold up fairly well - -On this plot we added two *exponential* models. at ``1960`` we added a *roughly* ``0.2`` percent per year growth (a rate mentioned in an accompanying caption) and at ``2000`` a roughly ``0.5`` percent per year growth. The former barely keeping up with the data. - -The word **roughly** above could be made exact. Suppose we knew that between ``1960`` and ``1970`` the rate went from ``313`` to ``320``. If this followed an exponential model, then ``r`` above would satisfy: - -```math -P_{1970} = P_{1960} e^{r * (1970 - 1960)} -``` - -or on division ``320/313 = e^{r\cdot 10}``. Solving for ``r`` can be done -- as explained next -- and yields ``0.002211\dots``. - - - -## Logarithmic functions - -As the exponential functions are strictly *decreasing* when $0 < a < -1$ and strictly *increasing* when $a>1,$ in both cases an inverse -function will exist. (When $a=1$ the function is a constant and is not -one-to-one.) The domain of an exponential function is all real $x$ and -the range is all *positive* $x$, so these are switched around for the -inverse function. Explicitly: the inverse function to ``f(x)=a^x`` will have domain ``(0,\infty)`` and range ``(-\infty, \infty)`` when ``a > 0, a \neq 1``. - -The inverse function will solve for $x$ in the equation $a^x = y$. The -answer, formally, is the logarithm base $a$, written $\log_a(x)$. - -That is $a^{\log_a(x)} = x$ for ``x > 0`` and $\log_a(a^x) = x$ for all ``x``. - - -To see how a logarithm is mathematically defined will have to wait, -though the family of functions - one for each $a>0$ - are implemented in `Julia` through the function -`log(a,x)`. There are special cases requiring just one argument: `log(x)` will compute the natural -log, base $e$ - the inverse of $f(x) = e^x$; `log2(x)` will compute the -log base $2$ - the inverse of $f(x) = 2^x$; and `log10(x)` will compute -the log base $10$ - the inverse of $f(x)=10^x$. (Also `log1p` computes an accurate value of ``\log(1 + p)`` when ``p \approx 0``.) - -To see this in an example, we plot for base $2$ the exponential -function $f(x)=2^x$, its inverse, and the logarithm function with base -$2$: - -```julia;hold=true; -f(x) = 2^x -xs = range(-2, stop=2, length=100) -ys = f.(xs) -plot(xs, ys, color=:blue, label="2ˣ") # plot f -plot!(ys, xs, color=:red, label="f⁻¹") # plot f^(-1) -xs = range(1/4, stop=4, length=100) -plot!(xs, log2.(xs), color=:green, label="log₂") # plot log2 -``` - -Though we made three graphs, only two are seen, as the graph of `log2` -matches that of the inverse function. - -Note that we needed a bit of care to plot the inverse function -directly, as the domain of $f$ is *not* the domain of $f^{-1}$. Again, in -this case the domain of $f$ is all $x$, but the domain of $f^{-1}$ is only all *positive* $x$ values. - -Knowing that `log2` implements an inverse function allows us to solve -many problems involving doubling. - - -##### Example - -An [old](https://en.wikipedia.org/wiki/Wheat_and_chessboard_problem) story about doubling is couched in terms of doubling grains of wheat. To simplify the story, suppose each day an amount of grain is doubled. How many days of doubling will it take ``1`` grain to become ``1`` million grains? - -The number of grains after one day is $2$, two days is $4$, three days -is $8$ and so after $n$ days the number of grains is $2^n$. To answer -the question, we need to solve $2^x = 1,000,000$. The logarithm -function yields ``20`` days (after rounding up): - -```julia; -log2(1_000_000) -``` - -##### Example - -The half-life of a radioactive material is the time it takes for half the material to decay. Different materials have quite different half lives with some quite long, and others quite short. See [half lives](https://en.wikipedia.org/wiki/List_of_radioactive_isotopes_by_half-life) for some details. - -The carbon ``14`` isotope is a naturally occurring isotope on Earth, -appearing in trace amounts. Unlike Carbon ``12`` and ``13`` it decays, in this -case with a half life of ``5730`` years (plus or minus ``40`` years). In a -[technique](https://en.wikipedia.org/wiki/Radiocarbon_dating) due to -Libby, measuring the amount of Carbon 14 present in an organic item -can indicate the time since death. The amount of Carbon ``14`` at death is -essentially that of the atmosphere, and this amount decays over -time. So, for example, if roughly half the carbon ``14`` remains, then the death occurred -about ``5730`` years ago. - - -A formula for the amount of carbon ``14`` remaining $t$ years after death would be $P(t) = P_0 \cdot 2^{-t/5730}$. - -If $1/10$ of the original carbon ``14`` remains, how old is the item? This amounts to solving $2^{-t/5730} = 1/10$. We have: $-t/5730 = \log_2(1/10)$ or: - -```julia; --5730 * log2(1/10) -``` - -!!! note - (Historically) Libby and James Arnold proceeded to test the radiocarbon dating theory by analyzing samples with known ages. For example, two samples taken from the tombs of two Egyptian kings, Zoser and Sneferu, independently dated to ``2625`` BC plus or minus ``75`` years, were dated by radiocarbon measurement to an average of ``2800`` BC plus or minus ``250`` years. These results were published in Science in ``1949``. Within ``11`` years of their announcement, more than ``20`` radiocarbon dating laboratories had been set up worldwide. Source: [Wikipedia](http://tinyurl.com/p5msnh6). - - -### Properties of logarithms - - -The basic graphs of logarithms ($a > 1$) are all similar, though as we see larger -bases lead to slower growing functions, though all satisfy $\log_a(1) -= 0$: - -```julia; -plot(log2, 1/2, 10, label="2") # base 2 -plot!(log, 1/2, 10, label="e") # base e -plot!(log10, 1/2, 10, label="10") # base 10 -``` - - -Now, what do the properties of exponents imply about logarithms? - -Consider the sum $\log_a(u) + \log_a(v)$. If we raise $a$ to this -power, we have using the powers of exponents and the inverse nature of -$a^x$ and $\log_a(x)$ that: - -```math -a^{\log_a(u) + \log_a(v)} = a^{\log_a(u)} \cdot a^{\log_a(v)} = u \cdot v. -``` - -Taking $\log_a$ of *both* sides yields $\log_a(u) + \log_a(v)=\log_a(u\cdot v)$. That is logarithms turn products into sums (of logs). - -Similarly, the relation $(a^{x})^y =a^{x \cdot y}, a > 0$ can be used to see that -$\log_a(b^x) = x \cdot\log_a(b)$. This follows, as applying $a^x$ to each side yields the -same answer. - -Due to inverse relationship between $a^x$ and $\log_a(x)$ we have: - -```math -a^{\log_a(b^x)} = b^x. -``` - -Due to the rules of exponents, we have: - -```math -a^{x \log_a(b)} = a^{\log_a(b) \cdot x} = (a^{\log_a(b)})^x = b^x. -``` - -Finally, since $a^x$ is one-to-one (when $a>0$ and $a \neq 1$), if $a^{\log_a(b^x)}=a^{x \log_a(b)}$ it must be that $\log_a(b^x) = x \log_a(b)$. That is, logarithms turn powers into products. - -Finally, we use the inverse property of logarithms and powers to show that logarithms can be defined for any base. Say $a, b > 0$. Then $\log_a(x) = \log_b(x)/\log_b(a)$. Again, to verify this we apply $a^x$ to both sides to see we get the same answer: - -```math -a^{\log_a(x)} = x, -``` - -this by the inverse property. Whereas, by expressing $a=b^{\log_b(a)}$ we have: - -```math -a^{(\log_b(x)/\log_b(b))} = (b^{\log_b(a)})^{(\log_b(x)/\log_b(a))} = -b^{\log_b(a) \cdot \log_b(x)/\log_b(a) } = b^{\log_b(x)} = x. -``` - -In short, we have these three properties of logarithmic functions: - -If $a, b$ are positive bases; $u,v$ are positive numbers; and $x$ is any real number then: - -```math -\begin{align*} -\log_a(uv) &= \log_a(u) + \log_a(v), \\ -\log_a(u^x) &= x \log_a(u), \text{ and} \\ -\log_a(u) &= \log_b(u)/\log_b(a). -\end{align*} -``` - -##### Example - -Before the ubiquity of electronic calculating devices, the need to -compute was still present. Ancient civilizations had abacuses to make -addition easier. For multiplication and powers a [slide -rule](https://en.wikipedia.org/wiki/Slide_rule) could be used. -It is easy to represent addition physically with two straight pieces -of wood - just represent a number with a distance and align the two -pieces so that the distances are sequentially arranged. To multiply -then was as easy: represent the logarithm of a number with a distance -then add the logarithms. The sum of the logarithms is the logarithm of -the *product* of the original two values. Converting back to a number -answers the question. The conversion back and forth is done by simply -labeling the wood using a logartithmic scale. The slide rule was -[invented](http://tinyurl.com/qytxo3e) soon after Napier's initial publication -on the logarithm in 1614. - - -##### Example - -Returning to the Rule of ``72``, what should the exact number be? - -The amount of time to double an investment that grows according to -$P_0 e^{rt}$ solves $P_0 e^{rt} = 2P_0$ or $rt = \log_e(2)$. So we get -$t=\log_e(2)/r$. As $\log_e(2)$ is - -```julia; -log(e, 2) -``` - -We get the actual rule should be the "Rule of $69.314...$." - -## Questions - -###### Question - -Suppose every $4$ days, a population doubles. If the population starts -with $2$ individuals, what is its size after $4$ weeks? - -```julia; hold=true; echo=false -n = 4*7/4 -val = 2 * 2^n -numericq(val) -``` - -###### Question - -A bouncing ball rebounds to a height of $5/6$ of the previous peak -height. If the ball is droppet at a height of $3$ feet, how high will -it bounce after $5$ bounces? - -```julia; hold=true; echo=false -val = 3 * (5/6)^5 -numericq(val) -``` - -###### Question - -Which is bigger $e^2$ or $2^e$? - -```julia; hold=true; echo=false -choices = ["``e^2``", "``2^e``"] -answ = e^2 - 2^e > 0 ? 1 : 2 -radioq(choices, answ) -``` - - -###### Question - -Which is bigger $\log_8(9)$ or $\log_9(10)$? - -```julia; hold=true; echo=false -choices = [raw"``\log_8(9)``", raw"``\log_9(10)``"] -answ = log(8,9) > log(9,10) ? 1 : 2 -radioq(choices, answ) -``` - -###### Question - -If $x$, $y$, and $z$ satisfy $2^x = 3^y$ and $4^y = 5^z$, what is the -ratio $x/z$? - -```julia; hold=true; echo=false -choices = [ -raw"``\frac{\log(2)\log(3)}{\log(5)\log(4)}``", -raw"``2/5``", -raw"``\frac{\log(5)\log(4)}{\log(3)\log(2)}``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Does $12$ satisfy $\log_2(x) + \log_3(x) = \log_4(x)$? - -```julia; hold=true; echo=false -answ = log(2,12) + log(3,12) == log(4, 12) -yesnoq(answ) -``` - - -###### Question - -The [Richter](https://en.wikipedia.org/wiki/Richter_magnitude_scale) -magnitude is determined from the logarithm of the amplitude of waves -recorded by seismographs (Wikipedia). The formula is $M=\log(A) - -\log(A_0)$ where $A_0$ depends on the epicenter distance. Suppose an -event has $A=100$ and $A_0=1/100$. What is $M$? - -```julia; hold=true; echo=false -A, A0 = 100, 1/100 -val = M = log(A) - log(A0) -numericq(val) -``` - -If the magnitude of one earthquake is $9$ and the magnitude of another -earthquake is $7$, how many times stronger is $A$ if $A_0$ is the same -for each? - -```julia; hold=true; echo=false -choices = ["``1000`` times", "``100`` times", "``10`` times", "the same"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The [Loudest band](https://en.wikipedia.org/wiki/Loudest_band) can -possibly be measured in [decibels](https://en.wikipedia.org/wiki/Decibel). In ``1976`` the Who recorded $126$ -db and in ``1986`` Motorhead recorded $130$ db. Suppose both measurements -record power through the formula $db = 10 \log_{10}(P)$. What is the -ratio of the Motorhead $P$ to the $P$ for the Who? - -```julia; hold=true; echo=false -db_who, db_motorhead = 126, 130 -db2P(db) = 10^(db/10) -P_who, P_motorhead = db2P.((db_who, db_motorhead)) -val = P_motorhead / P_who -numericq(val) -``` - -###### Question - -Based on this graph: - -```julia; hold=true; -plot(log, 1/4, 4, label="log") -f(x) = x - 1 -plot!(f, 1/4, 4, label="x-1") -``` - -Which statement appears to be true? - -```julia; hold=true; echo=false -choices = [ - raw"``x \geq 1 + \log(x)``", - raw"``x \leq 1 + \log(x)``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Consider this graph: - -```julia; hold=true; -f(x) = log(1-x) -g(x) = -x - x^2/2 -plot(f, -3, 3/4, label="f") -plot!(g, -3, 3/4, label="g") -``` - -What statement appears to be true? - -```julia; hold=true; echo=false -choices = [ -raw"``\log(1-x) \geq -x - x^2/2``", -raw"``\log(1-x) \leq -x - x^2/2``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Suppose $a > 1$. If $\log_a(x) = y$ what is $\log_{1/a}(x)$? (The -reciprocal property of exponents, $a^{-x} = (1/a)^x$, is at play here.) - -```julia; hold=true; echo=false -choices = ["``-y``", "``1/y``", "``-1/y``"] -answ = 1 -radioq(choices, answ) -``` - -Based on this, the graph of $\log_{1/a}(x)$ is the graph of -$\log_a(x)$ under which transformation? - -```julia; hold=true; echo=false -choices = [ -L"Flipped over the $x$ axis", -L"Flipped over the $y$ axis", -L"Flipped over the line $y=x$" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Suppose $x < y$. Then for $a > 0$, $a^y - a^x$ is equal to: - -```julia; hold=true; echo=false -choices = [ - raw"``a^x \cdot (a^{y-x} - 1)``", - raw"``a^{y-x}``", - raw"``a^{y-x} \cdot (a^x - 1)``" -] -answ = 1 -radioq(choices, answ) -``` - -Using $a > 1$ we have: - -```julia; hold=true; echo=false -choices = [ - L"as $a^{y-x} > 1$ and $y-x > 0$, $a^y > a^x$", - L"as $a^x > 1$, $a^y > a^x$", - "``a^{y-x} > 0``" -] -answ=1 -radioq(choices, answ) -``` - -If $a < 1$ then: - -```julia; hold=true; echo=false -choices = [ -L"as $a^{y-x} < 1$ as $y-x > 0$, $a^y < a^x$", -L"as $a^x < 1$, $a^y < a^x$", -"``a^{y-x} < 0``" -] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/precalc/figures/c02-1970.png b/CwJ/precalc/figures/c02-1970.png deleted file mode 100644 index 1c5f9a6..0000000 Binary files a/CwJ/precalc/figures/c02-1970.png and /dev/null differ diff --git a/CwJ/precalc/figures/c02-2021.png b/CwJ/precalc/figures/c02-2021.png deleted file mode 100644 index 01244a9..0000000 Binary files a/CwJ/precalc/figures/c02-2021.png and /dev/null differ diff --git a/CwJ/precalc/figures/calculator.png b/CwJ/precalc/figures/calculator.png deleted file mode 100644 index baf1f7d..0000000 Binary files a/CwJ/precalc/figures/calculator.png and /dev/null differ diff --git a/CwJ/precalc/figures/leading_term.gif b/CwJ/precalc/figures/leading_term.gif deleted file mode 100644 index 4af4f1a..0000000 Binary files a/CwJ/precalc/figures/leading_term.gif and /dev/null differ diff --git a/CwJ/precalc/figures/order_operations_pop_mech.png b/CwJ/precalc/figures/order_operations_pop_mech.png deleted file mode 100644 index 0bbec61..0000000 Binary files a/CwJ/precalc/figures/order_operations_pop_mech.png and /dev/null differ diff --git a/CwJ/precalc/figures/summary-sum-and-difference-of-two-angles.jpg b/CwJ/precalc/figures/summary-sum-and-difference-of-two-angles.jpg deleted file mode 100644 index cbd7477..0000000 Binary files a/CwJ/precalc/figures/summary-sum-and-difference-of-two-angles.jpg and /dev/null differ diff --git a/CwJ/precalc/functions.jmd b/CwJ/precalc/functions.jmd deleted file mode 100644 index 0b925a3..0000000 --- a/CwJ/precalc/functions.jmd +++ /dev/null @@ -1,1256 +0,0 @@ -# Functions - -This section will use the following add-on packages: - -```julia -using CalculusWithJulia, Plots -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Functions", - description = "Calculus with Julia: Functions", - tags = ["CalculusWithJulia", "precalc", "functions"], -); - -nothing -``` - ---- - -A mathematical [function](http://en.wikipedia.org/wiki/Function_(mathematics)) is defined abstractly by: - -> **Function:** A function is a *relation* which assigns to each element in the domain a *single* element in the range. A **relation** is a set of ordered pairs, $(x,y)$. The set of first coordinates is the domain, the set of second coordinates the range of the relation. - -That is, a function gives a correspondence between values in its domain with values in its range. - -This definition is abstract, as functions can be very general. With -single-variable calculus, we generally specialize to real-valued -functions of a single variable (*univariate, scalar functions*). These -typically have the correspondence given by a rule, such as $f(x) = -x^2$ or $f(x) = \sqrt{x}$. The function's domain may be implicit (as in all $x$ -for which the rule is defined) or may be explicitly given as part of the -rule. The function's range is then the image of its domain, or the set of all -$f(x)$ for each $x$ in the domain ($\{f(x): x \in \text{ domain}\}$). - - -Some examples of mathematical functions are: - -```math -f(x) = \cos(x), \quad g(x) = x^2 - x, \quad h(x) = \sqrt{x}, \quad -s(x) = \begin{cases} -1 & x < 0\\1&x>0\end{cases}. -``` - -For these examples, the domain of both $f(x)$ and $g(x)$ is all real -values of $x$, where as for $h(x)$ it is implicitly just the set of -non-negative numbers, -$[0, \infty)$. Finally, for $s(x)$, we can see that the domain is defined for every $x$ but $0$. - -In general the range is harder to identify than the domain, and this is the case for these functions too. For $f(x)$ we may know the $\cos$ function is trapped in $[-1,1]$ -and it is intuitively clear than all values in that set are -possible. The function $h(x)$ would have range -$[0,\infty)$. The $s(x)$ function is either $-1$ or $1$, so only has two possible values in its range. What about $g(x)$? It is a parabola that opens upward, so any $y$ values below the $y$ value of its vertex will not appear in the range. In this case, the symmetry indicates that the vertex will be at $(1/2, -1/4)$, so the range is $[-1/4, \infty)$. - - -!!! note - **Thanks to Euler (1707-1783):** The formal idea of a function is a relatively modern concept in mathematics. According to [Dunham](http://www.maa.org/sites/default/files/pdf/upload_library/22/Ford/dunham1.pdf), - Euler defined a function as an "analytic expression composed in any way - whatsoever of the variable quantity and numbers or constant - quantities." He goes on to indicate that as Euler matured, so did - his notion of function, ending up closer to the modern idea of a - correspondence not necessarily tied to a particular formula or - “analytic expression.” He finishes by saying: "It is fair to say - that we now study functions in analysis because of him." - - -We will see that defining functions within `Julia` can be as simple a concept as Euler started with, but that the more abstract concept has a great advantage that is exploited in the design of the language. - -## Defining simple mathematical functions - -The notation `Julia` uses to define simple mathematical functions could not be more closely related to how they are written mathematically. For example, the functions $f(x)$, $g(x)$, and $h(x)$ above may be defined by: - -```julia; -f(x) = cos(x) -g(x) = x^2 - x -h(x) = sqrt(x) -``` - -The left-hand sign of the equals sign is an assignment. In this use, a function with a given signature is defined and attached to a method table for the given function name. The right-hand side is simply `Julia` code to compute the *rule* corresponding to the function. - -Calling the function also follows standard math notation: - -```julia; -f(pi), g(2), h(4) -``` - -For typical cases like the three above, there isn't really much new to learn. - - -!!! note - The equals sign in `Julia` always indicates either an assignment or a - mutation of the object on the left side. The definition of a function - above is an *assignment*, in that a function is added (or modified) in - a table holding the methods associated with the function's name. - - The equals sign restricts the expressions available on the *left*-hand - side to a) a variable name, for assignment; b) mutating an object at an index, - as in `xs[1]`; c) mutating a property of a stuct; or d) a function assignment - following this form `function_name(args...)`. - - Whereas function - definitions and usage in `Julia` mirrors standard math notation; - equations in math are not so mirrored in `Julia`. In mathematical - equations, the left-hand of an equation is typically a complicated - algebraic expression. Not so with `Julia`, where the left hand side of - the equals sign is prescribed and quite limited. - - -### The domain of a function - -Functions in `Julia` have an implicit domain, just as they do mathematically. In the case of $f(x)$ and $g(x)$, the right-hand side is defined for all real values of $x$, so the domain is all $x$. For $h(x)$ this isn't the case, of course. Trying to call $h(x)$ when $x < 0$ will give an error: - -```julia; -h(-1) -``` - -The `DomainError` is one of many different error types `Julia` has, in this case it is quite apt: the value $-1$ is not in the domain of the function. - - - - - -### Equations, functions, calling a function - -Mathematically we tend to blur the distinction between the equation - -```math -y = 5/9 \cdot (x - 32) -``` - -and the function - -```math -f(x) = 5/9 \cdot (x - 32) -``` - -In fact, the graph of a function $f(x)$ is simply defined as the graph of the -equation $y=f(x)$. There is a distinction in `Julia` as a command such as - -```julia; -x = -40 -y = 5/9 * (x - 32) -``` - -will evaluate the right-hand side with the value of `x` bound at the -time of assignment to `y`, whereas assignment to a function - -```julia;hold=true -f(x) = 5/9 * (x - 32) -f(72) ## room temperature -``` - -will create a function object with a value of `x` determined at a later -time - the time the function is called. So the value of `x` defined when the function is created is not -important here (as the value of `x` used by `f` is passed in as an argument). - - - -Within `Julia`, we make note of the distinction between a function -object versus a function call. In the definition `f(x)=cos(x)`, the -variable `f` refers to a function object, whereas the expression -`f(pi)` is a function call. This mirrors the math notation where an -$f$ is used when properties of a function are being emphasized (such -as $f \circ g$ for composition) and $f(x)$ is used when the values -related to the function are being emphasized (such as saying "the plot -of the equation $y=f(x)$). - -Distinguishing these three related but different concepts (equations, function objects, and function calls) is important when modeling on the computer. - -### Cases - -The definition of $s(x)$ above has two cases: - -```math -s(x) = \begin{cases} -1 & s < 0\\ 1 & s > 0. \end{cases} -``` - -We learn to read this as: when $s$ is less than $0$, then the answer is -$-1$. If $s$ is greater than $0$ the answer is $1.$ Often - but not -in this example - there is an "otherwise" case to catch those values -of $x$ that are not explicitly mentioned. As there is no such -"otherwise" case here, we can see that this function has no definition -when $x=0$. This function is often called the "sign" function and is -also defined by $\lvert x\rvert/x$. (`Julia`'s `sign` function actually -defines `sign(0)` to be `0`.) - -How do we create conditional statements in `Julia`? Programming languages generally have "if-then-else" constructs to handle conditional evaluation. In `Julia`, the following code will handle the above condition: - -```julia;eval=false -if x < 0 - -1 -elseif x > 0 - 1 -end -``` - -The "otherwise" case would be caught with an `else` addition. So, for example, this would implement `Julia`'s definition of `sign` (which also assigns $0$ to $0$): - -```julia;eval=false -if x < 0 - -1 -elseif x > 0 - 1 -else - 0 -end -``` - - -The conditions for the `if` statements are expressions that evaluate to either `true` or `false`, such as generated by the Boolean operators `<`, `<=`, `==`, `!-`, `>=`, and `>`. - - -If familiar with `if` conditions, they are natural to use. However, for simpler cases of "if-else" `Julia` provides the more convenient *ternary* operator: `cond ? if_true : if_false`. (The name comes from the fact that there are three arguments specified.) The ternary operator checks the condition and if true returns the first expression, whereas if the condition is false the second condition is returned. Both expressions are evaluated. (The [short-circuit](http://julia.readthedocs.org/en/latest/manual/control-flow/#short-circuit-evaluation) operators can be used to avoid both evaluations.) - -For example, here is one way to define an absolute value function: - -```julia; -abs_val(x) = x >= 0 ? x : -x -``` - -The condition is `x >= 0` - or is `x` non-negative? If so, the value `x` is used, otherwise `-x` is used. - - -Here is a means to implement a function which takes the larger of `x` or `10`: - -```julia; -bigger_10(x) = x > 10 ? x : 10.0 -``` - -(This could also utilize the `max` function: `f(x) = max(x, 10.0)`.) - -Or similarly, a function to represent a cell phone plan where the first ``500`` minutes are ``20`` dollars and every additional minute is ``5`` cents: - -```julia; -cellplan(x) = x < 500 ? 20.0 : 20.0 + 0.05 * (x-500) -``` - -!!! warning - Type stability. These last two definitions used `10.0` and `20.0` - instead of the integers `10` and `20` for the answer. Why the extra - typing? When `Julia` can predict the type of the output from the type - of inputs, it can be more efficient. So when possible, we help out and - ensure the output is always the same type. - -##### Example - -The `ternary` operator can be used to define an explicit domain. For example, a falling body might have height given by $h(t) = 10 - 16t^2$. This model only applies for non-negative $t$ and non-negative $h$ values. So, in particular $0 \leq t \leq \sqrt{10/16}$. To implement this function we might have: - -```julia; -hᵣ(t) = 0 <= t <= sqrt(10/16) ? 10.0 - 16t^2 : error("t is not in the domain") -``` - -#### Nesting ternary operators - -The function `s(x)` isn't quite so easy to implement, as there isn't an "otherwise" case. We could use an `if` statement, but instead illustrate using a second, nested ternary operator: - -```julia; -s(x) = x < 0 ? 1 : - x > 0 ? 1 : error("0 is not in the domain") -``` - -With nested ternary operators, the advantage over the `if` condition -is not always compelling, but for simple cases the ternary operator is -quite useful. - -## Functions defined with the "function" keyword - - -For more complicated functions, say one with a few -steps to compute, an alternate form for defining a function can be -used: - -```verbatim -function function_name(function_arguments) - ...function_body... -end -``` - -The last value computed is returned unless the `function_body` -contains an explicit `return` statement. - -For example, the following is a more verbose way to define $sq(x) = x^2$: - -```julia; -function sq(x) - return x^2 -end -``` - -The line `return x^2`, could have just been `x^2` as it is the last (and) only line evaluated. - -!!! note - The `return` keyword is not a function, so is not called with parentheses. An emtpy `return` statement will return a value of `nothing`. - - - -##### Example - - -Imagine we have the following complicated function related to the trajectory of a [projectile](http://www.researchgate.net/publication/230963032_On_the_trajectories_of_projectiles_depicted_in_early_ballistic_woodcuts) with wind resistance: - -```math - f(x) = \left(\frac{g}{k v_0\cos(\theta)} + \tan(\theta) \right) x + \frac{g}{k^2}\ln\left(1 - \frac{k}{v_0\cos(\theta)} x \right) -``` - -Here $g$ is the gravitational constant $9.8$ and $v_0$, $\theta$ and $k$ parameters, which we take to be $200$, $45$ degrees and $1/2$ respectively. With these values, the above function can be computed when $x=100$ with: - -```julia; -function trajectory(x) - g, v0, theta, k = 9.8, 200, 45*pi/180, 1/2 - a = v0 * cos(theta) - - (g/(k*a) + tan(theta))* x + (g/k^2) * log(1 - k/a*x) -end -``` - -```julia -trajectory(100) -``` - -By using a multi-line function our work is much easier to look over for errors. - -##### Example: the secant method for finding a solution to $f(x) = 0$. - -This next example, shows how using functions to collect a set of computations for simpler reuse can be very helpful. - -An old method for finding a zero of an equation is the [secant -method](https://en.wikipedia.org/wiki/Secant_method). We illustrate -the method with the function $f(x) = x^2 - 2$. In an upcoming example we -saw how to create a function to evaluate the secant line between -$(a,f(a))$ and $(b, f(b))$ at any point. In this example, we define a -function to compute the $x$ coordinate of where the secant line -crosses the $x$ axis. This can be defined as follows: - -```julia; -function secant_intersection(f, a, b) - # solve 0 = f(b) + m * (x-b) where m is the slope of the secant line - # x = b - f(b) / m - m = (f(b) - f(a)) / (b - a) - b - f(b) / m -end -``` - -We utilize this as follows. Suppose we wish to solve $f(x) = 0$ and we -have two "rough" guesses for the answer. In our example, we wish to -solve $q(x) = x^2 - 2$ and our "rough" guesses are $1$ and $2$. Call -these values $a$ and $b$. We *improve* our rough guesses by finding a -value $c$ which is the intersection point of the secant line. - -```julia; -q(x) = x^2 - 2 -𝒂, 𝒃 = 1, 2 -𝒄 = secant_intersection(q, 𝒂, 𝒃) -``` - -In our example, we see that in trying to find an answer to $f(x) = 0$ -( $\sqrt{2}\approx 1.414\dots$) our value found from the intersection -point is a better guess than either $a=1$ or $b=2$: - -```julia;echo=false; -plot(q, 𝒂, 𝒃, linewidth=5, legend=false) -plot!(zero, 𝒂, 𝒃) -plot!([𝒂, 𝒃], q.([𝒂, 𝒃])) -scatter!([𝒄], [q(𝒄)]) -``` - -Still, `q(𝒄)` is not really close to $0$: - -```julia; -q(𝒄) -``` - -*But* it is much closer than either $q(a)$ or $q(b)$, so it is an improvement. This suggests renaming $a$ and $b$ with the old $b$ and $c$ values and trying again we might do better still: - -```julia;hold=true -𝒂, 𝒃 = 𝒃, 𝒄 -𝒄 = secant_intersection(q, 𝒂, 𝒃) -q(𝒄) -``` - -Yes, now the function value at this new $c$ is even closer to $0$. Trying a few more times we see we just get closer and closer. He we start again to see the progress - -```julia;hold=true; -𝒂,𝒃 = 1, 2 -for step in 1:6 - 𝒂, 𝒃 = 𝒃, secant_intersection(q, 𝒂, 𝒃) - current = (c=𝒃, qc=q(𝒃)) - @show current -end -``` - -Now our guess $c$ is basically the same as `sqrt(2)`. Repeating the above leads to only a slight improvement in the guess, as we are about as close as floating point values will allow. - -Here we see a visualization with all these points. As can be seen, it -quickly converges at the scale of the visualization, as we can't see -much closer than `1e-2`. - -```julia;hold=true;echo=false; -f(x) = x^2 - 2 -a, b = 1, 2 -c = secant_intersection(f, a, b) - -p = plot(f, a, b, linewidth=5, legend=false) -plot!(p, zero, a, b) - -plot!(p, [a,b], f.([a,b])); -scatter!(p, [c], [f(c)]) - -a, b = b, c -c = secant_intersection(f, a, b) -plot!(p, [a,b], f.([a,b])); -scatter!(p, [c], [f(c)]) - - -a, b = b, c -c = secant_intersection(f, a, b) -plot!(p, [a,b], f.([a,b])); -scatter!(p, [c], [f(c)]) -p -``` - - - -In most cases, this method can fairly quickly find a zero provided two good starting points are used. - -## Parameters, function context (scope), keyword arguments - -Consider two functions implementing the slope-intercept form and point-slope form of a line: - -```math -f(x) = m \cdot x + b, \quad g(x) = y_0 + m \cdot (x - x_0). -``` - -Both functions use the variable $x$, but there is no confusion, as we learn that this is just a dummy variable to be substituted for and so could have any name. Both also share a variable $m$ for a slope. Where does that value come from? In practice, there is a context that gives an answer. Despite the same name, there is no expectation that the slope will be the same for each function if the context is different. So when parameters are involved, a function involves a rule and a context to give specific values to the parameters. Euler had said initially that functions composed of "the variable quantity and numbers or constant quantities." The term "variable," we still use, but instead of "constant quantities," we use the name "parameters." - - - -Something similar is also true with `Julia`. Consider the example of writing a function to model a linear equation with slope $m=2$ and $y$-intercept $3$. A typical means to do this would be to define constants, and then use the familiar formula: - -```julia; -m, b = 2, 3 -mxb(x) = m*x + b -``` - -This will work as expected. For example, $f(0)$ will be $b$ and $f(2)$ will be $7$: - -```julia; -mxb(0), mxb(2) -``` - -All fine, but what if somewhere later the values for $m$ and $b$ were *redefined*, say with ``m,b = 3,2``? - - -Now what happens with $f(0)$? When $f$ was defined `b` was $3$, but -now if we were to call `f`, `b` is ``2``. Which value will we get? More -generally, when `f` is being evaluated in what context does `Julia` -look up the bindings for the variables it encounters? It could be that -the values are assigned when the function is defined, or it could be -that the values for the parameters are resolved when the function is -called. If the latter, what context will be used? - - -Before discussing this, let's just see in this case: - -```julia;hold=true -m, b = 3, 2 -mxb(0) -``` - -So the `b` is found from the currently stored value. This fact can be exploited. we can write template-like functions, such as `f(x)=m*x+b` and reuse them just by updating the parameters separately. - - - -How `Julia` resolves what a variable refers to is described in detail -in the manual page -[Scope of Variables](http://julia.readthedocs.org/en/latest/manual/variables-and-scoping/). In -this case, the function definition finds variables in the context of -where the function was defined, the main workspace. As seen, this -context can be modified after the function definition and prior to the -function call. It is only when `b` is needed, that the context is -consulted, so the most recent binding is retrieved. Contexts (more -formally known as environments) allow the user to repurpose variable -names without there being name collision. For example, we typically -use `x` as a function argument, and different contexts allow this `x` -to refer to different values. - - -Mostly this works as expected, but at times it can be complicated to -reason about. In our example, definitions of the parameters can be -forgotten, or the same variable name may have been used for some other -purpose. The potential issue is with the parameters, the value for `x` -is straightforward, as it is passed into the function. However, we can -also pass the parameters, such as $m$ and $b$, as arguments. For -parameters, we suggest using -[keyword](http://julia.readthedocs.org/en/latest/manual/functions/#keyword-arguments) -arguments. These allow the specification of parameters, but also give -a default value. This can make usage explicit, yet still -convenient. For example, here is an alternate way of defining a line -with parameters `m` and `b`: - -```julia; -mxplusb(x; m=1, b=0) = m*x + b -``` - -The right-hand side is identical to before, but the left hand side is -different. Arguments defined *after* a semicolon are keyword -arguments. They are specified as `var=value` (or `var::Type=value` to -restrict the type) where the value is used as the default, should a -value not be specified when the function is called. - -Calling a function with keyword arguments can be identical to before: - -```julia; -mxplusb(0) -``` - -During this call, values for `m` and `b` are found from how the -function is called, not the main workspace. In this case, nothing is -specified so the defaults of $m=1$ and $b=0$ are used. Whereas, this -call will use the user-specified values for `m` and `b`: - -```julia; -mxplusb(0; m=3, b=2) -``` - -Keywords are used to mark the parameters whose values are to be changed from the default. Though one can use *positional arguments* for parameters - and there are good reasons to do so - using keyword arguments is a good practice if performance isn't paramount, as their usage is more explicit yet the defaults mean that a minimum amount of typing needs to be done. - -##### Example - -In the example for multi-line functions we hard coded many variables inside the body of the function. In practice it can be better to pass these in as parameters along the lines of: - -```julia;hold=true -function trajectory(x; g = 9.8, v0 = 200, theta = 45*pi/180, k = 1/2) - a = v0 * cos(theta) - (g/(k*a) + tan(theta))* x + (g/k^2) * log(1 - k/a*x) -end -trajectory(100) -``` - - -### The `f(x,p)` style for parameterization - -An alternative to keyword arguments is to bundle the parameters into a container and pass them as a single argument to the function. The idiom in `Julia` is to use the *second* argument for parameters, or `f(x, p)` for the function argument specifications. This style is used in the very popular `SciML` suite of packages. - -For example, here we use a *named tuple* to pass parameters to `f`: - -```julia;hold=true -function trajectory(x ,p) - g,v0, theta, k = p.g, p.v0, p.theta, p.k # unpack parameters - - a = v0 * cos(theta) - (g/(k*a) + tan(theta))* x + (g/k^2) * log(1 - k/a*x) -end - -p = (g=9.8, v0=200, theta = 45*pi/180, k=1/2) -trajectory(100, p) -``` - -The style isn't so different from using keyword arguments, save the extra step of unpacking the parameters. The *big* advantage is consistency -- the function is always called in an identical manner regardless of the number of parameters (or variables). - - -## Multiple dispatch - -The concept of a function is of much more general use than its -restriction to mathematical functions of single real variable. A -natural application comes from describing basic properties of -geometric objects. The following function definitions likely will -cause no great concern when skimmed over: - -```julia;hold=true -Area(w, h) = w * h # of a rectangle -Volume(r, h) = pi * r^2 * h # of a cylinder -SurfaceArea(r, h) = pi * r * (r + sqrt(h^2 + r^2)) # of a right circular cone, including the base -``` - -The right-hand sides may or may not be familiar, but it should be -reasonable to believe that if push came to shove, the formulas could be looked -up. However, the left-hand sides are subtly different - they have two -arguments, not one. In `Julia` it is trivial to define functions with -multiple arguments - we just did. - - -Earlier we saw the `log` function can use a second argument to express -the base. This function is basically defined by `log(b,x)=log(x)/log(b)`. The `log(x)` value is the natural log, and this definition -just uses the change-of-base formula for logarithms. - -But not so fast, on the left side is a function with two arguments and on the right side the functions have one argument - yet they share the same name. How does `Julia` know which to use? `Julia` uses the number, order, and *type* of the positional arguments passed to a function to determine which function definition to use. This is technically known as [multiple dispatch](http://en.wikipedia.org/wiki/Multiple_dispatch) or **polymorphism**. As a feature of the language, it can be used to greatly simplify the number of functions the user must learn. The basic idea is that many functions are "generic" in that they have methods which will work differently in different scenarios. - -!!! warning - Multiple dispatch is very common in mathematics. For example, we learn different ways to add: integers (fingers, carrying), real numbers (align the decimal points), rational numbers (common denominators), complex numbers (add components), vectors (add components), polynomials (combine like monomials), ... yet we just use the same `+` notation for each operation. The concepts are related, the details different. - - -`Julia` is similarly structured. `Julia` terminology would be to call the operation "`+`" a *generic function* and the different implementations *methods* of "`+`". This allows the user to just need to know a smaller collection of generic concepts yet still have the power of detail-specific implementations. To see how many different methods are defined in the base `Julia` language for the `+` operator, we can use the command `methods(+)`. As there are so many ($\approx 200$) and that number is growing, we illustrate how many different logarithm methods are implemented for "numbers:" - -```julia; -methods(log, (Number,)) -``` - -(The arguments have *type annotations* such as `x::Float64` or -`x::BigFloat`. `Julia` uses these to help resolve which method should -be called for a given set of arguments. This allows for different -operations depending on the variable type. For example, in this case, -the `log` function for `Float64` values uses a fast algorithm, whereas -for `BigFloat` values an algorithm that can handle multiple precision -is used.) - -##### Example: An application of composition and multiple dispatch - -As mentioned `Julia`'s multiple dispatch allows multiple functions with the same name. The function that gets selected depends not just on the type of the arguments, but also on the number of arguments given to the function. We can exploit this to simplify our tasks. For example, consider this optimization problem: - -> For all rectangles of perimeter ``20``, what is the one with largest area? - -The start of this problem is to represent the area in terms of one variable. We see next that composition can simplify this task, which when done by hand requires a certain amount of algebra. - -Representing the area of a rectangle in terms of two variables is easy, as the familiar formula of width times height applies: - -```julia; -Area(w, h) = w * h -``` - -But the other fact about this problem - that the perimeter is $20$ - means that height depends on width. For this question, we can see that $P=2w + 2h$ so that - as a function - `height` depends on `w` as follows: - -```julia; -height(w) = (20 - 2*w)/2 -``` - -By hand we would substitute this last expression into that for the area and simplify (to get $A=w\cdot (20-2 \cdot w)/2 = -w^2 + 10$). However, within `Julia` we can let *composition* do the substitution and leave the algebraic simplification for `Julia` to do: - - -```julia; -Area(w) = Area(w, height(w)) -``` - -This might seem odd, just like with `log`, we now have two *different* but related -functions named `Area`. Julia will decide which to use based on the -number of arguments when the function is called. This setup allows both to -be used on the same line, as above. This usage style is not so common with -many computer languages, but is a feature of `Julia` which is built around -the concept of *generic* functions with multiple dispatch rules to -decide which rule to call. - - -For example, jumping ahead a bit, the `plot` function of `Plots` expects functions of a single numeric -variable. Behind the scenes, then the function `A(w)` will be used in this graph: - -```julia; -plot(Area, 0, 10) -``` - -From the graph, we can see that that width for maximum area is $w=5$ and so $h=5$ as well. - - -## Function application - -The typical calling pattern for a function simply follows *mathematical* notation, that is `f(x)` calls the function `f` with the argument `x`. There are times -- especially with function composition -- that an alternative *piping* syntax is desirable. `Julia` provides the *infix* operation `|>` for piping, defining it by `|>(x, f) = f(x)`. This allows composition to work left to right, instead of right to left. For example, these two calls produce the same answer: - -```julia -exp(sin(log(3))), 3 |> log |> sin |> exp -``` - - -## Other types of functions - -`Julia` has both *generic* functions *and* *anonymous* functions. Generic functions participate in *multiple dispatch*, a central feature of `Julia`. Anonymous functions are very useful with higher-order programming (passing functions as arguments). These notes occasionally take advantage of anonymous functions for convenience. - - -### Anonymous functions - -Simple mathematical functions have a domain and range which are a subset of the real numbers, and generally have a concrete mathematical rule. However, the definition of a function is much more abstract. We've seen that functions for computer languages can be more complicated too, with, for example, the possibility of multiple input values. Things can get more abstract still. - -Take for example, the idea of the shift of a function. The following mathematical definition of a new function $g$ related to a function $f$: - -```math -g(x) = f(x-c) -``` - -has an interpretation - the graph of $g$ will be the same as the graph of $f$ shifted to the right by $c$ units. That is $g$ is a transformation of $f$. From one perspective, the act of replacing $x$ with $x-c$ transforms a function into a new function. Mathematically, when we focus on transforming functions, the word [operator](http://en.wikipedia.org/wiki/Operator_%28mathematics%29) is sometimes used. This concept of transforming a function can be viewed as a certain type of function, in an abstract enough way. The relation would be to just pair off the functions $(f,g)$ where $g(x) = f(x-c)$. - -With `Julia` we can represent such operations. The simplest thing would be to do something like: - -```julia;hold=true -f(x) = x^2 - 2x -g(x) = f(x -3) -``` - -Then $g$ has the graph of $f$ shifted by 3 units to the right. Now `f` above refers to something in the main workspace, in this example a specific function. Better would be to allow `f` to be an argument of a function, like this: - -```julia; -function shift_right(f; c=0) - function(x) - f(x - c) - end -end -``` - -That takes some parsing. In the body of the `shift_right` is the -definition of a function. But this function has no name-- it is -*anonymous*. But what it does should be clear - it subtracts $c$ from -$x$ and evaluates $f$ at this new value. Since the last expression -creates a function, this function is returned by `shift_right`. - -So we could have done something more complicated like: - -```julia; -f(x) = x^2 - 2x -l = shift_right(f, c=3) -``` - -Then `l` is a function that is derived from `f`. - -!!! note - The value of `c` used when `l` is called is the one passed to `shift_right`. Functions like `l` that are returned by other functions also are called *closures*, as the context they are evaluated within includes the context of the function that constructs them. - - -Anonymous functions can be created with the `function` keyword, but we will use the "arrow" notation, `arg->body` to create them, The above, could have been defined as: - -```julia -shift_right_alt(f; c=0) = x -> f(x-c) -``` - -When the `->` is seen a function is being created. - - -!!! warning - Generic versus anonymous functions. Julia has two types of functions, - generic ones, as defined by `f(x)=x^2` and anonymous ones, as defined - by `x -> x^2`. One gotcha is that `Julia` does not like to use the - same variable name for the two types. In general, Julia is a dynamic - language, meaning variable names can be reused with different types - of variables. But generic functions take more care, as when a new - method is defined it gets added to a method table. So repurposing the - name of a generic function for something else is not allowed. Similarly, - repurposing an already defined variable name for a generic function is - not allowed. This comes up when we use functions that return functions - as we have different styles that can be used: When we defined `l = - shift_right(f, c=3)` the value of `l` is assigned an anonymous - function. This binding can be reused to define other variables. - However, we could have defined the function `l` through `l(x) = - shift_right(f, c=3)(x)`, being explicit about what happens to the - variable `x`. This would add a method to the generic function `l`. Meaning, we - get an error if we tried to assign a variable to `l`, such as an - expression like `l=3`. We generally employ the latter style, even though - it involves a bit more typing, as we tend to stick to methods of generic - functions for consistency. - - -##### Example: the secant line - -A secant line is a line through two points on the graph of a function. If we have a function $f(x)$, and two $x$-values $x=a$ and $x=b$, then we can find the slope between the points $(a,f(a))$ and $(b, f(b))$ with: - -```math -m = \frac{f(b) - f(a)}{b - a}. -``` - -The point-slope form of a line then gives the equation of the tangent line as $y = f(a) + m \cdot (x - a)$. - -To model this in `Julia`, we would want to turn the inputs `f`,`a`, `b` into a function that implements the secant line (functions are much easier to work with than equations). Here is how we can do it: - -```julia; -function secant(f, a, b) - m = (f(b) - f(a)) / (b-a) - x -> f(a) + m * (x - a) -end -``` - -The body of the function nearly mirrors the mathematical treatment. The main difference is in place of $y = \dots$ we have a `x -> ...` to create an anonymous function. - -To illustrate the use, suppose $f(x) = x^2 - 2$ and we have the secant line between $a=1$ and $b=2$. The value at $x=3/2$ is given by: - -```julia;hold=true -f(x) = x^2 - 2 -a,b = 1, 2 -secant(f,a,b)(3/2) -``` - -The last line employs double parentheses. The first pair, `secant(f,a,b)`, returns a function and the second pair, `(3/2)`, are used to call the returned function. - - -#### Closures - -One main use of anonymous functions is to make [closures](https://en.wikipedia.org/wiki/Closure_(computer_programming)). We've touched on two concepts: functions with parameters *and* functions as arguments to other functions. The creation of a function for a given set of parameters may be needed. Anonymous functions are used to create **closures** which capture the values of the parameters. For a simple example, `mxplusb` parameterizes any line, but to use a function to represent a specific line, a new function can be created: - -```julia; hold=true -mxplusb(x; m=0, b=0) = m*x + b -specific_line(m,b) = x -> mxplusb(x; m=m, b=b) -``` - -The returned object will have its parameters (`m` and `b`) fixed when used. - -In `Julia`, the functions `Base.Fix1` and `Base.Fix2` are provided to take functions of two variables and create callable objects of just one variable, with the other argument fixed. This partial function application is provided by a some of the logical comparison operators. which can be useful with filtering, say. - -For example, `<(2)` is a funny looking way of expressing the function `x -> x < 2`. (Think of `x < y` as `<(x,y)` and then "fix" the value of `y` to be `2`.) This is useful with filtering by a predicate function, for example: - -```julia -filter(<(2), 0:4) -``` - -which picks off the values of `0` and `1` in a somewhat obscure way but less verbose than `filter(x -> x < 2, 0:4)`. - - -The `Fix2` function is also helpful when using the `f(x, p)` form for passing parameters to a function. The result of `Base.Fix2(f, p)` is a function with its parameters fixed that can be passed along for plotting or other uses. - - -### The `do` notation - -Many functions in `Julia` accept a function as the first argument. A common pattern for calling some function is `action(f, args...)` where `action` is the function that will act on another function `f` using the value(s) in `args...`. There `do` notation is syntactical sugar for creating an anonymous function which is useful when more complicated function bodies are needed. - -Here is an artificial example to illustrate of a task we won't have cause to use in these notes, but is an important skill in some contexts. The `do` notation can be confusing to read, as it moves the function definition to the end and not the beginning, but is convenient to write and is used very often with the task of this example. - -To save some text to a file requires a few steps: opening the file; writing to the file; closing the file. The `open` function does the first. One method has this signature `open(f::Function, args...; kwargs....)` and is documented to "Apply the function f to the result of `open(args...; kwargs...)` and close the - resulting file descriptor upon completion." Which is great, the open and close stages are handled by `Julia` and only the writing is up to the user. - -The writing is done in the function of a body, so the `do` notation allows the creation of the function to be handled anonymously. In this context, the argument to this function will be an `IO` handle, which is typically called `io`. - -So the pattern would be - -```julia; eval=false -open("somefile.txt", "w") do io - write(io, "Four score and seven") - write(io, "years ago...") -end -``` - -The name of the file to open appears, how the file is to be opened (`w` means write, `r` would mean read), and then a function with argument `io` which writes two lines to `io`. - - - -## Questions - -##### Question - -State the domain and range of $f(x) = |x + 2|$. - -```julia; hold=true;echo=false; -choices = [ -"Domain is all real numbers, range is all real numbers", -"Domain is all real numbers, range is all non-negative numbers", -"Domain is all non-negative numbers, range is all real numbers", -"Domain is all non-negative numbers, range is all non-negative numbers" -] -answ = 2 -radioq(choices, answ) -``` - -##### Question - -State the domain and range of $f(x) = 1/(x-2)$. - - -```julia; hold=true;echo=false; -choices = [ -"Domain is all real numbers, range is all real numbers", -L"Domain is all real numbers except $2$, range is all real numbers except $0$", -L"Domain is all non-negative numbers except $0$, range is all real numbers except $2$", -L"Domain is all non-negative numbers except $-2$, range is all non-negative numbers except $0$" -] -answ = 2 -radioq(choices, answ) -``` - -##### Question - -Which of these functions has a domain of all real $x$, but a range of $x > 0$? - -```julia; hold=true;echo=false; -choices = [ -raw"``f(x) = 2^x``", -raw"``f(x) = 1/x^2``", -raw"``f(x) = |x|``", -raw"``f(x) = \sqrt{x}``"] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Which of these commands will make a function for $f(x) = \sin(x + \pi/3)$? - -```julia; hold=true;echo=false; -choices = [q"f = sin(x + pi/3)", -q"function f(x) = sin(x + pi/3)", -q"f(x) = sin(x + pi/3)", -q"f: x -> sin(x + pi/3)", -q"f x = sin(x + pi/3)"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -Which of these commands will create a function for $f(x) = (1 + x^2)^{-1}$? - -```julia; hold=true;echo=false; -choices = [q"f(x) = (1 + x^2)^(-1)", -q"function f(x) = (1 + x^2)^(-1)", -q"f(x) := (1 + x^2)^(-1)", -q"f[x] = (1 + x^2)^(-1)", -q"def f(x): (1 + x^2)^(-1)" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Will the following `Julia` commands create a function for - -```math -f(x) = \begin{cases} -30 & x < 500\\ -30 + 0.10 \cdot (x-500) & \text{otherwise.} -\end{cases} -``` - -```julia -phone_plan(x) = x < 500 ? 30.0 : 30 + 0.10 * (x-500); -``` - -```julia; hold=true;echo=false; -booleanq(true, labels=["Yes", "No"]) -``` - - -###### Question - -The expression `max(0, x)` will be `0` if `x` is negative, but otherwise will take the value of `x`. Is this the same? - -```julia -a_max(x) = x < 0 ? x : 0.0; -``` - - -```julia; hold=true;echo=false; -yesnoq(false) -``` - - -###### Question - -In statistics, the normal distribution has two parameters $\mu$ and $\sigma$ appearing as: - -```math -f(x; \mu, \sigma) = \frac{1}{\sqrt{2\pi\sigma}} e^{-\frac{1}{2}\frac{(x-\mu)^2}{\sigma}}. -``` - -Does this function implement this with the default values of $\mu=0$ and $\sigma=1$? - -```julia -a_normal(x; mu=0, sigma=1) = 1/sqrt(2pi*sigma) * exp(-(1/2)*(x-mu)^2/sigma) -``` - - -```julia; hold=true;echo=false; -booleanq(true, labels=["Yes", "No"]) -``` - -What value of $\mu$ is used if the function is called as `f(x, sigma=2.7)`? - -```julia; hold=true;echo=false; -numericq(0) -``` - - -What value of $\mu$ is used if the function is called as `f(x, mu=70)`? - -```julia; hold=true;echo=false; -numericq(70) -``` - -What value of $\mu$ is used if the function is called as `f(x, mu=70, sigma=2.7)`? - -```julia; hold=true;echo=false; -numericq(70) -``` - - -###### Question - -`Julia` has keyword arguments (as just illustrated) but also -positional arguments. These are matched by how the function is -called. For example, - -```julia; -A(w, h) = w * h -``` - -when called as `A(10, 5)` will use 10 for `w` and `5` for `h`, as the -order of `w` and `h` matches that of `10` and `5` in the call. - -This is clear enough, but in fact positional arguments can have -default values (then called -[optional](http://julia.readthedocs.org/en/latest/manual/functions/#optional-arguments)) -arguments). For example, - -```julia; -B(w, h=5) = w * h -``` - -Actually creates two functions: `B(w,h)` for when the call is, say, `B(10,5)` and `B(w)` when the call is `B(10)`. - -Suppose a function `C` is defined by - -```julia; -C(x, mu=0, sigma=1) = 1/sqrt(2pi*sigma) * exp(-(1/2)*(x-mu)^2/sigma) -``` - -This is *nearly* identical to the last question, save for a comma -instead of a semicolon after the `x`. - -What value of `mu` is used by the call `C(1, 70, 2.7)`? - -```julia; hold=true;echo=false; -numericq(70) -``` - -What value of `mu` is used by the call `C(1, 70)`? - -```julia; hold=true;echo=false; -numericq(70) -``` - -What value of `mu` is used by the call `C(1)`? - -```julia; hold=true;echo=false; -numericq(0) -``` - -Will the call `C(1, mu=70)` use a value of `70` for `mu`? - -```julia; hold=true;echo=false; -choices = ["Yes, this will work just as it does for keyword arguments", -"No, there will be an error that the function does not accept keyword arguments"] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -This function mirrors that of the built-in `clamp` function: - -```julia -klamp(x, a, b) = x < a ? a : (x > b ? b : x) -``` - -Can you tell what it does? - -```julia; hold=true;echo=false; -choices = [ -"If `x` is in `[a,b]` it returns `x`, otherwise it returns `a` when `x` is less than `a` and `b` when `x` is greater than `b`.", -"If `x` is in `[a,b]` it returns `x`, otherwise it returns `NaN`", -"`x` is the larger of the minimum of `x` and `a` and the value of `b`, aka `max(min(x,a),b)`" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -`Julia` has syntax for the composition of functions $f$ and $g$ using the Unicode operator `∘` entered as `\circ[tab]`. - -The notation to call a composition follows the math notation, where parentheses are necessary to separate the act of composition from the act of calling the function: - -```math -(f \circ g)(x) -``` - - -For example - -```julia; -(sin ∘ cos)(pi/4) -``` - -What happens if you forget the extra parentheses and were to call `sin ∘ cos(pi/4)`? - - -```julia; hold=true;echo=false; -choices = [ -L"You still get $0.649...$", -"You get a `MethodError`, as `cos(pi/4)` is evaluated as a number and `∘` is not defined for functions and numbers", -"You get a `generic` function, but this won't be callable. If tried, it will give an method error." -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The [pipe](http://julia.readthedocs.org/en/latest/stdlib/base/#Base.|>) notation `ex |> f` takes the output of `ex` and uses it as the input to the function `f`. That is composition. What is the value of this expression `1 |> sin |> cos`? - -```julia; hold=true;echo=false; -choices = [ -"It is `0.6663667453928805`, the same as `cos(sin(1))`", -"It is `0.5143952585235492`, the same as `sin(cos(1))`", -"It gives an error"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -`Julia` has implemented this *limited* set of algebraic operations on functions: `∘` for *composition* and `!` for *negation*. (Read `!` as "not.") The latter is useful for "predicate" functions (ones that return either `true` or `false`. What is output by this command? - -```julia;eval=false -fn = !iseven -fn(3) -``` - -```julia; hold=true;echo=false; -choices = ["`true`","`false`"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Generic functions in `Julia` allow many algorithms to work without change for different number types. For example, [3000](https://pdfs.semanticscholar.org/1ef4/ee58a159dc7e437e190ec2839fb9a654596c.pdf) years ago, floating point numbers wouldn't have been used to carry out the secant method computations, rather rational numbers would have been. We can see the results of using rational numbers with no change to our key function, just by starting with rational numbers for `a` and `b`: - - -```julia; hold=true; -secant_intersection(f, a, b) = b - f(b) * (b - a) / (f(b) - f(a)) # rewritten -f(x) = x^2 - 2 -a, b = 1//1, 2//1 -c = secant_intersection(f, a, b) -``` - -Now `c` is `4//3` and not `1.333...`. This works as the key operations -used: division, squaring, subtraction all have different -implementations for rational numbers that preserve this type. - -Repeat the secant method two more times to find a better approximation for $\sqrt{2}$. What is the value of `c` found? - -```julia; hold=true;echo=false; -choices = [q"4//3", q"7//5", q"58//41", q"816//577"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -How small is the value of $f(c)$ for this value? - -```julia; hold=true;echo=false; -val = f(58/41) -numericq(val) -``` - -How close is this answer to the true value of $\sqrt{2}$? - -```julia; hold=true;echo=false; -choices = [L"about $8$ parts in $100$", L"about $1$ parts in $100$", L"about $4$ parts in $10,000$", L"about $2$ parts in $1,000,000$"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -(Finding a good approximation to $\sqrt{2}$ would be helpful to builders, for example, as it could be used to verify the trueness of a square room, say.) - -###### Question - -`Julia` does not have surface syntax for the *difference* of functions. This is a common thing to want when solving equations. The tools available solve $f(x)=0$, but problems may present as solving for $h(x) = g(x)$ or even $h(x) = c$, for some constant. Which of these solutions is **not** helpful if $h$ and $g$ are already defined? - -```julia; hold=true;echo=false; -choices = ["Just use `f = h - g`", -"Define `f(x) = h(x) - g(x)`", -"Use `x -> h(x) - g(x)` when the difference is needed" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Identifying the range of a function can be a difficult task. We see in this question that in some cases, a package can be of assistance. - -A mathematical interval is a set of values of the form - -* an open interval: ``a < x < b``, or ``(a,b)``; -* a closed interval: ``a \leq x \leq b``, or ``[a,b]``; -* or a half-open interval: ``a < x \leq b`` or ``a \leq x < b``, repectively ``(a,b]`` or ``[a,b)``. - -They all contain all real numbers between the endpoints, the distinction is whether the endpoints are included or not. - -A domain is some set, but typically that set is an interval such as *all real numbers* (``(-\infty,\infty)``), *all non-negative numbers* (``[0,\infty)``), or, say, *all positive numbers* (``(0,\infty)``). - -The `IntervalArithmetic` package provides an easy means to define closed intervals using the symbol `..`, but this is also used by the already loaded `CalculusWithJulia` package in different manner, so we use the fully qualified named constructor in the following to construct intervals: - -```julia -import IntervalArithmetic -``` - -```julia -I1 = IntervalArithmetic.Interval(-Inf, Inf) -``` - -```julia -I2 = IntervalArithmetic.Interval(0, Inf) -``` - -The main feature of the package is not to construct intervals, but rather to *rigorously* bound with an interval the output of the image of a closed interval under a function. That is, for a function ``f`` and *closed* interval ``[a,b]``, a bound for the set ``\{f(x) \text{ for } x \text{ in } [a,b]\}``. When `[a,b]` is the domain of ``f``, then this is a bound for the range of ``f``. - - -For example the function ``f(x) = x^2 + 2`` had a domain of all real ``x``, the range can be found with: - -```julia -ab = IntervalArithmetic.Interval(-Inf, Inf) -u(x) = x^2 + 2 -u(ab) -``` - -For this problem, the actual range can easily be identified. Does the bound computed match exactly? - -```julia; hold=true; echo=false -yesnoq("yes") -``` - -Does `sin(0..pi)` **exactly** match the interval of ``[-1,1]``? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -Guess why or why not? - -```julia; hold=true; echo=false -choices = ["Well it does, because ``[-1,1]`` is the range", - """It does not. The bound found is a provably known bound. The small deviation is due to the possible errors in evalution of the `sin` function near the floating point approximation of `pi`, -"""] -radioq(choices, 2) -``` - -Now consider the evaluation - -```julia; hold=true; -f(x) = x^x -I = IntervalArithmetic.Interval(0, Inf) -f(I) -``` - -Make a graph of `f`. Does the interval found above provide a nearly exact estimate of the true range (as the previous two questions have)? - -```julia; hold=true; echo=false -yesnoq("no") -``` - -Any thoughts on why? - -```julia; hold=true;echo=false -choices = [""" -The guarantee of `IntervalArithmetic` is a *bound* on the interval, not the *exact* interval. In the case where the variable `x` appears more than once, it is treated formulaically as an *independent* quantity (meaning it has it full set of values considered in each instance) which is not the actual case mathematically. This is the "dependence problem" in interval arithmetic.""", - """ -The interval is a nearly exact estimate, as guaranteed by `IntervalArithmetic`. -"""] -radioq(choices, 1) -``` diff --git a/CwJ/precalc/inversefunctions.jmd b/CwJ/precalc/inversefunctions.jmd deleted file mode 100644 index 1db2406..0000000 --- a/CwJ/precalc/inversefunctions.jmd +++ /dev/null @@ -1,660 +0,0 @@ -# The Inverse of a Function - -In this section we will use these add-on packages: - -```julia -using CalculusWithJulia -using Plots -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "The Inverse of a Function", - description = "Calculus with Julia: The Inverse of a Function", - tags = ["CalculusWithJulia", "precalc", "the inverse of a function"], -); - -nothing -``` - ----- - -A (univariate) mathematical function relates or associates values of -$x$ to values $y$ using the notation $y=f(x)$. A key point is a given -$x$ is associated with just one $y$ value, though a given $y$ value -may be associated with several different $x$ values. (Graphically, this is the vertical line test.) - -We may conceptualize such a relation in many ways: through an algebraic -rule; through the graph of $f;$ through a description of what $f$ -does; or through a table of paired values, say. For the moment, let's -consider a function as rule that takes in a value of $x$ and outputs a -value $y$. If a rule is given defining the function, the computation -of $y$ is straightforward. A different question is not so easy: for a -given value $y$ what value - or *values* - of $x$ (if any) produce an output of -$y$? That is, what $x$ value(s) satisfy $f(x)=y$? - -*If* for each $y$ in some set of values there is just one $x$ value, then this operation associates to each value $y$ a single value $x$, so it too is a function. When that is the case we call this an *inverse* function. - -Why is this useful? When available, it can help us solve equations. If we can write our equation as $f(x) = y$, then we can "solve" for $x$ through $x = g(y)$, where $g$ is this inverse function. - -Let's explore when we can "solve" for an inverse function. - -Consider the graph of the function $f(x) = 2^x$: - -```julia;hold=true -f(x) = 2^x -plot(f, 0, 4, legend=false) -plot!([2,2,0], [0,f(2),f(2)]) -``` - -The graph of a function is a representation of points $(x,f(x))$, so -to *find* $f(c)$ from the graph, we begin on the $x$ axis at $c$, move -vertically to the graph (the point $(c, f(c))$), and then move -horizontally to the $y$ axis, intersecting it at $f(c)$. The figure -shows this for $c=2$, from which we can read that $f(c)$ is about -$4$. This is how an $x$ is associated to a single $y$. - -If we were to *reverse* the direction, starting at $f(c)$ on the $y$ -axis and then moving horizontally to the graph, and then vertically to -the $x$-axis we end up at a value $c$ with the correct $f(c)$. This -operation will form a function **if** the initial movement -horizontally is guaranteed to find *no more than one* value on the graph. That -is, to have an inverse function, there can not be two $x$ values -corresponding to a given $y$ value. This observation is often -visualized through the "horizontal line test" - the graph of a function with an -inverse function can only intersect a horizontal line at most in one -place. - - -More formally, a function is called *one-to-one* *if* for any two $a -\neq b$, it must be that $f(a) \neq f(b)$. Many functions are -one-to-one, many are not. Familiar one-to-one functions are linear functions ($f(x)=a \cdot x + b$ with $a\neq 0$), odd powers of $x$ ($f(x)=x^{2k+1}$), and functions of the form $f(x)=x^{1/n}$ for $x \geq 0$. In contrast, all *even* functions are *not* one-to-one, as $f(x) = f(-x)$ for any nonzero $x$ in the domain of $f$. - -A class of functions that are guaranteed to be one-to-one are the -*strictly* increasing functions (which satisfy $a < b$ implies $f(a) < -f(b)$). Similarly, strictly decreasing functions are one-to-one. The -term strictly *monotonic* is used to describe either strictly -increasing or strictly decreasing. By the above observations, -strictly monotonic function will have inverse functions. - -The function $2^x$, graphed above, is strictly increasing, so it will -have an inverse function. That is we can solve for $x$ in an equation -like $2^x = 9$ using the inverse function of $f(x) = 2^x$, provided we -can identify the inverse function. - -## How to solve for an inverse function? - -If we know an inverse function exists, how can we find it? - -If our function is given by a graph, the process above describes how to find the inverse function. - -However, typically we have a rule describing our function. What is the -process then? A simple example helps illustrate. The *linear* -function $f(x) = 9/5\cdot x + 32$ is strictly increasing, hence has an -inverse function. What should it be? Let's describe the action of $f$: -it multiplies $x$ by $9/5$ and then adds $32$. To "invert" this we -*first* invert the adding of $32$ by subtracting $32$, then we would -"invert" multiplying by $9/5$ by *dividing* by $9/5$. Hence $g(x)=(x-32)/(9/5)$. -We would generally simplify this, but let's not for now. If we -view a function as a composition of many actions, then we find the -inverse by composing the inverse of these actions in **reverse** -order. The reverse order might seem confusing, but this is how we get -dressed and undressed: to dress we put on socks and then shoes. To -undress we take off the shoes and then take off the socks. - -When we solve algebraically for $x$ in $y=9/5 \cdot x + 32$ we do the same thing as we do verbally: we subtract $32$ from each side, and then divide by $9/5$ to isolate $x$: - -```math -\begin{align} -y &= 9/5 \cdot x + 32\\ -y - 32 &= 9/5 \cdot x\\ -(y-32) / (9/5) &= x. -\end{align} -``` - -From this, we have the function $g(y) = (y-32) / (9/5)$ is the inverse function of $f(x) = 9/5\cdot x + 32$. - -*Usually* univariate functions are written with $x$ as the dummy variable, so it is typical to write $g(x) = (x-32) / (9/5)$ as the inverse function. - -*Usually* we use the name $f^{-1}$ for the inverse function of $f$, so this would be most often [seen](http://tinyurl.com/qypbueb) as $f^{-1}(x) = (x-32)/(9/5)$ or after simplification $f^{-1}(x) = (5/9) \cdot (x-32)$. - -!!! note - The use of a negative exponent on the function name is *easily* confused for the notation for a reciprocal when it is used on a mathematical *expression*. An example might be the notation $(1/x)^{-1}$. As this is an expression this would simplify to $x$ and not the inverse of the *function* $f(x)=1/x$ (which is $f^{-1}(x) = 1/x$). - - - -##### Example - -Suppose a transformation of $x$ is given by $y = f(x) = (ax + b)/(cx+d)$. This function is invertible for most choices of the parameters. Find the inverse and describe it's domain. - -From the expression $y=f(x)$ we *algebraically* solve for $x$: - -```math -\begin{align*} -y &= \frac{ax +b}{cx+d}\\ -y \cdot (cx + d) &= ax + b\\ -ycx - ax &= b - yd\\ -(cy-a) \cdot x &= b - dy\\ -x &= -\frac{dy - b}{cy-a}. -\end{align*} -``` - -We see that to solve for $x$ we need to divide by $cy-a$, so this expression can not be zero. -So, using $x$ as the dummy variable, we have - -```math -f^{-1}(x) = -\frac{dx - b}{cx-a},\quad cx-a \neq 0. -``` - - - - -##### Example - -The function $f(x) = (x-1)^5 + 2$ is strictly increasing and so will have an inverse function. Find it. - -Again, we solve algebraically starting with $y=(x-1)^5 + 2$ and solving for $x$: - -```math -\begin{align*} -y &= (x-1)^5 + 2\\ -y - 2 &= (x-1)^5\\ -(y-2)^{1/5} &= x - 1\\ -(y-2)^{1/5} + 1 &= x. -\end{align*} -``` - -We see that $f^{-1}(x) = 1 + (x - 2)^{1/5}$. The fact that the power $5$ is an odd power is important, as this ensures a unique (real) solution to the fifth root of a value, in the above $y-2$. - -##### Example - -The function $f(x) = x^x, x \geq 1/e$ is strictly -increasing. However, trying to algebraically solve for an inverse -function will quickly run into problems (without using specially -defined functions). The existence of an inverse does not imply there -will always be luck in trying to find a mathematical rule defining the -inverse. - -## Functions which are not always invertible - -Consider the function $f(x) = x^2$. The graph - a parabola - is clearly not *monotonic*. Hence no inverse function exists. Yet, we can solve equations $y=x^2$ quite easily: $y=\sqrt{x}$ *or* $y=-\sqrt{x}$. We know the square root undoes the squaring, but we need to be a little more careful to say the square root is the inverse of the squaring function. - -The issue is there are generally *two* possible answers. To avoid this, we might choose to only take the *non-negative* answer. To make this all work as above, we restrict the domain of $f(x)$ and now consider the related function $f(x)=x^2, x \geq 0$. This is now a monotonic function, so will have an inverse function. This is clearly $f^{-1}(x) = \sqrt{x}$. (The ``\sqrt{x}`` being defined as the principle square root or the unique *non-negative* answer to ``u^2-x=0``.) - -The [inverse function theorem](https://en.wikipedia.org/wiki/Inverse_function_theorem) basically says that if $f$ is *locally* monotonic, then an inverse function will exist *locally*. By "local" we mean in a neighborhood of $c$. - -##### Example - -Consider the function $f(x) = (1+x^2)^{-1}$. This bell-shaped function is even (symmetric about $0$), so can not possibly be one-to-one. However, if the domain is restricted to $[0,\infty)$ it is. The restricted function is strictly decreasing and its inverse is found, as follows: - - -```math -\begin{align*} -y &= \frac{1}{1 + x^2}\\ -1+x^2 &= \frac{1}{y}\\ -x^2 &= \frac{1}{y} - 1\\ -x &= \sqrt{(1-y)/y}, \quad 0 \leq y \leq 1. -\end{align*} -``` - -Then $f^{-1}(x) = \sqrt{(1-x)/x}$ where $0 < x \leq 1$. The somewhat -complicated restriction for the the domain coincides with the range of -$f(x)$. We shall see next that this is no coincidence. - -## Formal properties of the inverse function - -Consider again the graph of a monotonic function, in this case $f(x) = x^2 + 2, x \geq 0$: - -```julia;hold=true -f(x) = x^2 + 2 -plot(f, 0, 4, legend=false) -plot!([2,2,0], [0,f(2),f(2)]) -``` - -The graph is shown over the interval $(0,4)$, but the *domain* of $f(x)$ is all $x \geq 0$. The *range* of $f(x)$ is clearly $2 \leq x \leq \infty$. - -The lines layered on the plot show how to associate an $x$ value to a $y$ value or vice versa (as $f(x)$ is one-to-one). The domain then of the inverse function is all the $y$ values for which a corresponding $x$ value exists: this is clearly all values bigger or equal to $2$. The *range* of the inverse function can be seen to be all the images for the values of $y$, which would be all $x \geq 0$. This gives the relationship: - -> the *range* of $f(x)$ is the *domain* of $f^{-1}(x)$; furthermore the *domain* of $f(x)$ is the *range* for $f^{-1}(x)$; - -From this we can see if we start at $x$, apply $f$ we get $y$, if we then apply $f^{-1}$ we will get back to $x$ so we have: - -> For all $x$ in the domain of $f$: $f^{-1}(f(x)) = x$. - -Similarly, were we to start on the $y$ axis, we would see: - -> For all $x$ in the domain of $f^{-1}$: $f(f^{-1}(x)) = x$. - -In short $f^{-1} \circ f$ and $f \circ f^{-1}$ are both identity functions, though on possibly different domains. - -## The graph of the inverse function - -The graph of $f(x)$ is a representation of all values $(x,y)$ where $y=f(x)$. As the inverse flips around the role of $x$ and $y$ we have: - -> If $(x,y)$ is a point on the graph of $f(x)$, then $(y,x)$ will be a point on the graph of $f^{-1}(x)$. - - -Let's see this in action. Take the function $2^x$. We can plot it by generating points to plot as follows: - -```julia; hold=true; -f(x) = 2^x -xs = range(0, 2, length=50) -ys = f.(xs) -plot(xs, ys, color=:blue, label="f") -plot!(ys, xs, color=:red, label="f⁻¹") # the inverse -``` - -By flipping around the $x$ and $y$ values in the `plot!` command, we -produce the graph of the inverse function - when viewed as a function -of $x$. We can see that the domain of the inverse function (in red) is -clearly different from that of the function (in blue). - -The inverse function graph can be viewed as a symmetry of the graph of -the function. Flipping the graph for $f(x)$ around the line $y=x$ will -produce the graph of the inverse function: Here we see for the graph -of $f(x) = x^{1/3}$ and its inverse function: - -```julia; hold=true; -f(x) = cbrt(x) -xs = range(-2, 2, length=150) -ys = f.(xs) -plot(xs, ys, color=:blue, aspect_ratio=:equal, legend=false) -plot!(ys, xs, color=:red) -plot!(identity, color=:green, linestyle=:dash) -x, y = 1/2, f(1/2) -plot!([x,y], [y,x], color=:green, linestyle=:dot) -``` - -We drew a line connecting $(1/2, f(1/2))$ to $(f(1/2),1/2)$. We can -see that it crosses the line $y=x$ perpendicularly, indicating that -points are symmetric about this line. (The plotting argument -`aspect_ratio=:equal` ensures that the $x$ and $y$ axes are on the -same scale, so that this type of line will look perpendicular.) - - -One consequence of this symmetry, is that if $f$ is strictly increasing, then so is its inverse. - - -!!!note - In the above we used `cbrt(x)` and not `x^(1/3)`. The latter usage assumes that $x \geq 0$ as it isn't guaranteed that for all real exponents the answer will be a real number. The `cbrt` function knows there will always be a real answer and provides it. - - -### Lines - -The slope of $f(x) = 9/5 \cdot x + 32$ is clearly $9/5$ and the slope of the inverse function $f^{-1}(x) = 5/9 \cdot (x-32)$ is clearly $5/9$ - or the reciprocal. This makes sense, as the slope is the rise over the run, and by flipping the $x$ and $y$ values we merely flip over the rise and the run. - -Now consider the graph of the *tangent line* to a function. This concept will be better defined later, for now, it is a line "tangent" to the graph of $f(x)$ at a point $x=c$. - -For concreteness, we consider $f(x) = \sqrt{x}$ at $c=2$. The tangent line will have slope $1/(2\sqrt{2})$ and will go through the point $(2, f(2)$. We graph the function, its tangent line, and their inverses: - -```julia; hold=true; -f(x) = sqrt(x) -c = 2 -tl(x) = f(c) + 1/(2 * sqrt(2)) * (x - c) -xs = range(0, 3, length=150) -ys = f.(xs) -zs = tl.(xs) -plot(xs, ys, color=:blue, legend=false) -plot!(xs, zs, color=:blue) # the tangent line -plot!(ys, xs, color=:red) # the inverse function -plot!(zs, xs, color=:red) # inverse of tangent line -``` - -What do we see? In blue, we can see the familiar square root graph along with a "tangent" line through the point $(2, f(2))$. The red graph of $f^{-1}(x) = x^2, x \geq 0$ is seen and, perhaps surprisingly, a tangent line. This is at the point $(f(2), 2)$. We know the slope of this tangent line is the reciprocal of the slope of the red tangent line. This gives this informal observation: - -> If the graph of $f(x)$ has a tangent line at $(c, f(c))$ with slope $m$, then the graph of $f^{-1}(x)$ will have a tangent line at $(f(c), c)$ with slope $1/m$. - -This is reminiscent of the formula for the slope of a perpendicular line, $-1/m$, but quite different, as this formula implies the two lines have either both positive slopes or both negative slopes, unlike the relationship in slopes between a line and a perpendicular line. - -The key here is that the shape of $f(x)$ near $x=c$ is somewhat related to the shape of $f^{-1}(x)$ at $f(c)$. In this case, if we use the tangent line as a fill in for how steep a function is, we see from the relationship that if $f(x)$ is "steep" at $x=c$, then $f^{-1}(x)$ will be "shallow" at $x=f(c)$. - - - -## Questions - -###### Question - -Is it possible that a function have two different inverses? - -```julia; hold=true; echo=false -choices = [L"No, for all $x$ in the domain an an inverse, the value of any inverse will be the same, hence all inverse functions would be identical.", -L"Yes, the function $f(x) = x^2, x \geq 0$ will have a different inverse than the same function $f(x) = x^2, x \leq 0$"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -A function takes a value $x$ adds $1$, divides by $2$, and then subtracts $1$. Is the function "one-to-one"? - -```julia; hold=true; echo=false -choices = [L"Yes, the function is the linear function $f(x)=(x+1)/2 + 1$ and so is monotonic.", -L"No, the function is $1$ then $2$ then $1$, but not \"one-to-one\"" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Is the function $f(x) = x^5 - x - 1$ one-to-one? - -```julia; hold=true; echo=false -choices=[L"Yes, a graph over $(-100, 100)$ will show this.", -L"No, a graph over $(-2,2)$ will show this." -] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -A function is given by the table - -```verbatim -x | y --------- -1 | 3 -2 | 4 -3 | 5 -4 | 3 -5 | 4 -6 | 5 -``` - -Is the function one-to-one? - -```julia; hold=true; echo=false -yesnoq(false) -``` - -###### Question - -A function is defined by its graph. - -```julia; hold=true; echo=false -f(x) = x - sin(x) -plot(f, 0, 6pi) -``` - -Over the domain shown, is the function one-to-one? - -```julia; hold=true; echo=false -yesnoq(true) -``` - - -###### Question - -Suppose $f(x) = x^{-1}$. - -What is $g(x) = (f(x))^{-1}$? - -```julia; hold=true; echo=false -choices = ["``g(x) = x``", "``g(x) = x^{-1}``"] -answ = 1 -radioq(choices, answ) -``` - -What is $g(x) = f^{-1}(x)$? - -```julia; hold=true; echo=false -choices = ["``g(x) = x``", "``g(x) = x^{-1}``"] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -A function, $f$, is given by its graph: - -```julia; echo=false; -k(x) = sin(pi/4 * x) -plot(k, -2, 2) -``` - -What is the value of $f(1)$? - -```julia; hold=true; echo=false -val = k(1) -numericq(val, 0.2) -``` - -What is the value of $f^{-1}(1)$? - -```julia; hold=true; echo=false -val = 2 -numericq(val, 0.2) -``` - -What is the value of $(f(1))^{-1}$? - -```julia; hold=true; echo=false -val = 1/k(1) -numericq(val, 0.2) -``` - -What is the value of $f^{-1}(1/2)$? - -```julia; hold=true; echo=false -val = 2/3 -numericq(val, 0.2) -``` - -###### Question - -A function is described as follows: for $x > 0$ it takes the square root, adds $1$ and divides by $2$. - -What is the inverse of this function? - -```julia; hold=true; echo=false -choices=[ -L"The function that multiplies by $2$, subtracts $1$ and then squares the value.", -L"The function that divides by $2$, adds $1$, and then takes the square root of the value.", -L"The function that takes square of the value, then subtracts $1$, and finally multiplies by $2$." -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -A function, $f$, is specified by a table: - -```verbatim -x | y -------- -1 | 2 -2 | 3 -3 | 5 -4 | 8 -5 | 13 -``` - -What is $f(3)$? - -```julia; hold=true; echo=false -numericq(5) -``` - -What is $f^{-1}(3)$? - -```julia; hold=true; echo=false -numericq(2) -``` - -What is $f(5)^{-1}$? - -```julia; hold=true; echo=false -numericq(1/13) -``` - -What is $f^{-1}(5)$? - -```julia; hold=true; echo=false -numericq(3) -``` - -###### Question - -Find the inverse function of $f(x) = (x^3 + 4)/5$. - -```julia; hold=true; echo=false -choices = [ -"``f^{-1}(x) = (5y-4)^{1/3}``", -"``f^{-1}(x) = (5y-4)^3``", -"``f^{-1}(x) = 5/(x^3 + 4)``" -] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Find the inverse function of $f(x) = x^\pi + e, x \geq 0$. - -```julia; hold=true; echo=false -choices = [ -raw"``f^{-1}(x) = (x-e)^{1/\pi}``", -raw"``f^{-1}(x) = (x-\pi)^{e}``", -raw"``f^{-1}(x) = (x-e)^{\pi}``" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question -What is the *domain* of the inverse function for $f(x) = x^2 + 7, x \geq 0$? - -```julia; hold=true; echo=false -choices = [ -raw"``[7, \infty)``", -raw"``(-\infty, \infty)``", -raw"``[0, \infty)``"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -What is the *range* of the inverse function for $f(x) = x^2 + 7, x \geq 0$? - - -```julia; hold=true; echo=false -choices = [ -raw"``[7, \infty)``", -raw"``(-\infty, \infty)``", -raw"``[0, \infty)``"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -From the plot, are blue and red inverse functions? - -```julia; hold=true; echo=false -f(x) = x^3 -xs = range(0, 2, length=100) -ys = f.(xs) -plot(xs, ys, color=:blue, legend=false) -plot!(ys, xs, color=:red) -plot!(x->x, linestyle=:dash) -``` - -```julia; hold=true; echo=false -yesnoq(true) -``` - - -From the plot, are blue and red inverse functions? - -```julia; hold=true; echo=false -f(x) = x^3 - x - 1 -xs = range(-2,2, length=100) -ys = f.(xs) -plot(xs, ys, color=:blue, legend=false) -plot!(-xs, -ys, color=:red) -plot!(x->x, linestyle=:dash) -``` - -```julia; hold=true; echo=false -yesnoq(false) -``` - - -###### Question - -The function $f(x) = (ax + b)/(cx + d)$ is known as a [Mobius](http://tinyurl.com/oemweyj) transformation and can be expressed as a composition of $4$ functions, $f_4 \circ f_3 \circ f_2 \circ f_1$: - -* where $f_1(x) = x + d/c$ is a translation, -* where $f_2(x) = x^{-1}$ is inversion and reflection, -* where $f_3(x) = ((bc-ad)/c^2) \cdot x$ is scaling, -* and $f_4(x) = x + a/c$ is a translation. - -For $x=10$, what is $f(10)$? - -```julia; echo=false -𝒂,𝒃,𝒄,𝒅 = 1,2,3,5 -f1(x) = x + 𝒅/𝒄; f2(x) = 1/x; f3(x) = (𝒃*𝒄-𝒂*𝒅)/𝒄^2 * x; f4(x)= x + 𝒂/𝒄 -𝒇(x;a=𝒂,b=𝒃,c=𝒄,d=𝒅) = (a*x+b) / (c*x + d) -numericq(𝒇(10)) -``` - -For $x=10$, what is $f_4(f_3(f_2(f_1(10))))$? - -```julia; hold=true; echo=false -numericq(f4(f3(f2(f1(10))))) -``` - -The last two answers should be the same, why? - -```julia; hold=true; echo=false -choices = [ - L"As $f_4(f_3(f_2(f)_1(x))))=(f_4 \circ f_3 \circ f_2 \circ f_1)(x)$", - L"As $f_4(f_3(f_2(f_1(x))))=(f_1 \circ f_2 \circ f_3 \circ f_4)(x)$", - "As the latter is more complicated than the former." -] -answ=1 -radioq(choices, answ) -``` - - -Let $g_1$, $g_2$, $g_3$, and $g_4$ denote the inverse functions. Clearly, $g_1(x) = x- d/c$ and $g+4(x) = x - a/c$, as the inverse of adding a constant is subtracting the constant. - -What is $g_2(x)=f_2^{-1}(x)$? - -```julia; hold=true; echo=false -choices = ["``g_2(x) = x^{-1}``", "``g_2(x) = x``", "``g_2(x) = x -1``"] -answ = 1 -radioq(choices, answ) -``` - -What is $g_3(x)=f_3^{-1}(x)$? - -```julia; hold=true; echo=false -choices = [ - raw"``c^2/(b\cdot c - a\cdot d) \cdot x``", - raw"``(b\cdot c-a\cdot d)/c^2 \cdot x``", - raw"``c^2 x``"] -answ = 1 -radioq(choices, answ) -``` - -Given these, what is the value of $g_4(g_3(g_2(g_1(f_4(f_3(f_2(f_1(10))))))))$? - -```julia; echo=false -g1(x) = x - 𝒅/𝒄; g2(x) = 1/x; g3(x) = 1/((𝒃*𝒄-𝒂*𝒅)/𝒄^2) *x; g4(x)= x - 𝒂/𝒄 -val1 = g4(g3(g2(g1(f4(f3(f2(f1(10)))))))) -numericq(val1) -``` - -What about the value of $g_1(g_2(g_3(g_4(f_4(f_3(f_2(f_1(10))))))))$? - -```julia; hold=true; echo=false -val = g1(g2(g3(g4(f4(f3(f2(f1(10)))))))) -numericq(val) -``` diff --git a/CwJ/precalc/julia_overview.jmd b/CwJ/precalc/julia_overview.jmd deleted file mode 100644 index 091b281..0000000 --- a/CwJ/precalc/julia_overview.jmd +++ /dev/null @@ -1,570 +0,0 @@ -# Overview of Julia commands - -```julia; echo=false; results="hidden" -using CalculusWithJulia -using CalculusWithJulia.WeaveSupport -const frontmatter = ( - title = "Overview of Julia commands", - description = "Calculus with Julia: Overview of Julia commands", - tags = ["CalculusWithJulia", "precalc", "overview of julia commands"], -); - -nothing -``` - -The [`Julia`](http://www.julialang.org) programming language is well suited as a computer accompaniment while learning the concepts of calculus. The following overview covers the language-specific aspects of the pre-calculus part of the [Calculus with Julia](calculuswithjulia.github.io) notes. - -## Installing `Julia` - -`Julia` is an *open source* project which allows anyone with a supported computer to use it. To install locally, the [downloads](https://julialang.org/downloads/) page has several different binaries for installation. Additionally, the downloads page contains a link to a docker image. For Microsoft Windows, the new [juilaup](https://github.com/JuliaLang/juliaup) installer may be of interest; it is available from the Windows Store. -`Julia` can also be compiled from source. - -`Julia` can also be run through the web. The [https://mybinder.org/](https://mybinder.org/) service in particular allows free access, though limited in terms of allotted memory and with a relatively short timeout for inactivity. - - -[Launch Binder](https://mybinder.org/v2/gh/CalculusWithJulia/CwJScratchPad.git/master) - -## Interacting with `Julia` - -At a basic level, `Julia` provides a means to read commands or instructions, evaluate those commands, and then print or return those commands. At a user level, there are many different ways to interact with the reading and printing. For example: - -* The REPL. The `Julia` terminal is the built-in means to interact with `Julia`. A `Julia` Terminal has a command prompt, after which commands are typed and then sent to be evaluated by the `enter` key. The terminal may look something like the following where `2+2` is evaluated: - ----- - -```verbatim -$ julia - _ - _ _ _(_)_ | Documentation: https://docs.julialang.org - (_) | (_) (_) | - _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. - | | | | | | |/ _` | | - | | |_| | | | (_| | | Version 1.7.0 (2021-11-30) - _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release -|__/ | - -julia> 2 + 2 -4 -``` - ----- - -* An IDE. For programmers, an integrated development environment is often used to manage bigger projects. `Julia` has `Juno` and `VSCode`. - -* A notebook. The [Project Juptyer](https://jupyter.org/) provides a notebook interface for interacting with `Julia` and a more `IDE` style `jupyterlab` interface. A jupyter notebook has cells where commands are typed and immediately following is the printed output returned by `Julia`. The output of a cell depends on the state of the kernel when the cell is computed, not the order of the cells in the notebook. Cells have a number attached, showing the execution order. The `Juypter` notebook is used by `binder` and can be used locally through the `IJulia` package. This notebook has the ability to display many different types of outputs in addition to plain text, such as images, marked up math text, etc. - -* The [Pluto](https://github.com/fonsp/Pluto.jl) package provides a *reactive* notebook interface. Reactive means when one "cell" is modified and executed, the new values cascade to all other dependent cells which in turn are updated. This is very useful for exploring a parameter space, say. Pluto notebooks can be exported as HTML files which make them easy to read online and -- by clever design -- embed the `.jl` file that can run through `Pluto` if it is downloaded. - - -The `Pluto` interface has some idiosyncracies that need explanation: - -* Cells can only have one command within them. Multiple-command cells must be contained in a `begin` block or a `let` block. - -* By default, the cells are *reactive*. This means when a variable in one cell is changed, then any references to that variable are also updated -- like a spreadsheet. This is fantastic for updating several computations at once. However it means variable names can not be repeated within a page. Pedagogically, it is convenient to use variable names and function names (e.g., `x` and `f`) repeatedly, but this is only possible *if* they are within a `let` block or a function body. - -* To not repeat names, but to be able to reference a value from cell-to-cell, some Unicode variants are used within a page. Visually these look familiar, but typing the names requires some understanding of Unicode input. The primary usages is *bold italic* (e.g., `\bix[tab]` or `\bif[tab]`) or *bold face* (e.g. `\bfx[tab]` or `bff[tab]`). - -* The notebooks snapshot the packages they depend on, which is great for reproducability, but may mean older versions are silently used. - - - - -## Augmenting base `Julia` - -The base `Julia` installation has many features, but leaves many others to `Julia`'s package ecosystem. These notes use packages to provide plotting, symbolic math, access to special functions, numeric routines, and more. - -Within `Pluto`, using add-on packages is very simple, as `Pluto` downloads and installs packages when they are requested through a `using` or `import` directive. - ----- - -For other interfaces to `Julia` some more detail is needed. - -The `Julia` package manager makes add-on packages very easy to install. - -Julia comes with just a few built-in packages, one being `Pkg` which manages subsequent package installation. To add more packages, we first must *load* the `Pkg` package. This is done by issuing the following command: - -```julia -using Pkg -``` - -The `using` command loads the specified package and makes all its *exported* values available for direct use. There is also the `import` command which allows the user to select which values should be imported from the package, if any, and otherwise gives access to the new functionality through the dot syntax. - -Packages need to be loaded just once per session. - -To use `Pkg` to "add" another package, we would have a command like: - -```julia; eval=false -Pkg.add("CalculusWithJulia") -``` - -This command instructs `Julia` to look at its *general registry* for the `CalculusWithJulia.jl` package, download it, then install it. Once installed, a package only needs to be brought into play with the `using` or `import` commands. - -!!! note - In a terminal setting, there is a package mode, entered by typing `]` as the leading character and exited by entering `` at a blank line. This mode allows direct access to `Pkg` with a simpler syntax. The command above would be just `add CalculusWithJulia`.) - - - -Packages can be updated through the command `Pkg.up()`, and removed with `Pkg.rm(pkgname)`. - -By default packages are installed in a common area. It may be desirable to keep packages for projects isolated. For this the `Pkg.activate` command can be used. This feature allows a means to have reproducible environments even if `Julia` or the packages used are upgraded, possibly introducing incompatabilities. - - -For these notes, the following packages, among others, are used: - -```julia; eval=false -Pkg.add("CalculusWithJulia") # for some simplifying functions and a few packages (SpecialFunctions, ForwardDiff) -Pkg.add("Plots") # for basic plotting -Pkg.add("SymPy") # for symbolic math -Pkg.add("Roots") # for solving `f(x)=0` -Pkg.add("QuadGk") # for integration -Pkg.add("HQuadrature") # for higher-dimensional integration -``` - - -## `Julia` commands - -In a `Jupyter` notebook or `Pluto` notebook, commands are typed into a -notebook cell: - -```julia; -2 + 2 # use shift-enter to evaluate -``` - -Commands are executed by using `shift-enter` or a run button near the cell. - -In `Jupyter` multiple commands per cell are allowed. In `Pluto`, a `begin` or `let` block is used to collect multiple commmands into a single call. -Commands may be separated by new lines or semicolons. - -On a given line, anything **after** a `#` is a *comment* and is not processed. - -The results of the last command executed will be displayed in an -output area. Separating values by commas allows more than one value to be -displayed. Plots are displayed when the plot object is returned by the last executed command. - -In `Jupyter`, the state of the notebook is a determined by the cells -executed along with their order. The state of a `Pluto` notebook is a -result of all the cells in the notebook being executed. The cell order -does not impact this and can be rearranged by the user. - -## Numbers, variable types - -`Julia` has many different number types beyond the floating point type employed by most calculators. These include - -* Floating point numbers: `0.5` - -* Integers: `2` - -* Rational numbers: `1//2` - -* Complex numbers `2 + 0im` - -`Julia`'s parser finds the appropriate type for the value, when read in. The following all create the number ``1`` first as an integer, then a rational, then a floating point number, again as floating point number, and finally as a complex number: - -```julia -1, 1//1, 1.0, 1e0, 1 + 0im -``` - - -As much as possible, operations involving certain types of numbers will produce output of a given type. For example, both of these divisions produce a floating point answer, even though mathematically, they need not: - -```julia; -2/1, 1/2 -``` - - -Some powers with negative bases, like `(-3.0)^(1/3)`, are not defined. However, `Julia` provides the special-case function `cbrt` (and `sqrt`) for handling these. - -Integer operations may silently overflow, producing odd answers, at first glance: - -```julia; -2^64 -``` - -(Though the output is predictable, if overflow is taken into consideration appropriately.) - -When different types of numbers are mixed, `Julia` will usually promote the values to a common type before the operation: - -```julia; -(2 + 1//2) + 0.5 -``` - -`Julia` will first add `2` and `1//2` promoting `2` to rational before doing so. Then add the result, `5//2` to `0.5` by promoting `5//2` to the floating point number `2.5` before proceeding. - -`Julia` uses a special type to store a handful of irrational constants such as `pi`. The special type allows these constants to be treated without round off, until they mix with other floating point numbers. There are some functions that require these be explicitly promoted to floating point. This can be done by calling `float`. - -The standard mathematical operations are implemented by `+`, `-`, `*`, `/`, `^`. Parentheses are used for grouping. - - -### Vectors - -A vector is an indexed collection of similarly typed values. Vectors can be constructed with square brackets (syntax for concatenation): - -```julia; -[1, 1, 2, 3, 5, 8] -``` - -Values will be promoted to a common type (or type `Any` if none exists). For example, this vector will have type `Float64` due to the `1/3` computation: - -```julia -[1, 1//2, 1/3] -``` - - -(Vectors are used as a return type from some functions, as such, some familiarity is needed.) - -Regular arithmetic sequences can be defined by either: - -* Range operations: `a:h:b` or `a:b` which produces a generator of values starting at `a` separated by `h` (`h` is `1` in the last form) until they reach `b`. - -* The `range` function: `range(a, b, length=n)` which produces a generator of `n` values between `a` and `b`; - -These constructs return range objects. A range object *compactly* stores the values it references. To see all the values, they can be collected with the `collect` function, though this is rarely needed in practice. - - -Random sequences are formed by `rand`, among others: - -```julia -rand(3) -``` - -The call `rand()` returns a single random number (in ``[0,1)``.) - -## Variables - -Values can be assigned variable names, with `=`. There are some variants - -```julia; -u = 2 -a_really_long_name = 3 -a0, b0 = 1, 2 # multiple assignment -a1 = a2 = 0 # chained assignment, sets a2 and a1 to 0 -``` - -The names can be short, as above, or more verbose. Variable names can't start -with a number, but can include numbers. Variables can also include -[Unicode](../misc/unicode.html) or even be an emoji. - -```julia; -α, β = π/3, π/4 -``` -We can then use the variables to reference the values: - -```julia; -u + a_really_long_name + a0 - b0 + α -``` - -Within `Pluto`, names are idiosyncratic: within the global scope, only a single usage is possible per notebook; functions and variables can be freely renamed; structures can be redefined or renamed; ... - -Outside of `Pluto`, names may be repurposed, even with values of -different types (`Julia` is a dynamic language), save for (generic) function -names, which have some special rules and can only be redefined as -another function. Generic functions are central to `Julia`'s -design. Generic functions use a method table to dispatch on, so once a -name is assigned to a generic function, it can not be used as a -variable name; the reverse is also true. - -## Functions - -Functions in `Julia` are first-class objects. In these notes, we often pass them as arguments to other functions. There are many built-in functions and it is easy to define new functions. - -We "call" a function by passing argument(s) to it, grouped by parentheses: - -```julia; -sqrt(10) -sin(pi/3) -log(5, 100) # log base 5 of 100 -``` - -With out parentheses, the name (usually) refers to a generic name and the output lists the number of available implementations (methods). - -```julia; -log -``` - -### Built-in functions - -`Julia` has numerous built-in [mathematical](http://julia.readthedocs.io/) functions, we review a few here: - -#### Powers logs and roots - -Besides `^`, there are `sqrt` and `cbrt` for powers. In addition basic functions for exponential and logarithmic functions: - -```verbatim -sqrt, cbrt -exp -log # base e -log10, log2, # also log(b, x) -``` - -#### Trigonometric functions - -The ```6``` standard trig functions are implemented; their implementation for degree arguments; their inverse functions; and the hyperbolic analogs. - -```verbatim -sin, cos, tan, csc, sec, cot -asin, acos, atan, acsc, asec, acot -sinh, cosh, tanh, csch, sech, coth -asinh, acosh, atanh, acsch, asech, acoth -``` - -If degrees are preferred, the following are defined to work with arguments in degrees: - -```verbatim -sind, cosd, tand, cscd, secd, cotd -``` - - -#### Useful functions - -Other useful and familiar functions are defined: - -- `abs`: absolute value - -- `sign`: is ``\lvert x \rvert/x`` except at ``x=0``, where it is ``0``. - -- `floor`, `ceil`: greatest integer less or least integer greater - -- `max(a,b)`, `min(a,b)`: larger (or smaller) of `a` or `b` - -- `maximum(xs)`, `minimum(xs)`: largest or smallest of the collection referred to by `xs` - ----- - -In a Pluto session, the "Live docs" area shows inline documentation for the current object. - -For other uses of `Julia`, the built-in documentation for an object is -accessible through a leading `?`, say, `?sign`. There is also the `@doc` macro, for example: - -```julia; eval=false -@doc sign -``` - ----- - -### User-defined functions - -Simple mathematical functions can be defined using standard mathematical notation: - -```julia; -f(x) = -16x^2 + 100x + 2 -``` - -The argument `x` is passed into the body of function. - -Other values are found from the environment where defined: - -```julia; hold=true; -a = 1 -f(x) = 2*a + x -f(3) # 2 * 1 + 3 -a = 4 -f(3) # now 2 * 4 + 3 -``` - -User-defined functions can have ``0``, ``1`` or more arguments: - -```julia; -area(w, h) = w*h -``` - -Julia makes different *methods* for *generic* function names, so function definitions whose argument specification is different are for different uses, even if the name is the same. This is *polymorphism*. The practical use is that it means users need only remember a much smaller set of function names, as attempts are made to give common expectations to the same name. (That is, `+` should be used only for "add" ing objects, however defined.) - -Functions can be defined with *keyword* arguments that may have defaults specified: - -```julia; hold=true; -f(x; m=1, b=0) = m*x + b # note ";" -f(1) # uses m=1, b=0 -> 1 * 1 + 0 -f(1, m=10) # uses m=10, b=0 -> 10 * 1 + 0 -f(1, m=10, b=5) # uses m=10, b=5 -> 10 * 1 + 5 -``` - - -Longer functions can be defined using the `function` keyword, the last command executed is returned: - -```julia; -function 𝒇(x) - y = x^2 - z = y - 3 - z -end -``` - - - -Functions without names, *anonymous functions*, are made with the `->` syntax as in: - -```julia; -x -> cos(x)^2 - cos(2x) -``` - -These are useful when passing a function to another function or when -writing a function that *returns* a function. - -## Conditional statements - -`Julia` provides the traditional `if-else-end` statements, but more conveniently has a `ternary` operator for the simplest case: - -```julia; -our_abs(x) = (x < 0) ? -x : x -``` - -## Looping - -Iterating over a collection can be done with the traditional `for` loop. However, there are list comprehensions to mimic the definition of a set: - -```julia; -[x^2 for x in 1:10] -``` - -Comprehensions can be filtered through the `if` keyword - -```julia -[x^2 for x in 1:10 if iseven(x)] -``` - -This is more efficient than creating the collection then filtering, as is done with: - -```julia -filter(iseven, [x^2 for x in 1:10]) -``` - - -## Broadcasting, mapping - -A function can be applied to each element of a vector through mapping or broadcasting. The latter is implemented in a succinct notation. Calling a function with a "." before its opening "(` will apply the function to each individual value in the argument: - -```julia; -xs = [1,2,3,4,5] -sin.(xs) # gives back [sin(1), sin(2), sin(3), sin(4), sin(5)] -``` - -For "infix" operators, the dot precedes the operator, as in this example instructing pointwise multiplication of each element in `xs`: - -```juila; -xs .* xs -``` - -Alternatively, the more traditional `map` can be used: - -```julia; -map(sin, xs) -``` - -## Plotting - -Plotting is *not* built-in to `Julia`, rather added through add-on -packages. `Julia`'s `Plots` package is an interface to several -plotting packages. We mention `plotly` (built-in) for web based -graphics, `pyplot`, and `gr` (also built into `Plots`) for other graphics. - -We must load `Plots` before we can plot (and it must be installed before we can load it): - -```julia; -using Plots -``` - - -With `Plots` loaded, we can plot a function by passing the function object by name to `plot`, specifying the range of ```x``` values to show, as follows: - -```julia; -plot(sin, 0, 2pi) # plot a function - by name - over an interval [a,b] -``` - -!!1 note - This is in the form of **the** basic pattern employed: `verb(function_object, arguments...)`. The verb in this example is `plot`, the object `sin`, the arguments `0, 2pi` to specify `[a,b]` domain to plot over. - - -Plotting more than one function over ```[a,b]``` is achieved through the `plot!` function, which modifies the existing plot (`plot` creates a new one) by adding a new layer: - -```julia; -plot(sin, 0, 2pi) -plot!(cos, 0, 2pi) -plot!(zero, 0, 2pi) # add the line y=0 -``` - -Individual points are added with `scatter` or `scatter!`: - -```julia; -plot(sin, 0, 2pi, legend=false) -plot!(cos, 0, 2pi) -scatter!([pi/4, pi+pi/4], [sin(pi/4), sin(pi + pi/4)]) -``` - -(The extra argument `legend=false` suppresses the automatic legend drawing. There are many other useful arguments to adjust a graphic. For example, passing `markersize=10` to the `scatter!` command would draw the points larger than the default.) - -Plotting an *anonymous* function is a bit more immediate than the two-step approach of defining a named function then calling `plot` with this as an argument: - -```julia; -plot( x -> exp(-x/pi) * sin(x), 0, 2pi) -``` - - -The `scatter!` function used above takes two vectors of values to describe the points to plot, one for the ``x`` values and one for the matching ``y`` values. The `plot` function can also produce plots with this interface. For example, here we use a comprehension to produce `y` values from the specified `x` values: - -```julia; hold=true; -xs = range(0, 2pi, length=251) -ys = [sin(2x) + sin(3x) + sin(4x) for x in xs] -plot(xs, ys) -``` - - - - -## Equations - -Notation for `Julia` and math is *similar* for functions - but not for equations. In math, an equation might look like: - -```math -x^2 + y^2 = 3 -``` - -In `Julia` the equals sign is **only** for *assignment* and -*mutation*. The *left-hand* side of an equals sign in `Julia` is -reserved for a) variable assignment; b) function definition (via `f(x) -= ...`); c) indexed mutation of a vector or array; d) mutation of -fields in a structure. (Vectors are indexed by a number allowing -retrieval and mutation of the stored value in the container. The -notation mentioned here would be `xs[2] = 3` to mutate the 2nd element -of `xs` to the value `3`. - -## Symbolic math - -Symbolic math is available through an add-on package `SymPy` (among others). Once loaded, symbolic variables are created with the macro `@syms`: - -```julia -using SymPy -``` - -```julia; -@syms x a b c -``` - -(A macro rewrites values into other commands before they are interpreted. Macros are prefixed with the `@` sign. In this use, the "macro" `@syms` translates `x a b c` into a command involving `SymPy`s `symbols` function.) - -Symbolic expressions - unlike numeric expressions - are not immediately evaluated, though they may be simplified: - -```julia; -p = a*x^2 + b*x + c -``` - -To substitute a value, we can use `Julia`'s `pair` notation (`variable=>value`): - -```julia; -p(x=>2), p(x=>2, a=>3, b=>4, c=>1) -``` - -This is convenient notation for calling the `subs` function for `SymPy`. - - -SymPy expressions of a single free variable can be plotted directly: - -```julia; -plot(64 - (1/2)*32 * x^2, 0, 2) -``` - - -* SymPy has functions for manipulating expressions: `simplify`, `expand`, `together`, `factor`, `cancel`, `apart`, ``...`` - -* SymPy has functions for basic math: `factor`, `roots`, `solve`, `solveset`, ``\dots`` - -* SymPy has functions for calculus: `limit`, `diff`, `integrate`, ``\dots`` diff --git a/CwJ/precalc/logical_expressions.jmd b/CwJ/precalc/logical_expressions.jmd deleted file mode 100644 index e8405e0..0000000 --- a/CwJ/precalc/logical_expressions.jmd +++ /dev/null @@ -1,558 +0,0 @@ -# Inequalities, Logical expressions - -In this section we use the following package: - -```julia -using CalculusWithJulia # loads the `SpecialFunctions` package -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using Plots - -const frontmatter = ( - title = "Inequalities, Logical expressions", - description = "Calculus with Julia: Inequalities, Logical expressions", - tags = ["CalculusWithJulia", "precalc", "inequalities, logical expressions"], -); -nothing -``` - - -## Boolean values - -In mathematics it is common to test if an expression is true or -false. For example, is the point $(1,2)$ inside the disc $x^2 + y^2 -\leq 1$? We would check this by substituting $1$ for $x$ and $2$ for -$y$, evaluating both sides of the inequality and then assessing if the -relationship is true or false. In this case, we end up with a comparison of $5 \leq -1$, which we of course know is false. - -`Julia` provides numeric comparisons that allow this notation to be exactly mirrored: - -```julia;hold=true -x, y = 1, 2 -x^2 + y^2 <= 1 -``` - -The response is `false`, as expected. `Julia` provides -[Boolean](http://en.wikipedia.org/wiki/Boolean_data_type) values -`true` and `false` for such questions. The same process is followed as was described mathematically. - -The set of numeric comparisons is nearly the same as the mathematical -counterparts: `<`, `<=`, `==`, `>=`, `>`. The syntax for less than or -equal can also be represented with the Unicode `≤` (generated by -`\le[tab]`). Similarly, for greater than or equal, there is -`\ge[tab]`. - -!!! warning - The use of `==` is necessary, as `=` is used for assignment and mutation.") - - -The `!` operator takes a boolean value and negates it. It uses prefix notation: - -```julia; -!true -``` - -For convenience, `a != b` can be used in place of `!(a == b)`. - -## Algebra of inequalities - -To illustrate, let's see that the algebra of expressions works as expected. - -For example, if $a < b$ then for any $c$ it is also true that $a + c < b + c$. - -We can't "prove" this through examples, but we can investigate it by -the choice of various values of $a$, $b$, and $c$. For example: - -```julia;hold=true -a,b,c = 1,2,3 -a < b, a + c < b + c -``` - -Or in reverse: - -```julia;hold=true -a,b,c = 3,2,1 -a < b, a + c < b + c -``` - -Trying other choices will show that the two answers are either both `false` or both `true`. - -!!! warning - Well, almost... When `Inf` or `NaN` are involved, this may not hold, for example `1 + Inf < 2 + Inf` is actually `false`. As would be `1 + (typemax(1)-1) < 2 + (typemax(1)-1)`. - - - -So adding or subtracting most any finite value from an inequality will preserve the inequality, just as it does for equations. - -What about addition and multiplication? - -Consider the case $a < b$ and $c > 0$. Then $ca < cb$. Here we investigate using ``3`` random values (which will be positive): - -```julia;hold=true -a,b,c = rand(3) # 3 random numbers in [0,1) -a < b, c*a < c*b -``` - -Whenever these two commands are run, the two logical values should be identical, even though the specific values of `a`, `b`, and `c` will vary. - - -The restriction that $c > 0$ is needed. For example, if $c = -1$, then we have $a < b$ if and only if $-a > -b$. That is the inequality is "flipped." - -```julia;hold=true -a,b = rand(2) -a < b, -a > -b -``` - -Again, whenever this is run, the two logical values should be the same. -The values $a$ and $-a$ are the same distance from $0$, but on opposite sides. Hence if $0 < a < b$, then $b$ is farther from $0$ than $a$, so $-b$ will be farther from $0$ than $-a$, which in this case says $-b < -a$, as expected. - -Finally, we have the case of division. The relation of $x$ and $1/x$ -(for $x > 0$) is that the farther $x$ is from $0$, the closer $1/x$ is -to $0$. So large values of $x$ make small values of $1/x$. This leads -to this fact for $a,b > 0$: $a < b$ if and only if $1/a > 1/b$. - -We can check with random values again: - -```julia;hold=true -a,b = rand(2) -a < b, 1/a > 1/b -``` - -In summary we investigated numerically that the following hold: - -- `a < b` if and only if `a + c < b + c` for all finite `a`, `b`, and `c`. - -- `a < b` if and only if `c*a < c*b` for all finite `a` and `b`, and finite, positive `c`. - -- `a < b` if and only if `-a > -b` for all finite `a` and `b`. - -- `a < b` if and only if `1/a > 1/b` for all finite, positive `a` and `b`. - -### Examples - -We now show some inequalities highlighted on this [Wikipedia](http://en.wikipedia.org/wiki/Inequality_%28mathematics%29) page. - -Numerically investigate the fact $e^x \geq 1 + x$ by showing it is -true for three different values of $x$. We pick $x=-1$, $0$, and $1$: - -```julia;hold=true; -x = -1; exp(x) >= 1 + x -x = 0; exp(x) >= 1 + x -x = 1; exp(x) >= 1 + x -``` - -Now, let's investigate that for any distinct real numbers, $a$ and $b$, that - -```math -\frac{e^b - e^a}{b - a} > e^{(a+b)/2} -``` - -For this, we use `rand(2)` to generate two random numbers in $[0,1)$: - -```julia;hold=true -a, b = rand(2) -(exp(b) - exp(a)) / (b-a) > exp((a+b)/2) -``` - -This should evaluate to `true` for any random choice of `a` and `b` returned by `rand(2)`. - - -Finally, let's investigate the fact that the harmonic mean, $2/(1/a + -1/b)$ is less than or equal to the geometric mean, $\sqrt{ab}$, which -is less than or equal to the quadratic mean, $\sqrt{a^2 + -b^2}/\sqrt{2}$, using two randomly chosen values: - -```julia;hold=true -a, b = rand(2) -h = 2 / (1/a + 1/b) -g = (a * b) ^ (1 / 2) -q = sqrt((a^2 + b^2) / 2) -h <= g, g <= q -``` - - - - -## Chaining, combining expressions: absolute values - - -The absolute value notation can be defined through cases: - -```math -\lvert x\rvert = \begin{cases} -x & x \geq 0\\ --x & \text{otherwise}. -\end{cases} -``` - -The interpretation of $\lvert x\rvert$, as the distance on the number line of $x$ -from $0$, means that many relationships are naturally expressed in -terms of absolute values. For example, a simple shift: $\lvert x -c\rvert$ is -related to the distance $x$ is from the number $c$. As common as they are, the concept can still be confusing when inequalities are involved. - - -For example, the expression $\lvert x - 5\rvert < 7$ has solutions which are all -values of $x$ within $7$ units of $5$. This would be the values $-2< x < 12$. -If this isn't immediately intuited, then formally $\lvert x - 5\rvert <7$ -is a compact representation of a chain of inequalities: $-7 < x-5 < 7$. -(Which is really two combined inequalities: $-7 < x-5$ *and* $x-5 < 7$.) -We can "add" ``5`` to each side to get $-2 < x < 12$, using the -fact that adding by a finite number does not change the inequality -sign. - - -Julia's precedence for logical -expressions, allows such statements to mirror the mathematical -notation: - -```julia; -x = 18 -abs(x - 5) < 7 -``` - -This is to be expected, but we could also have written: - -```julia; --7 < x - 5 < 7 -``` - -Read aloud this would be "minus ``7`` is less than ``x`` minus ``5`` -**and** ``x`` minus ``5`` is less than ``7``". - -The "and" equations can be combined as above with a natural notation. However, -an equation like $\lvert x - 5\rvert > 7$ would emphasize -an **or** and be "``x`` minus ``5`` less than minus ``7`` **or** ``x`` minus ``5`` -greater than ``7``". Expressing this requires some new notation. - - -The *boolean shortcut operators* `&&` and `||` implement "and" and "or." (There are also *bitwise* boolean operators `&` and `|`, but we only describe the former.) - - -Thus we could write $-7 < x-5 < 7$ as - -```julia; -(-7 < x - 5) && (x - 5 < 7) -``` - -and could write $\lvert x-5\rvert > 7$ as - -```julia; -(x - 5 < -7) || (x - 5 > 7) -``` - -(The first expression is false for $x=18$ and the second expression true, so the "or"ed result is `true` and the "and" result if `false`.) - - -##### Example - -One of [DeMorgan's Laws](http://en.wikipedia.org/wiki/De_Morgan%27s_laws) states that -"not (A and B)" is the same as "(not A) or (not B)". This is a kind of -distributive law for "not", but note how the "and" changes to -"or". We can verify this law systematically. For example, the -following shows it true for ``1`` of the ``4`` possible cases for the pair -`A`, `B` to take: - -```julia; -A,B = true, false ## also true, true; false, true; and false, false -!(A && B) == !A || !B -``` - -## Precedence - -The question of when parentheses are needed and when they are not is answered by the -[precedence](http://julia.readthedocs.org/en/latest/manual/mathematical-operations/#operator-precedence) rules -implemented. Earlier, we wrote - -```julia; -(x - 5 < -7) || (x - 5 > 7) -``` - -To represent $\lvert x-5\rvert > 7$. Were the parentheses necessary? Let's just check. - -```julia; -x - 5 < -7 || x - 5 > 7 -``` - -So no, they were not in this case. - -An operator (such as `<`, `>`, `||` above) has an associated associativity and precedence. The associativity is whether an expression like `a - b - c` is `(a-b) - c` or `a - (b-c)`. The former being left associative, the latter right. Of issue here is *precedence*, as in with two or more different operations, which happens first, second, ``\dots``. - -The table in the manual on [operator precedence and associativity](https://docs.julialang.org/en/v1/manual/mathematical-operations/#Operator-Precedence-and-Associativity) shows that for these operations "control flow" (the `&&` above) is lower than "comparisons" (the `<`, `>`), which are lower than "Addition" (the `-` above). So the expression without parentheses would be equivalent to: - -```julia -((x-5) < -7) && ((x-5) > 7) -``` - -(This is different than the precedence of the bitwise boolean operators, which have `&` with "Multiplication" and `|` with "Addition", so `x-5 < 7 | x - 5 > 7` would need parentheses.) - - -A thorough understanding of the precedence rules can help eliminate -unnecessary parentheses, but in most cases it is easier just to put -them in. - -## Arithmetic with - -For convenience, basic arithmetic can be performed with Boolean -values, `false` becomes $0$ and true $1$. For example, both these -expressions make sense: - -```julia; -true + true + false, false * 1000 -``` - -The first example shows a common means used to count the number of -`true` values in a collection of Boolean values - just add them. - - -This can be cleverly exploited. For example, the following expression returns `x` when it is positive and $0$ otherwise: - -```julia; -(x > 0) * x -``` - -There is a built in function, `max` that can be used for this: `max(0, x)`. - -This expression returns `x` if it is between $-10$ and $10$ and otherwise $-10$ or $10$ depending on whether $x$ is negative or positive. - -```julia; -(x < -10)*(-10) + (x >= -10)*(x < 10) * x + (x>=10)*10 -``` - -The `clamp(x, a, b)` performs this task more generally, and is used as in `clamp(x, -10, 10)`. - - - -## Questions - -###### Question - -Is `e^pi` or `pi^e` greater? - -```julia; hold=true; echo=false; -choices = [ -"`e^pi` is greater than `pi^e`", -"`e^pi` is equal to `pi^e`", -"`e^pi` is less than `pi^e`" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Is $\sin(1000)$ positive? - -```julia; hold=true; echo=false; -answ = (sin(1000) > 0) -yesnoq(answ) -``` - -###### Question - -Suppose you know $0 < a < b$. What can you say about the relationship between $-1/a$ and $-1/b$? - -```julia; hold=true; echo=false; -choices = [ -"``-1/a < -1/b``", -"``-1/a > -1/b``", -raw"``-1/a \geq -1/b``"] -answ = 3 -radioq(choices, answ) -``` - - -###### Question - -Suppose you know $a < 0 < b$, is it true that $1/a > 1/b$? - -```julia; hold=true; echo=false; -choices = ["Yes, it is always true.", - "It can sometimes be true, though not always.", - L"It is never true, as $1/a$ is negative and $1/b$ is positive"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -The `airyai` [function](http://en.wikipedia.org/wiki/Airy_function) is -a special function named after a British Astronomer who realized the -function's value in his studies of the rainbow. The `SpecialFunctions` -package must be loaded to include this function, which is done with the accompanying package `CalculusWithJulia`. - -```julia; -airyai(0) -``` - -It is known that this function -is always positive for $x > 0$, though not so for negative values of -$x$. Which of these indicates the first negative value : `airyai(-1) <0`, -`airyai(-2) < 0`, ..., or `airyai(-5) < 0`? - -```julia; hold=true; echo=false; -choices = ["`airyai($i) < 0`" for i in -1:-1:-5] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -By trying three different values of $x > 0$ which of these could possibly be always true: - -```julia; hold=true; echo=false; -choices = ["`x^x <= (1/e)^(1/e)`", - "`x^x == (1/e)^(1/e)`", - "`x^x >= (1/e)^(1/e)`"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -Student logic says $(x+y)^p = x^p + y^p$. Of course, this isn't -correct for all $p$ and $x$. By trying a few points, which is true -when $x,y > 0$ and $0 < p < 1$: - -```julia; hold=true; echo=false; -choices = ["`(x+y)^p < x^p + y^p`", - "`(x+y)^p == x^p + y^p`", - "`(x+y)^p > x^p + y^p`"] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -According to Wikipedia, one of the following inequalities is always -true for $a, b > 0$ (as proved by I. Ilani in -JSTOR, AMM, Vol.97, No.1, 1990). Which one? - - -```julia; hold=true; echo=false; -choices = ["`a^a + b^b <= a^b + b^a`", - "`a^a + b^b >= a^b + b^a`", - "`a^b + b^a <= 1`"] -answ = 2 -radioq(choices, answ) -``` - - -###### Question -Is $3$ in the set $\lvert x - 2\rvert < 1/2$? - -```julia; hold=true; echo=false; -val = abs(3-2) < 1/2 -yesnoq(val) -``` - -###### Question - -Which of the following is equivalent to $\lvert x - a\rvert > b$: - -```julia; hold=true; echo=false; -choices = [raw"``-b < x - a < b``", - raw"`` -b < x-a \text{ and } x - a < b``", - raw"``x - a < -b \text{ or } x - a > b``"] -answ = 3 -radioq(choices, answ) -``` - - -###### Question -If $\lvert x - \pi\rvert < 1/10$ is $\lvert \sin(x) - \sin(\pi)\rvert < 1/10$? - -Guess an answer based on a few runs of - -```julia; hold=true; eval=false; -x = pi + 1/10 * (2rand()-1) -abs(x - pi) < 1/10, abs(sin(x) - sin(pi)) < 1/10 -``` - -```julia; hold=true; echo=false; -booleanq(true) -``` - -###### Question - -Does `12` satisfy $\lvert x - 3\rvert + \lvert x-9\rvert > 12$? - -```julia; hold=true; echo=false; -x = 12 -val = (abs(x -3) + abs(x-9) > 12) -yesnoq(val) -``` - -###### Question - -Which of these will show DeMorgan's law holds when both values are `false`: - -```julia; hold=true; echo=false; -choices = ["`!(false && false) == (!false && !false)`", - "`!(false && false) == (false || false)`", - "`!(false && false) == (!false || !false)`"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -For floating point numbers there are two special values `Inf` and `NaN`. For which of these is the answer always `false`: - -```julia; hold=true; echo=false; -choices = ["`Inf < 3.0` and `3.0 <= Inf`", - "`NaN < 3.0` and `3.0 <= NaN`"] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -The IEEE 754 standard is about floating point numbers, for which there are the special values `Inf`, `-Inf`, `NaN`, and, surprisingly, `-0.0` (as a floating point number and not `-0`, an integer). Here are 4 facts that seem reasonable: - -- Positive zero is equal but not greater than negative zero. - -- `Inf` is equal to itself and greater than everything else except `NaN`. - -- `-Inf` is equal to itself and less then everything else except `NaN`. - -- `NaN` is not equal to, not less than, and not greater than anything, including itself. - -Do all four seem to be the case within `Julia`? Find your answer by trial and error. - -```julia; hold=true; echo=false; -yesnoq(true) -``` - -###### Question - -The `NaN` value is meant to signal an error in computation. `Julia` has value to indicate some data is missing or unavailable. This is `missing`. For `missing` values we have these computations: - -```julia; -true && missing, true || missing -``` - -We see the value of `true || missing` is `true`. Why? - -```julia; hold=true; echo=false; -choices = [""" -In the manual we can read that "In the expression `a || b`, the subexpression `b` is only evaluated if `a` evaluates to false." In this case `a` is `true` and so `a` is returned. -""", -"Since the second value is \"`missing`\", only the first is used. So `false || missing` would also be `false`"] -answ = 1 -radioq(choices, answ) -``` - -The value for `true && missing` is `missing`, not a boolean value. What happens? - -```julia; hold=true; echo=false; -choices = [""" -In the manual we can read that "In the expression `a && b`, the subexpression `b` is only evaluated if `a` evaluates to true." In this case, `a` is `false` so `b` is evaluated and returned. As `b` is just `missing` that is the return value. -""", -"Since the second value is \"`missing`\" all such answers would be missing."] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/precalc/numbers_types.jmd b/CwJ/precalc/numbers_types.jmd deleted file mode 100644 index 1e43fb0..0000000 --- a/CwJ/precalc/numbers_types.jmd +++ /dev/null @@ -1,652 +0,0 @@ -# Number systems - -```julia; echo=false; results="hidden" -using CalculusWithJulia -using CalculusWithJulia.WeaveSupport -const frontmatter = ( - title = "Number systems", - description = "Calculus with Julia: Number systems", - tags = ["CalculusWithJulia", "precalc", "number systems"], -); - -nothing -``` - - -In mathematics, there are many different number systems in common -use. For example by the end of pre-calculus, all of the following have -been introduced: - -* The integers, $\{\dots, -3, -2, -1, 0, 1, 2, 3, \dots\}$; -* The rational numbers, $\{p/q: p, q \text{ are integers}, q \neq 0\}$; -* The real numbers, $\{x: -\infty < x < \infty\}$; -* The complex numbers, $\{a + bi: a,b \text{ are real numbers and } i^2=-1\}$. - -On top of these, we have special subsets, such as the natural numbers $\{1, 2, \dots\}$ (sometimes including ``0``), the even numbers, the odd numbers, the positive numbers, the non-negative numbers, etc. - -Mathematically, these number systems are naturally nested within each -other as integers are rational numbers which are real numbers, which -can be viewed as part of the complex numbers. - - -Calculators typically have just one type of number - floating point -values. These model the real numbers. `Julia`, on other other hand, has -a rich type system, and within that has many different number -types. There are types that model each of the four main systems above, -and within each type, specializations for how these values are stored. - -Most of the details will not be of interest to all, and will be described later. - - -For now, let's consider the number ``1``. It can be viewed as either an integer, rational, -real, or complex number. To construct "``1``" in each type within `Julia` -we have these different styles: - -```julia; -1, 1.0, 1//1, 1 + 0im -``` - -The basic number types in `Julia` are `Int`, `Float64`, `Rational` and `Complex`, though in fact there are many more, and the last two aren't even *concrete* types. This distinction is important, as the type of number dictates how it will be stored and how precisely the stored value can be expected to be to the mathematical value it models. - -Though there are explicit constructors for these types, these notes -avoid them unless necessary, as `Julia`'s parser can distinguish these -types through an easy to understand syntax: - -* integers have no decimal point; -* floating point numbers have a decimal point (or are in scientific notation); -* rationals are constructed from integers using the double division operator, `//`; and -* complex numbers are formed by including a term with the imaginary unit, `im`. - -!!! warngin - Heads up, the difference between `1` and `1.0` is subtle. - Even more so, as `1.` will parse as `1.0`. - This means some expressions, such as `2.*3`, are ambigous, as the `.` might be part of the `2` (as in `2. * 3`) or the operation `*` (as in `2 .* 3`). - - -Similarly, each type is printed slightly differently. - -The key distinction is between integers and floating points. While -floating point values include integers, and so can be used exclusively -on the calculator, the difference is that an integer is guaranteed to -be an exact value, whereas a floating point value, while often an -exact representation of a number is also often just an -*approximate* value. This can be an advantage -- floating point values can -model a much wider range of numbers. - - -Now in nearly all cases the differences are not noticable. Take for instance this simple calculation involving mixed types. - - -```julia; -1 + 1.25 + 3//2 -``` - -The sum of an integer, a floating point number and rational number returns a floating point number without a complaint. - -This is because behind the scenes, `Julia` will often "promote" a type to match, so for example to compute `1 + 1.25` the integer `1` will be promoted to a floating point value and the two values are then added. Similarly, with `2.25 + 3//2`, where the fraction is promoted to the floating point value `1.5` and addition is carried out. - -As floating point numbers may be approximations, some values are not quite what they would be mathematically: - -```julia; -sqrt(2) * sqrt(2) - 2, sin(pi), 1/10 + 1/5 - 3/10 -``` - -These values are *very* small numbers, but not exactly $0$, as they are mathematically. - ----- - -The only common issue is with powers. `Julia` tries to keep a -predictable output from the input types (not their values). Here are the two main cases -that arise where this can cause unexpected results: - - -* integer bases and integer exponents can *easily* overflow. Not only `m^n` is always an integer, it is always an integer with a fixed storage size computed from the sizes of `m` and `n`. So the powers can quickly get too big. This can be especially noticeable on older $32$-bit machines, where too big is $2^{32} = 4,294,967,296$. On $64$-bit machines, this limit is present but much bigger. - -Rather than give an error though, `Julia` gives seemingly arbitrary answers, as can be seen in this example on a $64$-bit machine: - -```julia; -2^62, 2^63 -``` - -(They aren't arbitrary, rather integer arithmetic is implemented as modular arithmetic.) - -This could be worked around, as it is with some programming languages, -but it isn't, as it would slow down this basic computation. So, it is -up to the user to be aware of cases where their integer values can -grow to big. The suggestion is to use floating point numbers in this -domain, as they have more room, at the cost of sometimes being -approximate values. - - -* the `sqrt` function will give a domain error for negative values: - -```julia; -sqrt(-1.0) -``` - -This is because for real-valued inputs `Julia` expects to return a -real-valued output. Of course, this is true in mathematics until the -complex numbers are introduced. Similarly in `Julia` - to take square -roots of negative numbers, start with complex numbers: - -```julia; -sqrt(-1.0 + 0im) -``` - - -* At one point, `Julia` had an issue with a third type of power: -integer bases and negative integer exponents. For example -`2^(-1)`. This is now special cased, though only for numeric -literals. If `z=-1`, `2^z` will throw a `DomainError`. Historically, -the desire to keep a predictable type for the output (integer) led to -defining this case as a domain error, but its usefulness led to -special casing. - - -## Additional details. - - -What follows is only needed for those seeking more background. - - -Julia has *abstract* number types `Integer`, `Real`, and `Number`. All -four types described above are of type `Number`, but `Complex` is not -of type `Real`. - -However, a specific value is an instance of a *concrete* type. A -concrete type will also include information about how the value is -stored. For example, the *integer* `1` could be stored using $64$ bits -as a signed integers, or, should storage be a concern, as an $8$ bits -signed or even unsigned integer, etc.. If storage -isn't an issue, but exactness at all scales is, then it can be stored in a manner -that allows for the storage to grow using "big" numbers. - -These distinctions can be seen in how `Julia` parses these three values: - -* `1234567890` will be a $64$-bit integer (on newer machines), `Int64` -* `12345678901234567890` will be a $128$ bit integer, `Int128` -* `1234567890123456789012345678901234567890` will be a big integer, `BigInt` - -Having abstract types allows programmers to write functions that will -work over a wide range of input values that are similar, but have -different implementation details. - - -### Integers - -Integers are often used casually, as they come about from parsing. As with a calculator, floating point numbers *could* be used for integers, but in `Julia` - and other languages - it proves useful to have numbers known to have *exact* values. In `Julia` there are built-in number types for integers stored in $8$, $16$, $32$, $64$, and $128$ bits and `BigInt`s if the previous aren't large enough. ($8$ bits can hold $8$ binary values representing $1$ of $256=2^8$ possibilities, whereas the larger $128$ bit can hold one of $2^{128}$ possibilities.) Smaller values can be more efficiently used, and this is leveraged at the system level, but not a necessary distinction with calculus where the default size along with an occasional usage of `BigInt` suffice. - -### Floating point numbers - -[Floating point](http://en.wikipedia.org/wiki/Floating_point) numbers -are a computational model for the real numbers. For floating point -numbers, $64$ bits are used by default for both $32$- and $64$-bit systems, though other storage sizes can be requested. This gives -a large ranging - but still finite - set of real numbers that can be -represented. However, there are infinitely many real numbers just -between $0$ and $1$, so there is no chance that all can be represented -exactly on the computer with a floating point value. Floating point -then is *necessarily* an approximation for all but a subset of the -real numbers. Floating point values can be viewed in normalized -[scientific notation](http://en.wikipedia.org/wiki/Scientific_notation) -as $a\cdot 2^b$ where $a$ is the *significand* and $b$ is the -*exponent*. Save for special values, the significand $a$ is normalized to satisfy $1 \leq \lvert a\rvert < -2$, the exponent can be taken to be an integer, possibly negative. - -As per IEEE Standard 754, the `Float64` type gives 52 bits to the precision (with an additional implied one), 11 bits to the exponent and the other bit is used to represent the sign. Positive, finite, floating point numbers have a range approximately between $10^{-308}$ and $10^{308}$, as 308 is about $\log_{10}\cdot 2^{1023}$. The numbers are not evenly spread out over this range, but, rather, are much more concentrated closer to $0$. - - -!!! warning "More on floating point numbers" - You can discover more about the range of floating point values provided by calling a few different functions. - * `typemax(0.0)` gives the largest value for the type (`Inf` in this case). - * `prevfloat(Inf)` gives the largest finite one, in general `prevfloat` is the next smallest floating point value. - * `nextfloat(-Inf)`, similarly, gives the smallest finite floating point value, and in general returns the next largest floating point value. - * `nextfloat(0.0)` gives the closest positive value to 0. - * `eps()` gives the distance to the next floating point number bigger than `1.0`. This is sometimes referred to as machine precision. - -#### Scientific notation - -Floating point numbers may print in a familiar manner: - -```julia; -x = 1.23 -``` - -or may be represented in scientific notation: - -```julia; -6.23 * 10.0^23 -``` - -The special coding `aeb` (or if the exponent is negative `ae-b`) is -used to represent the number $a \cdot 10^b$ ($1 \leq a < 10$). This -notation can be used directly to specify a floating point value: - -```julia; -avagadro = 6.23e23 -``` - -Here `e` is decidedly *not* the Euler number, rather syntax to separate the exponent from the mantissa. - -The first way of representing this number required using `10.0` and not `10` as the integer power will return an integer and even for 64-bit systems is only valid up to `10^18`. Using scientific notation avoids having to concentrate on such limitations. - -##### Example - -Floating point values in scientific notation will always be normalized. This is easy for the computer to do, but tedious to do by hand. Here we see: - -```julia; -4e30 * 3e40 -``` - -```julia; -3e40 / 4e30 -``` - -The power in the first is ``71``, not ``70 = 30+40``, as the product of ``3`` and ``4`` as ``12`` or `1.2e^1`. (We also see the artifact of `1.2` not being exactly representable in floating point.) - -##### Example: 32-bit floating point - -In some uses, such as using a GPU, ``32``-bit floating point (single precision) is also -common. These values may be specified with an `f` in place of the `e` -in scientific notation: - -```julia -1.23f0 -``` - -As with the use of `e`, some exponent is needed after the `f`, even if it is `0`. - - -#### Special values: Inf, -Inf, NaN - -The coding of floating point numbers also allows for the special -values of `Inf`, `-Inf` to represent positive and negative -infinity. As well, a special value `NaN` ("not a number") is used to -represent a value that arises when an operation is not closed (e.g., -$0.0/0.0$ yields `NaN`). (Technically `NaN` has several possible "values," a point ignored here.) Except for negative bases, the floating point -numbers with the addition of `Inf` and `NaN` are closed under the -operations `+`, `-`, `*`, `/`, and `^`. Here are some computations -that produce `NaN`: - -```julia; -0/0, Inf/Inf, Inf - Inf, 0 * Inf -``` - -Whereas, these produce an infinity - -```julia; -1/0, Inf + Inf, 1 * Inf -``` - -Finally, these are mathematically undefined, but still yield a finite value with `Julia`: - -```julia; -0^0, Inf^0 -``` - -#### Floating point numbers and real numbers - -Floating point numbers are an abstraction for the real numbers. For -the most part this abstraction works in the background, though there -are cases where one needs to have it in mind. Here are a few: - -* For real and rational numbers, between any two numbers $a < b$, - there is another real number in between. This is not so for floating - point numbers which have a finite precision. (Julia has some - functions for working with this distinction.) - -* Floating point numbers are approximations for most values, even - simple rational ones like $1/3$. This leads to oddities such as this value - not being $0$: - -```julia; -sqrt(2)*sqrt(2) - 2 -``` - -It is no surprise that an irrational number, like $\sqrt{2}$, can't be represented **exactly** within floating point, but it is perhaps surprising that simple numbers can not be, so $1/3$, $1/5$, $\dots$ are approximated. Here is a surprising-at-first consequence: - -```julia; -1/10 + 2/10 == 3/10 -``` - -That is adding `1/10` and `2/10` is not exactly `3/10`, as expected mathematically. -Such differences are usually very small and are generally attributed to rounding error. The user needs to be mindful when testing for equality, as is done above with the `==` operator. - -* Floating point addition is not necessarily associative, that is the property $a + (b+c) = (a+b) + c$ may not hold exactly. For example: - -```julia; -1/10 + (2/10 + 3/10) == (1/10 + 2/10) + 3/10 -``` - -* For real numbers subtraction of similar-sized numbers is not exceptional, for example $1 - \cos(x)$ is positive if $0 < x < \pi/2$, say. This will not be the case for floating point values. If $x$ is close enough to $0$, then $\cos(x)$ and $1$ will be so close, that they will be represented by the same floating point value, `1.0`, so the difference will be zero: - -```julia; -1.0 - cos(1e-8) -``` - -### Rational numbers - -Rational numbers can be used when the exactness of the number is more -important than the speed or wider range of values offered by floating -point numbers. In `Julia` a rational number is comprised of a -numerator and a denominator, each an integer of the same type, and -reduced to lowest terms. The operations of addition, subtraction, -multiplication, and division will keep their answers as rational -numbers. As well, raising a rational number to a positive, integer -value will produce a rational number. - -As mentioned, these are constructed using double slashes: - -```julia; -1//2, 2//1, 6//4 -``` - -Rational numbers are exact, so the following are identical to their mathematical counterparts: - -```julia; -1//10 + 2//10 == 3//10 -``` - -and associativity: - -```julia; -(1//10 + 2//10) + 3//10 == 1//10 + (2//10 + 3//10) -``` - -Here we see that the type is preserved under the basic operations: - -```julia; -(1//2 + 1//3 * 1//4 / 1//5) ^ 6 -``` - -For powers, a non-integer exponent is converted to floating point, so this operation is defined, though will always return a floating point value: - -```julia; -(1//2)^(1//2) # the first parentheses are necessary as `^` will be evaluated before `//`. -``` - - -##### Example: different types of real numbers - -This table shows what attributes are implemented for the different types. - -```julia; echo=false; -using DataFrames -attributes = ["construction", "exact", "wide range", "has infinity", "has `-0`", "fast", "closed under"] -integer = [q"1", "true", "false", "false", "false", "true", "`+`, `-`, `*`, `^` (non-negative exponent)"] -rational = ["`1//1`", "true", "false", "false", "false", "false", "`+`, `-`, `*`, `/` (non zero denominator),`^` (integer power)"] -float = [q"1.0", "not usually", "true", "true", "true", "true", "`+`, `-`, `*`, `/` (possibly `NaN`, `Inf`),`^` (non-negative base)"] -d = DataFrame(Attributes=attributes, Integer=integer, Rational=rational, FloatingPoint=float) -table(d) -``` - - - - -### Complex numbers - -Complex numbers in `Julia` are stored as two numbers, a real and imaginary part, each some type of `Real` number. The special constant `im` is used to represent $i=\sqrt{-1}$. This makes the construction of complex numbers fairly standard: - -```julia; -1 + 2im, 3 + 4.0im -``` - -(These two aren't exactly the same, the `3` is promoted from an integer to a float to match the `4.0`. Each of the components must be of the same type of number.) - -Mathematically, complex numbers are needed so that certain equations can be satisfied. For example $x^2 = -2$ has solutions $-\sqrt{2}i$ and $\sqrt{2}i$ over the complex numbers. Finding this in `Julia` requires some attention, as we have both `sqrt(-2)` and `sqrt(-2.0)` throwing a `DomainError`, as the `sqrt` function expects non-negative real arguments. However first creating a complex number does work: - -```julia; -sqrt(-2 + 0im) -``` - -For complex arguments, the `sqrt` function will return complex values (even if the answer is a real number). - -This means, if you wanted to perform the quadratic equation for any real inputs, your computations might involve something like the following: - -```julia; -a,b,c = 1,2,3 ## x^2 + 2x + 3 -discr = b^2 - 4a*c -(-b + sqrt(discr + 0im))/(2a), (-b - sqrt(discr + 0im))/(2a) -``` - -When learning calculus, the only common usage of complex numbers arises when solving polynomial equations for roots, or zeros, though they are very important for subsequent work using the concepts of calculus. - -!!! note - Though complex numbers are stored as pairs of numbers, the imaginary unit, `im`, is of type `Complex{Bool}`, a type that can be promoted to more specific types when `im` is used with different number types. - - - -## Type stability - -One design priority of `Julia` is that it should be fast. How can -`Julia` do this? In a simple model, `Julia` is an interface between -the user and the computer's processor(s). Processors consume a set of -instructions, the user issues a set of commands. `Julia` is in charge -of the translation between the two. Ultimately `Julia` calls a compiler to create -the instructions. A basic premise is the shorter the instructions, the -faster they are to process. Shorter instructions can come about by -being more explicit about what types of values the instructions -concern. Explicitness means, there is no need to reason about what a -value can be. When `Julia` can reason about the type of value involved -without having to reason about the values themselves, it can work with -the compiler to produce shorter lists of instructions. - -So knowing the type of the output of a function based only on the type -of the inputs can be a big advantage. In `Julia` this is known as -*type stability*. In the standard `Julia` library, this is a primary -design consideration. - - -##### Example: closure - -To motivate this a bit, we discuss how mathematics can be shaped by a -desire to stick to simple ideas. A desirable algebraic property of a -set of numbers and an operation is *closure*. That is, if one takes an -operation like `+` and then uses it to add two numbers in a set, will -that result also be in the set? If this is so for any pair of numbers, -then the set is closed with respect to the operation addition. - -Lets suppose we start with the *natural numbers*: $1,2, \dots$. Natural, in that we can easily represent small values in terms of fingers. -This set is closed under addition - as a child learns when counting using their fingers. However, if we started with the odd natural numbers, this set would *not* be closed under addition - $3+3=6$. - -The natural numbers are not all the numbers we need, as once a desire -for subtraction is included, we find the set isn't closed. There isn't -a $0$, needed as $n-n=0$ and there aren't negative numbers. The set of -integers are needed for closure under addition and subtraction. - -The integers are also closed under multiplication, which for integer values can be seen as just regrouping into longer additions. - -However, the integers are not closed under division - even if you put -aside the pesky issue of dividing by $0$. For that, the rational -numbers must be introduced. So aside from division by $0$, the rationals are closed under addition, subtraction, multiplication, and division. There is one more fundamental operation though, powers. - -Powers are defined for positive integers in a simple enough manner - -```math -a^n=a \cdot a \cdot a \cdots a \text{ (n times); } a, n \text{ are integers } n \text{ is positive}. -``` - -We can define $a^0$ to be $1$, except for the special case of $0^0$, -which is left undefined mathematically (though it is also defined as -`1` within `Julia`). We can extend the above to include negative -values of $a$, but what about negative values of $n$? We can't say the -integers are closed under powers, as the definition consistent with -the rules that $a^{(-n)} = 1/a^n$ requires rational numbers to be -defined. - -Well, in the above `a` could be a rational number, is `a^n` closed for -rational numbers? No again. Though it is fine for $n$ as an integer -(save the odd case of $0$, simple definitions like $2^{1/2}$ are not -answered within the rationals. For this, we need to introduce the -*real* numbers. It is mentioned that -[Aristotle](http://tinyurl.com/bpqbkap) hinted at the irrationality of -the square root of $2$. To define terms like $a^{1/n}$ for integer -values $a,n > 0$ a reference to a solution to an equation $x^n-a$ is -used. Such solutions require the irrational numbers to have solutions -in general. Hence the need for the real numbers (well, algebraic -numbers at least, though once the exponent is no longer a rational -number, the full set of real numbers are needed.) - -So, save the pesky cases, the real numbers will be closed under -addition, subtraction, multiplication, division, and powers - provided -the base is non-negative. - -Finally for that last case, the complex numbers are introduced to give an answer to $\sqrt{-1}$. - - - ----- - - -How does this apply with `Julia`? - -The point is, if we restrict our set of inputs, we can get more -precise values for the output of basic operations, but to get more -general inputs we need to have bigger output sets. - - -A similar thing happens in `Julia`. For addition say, the addition of -two integers of the same type will be an integer of that type. This -speed consideration is not solely for type stability, but also to -avoid checking for overflow. - -Another example, the division of two integers will always be a number -of the same type - floating point, as that is the only type that -ensures the answer will always fit within. (The explicit use of -rationals notwithstanding.) So even if two integers are the input and -their answer *could* be an integer, in `Julia` it will be a floating -point number, (cf. `2/1`). - -Hopefully this helps explain the subtle issues around powers: in -`Julia` an integer raised to an integer should be an integer, for -speed, though certain cases are special cased, like `2^(-1)`. However -since a real number raised to a real number makes sense always when -the base is non-negative, as long as real numbers are used as outputs, -the expressions `2.0^(-1)` and `2^(-1.0)` are computed and real -numbers (floating points) are returned. For type stability, even -though $2.0^1$ could be an integer, a floating point answer is -returned. - -As for negative bases, `Julia` could always return complex numbers, -but in addition to this being slower, it would be irksome to users. So -user's must opt in. Hence `sqrt(-1.0)` will be an error, but the more -explicit - but mathematically equivalent - `sqrt(-1.0 + 0im)` will not -be a domain error, but rather a complex value will be returned. - - - -## Questions - -```julia; echo=false -choices = ["Integer", "Rational", "Floating point", "Complex", "None, an error occurs"] -nothing -``` - -###### Question - -The number created by `pi/2` is? - - -```julia; hold=true; echo=false; -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The number created by `2/2` is? - -```julia; hold=true; echo=false; -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The number created by `2//2` is? - -```julia; hold=true; echo=false; -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The number created by `1 + 1//2 + 1/3` is? - -```julia; hold=true; echo=false; -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - - -The number created by `2^3` is? - -```julia; hold=true; echo=false; -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - - -The number created by `sqrt(im)` is? - -```julia; hold=true; echo=false; -answ = 4 -radioq(choices, answ, keep_order=true) -``` - -###### Question - - -The number created by `2^(-1)` is? - -```julia; hold=true; echo=false; -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - - - -The "number" created by `1/0` is? - -```julia; hold=true; echo=false; -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Is `(2 + 6) + 7` equal to `2 + (6 + 7)`? - -```julia; hold=true; echo=false; -yesnoq(true) -``` - - -###### Question - -Is `(2/10 + 6/10) + 7/10` equal to `2/10 + (6/10 + 7/10)`? - -```julia; hold=true; echo=false; -yesnoq(false) -``` - -###### Question - -The following *should* compute `2^(-1)`, which if entered directly will return `0.5`. Does it? - -```julia; eval=false -a, b = 2, -1 -a^b -``` - -```julia; hold=true; echo=false; -yesnoq(false) -``` - -(This shows the special casing that is done when powers use literal numbers.) diff --git a/CwJ/precalc/plotting.jmd b/CwJ/precalc/plotting.jmd deleted file mode 100644 index fbcda2a..0000000 --- a/CwJ/precalc/plotting.jmd +++ /dev/null @@ -1,939 +0,0 @@ -# The Graph of a Function - -This section will use the following packages: - - -```julia -using CalculusWithJulia -using Plots -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using Roots -using SymPy -using DataFrames - -const frontmatter = ( - title = "The Graph of a Function", - description = "Calculus with Julia: The Graph of a Function", - tags = ["CalculusWithJulia", "precalc", "the graph of a function"], -); - -nothing -``` - ----- - -A scalar, univariate function, such as $f(x) = 1 - x^2/2$, can be thought of in many different ways. For example: - -* It can be represented through a rule of what it does to $x$, as above. This is useful for computing numeric values. - -* it can be interpreted verbally, as in *square* $x$, take half then *subtract* from - one. This can give clarity to what the function does. - -* It can be thought of in terms of its properties: a polynomial, continuous, $U$-shaped, an approximation for $\cos(x)$ near $0$, $\dots$ - -* it can be visualized graphically. This is useful for seeing the qualitative behavior of a function. - - - -The graph of a univariate function is just a set of points in the -Cartesian plane. These points come from the relation $(x,f(x))$ that -defines the function. Operationally, a sketch of the graph will -consider a handful of such pairs and then the rest of the points will -be imputed. - -For example, a typical approach to plot $f(x) = 1 - x^2/2$ would be to choose some values for $x$ and find the corresponding values of $y$. This might be organized in a "T"-table: - -```verbatim - x | y --------- --2 | -1 --1 | 1/2 - 0 | 1 - 1 | 1/2 - 2 | -1 - 3 | -7/2 -``` - -These pairs would be plotted in a Cartesian plane and then connected with curved lines. A good sketch is aided by knowing ahead of time that this function describes a parabola which is curving downwards. - -We note that this sketch would not include *all* the pairs $(x,f(x))$, as their extent is infinite, rather a well chosen collection of points over some finite domain. - -## Graphing a function with Julia - -`Julia` has several different options for rendering graphs, all in -external packages. We will focus in these notes on the `Plots` -package, which provides a common interface to several different -plotting backends. (Click through for instructions for plotting with the [Makie](../alternatives/makie_plotting.html) package or the [PlotlyLight](alternatives/plotly_plotting.html) package.) -At the top of this section the accompanying `CalculusWithJulia` package and the `Plots` package were loaded with the `using` command, like this: - -```julia; eval=false -using CalculusWithJulia -using Plots -``` - - -!!! note - `Plots` is a frontend for one of several backends. `Plots` comes with a backend for web-based graphics (call `plotly()` to specify that); a backend for static graphs (call `gr()` for that). If the `PyPlot` package is installed, calling `pyplot()` will set that as a backend. For terminal usage, if the `UnicodePlots` package is installed, calling `unicodeplots()` will enable that usage. There are still other backends. - -The `plotly` backend is part of the `Plots` package, as is `gr`. Other backends require installation, such as `PyPlot` and `PlotlyJS`. -We use `gr` in these notes, for the most part. (The `plotly` backend is also quite nice for interactive usage, but doesn't work as well with the static HTML pages.) - - -With `Plots` loaded, it is straightforward to graph a function. - -For example, to graph $f(x) = 1 - x^2/2$ over the interval $[-3,3]$ we have: - -```julia; -f(x) = 1 - x^2/2 -plot(f, -3, 3) -``` - -The `plot` command does the hard work behind the scenes. It needs ``2`` pieces of information declared: - -* **What** to plot. With this invocation, this detail is expressed by passing a function object to `plot` - -* **Where** to plot; the `xmin` and `xmax` values. As with a sketch, - it is impossible in this case to render a graph with all possible - $x$ values in the domain of $f$, so we need to pick some viewing - window. In the example this is $[-3,3]$ which is expressed by - passing the two endpoints as the second and third arguments. - -Plotting a function is then this simple: `plot(f, xmin, xmax)`. - -> *A basic template:* Many operations we meet will take the form -> `action(function, args...)`, as the call to `plot` does. The -> template shifts the focus to the action to be performed. This is a -> [declarative](http://en.wikipedia.org/wiki/Declarative_programming) -> style, where the details to execute the action are only exposed as -> needed. - -!!! note - The time to first plot can feel sluggish, but subsequent plots will be speedy. See the technical note at the end of this section for an explanation. - - -Let's see some other graphs. - -The `sin` function over one period is plotted through: - -```julia; -plot(sin, 0, 2pi) -``` - -We can make a graph of $f(x) = (1+x^2)^{-1}$ over $[-3,3]$ with - -```julia;hold=true -f(x) = 1 / (1 + x^2) -plot(f, -3, 3) -``` - -A graph of $f(x) = e^{-x^2/2}$ over $[-2,2]$ is produced with: - -```julia;hold=true -f(x) = exp(-x^2/2) -plot(f, -2, 2) -``` - -We could skip the first step of defining a function by using an *anonymous function*. For example, to plot $f(x) = \cos(x) - x$ over $[0, \pi/2]$ we could do: - -```julia; -plot(x -> cos(x) - x, 0, pi/2) -``` - -Anonymous functions are especially helpful when parameterized functions are involved: - -```julia;hold=true -mxplusb(x; m=1, b=0) = m*x + b -plot(x -> mxplusb(x; m=-1, b=1), -1, 2) -``` - - -Had we parameterized using the `f(x,p)` style, the result would be similar: - -```julia -function mxplusb(x, p) - m, b = p.m, p.b - m * x + b -end -plot(x -> mxplusb(x, (m=-1, b=1)), -1, 2) -``` - - -!!! note - The function object in the general pattern `action(function, args...)` - is commonly specified in one of three ways: by a name, as with `f`; as an - anonymous function; or as the return value of some other action - through composition. - - -Anonymous functions are also created by `Julia's` `do` notation, which is useful when the first argument to function (like `plot`) accepts a function: - -```julia -plot(0, pi/2) do x - cos(x) - x -end -``` - -The `do` notation can be a bit confusing to read when unfamiliar, though its convenience makes it appealing. - -!!! note - Some types we will encounter, such as the one for symbolic values or the special polynomial one, have their own `plot` recipes that allow them to be plotted similarly as above, even though they are not functions. - - ----- - - - -Making a graph with `Plots` is easy, but producing a graph that is -informative can be a challenge, as the choice of a viewing window can -make a big difference in what is seen. For example, trying to make a -graph of $f(x) = \tan(x)$, as below, will result in a bit of a mess - the -chosen viewing window crosses several places where the function blows up: - -```julia;hold=true -f(x) = tan(x) -plot(f, -10, 10) -``` - - -Though this graph shows the asymptote structure and periodicity, it -doesn't give much insight into each period or even into the fact that -the function is periodic. - -## The details of graph making - -The actual details of making a graph of $f$ over $[a,b]$ are pretty simple and follow the steps in making a "T"-table: - -* A set of $x$ values are created between $a$ and $b$. - -* A corresponding set of $y$ values are created. - -* The pairs $(x,y)$ are plotted as points and connected with straight lines. - -The only real difference is that when drawing by hand, we might know -to curve the lines connecting points based on an analysis of the -function. As `Julia` doesn't consider this, the points are connected -with straight lines -- like a dot-to-dot puzzle. - -In general, the `x` values are often generated by `range` or the `colon` operator and the `y` values produced by mapping or broadcasting a function over the generated `x` values. - - -However, the plotting directive `plot(f, xmin, xmax)` calls an -adaptive algorithm to use more points where needed, as judged by -`PlotUtils.adapted_grid(f, (xmin, xmax))`. It computes both the `x` -and `y` values. This algorithm is wrapped up into the `unzip(f, xmin, -xmax)` function from `CalculusWithJulia`. The algorithm adds more -points where the function is more "curvy" and uses fewer points where -it is "straighter." Here we see the linear function is identified as -needing far fewer points than the oscillating function when plotted -over the same range: - -```julia -pts_needed(f, xmin, xmax) = length(unzip(f, xmin, xmax)[1]) -pts_needed(x -> 10x, 0, 10), pts_needed(x -> sin(10x), 0, 10) -``` - -(In fact, the `21` is the minimum number of points used for any function; a linear function only needs two.) - - ----- - -For instances where a *specific* set of ``x`` values is desired to be -used, the `range` function or colon operator can be used to create the -$x$ values and broadcasting used to create the $y$ values. For -example, if we were to plot $f(x) = \sin(x)$ over $[0,2\pi]$ using -$10$ points, we might do: - -```julia; -𝒙s = range(0, 2pi, length=10) -𝒚s = sin.(𝒙s) -``` - -Finally, to plot the set of points and connect with lines, the ``x`` and ``y`` values are passed along as vectors: - -```julia; -plot(𝒙s, 𝒚s) -``` - -This plots the points as pairs and then connects them in order using -straight lines. Basically, it creates a dot-to-dot graph. The above -graph looks primitive, as it doesn't utilize enough points. - - -##### Example: Reflections - -The graph of a function may be reflected through a line, as those seen with a mirror. For example, a reflection through the $y$ axis takes a point $(x,y)$ to the point $(-x, y)$. We can easily see this graphically, when we have sets of $x$ and $y$ values through a judiciously placed minus sign. - -For example, to plot $\sin(x)$ over $(-\pi,\pi)$ we might do: - -```julia; -xs = range(-pi, pi, length=100) -ys = sin.(xs) -plot(xs, ys) -``` - -To reflect this graph through the $y$ axis, we only need to plot `-xs` and not `xs`: - -```julia; -plot(-xs, ys) -``` - -Looking carefully we see there is a difference. (How?) - -There are four very common reflections: - -- reflection through the $y$-axis takes $(x,y)$ to $(-x, y)$. - -- reflection through the $x$-axis takes $(x,y)$ to $(x, -y)$. - -- reflection through the origin takes $(x,y)$ to $(-x, -y)$. - -- reflection through the line $y=x$ takes $(x,y)$ to $(y,x)$. - -For the $\sin(x)$ graph, we see that reflecting through the $x$ axis -produces the same graph as reflecting through the $y$ axis: - -```julia; -plot(xs, -ys) -``` - -However, reflecting through the origin leaves this graph unchanged: - -```julia; -plot(-xs, -ys) -``` - -> An *even function* is one where reflection through the $y$ axis -> leaves the graph unchanged. That is, $f(-x) = f(x)$. An *odd function* -> is one where a reflection through the origin leaves the -> graph unchanged, or symbolically $f(-x) = -f(x)$. - - -If we try reflecting the graph of $\sin(x)$ through the line $y=x$, we have: - -```julia; -plot(ys, xs) -``` - -This is the graph of the equation $x = \sin(y)$, but is not the graph of a function as the same $x$ can map to more than one $y$ value. (The new graph does not pass the "vertical line" test.) - - - -However, for the sine function we can get a function from this reflection if we choose a narrower viewing window: - -```julia;hold=true -xs = range(-pi/2, pi/2, length=100) -ys = sin.(xs) -plot(ys, xs) -``` - -The graph is that of the "inverse function" for $\sin(x), x \text{ in } [-\pi/2, \pi/2]$. - -#### The `plot(xs, f)` syntax - -When plotting a univariate function there are three basic patterns that can be employed. We have examples above of: - -* `plot(f, xmin, xmax)` uses an adaptive algorithm to identify values for ``x`` in the interval `[xmin, xmas]`, -* `plot(xs, f.(xs))` to manually choose the values of ``x`` to plot points for, and - -Finally there is a merging of these following either of these patterns: - -* `plot(f, xs)` *or* `plot(xs, f)` - -Both require a manual choice of the values of the ``x``-values to -plot, but the broadcasting is carried out in the `plot` command. This -style is convenient, for example, to down sample the ``x`` range to -see the plotting mechanics, such as: - -```julia; -plot(0:pi/4:2pi, sin) -``` - - -#### NaN values - - -At times it is not desirable to draw lines between each succesive -point. For example, if there is a discontinuity in the function or if -there were a vertical asymptote, such as what happens at $0$ with -$f(x) = 1/x$. - -The most straightforward plot is dominated by the vertical asymptote at ``x=0``: - -```julia -q(x) = 1/x -plot(q, -1, 1) -``` - -We can attempt to improve this graph by adjusting the viewport. The -*viewport* of a graph is the $x$-$y$ range of the viewing window. By -default, the $y$-part of the viewport is determined by the range of -the function over the specified interval, $[a,b]$. As just seen, this -approach can produce poor graphs. The `ylims=(ymin, ymax)` argument -can modify what part of the ``y`` axis is shown. (Similarly -`xlims=(xmin, xmax)` will modify the viewport in the ``x`` direction.) - -As we see, even with this adjustment, the spurious line connecting the -points with ``x`` values closest to ``0`` is still drawn: - -```julia -plot(q, -1, 1, ylims=(-10,10)) -``` - - -The dot-to-dot algorithm, at some level, assumes the underlying function is continuous; here ``q(x)=1/x`` is not. - - -There is a convention for most plotting programs that **if** the $y$ -value for a point is `NaN` that no lines will connect to that point, -`(x,NaN)`. `NaN` conveniently appears in many cases where a plot may -have an issue, though not with $1/x$ as `1/0` is `Inf` and not -`NaN`. (Unlike, say, `0/0` which is NaN.) - -Here is one way to plot $q(x) = 1/x$ over $[-1,1]$ taking advantage of this convention: - -```julia;hold=true -xs = range(-1, 1, length=251) -ys = q.(xs) -ys[xs .== 0.0] .= NaN -plot(xs, ys) -``` - -By using an odd number of points, we should have that $0.0$ is amongst the `xs`. The next to last line replaces the $y$ value that would be infinite with `NaN`. - - -As a recommended alternative, we might modify the function so that if it is too large, the values are replaced by `NaN`. Here is one such function consuming a function and returning a modified function put to use to make this graph: - -```julia -rangeclamp(f, hi=20, lo=-hi; replacement=NaN) = x -> lo < f(x) < hi ? f(x) : replacement -plot(rangeclamp(x -> 1/x), -1, 1) -``` - -(The `clamp` function is a base `Julia` function which clamps a number between `lo` and `hi`, returning `lo` or `hi` if `x` is outside that range.) - - -## Layers - -Graphing more than one function over the same viewing window is often -desirable. Though this is easily done in `Plots` by specifying a vector of -functions as the first argument to `plot` instead of a single function -object, we instead focus on building the graph layer by layer. - - -For example, to see that a polynomial and the cosine function are -"close" near $0$, we can plot *both* $\cos(x)$ and the function $f(x) -= 1 - x^2/2$ over $[-\pi/2,\pi/2]$: - -```julia;hold=true -f(x) = 1 - x^2/2 -plot(cos, -pi/2, pi/2, label="cos") -plot!(f, -pi/2, pi/2, label="f") -``` - - -Another useful function to add to a plot is one to highlight the $x$ -axis. This makes identifying zeros of the function easier. The -anonymous function `x -> 0` will do this. But, perhaps less cryptically, -so will the base function `zero`. For example - -```julia;hold=true -f(x) = x^5 - x + 1 -plot(f, -1.5, 1.4, label="f") -plot!(zero, label="zero") -``` - -(The job of `zero` is to return "``0``" in the appropriate type. There is also a similar `one` function in base `Julia`.) - - - -The `plot!` call adds a layer. We could still specify the limits for the plot, though as this can be computed from the figure, to plot `zero` we let `Plots` do it. - -For another example, suppose we wish to plot the function $f(x)=x\cdot(x-1)$ -over the interval $[-1,2]$ and emphasize with points the fact that $0$ -and $1$ are zeros. We can do this with three layers: the first to graph -the function, the second to emphasize the ``x`` axis, the third to graph the points. - -```julia;hold=true -f(x) = x*(x-1) -plot(f, -1, 2, legend=false) # turn off legend -plot!(zero) -scatter!([0,1], [0,0]) -``` - -The ``3`` main functions used in these notes for adding layers are: - -* `plot!(f, a, b)` to add the graph of the function `f`; also `plot!(xs, ys)` -* `scatter!(xs, ys)` to add points $(x_1, y_1), (x_2, y_2), \dots$. -* `annotate!((x,y, label))` to add a label at $(x,y)$ - - -!!! warning - Julia has a convention to use functions named with a `!` suffix to - indicate that they mutate some object. In this case, the object is the - current graph, though it is implicit. Both `plot!`, `scatter!`, and - `annotate!` (others too) do this by adding a layer. - - -## Additional arguments - -The `Plots` package provides many arguments for adjusting a graphic, here we mention just a few of the [attributes](https://docs.juliaplots.org/latest/attributes/): - -* `plot(..., title="main title", xlab="x axis label", ylab="y axis label")`: add title and label information to a graphic -* `plot(..., color="green")`: this argument can be used to adjust the color of the drawn figure (color can be a string,`"green"`, or a symbol, `:green`, among other specifications) -* `plot(..., linewidth=5)`: this argument can be used to adjust the width of drawn lines -* `plot(..., xlims=(a,b), ylims=(c,d)`: either or both `xlims` and `ylims` can be used to control the viewing window -* `plot(..., linestyle=:dash)`: will change the line style of the plotted lines to dashed lines. Also `:dot`, ... -* `plot(..., aspect_ratio=:equal)`: will keep $x$ and $y$ axis on same scale so that squares look square. -* `plot(..., legend=false)`: by default, different layers will be indicated with a legend, this will turn off this feature -* `plot(..., label="a label")` the `label` attribute will show up when a legend is present. Using an empty string, `""`, will suppress add the layer to the legend. - -For plotting points with `scatter`, or `scatter!` the markers can be adjusted via - -* `scatter(..., markersize=5)`: increase marker size -* `scatter(..., marker=:square)`: change the marker (uses a symbol, not a string to specify) - -Of course, zero, one, or more of these can be used on any given call to `plot`, `plot!`, `scatter` or `scatter!`. - - - -## Graphs of parametric equations - -If we have two functions $f(x)$ and $g(x)$ there are a few ways to -investigate their joint behavior. As just mentioned, we can graph -both $f$ and $g$ over the same interval using layers. Such a graph -allows an easy comparison of the shape of the two functions and can be -useful in solving $f(x) = g(x)$. For the latter, the graph of $h(x) = -f(x) - g(x)$ is also of value: solutions to $f(x)=g(x)$ appear as -crossing points on the graphs of `f` and `g`, whereas they appear as zeros -(crossings of the $x$-axis) when `h` is plotted. - -A different graph can be made to compare the two functions side-by-side. This is -a parametric plot. Rather than plotting points $(x,f(x))$ and -$(x,g(x))$ with two separate graphs, the graph consists of points -$(f(x), g(x))$. We illustrate with some examples below: - -##### Example - -The most "famous" parametric graph is one that is likely already familiar, as it follows the parametrization of points on the unit circle by the angle made between the ``x`` axis and the ray from the origin through the point. (If not familiar, this will soon be discussed in these notes.) - -```julia; -𝒇(x) = cos(x); 𝒈(x) = sin(x) -𝒕s = range(0, 2pi, length=100) -plot(𝒇.(𝒕s), 𝒈.(𝒕s), aspect_ratio=:equal) # make equal axes -``` - -Any point $(a,b)$ on this graph is represented by $(\cos(t), \sin(t))$ -for some value of $t$, and in fact multiple values of $t$, since $t + -2k\pi$ will produce the same $(a,b)$ value as $t$ will. - -Making the parametric plot is similar to creating a plot using lower -level commands. There a sequence of values is generated to -approximate the $x$ values in the graph (`xs`), a set of commands to create -the corresponding function values (e.g., `f.(xs)`), and some -instruction on how to represent the values, in this case with lines -connecting the points (the default for `plot` for two sets of numbers). - -In this next plot, the angle values are chosen to be the familiar ones, so the mechanics of the graph can be emphasized. Only the upper half is plotted: - - -```julia; hold=true; echo=false; -θs =[0, PI/6, PI/4, PI/3, PI/2, 2PI/3, 3PI/4,5PI/6, PI] -DataFrame(θ=θs, x=cos.(θs), y=sin.(θs)) -``` - -```julia;hold=true; -θs =[0, pi/6, pi/4, pi/3, pi/2, 2pi/3, 3pi/4, 5pi/6, pi] -plot(𝒇.(θs), 𝒈.(θs), legend=false, aspect_ratio=:equal) -scatter!(𝒇.(θs), 𝒈.(θs)) -``` - ---- - -As with the plot of a univariate function, there is a convenience interface for these plots - just pass the two functions in: - -```julia; -plot(𝒇, 𝒈, 0, 2pi, aspect_ratio=:equal) -``` - -##### Example - -Looking at growth. Comparing $x^2$ with $x^3$ can run into issues, as the scale gets big: - -```julia; -x²(x) = x^2 -x³(x) = x^3 -plot(x², 0, 25) -plot!(x³, 0, 25) -``` - -In the above, `x³` is already $25$ times larger on the scale of $[0,25]$ -and this only gets worse if the viewing window were to get -larger. However, the parametric graph is quite different: - -```julia; -plot(x², x³, 0, 25) -``` - -In this graph, as $x^3/x^2 = x$, as $x$ gets large, the ratio stays reasonable. - -##### Example - -Parametric plots are useful to compare the ratio of values near a -point. In the above example, we see how this is helpful for large -`x`. This example shows it is convenient for a fixed `x`, in this case -`x=0`. - -Plot $f(x) = x^3$ and $g(x) = x - \sin(x)$ around $x=0$: - -```julia;hold=true -f(x) = x^3 -g(x) = x - sin(x) -plot(f, g, -pi/2, pi/2) -``` - -This graph is *nearly* a straight line. At the point $(0,0)=(g(0), -g(0))$, we see that both functions are behaving in a similar manner, -though the slope is not $1$, so they do not increase at exactly the -same rate. - -##### Example: Etch A Sketch - -[Etch A sketch](http://en.wikipedia.org/wiki/Etch_A_Sketch) is a -drawing toy where two knobs control the motion of a pointer, one -knob controlling the $x$ motion, the other the $y$ motion. The trace -of the movement of the pointer is recorded until the display is -cleared by shaking. Shake to clear is now a motion incorporated by some smart-phone apps. - -Playing with the toy makes a few things become clear: - -* Twisting just the left knob (the horizontal or $x$ motion) will move - the pointer left or right, leaving a horizontal - line. Parametrically, this would follow the equations $f(t) = - \xi(t)$ for some $\xi$ and $g(t) = c$. - -* Twisting just the right knob (the vertical or $y$ motion) will move - the pointer up or down, leaving a vertical line. Parametrically, this would follow the - equations $f(t) = c$ and $g(t) = \psi(t)$ for some $\psi$. - -* Drawing a line with a slope different from $0$ or $\infty$ requires - moving both knobs at the same time. A ``45``$^\circ$ line with slope $m=1$ - can be made by twisting both at the same rate, say through $f(t) = - ct$, $g(t)=ct$. It doesn't matter how big $c$ is, just that it is - the same for both $f$ and $g$. Creating a different slope is done by - twisting at different rates, say $f(t)=ct$ and $g(t)=dt$. The slope - of the resulting line will be $d/c$. - -* Drawing a curve is done by twisting the two knobs with varying rates. - -These all apply to parametric plots, as the Etch A Sketch trace is no -more than a plot of $(f(t), g(t))$ over some range of values for $t$, -where $f$ describes the movement in time of the left knob and $g$ the -movement in time of the right. - -Now, we revist the last problem in the context of this. We saw in -the last problem that the parametric graph was nearly a line - so -close the eye can't really tell otherwise. That means that the -growth in both $f(t) = t^3$ and $g(t)=t - \sin(t)$ for $t$ around -$0$ are in a nearly fixed ratio, as otherwise the graph would have more -curve in it. - -##### Example: Spirograph - -Parametric plots can describe a richer set of curves than can plots of -functions. Plots of functions must pass the "vertical-line test", as -there can be at most one $y$ value for a given $x$ value. This is not -so for parametric plots, as the circle example above shows. Plotting sines and cosines this -way is the basis for the once popular -[Spirograph](http://en.wikipedia.org/wiki/Spirograph#Mathematical_basis) toy. The curves -drawn there are parametric plots where the functions come from rolling -a smaller disc either around the outside or inside of a larger disc. - -Here is an example using a parameterization provided on the Wikipedia -page where $R$ is the radius of the larger disc, $r$ the radius of the -smaller disc and $\rho < r$ indicating the position of the pencil -within the smaller disc. - -```julia;hold=true -R, r, rho = 1, 1/4, 1/4 -f(t) = (R-r) * cos(t) + rho * cos((R-r)/r * t) -g(t) = (R-r) * sin(t) - rho * sin((R-r)/r * t) - -plot(f, g, 0, max((R-r)/r, r/(R-r))*2pi) -``` - -In the above, one can fix $R=1$. Then different values for `r` and -`rho` will produce different graphs. These graphs will be periodic if -$(R-r)/r$ is a rational. (Nothing about these equations requires $\rho < r$.) - -## Questions - -###### Question - - -Plot the function $f(x) = x^3 - x$. When is the function positive? - -```julia; hold=true;echo=false -choices = ["`(-Inf, -1)` and `(0,1)`", - "`(-Inf, -0.577)` and `(0.577, Inf)`", - "`(-1, 0)` and `(1, Inf)`" - ]; -answ=3; -radioq(choices, answ) -``` - - -###### Question - - -Plot the function $f(x) = 3x^4 + 8x^3 - 18x^2$. Where (what $x$ value) -is the smallest value? (That is, for which input $x$ is the output -$f(x)$ as small as possible. - -```julia; hold=true;echo=false -f(x) = 3x^4 + 8x^3 - 18x^2 -val = -3; -numericq(val, 0.25) -``` - -###### Question - - -Plot the function $f(x) = 3x^4 + 8x^3 - 18x^2$. When is the function increasing? - -```julia; hold=true;echo=false -choices = ["`(-Inf, -3)` and `(0, 1)`", - "`(-3, 0)` and `(1, Inf)`", - "`(-Inf, -4.1)` and `(1.455, Inf)`" - ]; -answ=2; -radioq(choices, answ) -``` - -###### Question - -Graphing both `f` and the line ``y=0`` helps focus on the *zeros* of `f`. When -`f(x)=log(x)-2`, plot `f` and the line ``y=0``. Identify the lone zero. - -```julia; hold=true;echo=false -val = find_zero(x -> log(x) - 2, 8) -numericq(val, .5) -``` - - -###### Question - -Plot the function $f(x) = x^3 - x$ over $[-2,2]$. How many zeros are there? - -```julia; hold=true;echo=false -val = 3; -numericq(val, 1e-16) -``` - -###### Question - - -The function $f(x) = (x^3 - 2x) / (2x^2 -10)$ is a rational function -with issues when $2x^2 = 10$, or $x = -\sqrt{5}$ or $\sqrt{5}$. - -Plot this function from $-5$ to $5$. How many times does it cross the $x$ axis? - -```julia; hold=true;echo=false -val = 3; -numericq(val, .2) -``` - -###### Question - -A trash collection plan charges a flat rate of 35 dollars a month for -the first 10 bags of trash and is 4 dollars a bag thereafter. Which -function will model this: - - -```julia; hold=true;echo=false -choices = [ -"`f(x) = x <= 35.0 ? 10.0 : 10.0 + 35.0 * (x-4)`", -"`f(x) = x <= 4 ? 35.0 : 35.0 + 10.0 * (x-4)`", -"`f(x) = x <= 10 ? 35.0 : 35.0 + 4.0 * (x-10)`" -] -answ = 3 -radioq(choices, answ) -``` - -Make a plot of the model. Graphically estimate how many bags of trash will cost 55 dollars. - -```julia; hold=true;echo=false -answ = 15 -numericq(answ, .5) -``` - -###### Question - -Plot the functions $f(x) = \cos(x)$ and $g(x) = x$. Estimate the $x$ value of where the two graphs intersect. - -```julia; hold=true;echo=false -val = find_zero(x -> cos(x) -x, .7) -numericq(val, .25) -``` - - -###### Question - -The fact that only a finite number of points are used in a graph can -introduce artifacts. An example can appear when plotting -[sinusoidal](http://en.wikipedia.org/wiki/Aliasing#Sampling_sinusoidal_functions) -functions. An example is the graph of `f(x) = sin(500*pi*x)` over `[0,1]`. - -Make its graph using 250 evenly spaced points, as follows: - -```julia; hold=true;eval=false;results="hidden" -xs = range(0, 1, length=250) -f(x) = sin(500*pi*x) -plot(xs, f.(xs)) -``` - -What is seen? - -```julia; hold=true;echo=false -choices = [L"It oscillates wildly, as the period is $T=2\pi/(500 \pi)$ so there are 250 oscillations.", - "It should oscillate evenly, but instead doesn't oscillate very much near 0 and 1", - L"Oddly, it looks exactly like the graph of $f(x) = \sin(2\pi x)$."] -answ = 3 -radioq(choices, answ) -``` - -The algorithm to plot a function works to avoid aliasing issues. Does the graph generated by `plot(f, 0, 1)` look the same, as the one above? - -```julia; hold=true;echo=false -choices = ["Yes", -"No, but is still looks pretty bad, as fitting 250 periods into a too small number of pixels is a problem.", -"No, the graph shows clearly all 250 periods." -] -answ = 2 -radioq(choices, answ) -``` - - - -###### Question - -Make this parametric plot for the specific values of the parameters `k` and `l`. What shape best describes it? - -```julia; hold=true;eval=false -R, r, rho = 1, 3/4, 1/4 -f(t) = (R-r) * cos(t) + rho * cos((R-r)/r * t) -g(t) = (R-r) * sin(t) - rho * sin((R-r)/r * t) - -plot(f, g, 0, max((R-r)/r, r/(R-r))*2pi, aspect_ratio=:equal) -``` - -```julia; hold=true;echo=false -choices = [ -"Four sharp points, like a star", -"Four petals, like a flower", -"An ellipse", -"A straight line" -] -answ = 2 -radioq(choices, answ) -``` - - - -###### Question - -For these next questions, we use this function: - -```julia; -function spirograph(R, r, rho) - f(t) = (R-r) * cos(t) + rho * cos((R-r)/r * t) - g(t) = (R-r) * sin(t) - rho * sin((R-r)/r * t) - - plot(f, g, 0, max((R-r)/r, r/(R-r))*2pi, aspect_ratio=:equal) -end -``` - -Make this plot for the following specific values of the parameters `R`, `r`, and `rho`. What shape best describes it? - -```julia; hold=true;eval=false -R, r, rho = 1, 3/4, 1/4 -``` - -```julia; hold=true;echo=false -choices = [ -"Four sharp points, like a star", -"Four petals, like a flower", -"An ellipse", -"A straight line", -"None of the above" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - - -Make this plot for the following specific values of the parameters `R`, `r`, and `rho`. What shape best describes it? - -```julia; hold=true;eval=false -R, r, rho = 1, 1/2, 1/4 -``` - -```julia; hold=true;echo=false -choices = [ -"Four sharp points, like a star", -"Four petals, like a flower", -"An ellipse", -"A straight line", -"None of the above" -] -answ = 3 -radioq(choices, answ,keep_order=true) -``` - - -Make this plot for the specific values of the parameters `R`, `r`, and `rho`. What shape best describes it? - -```julia; hold=true;eval=false -R, r, rho = 1, 1/4, 1 -``` - -```julia; hold=true;echo=false -choices = [ -"Four sharp points, like a star", -"Four petals, like a flower", -"A circle", -"A straight line", -"None of the above" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - - - -Make this plot for the specific values of the parameters `R`, `r`, and `rho`. What shape best describes it? - -```julia; hold=true;eval=false -R, r, rho = 1, 1/8, 1/4 -``` - -```julia; hold=true;echo=false -choices = [ -"Four sharp points, like a star", -"Four petals, like a flower", -"A circle", -"A straight line", -"None of the above" -] -answ = 5 -radioq(choices, answ, keep_order=true) -``` - ----- - -## Technical note - -The slow "time to first plot" in `Julia` is a well-known hiccup that is related to how `Julia` can be so fast. Loading Plots and the making the first plot are both somewhat time consuming, though the second and subsequent plots are speedy. Why? - -`Julia` is an interactive language that attains its speed by compiling functions on the fly using the [llvm](llvm.org) compiler. When `Julia` encounters a new combination of a function method and argument types it will compile and cache a function for subsequent speedy execution. The first plot is slow, as there are many internal functions that get compiled. This has sped up of late, as excessive recompilations have been trimmed down, but still has a way to go. This is different from "precompilation" which also helps trim down time for initial executions. There are also some more technically challenging means to create `Julia` images for faster start up that can be pursued if needed. diff --git a/CwJ/precalc/polynomial.jmd b/CwJ/precalc/polynomial.jmd deleted file mode 100644 index 402de24..0000000 --- a/CwJ/precalc/polynomial.jmd +++ /dev/null @@ -1,867 +0,0 @@ -# Polynomials - -In this section we use the following add-on packages: - - -```julia -using SymPy -using Plots -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia -using CalculusWithJulia.WeaveSupport - -fig_size = (800, 600) #400, 300) - -const frontmatter = ( - title = "Polynomials", - description = "Calculus with Julia: Polynomials", - tags = ["CalculusWithJulia", "precalc", "polynomials"], -); -nothing -``` - - ----- - -Polynomials are a particular class of expressions that are simple -enough to have many properties that can be analyzed. In particular, -the key concepts of calculus: limits, continuity, derivatives, and -integrals are all relatively trivial for polynomial -functions. However, polynomials are flexible enough that they can be -used to approximate a wide variety of functions. Indeed, though we -don't pursue this, we mention that `Julia`'s `ApproxFun` package -exploits this to great advantage. - -Here we discuss some vocabulary and basic facts related to polynomials -and show how the add-on `SymPy` package can be used to model -polynomial expressions within `SymPy`. -`SymPy` provides a Computer Algebra System (CAS) for `Julia`. In this case, by leveraging a mature `Python` package [SymPy](https://www.sympy.org/). -Later we will discuss the `Polynomials` package -for polynomials. - -For our purposes, a *monomial* is simply a non-negative integer power -of $x$ (or some other indeterminate symbol) possibly multiplied by a -scalar constant. For example, $5x^4$ is a monomial, as are constants, -such as $-2=-2x^0$ and the symbol itself, as $x = x^1$. In general, -one may consider restrictions on where the constants can come from, -and consider more than one symbol, but we won't pursue this here, -restricting ourselves to the case of a single variable and real -coefficients. - -A *polynomial* is a sum of monomials. After -combining terms with same powers, a non-zero polynomial may be written uniquely -as: - -```math -a_n x^n + a_{n-1}x^{n-1} + \cdots a_1 x + a_0, \quad a_n \neq 0 -``` - -```julia; hold=true; echo=false; cache=true -##{{{ different_poly_graph }}} - - -anim = @animate for m in 2:2:10 - fn = x -> x^m - plot(fn, -1, 1, size = fig_size, legend=false, title="graph of x^{$m}", xlims=(-1,1), ylims=(-.1,1)) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) -caption = "Polynomials of varying even degrees over ``[-1,1]``." - -ImageFile(imgfile, caption) -``` - - - -The numbers $a_0, a_1, \dots, a_n$ are the **coefficients** of the -polynomial in the standard basis. With the identifications that $x=x^1$ and $1 = x^0$, the -monomials above have their power match their coefficient's index, -e.g., $a_ix^i$. Outside of the coefficient $a_n$, the other -coefficients may be negative, positive, *or* $0$. Except for the zero -polynomial, the largest power $n$ is called the -[degree](https://en.wikipedia.org/wiki/Degree_of_a_polynomial). The -degree of the [zero](http://tinyurl.com/he6eg6s) polynomial is typically not -defined or defined to be $-1$, so as to make certain statements easier to express. The term -$a_n$ is called the **leading coefficient**. When the leading -coefficient is $1$, the polynomial is called a **monic polynomial**. -The monomial $a_n x^n$ is the **leading term**. - -For example, the polynomial $-16x^2 - 32x + 100$ has degree $2$, -leading coefficient $-16$ and leading term $-16x^2$. It is not monic, -as the leading coefficient is not ``1``. - -Lower degree polynomials have special names: a degree $0$ polynomial -($a_0$) is a non-zero constant, a degree ``1`` polynomial ($a_0+a_1x$) is called -linear, a degree $2$ polynomial is quadratic, and a degree $3$ polynomial is called cubic. - -## Linear polynomials - -A special place is reserved for polynomials with degree ``1``. These are -linear, as their graphs are straight lines. The general form, - -```math -a_1 x + a_0, \quad a_1 \neq 0, -``` - -is often written as $mx + b$, which is the **slope-intercept** form. The slope of a line determines how steeply it rises. The value -of $m$ can be found from two points through the well-known formula: - -```math -m = \frac{y_1 - y_0}{x_1 - x_0} = \frac{\text{rise}}{\text{run}} -``` - -```julia; hold=true; echo=false; cache=true -### {{{ lines_m_graph }}} - -anim = @animate for m in [-5, -2, -1, 1, 2, 5, 10, 20] - fn = x -> m * x - plot(fn, -1, 1, size = fig_size, legend=false, title="m = $m", xlims=(-1,1), ylims=(-20, 20)) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) -caption = "Graphs of y = mx for different values of m" - -ImageFile(imgfile, caption) -``` - -The intercept, $b$, comes from the fact that when $x=0$ the expression -is $b$. That is the graph of the function $f(x) = mx + b$ will have -$(0,b)$ as a point on it. - -More generally, we have the **point-slope** form of a line, written as a polynomial through - -```math -y_0 + m \cdot (x - x_0). -``` - -The slope is $m$ and the point $(x_0, y_0)$. Again, the line graphing -this as a function of $x$ would have the point $(x_0,y_0)$ on it and -have slope $m$. This form is more useful in calculus, as the -information we have convenient is more likely to be related to a -specific value of $x$, not the special value $x=0$. - -Thinking in terms of transformations, this looks like the function -$f(x) = x$ (whose graph is a line with slope ``1``) stretched in the $y$ -direction by a factor of $m$ then shifted right by $x_0$ units, and -then shifted up by $y_0$ units. When $m>1$, this means the line grows -faster. When $m< 0$, the line $f(x)=x$ is flipped through the -$x$-axis so would head downwards, not upwards like $f(x) = x$. - -## Symbolic math in Julia - -The indeterminate value `x` (or some other symbol) in a polynomial, is -like a variable in a function and unlike a variable in `Julia`. Variables in `Julia` -are identifiers, just a means to look up a specific, already determined, -value. Rather, the symbol `x` is not yet determined, it is essentially a place holder for a future value. Although we have -seen that `Julia` makes it very easy to work with mathematical -functions, it is not the case that base `Julia` makes working with -expressions of algebraic symbols easy. This makes sense, `Julia` is -primarily designed for technical computing, where numeric approaches -rule the day. However, symbolic math can be used from within `Julia` through add-on packages. - -Symbolic math programs include well-known ones like the commercial -programs Mathematica and Maple. Mathematica powers the popular -[WolframAlpha](www.wolframalpha.com) website, which turns "natural" -language into the specifics of a programming language. The open-source -Sage project is an alternative to these two commercial giants. It -includes a wide-range of open-source math projects available within -its umbrella framework. (`Julia` can even be run from within the free -service [cloud.sagemath.com](https://cloud.sagemath.com/projects).) A -more focused project for symbolic math, is the [SymPy](www.sympy.org) -Python library. SymPy is also used within Sage. However, SymPy -provides a self-contained library that can be used standalone within a -Python session. That is great for `Julia` users, as the `PyCall` and -`PythonCall` packages glue `Julia` to Python in a seamless -manner. This allows the `Julia` package `SymPy` to provide -functionality from SymPy within `Julia`. - -!!! note - When `SymPy` is installed through the package manager, the underlying `Python` - libraries will also be installed. - -!!! note - The [`Symbolics`](../alternatives/symbolics) package is a rapidly - developing `Julia`-only packge that provides symbolic math options. - ----- - -To use `SymPy`, we create symbolic objects to be our indeterminate -symbols. The `symbols` function does this. However, we will use the more convenient `@syms` macro front end for `symbols`. - - -```julia; -@syms a, b, c, x::real, zs[1:10] -``` - -The above shows that multiple symbols can be defined at once. The -annotation `x::real` instructs `SymPy` to assume the `x` is real, as -otherwise it assumes it is possibly complex. There are many other -[assumptions](http://docs.sympy.org/dev/modules/core.html#module-sympy.core.assumptions) -that can be made. The `@syms` macro documentation lists them. The -`zs[1:10]` tensor notation creates a container with ``10`` different -symbols. The *macro* `@syms` does not need assignment, as the -variable(s) are created behind the scenes by the macro. - -!!! note - Macros in `Julia` are just transformations of the syntax into other syntax. The `@` indicates they behave differently than regular function calls. - - -The `SymPy` package does three basic things: - -- It imports some of the functionality provided by `SymPy`, including the ability to create symbolic variables. - -- It overloads many `Julia` functions to work seamlessly with symbolic expressions. This makes working with polynomials quite natural. - -- It gives access to a wide range of SymPy's functionality through the `sympy` object. - -To illustrate, using the just defined `x`, here is how we can create the polynomial $-16x^2 + 100$: - -```julia; -𝒑 = -16x^2 + 100 -``` - -That is, the expression is created just as you would create it within -a function body. But here the result is still a symbolic object. We -have assigned this expression to a variable `p`, and have not defined -it as a function `p(x)`. Mentally keeping the distinction between symbolic -expressions and functions is very important. - -The `typeof` function shows that `𝒑` is of a symbolic type (`Sym`): - -```julia; -typeof(𝒑) -``` - -We can mix and match symbolic objects. This command creates an -arbitrary quadratic polynomial: - -```julia; -quad = a*x^2 + b*x + c -``` - -Again, this is entered in a manner nearly identical to how we see such -expressions typeset ($ax^2 + bx+c$), though we must remember to -explicitly place the multiplication operator, as the symbols are not -numeric literals. - - -We can apply many of `Julia`'s mathematical functions and the result will still be symbolic: - -```julia; -sin(a*(x - b*pi) + c) -``` - -Another example, might be the following combination: - -```julia; -quad + quad^2 - quad^3 -``` - -One way to create symbolic expressions is simply to call a `Julia` function with symbolic arguments. The first line in the next example defines a function, the second evaluates it at the symbols `x`, `a`, and `b` resulting in a symbolic expression `ex`: - -```julia -f(x, m, b) = m*x + b -ex = f(x, a, b) -``` - - -## Substitution: subs, replace - -Algebraically working with symbolic expressions is straightforward. A -different symbolic task is substitution. For example, replacing each -instance of `x` in a polynomial, with, say, `(x-1)^2`. Substitution -requires three things to be specified: an expression to work on, a -variable to substitute, and a value to substitute in. - - -SymPy provides its `subs` function for this. This function is available in `Julia`, but it is easier to use notation reminiscent of function evaluation. - -To illustrate, to do -the task above for the polynomial $-16x^2 + 100$ we could have: - -```julia; -𝒑(x => (x-1)^2) -``` - -This "call" notation takes pairs (designated by `a=>b`) where the left-hand side is the variable to substitute for, and the right-hand side the new value. -The value to substitute can depend on the variable, as illustrated; be -a different variable; or be a numeric value, such as $2$: - -```julia; -𝒚 = 𝒑(x=>2) -``` - -The result will always be of a symbolic type, even if the answer is -just a number: - -```julia; -typeof(𝒚) -``` - - -If there is just one free variable in an expression, the pair notation can be dropped: - -```julia; -𝒑(4) # substitutes x=>4 -``` - - -##### Example - -Suppose we have the polynomial $p = ax^2 + bx +c$. What would it look -like if we shifted right by $E$ units and up by $F$ units? - -```julia; -@syms E F -p₂ = a*x^2 + b*x + c -p₂(x => x-E) + F -``` - -And expanded this becomes: - -```julia; -expand(p₂(x => x-E) + F) -``` - - -### Conversion of symbolic numbers to Julia numbers - -In the above, we substituted `2` in for `x` to get `y`: - -```julia; hold=true -p = -16x^2 + 100 -y = p(2) -``` - -The value, $36$ is still symbolic, but clearly an integer. If we -are just looking at the output, we can easily translate from the -symbolic value to an integer, as they print similarly. However the -conversion to an integer, or another type of number, does not happen -automatically. If a number is needed to pass along to another `Julia` -function, it may need to be converted. In general, conversions between -different types are handled through various methods of -`convert`. However, with `SymPy`, the `N` function will attempt to do -the conversion for you: - -```julia;hold=true -p = -16x^2 + 100 -N(p(2)) -``` - -Where `convert(T,x)` requires a specification of the type to convert `x` to, `N` attempts to match the data type used by SymPy to store the number. As such, the output type of `N` may vary (rational, a BigFloat, a float, etc.) -For getting more digits of accuracy, a -precision can be passed to `N`. The following command will take -the symbolic value for $\pi$, `PI`, and produce about ``60`` digits worth -as a `BigFloat` value: - -```julia; -N(PI, 60) -``` - - -Conversion by `N` will fail if the value to be converted contains free -symbols, as would be expected. - -### Converting symbolic expressions into Julia functions - -Evaluating a symbolic expression and returning a numeric value can be done by composing the two just discussed concepts. For example: - -```julia; -𝐩 = 200 - 16x^2 -N(𝐩(2)) -``` - -This approach is direct, but can be slow *if* many such evaluations were needed (such as with a plot). An alternative is to turn the symbolic expression into a `Julia` function and then evaluate that as usual. - -The `lambdify` function turns a symbolic expression into a `Julia` function - -```julia;hold=true -pp = lambdify(𝐩) -pp(2) -``` - -The `lambdify` function uses the name of the similar `SymPy` function which is named after Pythons convention of calling anoynmous function "lambdas." The use above is straightforward. Only slightly more complicated is the use when there are multiple symbolic values. For example: - -```julia; hold=true -p = a*x^2 + b -pp = lambdify(p) -pp(1,2,3) -``` - -This evaluation matches `a` with `1`, `b` with`2`, and `x` with `3` as that is the order returned by the function call `free_symbols(p)`. To adjust that, a second `vars` argument can be given: - -```julia; hold=true -pp = lambdify(p, (x,a,b)) -pp(1,2,3) # computes 2*1^2 + 3 -``` - -## Graphical properties of polynomials - -Consider the graph of the polynomial `x^5 - x + 1`: - -```julia; -plot(x^5 - x + 1, -3/2, 3/2) -``` - -(Plotting symbolic expressions is similar to plotting a function, in -that the expression is passed in as the first argument. The expression -must have only one free variable, as above, or an error will occur.) - - -This graph illustrates the key features of polynomial graphs: - -* there may be values for `x` where the graph crosses the $x$ axis - (real roots of the polynomial); - -* there may be peaks and valleys (local maxima and local minima) - -* except for constant polynomials, the ultimate behaviour for large - values of $\lvert x\rvert$ is either both sides of the graph going to positive - infinity, or negative infinity, or as in this graph one to the - positive infinity and one to negative infinity. In particular, there - is no *horizontal asymptote*. - -To investigate this last point, let's consider the case of the -monomial $x^n$. When $n$ is even, the following animation shows that -larger values of $n$ have greater growth once outside of $[-1,1]$: - -```julia; hold=true; echo=false; cache=true -### {{{ poly_growth_graph }}} - -anim = @animate for m in 0:2:12 - fn = x -> x^m - plot(fn, -1.2, 1.2, size = fig_size, legend=false, xlims=(-1.2, 1.2), ylims=(0, 1.2^12), title="x^{$m} over [-1.2, 1.2]") -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) -caption = L"Demonstration that $x^{10}$ grows faster than $x^8$, ... and $x^2$ grows faster than $x^0$ (which is constant)." - -ImageFile(imgfile, caption) -``` - - -Of course, this is expected, as, for example, $2^2 < 2^4 < 2^6 < -\cdots$. The general shape of these terms is similar - $U$ shaped, -and larger powers dominate the smaller powers as $\lvert x\rvert$ gets big. - - -For odd powers of $n$, the graph of the monomial $x^n$ is no longer -$U$ shaped, but rather constantly increasing. This graph of $x^5$ is -typical: - -```julia; -plot(x^5, -2, 2) -``` - -Again, for larger powers the shape is similar, but the growth is faster. - -### Leading term dominates - -To see the roots and/or the peaks and valleys of a polynomial requires a -judicious choice of viewing window, as ultimately the leading term -will dominate the graph. The following animation of the graph of -$(x-5)(x-3)(x-2)(x-1)$ illustrates. Subsequent images show a widening -of the plot window until the graph appears U-shaped. - - -```julia;hold=true; echo=false; cache=true -### {{{ leading_term_graph }}} - -anim = @animate for n in 1:6 - m = [1, .5, -1, -5, -20, -25] - M = [2, 4, 5, 10, 25, 30] - fn = x -> (x-1)*(x-2)*(x-3)*(x-5) - - plt = plot(fn, m[n], M[n], size=fig_size, legend=false, linewidth=2, title ="Graph of on ($(m[n]), $(M[n]))") - if n > 1 - plot!(plt, fn, m[n-1], M[n-1], color=:red, linewidth=4) - end -end - -caption = "The previous graph is highlighted in red. Ultimately the leading term (\$x^4\$ here) dominates the graph." -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps=1) - -ImageFile(imgfile, caption) -``` - - -The leading term in the animation is $x^4$, of even degree, so the graphic is -U-shaped, were the leading term of odd degree the left and right sides would -each head off to different signs of infinity. - - -To illustrate analytically why the leading term dominates, consider -the polynomial $2x^5 - x + 1$ and then factor out the largest power, -$x^5$, leaving a product: - -```math -x^5 \cdot (2 - \frac{1}{x^4} + \frac{1}{x^5}). -``` - -For large $\lvert x\rvert$, the last two terms in the product on the right get -close to $0$, so this expression is *basically* just $2x^5$ - the -leading term. - ----- - -The following graphic illustrates the ``4`` basic *overall* shapes that can result when plotting a polynomials as ``x`` grows without bound: - -```julia; echo=false; -plot(; layout=4) -plot!(x -> x^4, -3,3, legend=false, xticks=false, yticks=false, subplot=1, title="n > even, aₙ > 0") -plot!(x -> x^5, -3,3, legend=false, xticks=false, yticks=false, subplot=2, title="n > odd, aₙ > 0") -plot!(x -> -x^4, -3,3, legend=false, xticks=false, yticks=false, subplot=3, title="n > even, aₙ < 0") -plot!(x -> -x^5, -3,3, legend=false, xticks=false, yticks=false, subplot=4, title="n > odd, aₙ < 0") -``` - - -##### Example - -Suppose $p = a_n x^n + \cdots + a_1 x + a_0$ with $a_n > 0$. Then by -the above, eventually for large $x > 0$ we have $p > 0$, as that is the -behaviour of $a_n x^n$. Were $a_n < 0$, then eventually for large -$x>0$, $p < 0$. - -Now consider the related polynomial, $q$, where we multiply $p$ by $x^n$ and substitute in $1/x$ for $x$. This is the "reversed" polynomial, as we see in this illustration for $n=2$: - -```julia;hold=true -p = a*x^2 + b*x + c -n = 2 # the degree of p -q = expand(x^n * p(x => 1/x)) -``` - -In particular, from the reversal, the behavior of $q$ for large $x$ -depends on the sign of $a_0$. As well, due to the $1/x$, the behaviour -of $q$ for large $x>0$ is the same as the behaviour of $p$ for small -*positive* $x$. In particular if $a_n > 0$ but $a_0 < 0$, then `p` is -eventually positive and `q` is eventually negative. - -That is, if $p$ has $a_n > 0$ but $a_0 < 0$ then the graph of $p$ must cross the $x$ axis. - -This observation is the start of Descartes' rule of -[signs](http://sepwww.stanford.edu/oldsep/stew/descartes.pdf), which -counts the change of signs of the coefficients in `p` to say something -about how many possible crossings there are of the $x$ axis by the -graph of the polynomial $p$. - - - - -## Factoring polynomials - -Among numerous others, there are two common ways of representing a -non-zero polynomial: - -* expanded form, as in $a_n x^n + a_{n-1}x^{n-1} + \cdots a_1 x + a_0, a_n \neq 0$; or - -* factored form, as in $a\cdot(x-r_1)\cdot(x-r_2)\cdots(x-r_n), a \neq 0$. - -The latter writes $p$ as a product of linear factors, though this is -only possible in general if we consider complex roots. With real roots -only, then the factors are either linear or quadratic, as will be -discussed later. - -There are values to each representation. One value of the expanded -form is that polynomial addition and scalar multiplication is much easier than in factored form. For example, adding polynomials just requires matching -up the monomials of similar powers. For the factored form, polynomial multiplication is much easier than expanded form. For the factored form it is easy to read off *roots* of the polynomial (values of $x$ where $p$ is $0$), as -a product is $0$ only if a term is $0$, so any zero must be a zero of -a factor. Factored form has other technical advantages. For example, -the polynomial $(x-1)^{1000}$ can be compactly represented using the -factored form, but would require ``1001`` coefficients to store in expanded -form. (As well, due to floating point differences, the two would -evaluate quite differently as one would require over a ``1000`` operations -to compute, the other just two.) - -Translating from factored form to expanded form can be done by -carefully following the distributive law of multiplication. For -example, with some care it can be shown that: - -```math -(x-1) \cdot (x-2) \cdot (x-3) = x^3 - 6x^2 +11x - 6. -``` - - -The `SymPy` function `expand` will perform these algebraic -manipulations without fuss: - -```julia; -expand((x-1)*(x-2)*(x-3)) -``` - - -Factoring a polynomial is several weeks worth of lessons, as there is -no one-size-fits-all algorithm to follow. There are some tricks that -are taught: for example factoring differences of perfect squares, -completing the square, the rational root theorem, $\dots$. But in -general the solution is not automated. The `SymPy` function `factor` -will find all rational factors (terms like $(qx-p)$), but will leave -terms that do not have rational factors alone. For example: - - -```julia; -factor(x^3 - 6x^2 + 11x -6) -``` - -Or - -```julia; -factor(x^5 - 5x^4 + 8x^3 - 8x^2 + 7x - 3) -``` - -But will not factor things that are not hard to see: - -```julia; -x^2 - 2 -``` - -The factoring $(x-\sqrt{2})\cdot(x + \sqrt{2})$ is not found, as -$\sqrt{2}$ is not rational. - -(For those, it may be possible to solve to get the roots, which -can then be used to produce the factored form.) - -### Polynomial functions and polynomials. - -Our definition of a polynomial is in terms of algebraic expressions -which are easily represented by `SymPy` objects, but not objects from -base `Julia`. (Later we discuss the `Polynomials` package for representing polynomials. There is also the `AbstractAlbegra` package for a more algebraic treatment of polynomials.) - -However, *polynomial functions* are easily represented by `Julia`, for -example, - -```julia; -f(x) = -16x^2 + 100 -``` - -The distinction is subtle, the expression is turned into a function -just by adding the `f(x) =` preface. But to `Julia` there is a big -distinction. The function form never does any computation until after a value -of $x$ is passed to it. Whereas symbolic expressions can be -manipulated quite freely before any numeric values are specified. - -It is easy to create a symbolic expression from a function - just -evaluate the function on a symbolic value: - -```julia; -f(x) -``` - -This is easy - but can also be confusing. The function object is `f`, -the expression is `f(x)` - the function evaluated on a symbolic -object. Moreover, as seen, the symbolic expression can be evaluated -using the same syntax as a function call: - -```julia; -p = f(x) -p(2) -``` - - For many uses, the distinction is unnecessary to make, as the many functions will work with any callable expression. One such is `plot` -- either -`plot(f, a, b)` or `plot(f(x),a, b)` will produce the same plot using the -`Plots` package. - - -## Questions - -###### Question - -Let $p$ be the polynomial $3x^2 - 2x + 5$. - -What is the degree of $p$? - -```julia;hold=true; echo=false -numericq(2) -``` - -What is the leading coefficient of $p$? - -```julia; echo=false -numericq(3) -``` - -The graph of $p$ would have what $y$-intercept? - -```julia; echo=false -numericq(5) -``` - - -Is $p$ a monic polynomial? - -```julia; echo=false -booleanq(false, labels=["Yes", "No"]) -``` - -Is $p$ a quadratic polynomial? - - -```julia; echo=false -booleanq(true, labels=["Yes", "No"]) -``` - -The graph of $p$ would be $U$-shaped? - - -```julia; echo=false -booleanq(true, labels=["Yes", "No"]) -``` - -What is the leading term of $p$? - -```julia; hold=true; echo=false -choices = ["``3``", "``3x^2``", "``-2x``", "``5``"] -answ = 2 -radioq(choices, answ) -``` - - -###### Question - -Let $p = x^3 - 2x^2 +3x - 4$. - -What is $a_2$, using the standard numbering of coefficient? - -```julia; echo=false -numericq(-2) -``` - -What is $a_n$? - -```julia; echo=false -numericq(1) -``` - -What is $a_0$? - -```julia; echo=false -numericq(-4) -``` - -###### Question - -The linear polynomial $p = 2x + 3$ is written in which form: - -```julia; hold=true; echo=false -choices = ["point-slope form", "slope-intercept form", "general form"] -answ = 2 -radioq(choices, answ) -``` - - - - -###### Question - -The polynomial `p` is defined in `Julia` as follows: - -```julia; hold=true; eval=false -@syms x -p = -16x^2 + 64 -``` - -What command will return the value of the polynomial when $x=2$? - -```julia; hold=true; echo=false -choices = [q"p*2", q"p[2]", q"p_2", q"p(x=>2)"] -answ = 4 -radioq(choices, answ) -``` - - -###### Question - -In the large, the graph of $p=x^{101} - x + 1$ will - -```julia; hold=true; echo=false -choices = [ -L"Be $U$-shaped, opening upward", -L"Be $U$-shaped, opening downward", -L"Overall, go upwards from $-\infty$ to $+\infty$", -L"Overall, go downwards from $+\infty$ to $-\infty$"] -answ = 3 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -In the large, the graph of $p=x^{102} - x^{101} + x + 1$ will - -```julia; hold=true; echo=false -choices = [ -L"Be $U$-shaped, opening upward", -L"Be $U$-shaped, opening downward", -L"Overall, go upwards from $-\infty$ to $+\infty$", -L"Overall, go downwards from $+\infty$ to $-\infty$"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -In the large, the graph of $p=-x^{10} + x^9 + x^8 + x^7 + x^6$ will - -```julia; hold=true; echo=false -choices = [ -L"Be $U$-shaped, opening upward", -L"Be $U$-shaped, opening downward", -L"Overall, go upwards from $-\infty$ to $+\infty$", -L"Overall, go downwards from $+\infty$ to $-\infty$"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Use `SymPy` to factor the polynomial $x^{11} - x$. How many factors are found? - -```julia;hold=true; echo=false -@syms x -ex = x^11 - x -nf = length(factor(ex).args) -numericq(nf) -``` - -###### Question - -Use `SymPy` to factor the polynomial $x^{12} - 1$. How many factors are found? - -```julia;hold=true; echo=false -@syms x -ex = x^12 - 1 -nf = length(factor(ex).args) -numericq(nf) -``` - - -###### Question - -What is the monic polynomial with roots $x=-1$, $x=0$, and $x=2$? - -```julia; hold=true; echo=false -choices = [q"x^3 - 3x^2 + 2x", -q"x^3 - x^2 - 2x", -q"x^3 + x^2 - 2x", -q"x^3 + x^2 + 2x"] -answ = 2 -radioq(choices, 2) -``` - -###### Question - -Use `expand` to expand the expression `((x-h)^3 - x^3) / h` where `x` and `h` are symbolic constants. What is the value: - -```julia; hold=true; echo=false -choices = [ -q"-h^2 + 3hx - 3x^2", -q"h^3 + 3h^2x + 3hx^2 + x^3 -x^3/h", -q"x^3 - x^3/h", -q"0"] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/precalc/polynomial_roots.jmd b/CwJ/precalc/polynomial_roots.jmd deleted file mode 100644 index d2da1a8..0000000 --- a/CwJ/precalc/polynomial_roots.jmd +++ /dev/null @@ -1,1082 +0,0 @@ -# Roots of a polynomial - -In this section we use the following add on packages: - - -```julia -using CalculusWithJulia -using Plots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using Roots -import LinearAlgebra: norm - -const frontmatter = ( - title = "Roots of a polynomial", - description = "Calculus with Julia: Roots of a polynomial", - tags = ["CalculusWithJulia", "precalc", "roots of a polynomial"], -); - -nothing -``` - ----- - -The -[roots](http://en.wikipedia.org/wiki/Properties_of_polynomial_roots) -of a polynomial are the values of $x$ that when substituted into -the expression yield $0$. For example, the polynomial $x^2 - x$ has -two roots, $0$ and $1$. A simple graph verifies this: - -```julia; hold=true; -f(x) = x^2 - x -plot(f, -2, 2) -plot!(zero, -2, 2) -``` - -The graph crosses the $x$-axis at both $0$ and $1$. - -What is known about polynomial roots? Some simple questions might be: - -* Will a polynomial always have a root? -* How many roots can there be? -* How large can the roots be? - -We look at such questions here. - -### The factor theorem - -We begin with a comment that ties together two concepts related to -polynomials. It allows us to speak of roots or factors interchangeably: - - -> The [factor theorem](http://en.wikipedia.org/wiki/Factor_theorem) -> relates the *roots* of a polynomial with its *factors*: $r$ is a -> root of $p$ if *and* only if $(x-r)$ is a factor of the polynomial -> $p$. - -Clearly, if $p$ is factored as $a(x-r_1) \cdot (x-r_2) \cdots (x - -r_k)$ then each $r_i$ is a root, as a product involving at least one ``0`` -term will be ``0``. The other implication is a consequence of polynomial -division. - -### Polynomial Division - -[Euclidean division](http://en.wikipedia.org/wiki/Euclidean_division) -of integers $a, b$ uniquely writes $a = b\cdot q + r$ where -$0 \leq r < |b|$. The quotient is $q$ and the remainder $r$. There is -an analogy for polynomial division, where for two polynomial functions $f(x)$ -and $g(x)$ it is possible to write - -```math -f(x) = g(x) \cdot q(x) + r(x) -``` - -where the degree of $r$ is less than the degree of $g(x)$. The -[long-division algorithm](http://en.wikipedia.org/wiki/Long_division) -can be used to find both $q(x)$ and $r(x)$. - -For the special case of a linear factor where $g(x) = x - c$, the -remainder must be of degree $0$ (a non-zero constant) or the $0$ polynomial. The above simplifies to - -```math -f(x) = (x-c) \cdot q(x) + r -``` - -From this, we see that $f(c) = r$. Hence, when $c$ is a root of $f(x)$, then it -must be that $r=0$ and so, $(x-c)$ is a factor. - ----- - -The division algorithm for the case of linear term, $(x-c)$, can be -carried out by the [synthetic division](http://en.wikipedia.org/wiki/Synthetic_division) -algorithm. This algorithm produces $q(x)$ and $r$, a.k.a $f(c)$. The -Wikipedia page describes the algorithm well. - -The following is an example where $f(x) = x^4 + 2x^2 + 5$ and $g(x) = x-2$: - -```verbatim -2 | 1 0 2 0 5 - | 2 4 12 24 - ------------- - 1 2 6 12 29 -``` - -The polynomial $f(x)$ is coded in terms of its coefficients ($a_n$, -$a_{n-1}$, $\dots$, $a_1$, $a_0$) and is written on the top row. The -algorithm then proceeds from left to right. The number just produced -on the bottom row is multiplied by $c$ and placed under the coefficient -of $f(x)$. Then values are then added to produce the next number. The -sequence produced above is `1 2 6 12 29`. The last value (`29`) is -$r=f(c)$, the others encode the coefficients of `q(x)`, which for this problem -is $q(x)=x^3 + 2x + 6 + 12$. That is, we have written: - -```math -x^4 + 2x^2 + 5 = (x-2) \cdot (x^3 + 2x + 6 + 12) + 29. -``` - -As $r$ is not $0$, we can say that $2$ is not a root of $f(x)$. - - -If we were to track down the computation that produced $f(2) = 29$, we would have - -```math -5 + 2 \cdot (0 + 2 \cdot (2 + 2 \cdot (0 + (2 \cdot 1)))) -``` - -In terms of $c$ and the coefficients $a_0, a_1, a_2, a_3$, and $a_4$ this is - -```math -a_0 + c\cdot(a_1 + c\cdot(a_2 + c\cdot(a_3 + c\cdot a_4))), -``` - -The above pattern provides a means to compute $f(c)$ and could easily be generalized for higher degree -polynomials. This generalization is called [Horner's](http://en.wikipedia.org/wiki/Horner%27s_method) method. -Horner's method has the advantage of also being faster and more -accurate when floating point issues are accounted for. - -A simple implementation of Horner's algorithm would look like this, if -indexing were `0`-based: - -```julia -function horner(p, x) - n = degree(p) - Σ = p[n] - for i in (n-1):-1:0 - Σ = Σ * x + p[i] - end - return(Σ) -end -``` - -Recording the different values of `Σ` would recover the polynomial `q`. - - -`Julia` has a -built-in method, `evalpoly`, to compute polynomial evaluations this -way. To illustrate: - -```julia -@syms x::real # assumes x is real -``` - -```julia; hold=true; -p = (1, 2, 3, 4, 5) # 1 + 2x + 3x^2 + 4x^3 + 5x^4 -evalpoly(x, p) -``` - - - ----- - -The `SymPy` package can carry out polynomial long division. - -This naive attempt to divide won't "just work" though: - -```julia; -(x^4 + 2x^2 + 5) / (x-2) -``` - -`SymPy` is fairly conservative in how it simplifies answers, and, as -written, there is no compelling reason to change the expressions, -though in our example we want it done. - -For this task, `divrem` is available: - -```julia; -quotient, remainder = divrem(x^4 + 2x^2 + 5, x - 2) -``` - -The answer is a tuple containing the quotient and remainder. The quotient itself could be found with `div` or `÷` and the remainder with `rem`. - -!!! note - For those who have worked with SymPy within Python, `divrem` is the `div` method renamed, as `Julia`'s `div` method has the generic meaning of returning the quotient. - - - -As well, the `apart` function could be used for this task. This function -computes the -[partial fraction](http://en.wikipedia.org/wiki/Partial_fraction_decomposition) - decomposition of a ratio of polynomial functions. - - -```julia; -apart((x^4 + 2x^2 + 5) / (x-2)) -``` - -The function `together` would combine such terms, as an "inverse" to -`apart`. This isn't so much of interest at the moment, but will be -when techniques of integration are looked at. - - -### The rational root theorem - -Factoring polynomials to find roots is a task that most all readers -here will recognize, and, perhaps, remember not so fondly. One helpful -trick to find possible roots *by hand* is the [rational root -theorem](http://en.wikipedia.org/wiki/Rational_root_theorem): if a -polynomial has integer coefficients with $a_0 \neq 0$, than any -rational root, $p/q$, must have $p$ dividing the constant $a_0$ and -$q$ dividing the leading term $a_n$. - - -To glimpse why, suppose we have a polynomial with a rational root and -integer coefficients. With this in -mind, a polynomial with identical roots may be written as $(qx --p)(a_{n-1}x^{n-1}+\cdots a_1 x + a_0)$, where each coefficient is an -integer. Multiplying through, we get that the polynomial is -$qa_{n-1}x^n + \cdots + pa_0$. So $q$ is a factor of the leading -coefficient and $p$ is a factor of the constant. - -An immediate consequence is that if the polynomial with integer -coefficients is monic, then any rational root must be an integer. - -This gives a finite - though possibly large - set of values that can -be checked to exhaust the possibility of a rational root. By hand this -process can be tedious, though may be speeded up using synthetic -division. This task is one of the mainstays of high school algebra -where problems are chosen judiciously to avoid too many possibilities. - -However, one of the great triumphs of computer algebra is the ability -to factor polynomials with integer (or rational) coefficients over the -rational numbers. This is typically done by first factoring over -modular numbers (akin to those on a clock face) and has nothing to do -with the rational root test. - -`SymPy` can quickly find such a factorization, even -for quite large polynomials with rational or integer coefficients. - -For example, factoring $p = 2x^4 + x^3 -19x^2 -9x +9$. This has -*possible* rational roots of plus or minus $1$ or $2$ divided by $1$, -$3$, or $9$ - $12$ possible answers for this modest question. By hand -that can be a bit of work, but `factor` does it without fuss: - -```julia; hold=true -p = 2x^4 + x^3 - 19x^2 - 9x + 9 -factor(p) -``` - - - -### The fundamental theorem of algebra - -There is a basic fact about the roots of a polynomial of degree -$n$. Before formally stating it, we consider the earlier observation -that a polynomial of degree $n$ for large values of $x$ has a graph -that looks like the leading term. However, except at $0$, monomials do -not cross the $x$ axis, the roots must be the result of the interaction of -lower order terms. Intuitively, since each term can contribute only -one basic shape up or down, there can not be arbitrarily many roots. In -fact, a consequence of the -[Fundamental Theorem of Algebra](http://en.wikipedia.org/wiki/Fundamental_theorem_of_algebra) -(Gauss) is: - -> A polynomial of degree $n$ with real or complex coefficients has at most $n$ real roots. - -This statement can be proved with the factor theorem and the division algorithm. - -In fact the fundamental theorem states that there are exactly $n$ -roots, though, in general, one must consider multiple roots and -possible complex roots to get all $n$. (Consider $x^2$ to see why -multiplicity must be accounted for and $x^2 + 1$ to see why complex -values may be necessary.) - - -!!! warning - The special case of the ``0`` polynomial having no degree defined - eliminates needing to exclude it, as it has infinitely many roots. - Otherwise, the language would be qualified to have ``n \geq 0``. - - -## Finding roots of a polynomial - -Knowing that a certain number of roots exist and actually finding -those roots are different matters. For the simplest cases (the linear -case) with $a_0 + a_1x$, we know by solving algebraically that the root -is $-a_0/a_1$. (We assume $a_1\neq 0$.) Of course, when $a_1 \neq 0$, -the graph of the polynomial will be a line with some non-zero slope, -so will cross the $x$-axis as the line and this axis are not parallel. - -For the quadratic case, there is the famous -[quadratic formula](http://en.wikipedia.org/wiki/Quadratic_formula) (known since -``2000`` BC) to find the two roots guaranteed by the formula: - -```math -\frac{-b \pm \sqrt{b^2 - 4ac}}{2a}. -``` - -The discriminant is defined as $b^2 - 4ac$. When this is negative, the -square root requires the concept of complex numbers to be defined, and -the formula shows the two complex roots are conjugates. When the -discriminant is $0$, then the root has multiplicity two, e.g., the -polynomial will factor as $a_2(x-r)^2$. Finally, when the discriminant -is positive, there will be two distinct, real roots. This figure shows -the ``3`` cases, that are illustrated by $x^2 -1$, $x^2$ and $x^2 + 1$: - -```julia; -plot(x^2 - 1, -2, 2, legend=false) # two roots -plot!(x^2, -2, 2) # one (double) root -plot!(x^2 + 1, -2, 2) # no real root -plot!(zero, -2, 2) -``` - -There are similar formulas for the -[cubic](http://en.wikipedia.org/wiki/Cubic_function#General_formula_for_roots) -and -[quartic](http://en.wikipedia.org/wiki/Quartic_function#General_formula_for_roots) -cases. (The [cubic formula](http://arxiv.org/pdf/math/0005026v1.pdf) -was known to Cardano in ``1545``, though through Tartagli, and the quartic -was solved by Ferrari, Cardano's roommate.) - - -In general, there is no such formula using radicals for ``5``th degree -polynomials or higher, a proof first given by Ruffini in ``1803`` with -improvement by Abel in ``1824``. Even though the fundamental theorem shows -that any polynomial can be factored into linear and quadratic terms, -there is no general method as to how. (It is the case that *some* such -polynomials may be solvable by radicals, just not all of them.) - - -The `factor` function of `SymPy` only finds factors of polynomials -with integer or rational coefficients corresponding to rational roots. -There are alternatives. - -Finding roots with `SymPy` can also be done through its `solve` -function, a function which also has a more general usage, as it can -solve simple expressions or more than one expression. Here we -illustrate that `solve` can easily handle quadratic expressions: - -```julia; -solve(x^2 + 2x - 3) -``` - -The answer is a vector of values that when substituted in for the free -variable `x` produce ``0.`` The call to `solve` does not have an equals -sign. To solve a more complicated expression of the type $f(x) = -g(x),$ one can solve $f(x) - g(x) = 0,$ use the `Eq` function, or use `f ~ g`. - - -When the expression to solve has more than one free variable, the -variable to solve for should be explicitly stated with a second -argument. For example, here we show that `solve` is aware of the -quadratic formula: - -```julia; -@syms a b::real c::positive -solve(a*x^2 + b*x + c, x) -``` - - -The `solve` function will respect assumptions made when a variable is -defined through `symbols` or `@syms`: - -```julia; -solve(a^2 + 1) # works, as a can be complex -``` - -```julia; -solve(b^2 + 1) # fails, as b is assumed real -``` - -```julia; -solve(c + 1) # fails, as c is assumed positive -``` - -Previously, it was mentioned that `factor` only factors polynomials -with integer coefficients over rational roots. However, `solve` can be -used to factor. Here is an example: - -```julia; -factor(x^2 - 2) -``` - -Nothing is found, as the roots are $\pm \sqrt{2}$, irrational numbers. - -```julia; -rts = solve(x^2 - 2) -prod(x-r for r in rts) -``` - -Solving cubics and quartics can be done exactly using radicals. For -example, here we see the solutions to a quartic equation can be quite -involved, yet still explicit. (We use `y` so that complex-valued -solutions, if any, will be found.) - - -```julia; -@syms y # possibly complex -solve(y^4 - 2y - 1) -``` - - -Third- and fourth-degree polynomials can be solved in general, with -increasingly more complicated answers. The following finds one of the answers -for a general third-degre polynomial: - - -```julia; hold=true; -@syms a[0:3] -p = sum(a*x^(i-1) for (i,a) in enumerate(a)) -rts = solve(p, x) -rts[1] # there are three roots -``` - -Some fifth degree polynomials are solvable in terms of radicals, -however, `solve` will not seem to have luck with this particular fifth degree -polynomial: - -```julia; -solve(x^5 - x + 1) -``` - -(Though there is no formula involving only radicals like the quadratic -equation, there is a formula for the roots in terms of a function -called the [Bring radical](http://en.wikipedia.org/wiki/Bring_radical).) - - -### The `roots` function - -Related to `solve` is the specialized `roots` function for identifying roots, Unlike solve, it will identify multiplicities. - -For a polynomial with only one indeterminate the usage is straight foward: - -```julia -roots((x-1)^2 * (x-2)^2) # solve doesn't identify multiplicities -``` - -For a polynomial with symbolic coefficients, the difference between the symbol and the coefficients must be identified. `SymPy` has a `Poly` type to do so. The following call illustrates: - -```julia; hold=true -@syms a b c -p = a*x^2 + b*x + c -q = sympy.Poly(p, x) # identify `x` as indeterminate; alternatively p.as_poly(x) -roots(q) -``` - -!!! note - The sympy `Poly` function must be found within the underlying `sympy` module, a Python object, hence is qualified as `sympy.Poly`. This is common when using `SymPy`, as only a small handful of the many functions available are turned into `Julia` functions, the rest are used as would be done in Python. (This is similar, but different than qualifying by a `Julia` module when there are two conflicting names. An example will be the use of the name `roots` in both `SymPy` and `Polynomials` to refer to a function that finds the roots of a polynomial. If both functions were loaded, then the last line in the above example would need to be `SymPy.roots(q)` (note the capitalization.) - -### Numerically finding roots - -The `solve` function can be used to get numeric approximations to the -roots. It is as easy as calling `N` on the solutions: - -```julia; hold=true; -rts = solve(x^5 - x + 1 ~ 0) -N.(rts) # note the `.(` to broadcast over all values in rts -``` - -This polynomial has ``1`` real root found by `solve`, as `x` is assumed to be real. - - -Here we see another example: - -```julia; -ex = x^7 -3x^6 + 2x^5 -1x^3 + 2x^2 + 1x^1 - 2 -solve(ex) -``` - -This finds two of the seven possible roots, the remainder of the real -roots can be found numerically: - -```julia; -N.(solve(ex)) -``` - - -### The solveset function - -SymPy is phasing in the `solveset` function to replace `solve`. The -main reason being that `solve` has too many different output types (a -vector, a dictionary, ...). The output of `solveset` is always a -set. For tasks like this, which return a finite set, we use the -`elements` function to access the individual answers. To illustrate: - -```julia; -𝒑 = 8x^4 - 8x^2 + 1 -𝒑_rts = solveset(𝒑) -``` - -The `𝒑_rts` object, a `FiniteSet`, does not allow immediate access to its elements. For that `elements` will work to return a vector: - -```julia; -elements(𝒑_rts) -``` - -To get the numeric approximation, we compose these function calls: - -```julia; -N.(elements(solveset(𝒑))) -``` - -## Do numeric methods matter when you can just graph? - -It may seem that certain practices related to -roots of polynomials are unnecessary as we could just graph the -equation and look for the roots. This feeling is perhaps motivated by the examples given in -textbooks to be worked by hand, which necessarily focus on smallish -solutions. But, in general, without some sense of where the roots are, -an informative graph itself can be hard to produce. That is, technology doesn't -displace thinking - it only supplements it. - - -For another example, consider the polynomial $(x-20)^5 - (x-20) + 1$. In this form -we might think the roots are near ``20``. However, were we presented with -this polynomial in expanded form: $x^5 - 100x^4 + 4000x^3 - 80000x^2 + -799999x - 3199979$, we might be tempted to just graph it to find -roots. A naive graph might be to plot over $[-10, 10]$: - -```julia; -𝐩 = x^5 - 100x^4 + 4000x^3 - 80000x^2 + 799999x - 3199979 -plot(𝐩, -10, 10) -``` - -This seems to indicate a root near ``10``. But look at the scale of the -$y$ axis. The value at $-10$ is around $-25,000,000$ so it is really -hard to tell if $f$ is near $0$ when $x=10$, as the range is too -large. - -A graph over $[10,20]$ is still unclear: - -```julia; -plot(𝐩, 10,20) -``` - -We see that what looked like a zero near ``10``, was actually a number around $-100,000$. - - -Continuing, a plot over $[15, 20]$ still isn't that useful. It isn't -until we get close to ``18`` that the large values of the polynomial -allow a clear vision of the values near $0$. That being said, plotting -anything bigger than ``22`` quickly makes the large values hide those near -$0$, and might make us think where the function dips back down there -is a second or third zero, when only ``1`` is the case. (We know that, as this is the same $x^5 - x + 1$ shifted to the right by ``20`` units.) - -```julia; -plot(𝐩, 18, 22) -``` - - -Not that it can't be done, but graphically solving for a root -here can require some judicious choice of viewing window. Even worse -is the case where something might graphically look like a root, but in -fact not be a root. Something like $(x-100)^2 + 0.1$ will demonstrate. - - -For another example, the following polynomial when plotted over ``[-5,7]`` appears to have two real roots: - -```julia -h = x^7 - 16129x^2 + 254x - 1 -plot(h, -5, 7) -``` - -in fact there are three, two are *very* close together: - -```julia -N.(solve(h)) -``` - -!!! note - The difference of the two roots is around `1e-10`. For the graph over the interval of ``[-5,7]`` there are about ``800`` "pixels" used, so each pixel represents a size of about `1.5e-2`. So the cluster of roots would safely be hidden under a single "pixel." - - -The point of this is to say, that it is useful to know where to look -for roots, even if graphing calculators or graphing programs make -drawing graphs relatively painless. A better way in this case would be -to find the real roots first, and then incorporate that information -into the choice of plot window. - -## Some facts about the real roots of a polynomial - -A polynomial with real coefficients may or may not have real -roots. The following discusses some simple checks on the number of -real roots and bounds on how big they can be. This can be *roughly* -used to narrow viewing windows when graphing polynomials. - -### Descartes' rule of signs - -The study of polynomial roots is an old one. In ``1637`` Descartes -published a *simple* method to determine an upper bound on the number -of *positive* real roots of a polynomial. - -> [Descartes' rule of signs](http://en.wikipedia.org/wiki/Descartes%27_rule_of_signs): if -> $p=a_n x^n + a_{n-1}x^{n-1} + \cdots a_1x + a_0$ then the number of -> positive real roots is either equal to the number of -> sign differences between consecutive nonzero coefficients, or is -> less than it by an even number. Repeated roots are counted separately. - -One method of proof (sketched at the end of this section) first shows -that in synthetic division by $(x-c)$ with $c > 0$, we must have that -any sign change in $q$ is related to a sign change in $p$ and there -must be at least one more in ``p``. This is then used to show that -there can be only as many positive roots as sign changes. That the -difference comes in pairs is related to complex roots of real -polynomials always coming in pairs. - -An immediate consequence, is that a polynomial whose coefficients are -all non-negative will have no positive real roots. - -Applying this to the polynomial $x^5 -x + 1$ we get -That the coefficients have signs: `+ 0 0 0 - +` which collapses to -the sign -pattern `+`, `-`, `+`. This pattern has two changes of sign. The number of -*positive* real roots is either ``2`` or ``0``. In fact there are $0$ for this -case. - -What about negative roots? Cleary, any negative root of $p$ is a -positive root of $q(x) = p(-x)$, as the graph of $q$ is just that of -$p$ flipped through the $y$ axis. But the coefficients of $q$ are the -same as $p$, except for the odd-indexed coefficients ($a_1, a_3, -\dots$) have a changed sign. Continuing with our example, for $q(x) = -x^5 + -x + 1$ we get the new sign pattern `-`, `+`, `+` which yields one sign -change. That is, there *must* be a negative real root, and indeed there -is, $x \approx -1.1673$. - -With this knowledge, we could have known that in an earlier example -the graph of `p = x^7 - 16129x^2 + 254x - 1` -- which indicated two -positive real roots -- was misleading, as there must be ``1`` or ``3`` -by a count of the sign changes. - - -For another example, if we looked at $f(x) = x^5 - 100x^4 + 4000x^3 - -80000x^2 + 799999x - 3199979$ again, we see that there could be ``1``, ``3``, -or ``5`` *positive* roots. However, changing the signs of the odd powers -leaves all "-" signs, so there are $0$ negative roots. From the graph, -we saw just ``1`` real root, not ``3`` or ``5``. We can verify numerically with: - -```julia; -j = x^5 - 100x^4 + 4000x^3 - 80000x^2 + 799999x - 3199979 -N.(solve(j)) -``` - - -### Cauchy's bound on the magnitude of the real roots. - -Descartes' rule gives a bound on how many real roots there may -be. Cauchy provided a bound on how large they can be. Assume our -polynomial is monic (if not, divide by $a_n$ to make it so, as this -won't effect the roots). Then any real root is no larger in absolute -value than $|a_0| + |a_1| + |a_2| + \cdots + |a_n|$, (this is -expressed in different ways.) - -To see precisely -[why](https://captainblack.wordpress.com/2009/03/08/cauchys-upper-bound-for-the-roots-of-a-polynomial/) -this bound works, suppose $x$ is a root with $|x| > 1$ and let $h$ be -the bound. Then since $x$ is a root, we can solve -``a_0 + a_1x + \cdots + 1 \cdot x^n = 0`` -for $x^n$ as: - -```math -x^n = -(a_0 + a_1 x + \cdots a_{n-1}x^{n-1}) -``` - -Which after taking absolute values of both sides, yields: - -```math -|x^n| \leq |a_0| + |a_1||x| + |a_2||x^2| + \cdots |a_{n-1}| |x^{n-1}| \leq (h-1) (1 + |x| + |x^2| + \cdots |x^{n-1}|). -``` - -The last sum can be computed using a formula for geometric sums, -$(|x^n| - 1)/(|x|-1)$. Rearranging, gives the inequality: - -```math -|x| - 1 \leq (h-1) \cdot (1 - \frac{1}{|x^n|} ) \leq (h-1) -``` - -from which it follows that $|x| \leq h$, as desired. - - -For our polynomial $x^5 -x + 1$ we have the sum above is $3$. The lone -real root is approximately $-1.1673$ which satisfies $|-1.1673| \leq -3$. - - -## Questions - -###### Question - -What is the remainder of dividing $x^4 - x^3 - x^2 + 2$ by $x-2$? - -```julia; hold=true; echo=false -choices = [ - "``x^3 + x^2 + x + 2``", - "``x-2``", - "``6``", - "``0``" -] -answ = 3 -radioq(choices, answ) -``` - - -###### Question - -What is the remainder of dividing $x^4 - x^3 - x^2 + 2$ by $x^3 - 2x$? - -```julia; hold=true; echo=false -choices = [ - "``x - 1``", - "``x^2 - 2x + 2``", - "``2``" -] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -We have that $x^5 - x + 1 = (x^3 + x^2 - 1) \cdot (x^2 - x + 1) + (-2x + 2)$. - -What is the remainder of dividing $x^5 - x + 1$ by $x^2 - x + 1$? - -```julia; hold=true; echo=false -choices = [ -"``x^2 - x + 1``", -"``x^3 + x^2 - 1``", -"``-2x + 2``" -] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -Consider this output from synthetic division - -```verbatim -2 | 1 0 0 0 -1 1 - | 2 4 8 16 30 - --------------- - 1 2 4 8 15 31 -``` - -representing $p(x) = q(x)\cdot(x-c) + r$. - -What is $p(x)$? - -```julia; hold=true; echo=false -choices = [ -"``x^5 - x + 1``", -"``2x^4 + 4x^3 + 8x^2 + 16x + 30``", -"``x^5 + 2x^4 + 4x^3 + 8x^2 + 15x + 31``", -"``x^4 +2x^3 + 4x^2 + 8x + 15``", -"``31``"] -answ = 1 -radioq(choices, answ) -``` - - -What is $q(x)$? - -```julia; hold=true; echo=false -choices = [ -"``x^5 - x + 1``", -"``2x^4 + 4x^3 + 8x^2 + 16x + 30``", -"``x^5 + 2x^4 + 4x^3 + 8x^2 + 15x + 31``", -"``x^4 +2x^3 + 4x^2 + 8x + 15``", -"``31``"] -answ = 4 -radioq(choices, answ) -``` - -What is $r$? - -```julia; hold=true; echo=false -choices = [ -"``x^5 - x + 1``", -"``2x^4 + 4x^3 + 8x^2 + 16x + 30``", -"``x^5 + 2x^4 + 4x^3 + 8x^2 + 15x + 31``", -"``x^4 +2x^3 + 4x^2 + 8x + 15``", -"``31``"] -answ = 5 -radioq(choices, answ) -``` - - -###### Question - -Let $p=x^4 -9x^3 +30x^2 -44x + 24$ - -Factor $p$. What are the factors? - -```julia; hold=true; echo=false -choices = [ -L" $2$ and $3$", -L" $(x-2)$ and $(x-3)$", -L" $(x+2)$ and $(x+3)$"] -answ = 2 -radioq(choices, answ) -``` - - -###### Question - -Does the expression $x^4 - 5$ factor over the rational numbers? - -```julia; hold=true; echo=false -yesnoq(false) -``` - -Using `solve`, how many real roots does $x^4 - 5$ have: - -```julia; hold=true; echo=false -numericq(2) -``` - - -###### Question - -The Soviet historian I. Y. Depman claimed that in ``1486``, -Spanish mathematician Valmes was burned at the stake for claiming to -have solved the [quartic -equation](https://en.wikipedia.org/wiki/Quartic_function). Here we -don't face such consequences. - -Find the largest real root of ``x^4 - 10x^3 + 32x^2 - 38x + 15``. - -```julia; hold=true; echo=false -@syms x -p = x^4 - 10x^3 + 32x^2 - 38x + 15 -rts = sympy.real_roots(p) -numericq(N(maximum(rts))) -``` - -###### Question - -What are the numeric values of the real roots of $f(x) = x^6 - 5x^5 + x^4 - 3x^3 + x^2 - x + 1$? - -```julia; hold=true; echo=false -choices = [ -q"[-0.434235, -0.434235, 0.188049, 0.188049, 0.578696, 4.91368]", -q"[-0.434235, -0.434235, 0.188049, 0.188049]", -q"[0.578696, 4.91368]", -q"[-0.434235+0.613836im, -0.434235-0.613836im]"] -answ = 3 -radioq(choices, answ) -``` - - -###### Question - -Odd polynomials must have at least one real root. - -Consider the polynomial $x^5 - 3x + 1$. Does it have more than one real root? - -```julia; hold=true; echo=false -xs = find_zeros(x -> x^5 - 3x + 1, -10..10) -yesnoq(length(xs) > 1) -``` - - -Consider the polynomial $x^5 - 1.5x + 1$. Does it have more than one real root? - -```julia; hold=true;echo=false -xs = find_zeros(x -> x^5 - 1.5x + 1, -10..10) -yesnoq(length(xs) > 1) -``` - - -###### Question - -What is the maximum number of positive, real roots that Descarte's bound says $p=x^5 + x^4 - x^3 + x^2 + x + 1$ can have? - -```julia; hold=true; echo=false -numericq(2) -``` - -How many positive, real roots does it actually have? - -```julia; hold=true; echo=false -numericq(0) -``` - -What is the maximum number of negative, real roots that Descarte's bound says $p=x^5 + x^4 - x^3 + x^2 + x + 1$ can have? - -```julia; hold=true; echo=false -numericq(3) -``` - -How many negative, real roots does it actually have? - -```julia; hold=true; echo=false -numericq(1) -``` - -###### Question - -Let $f(x) = x^5 - 4x^4 + x^3 - 2x^2 + x$. What does Cauchy's bound say is the largest possible magnitude of a root? - -```julia; hold=true; echo=false -answ = 1 + 4 + 1 + 2 + 1 -numericq(answ) -``` - -What is the largest magnitude of a real root? - -```julia; hold=true; echo=false -f(x) = x^5 - 4x^4 + x^3 - 2x^2 + x -rts = find_zeros(f, -5..5) -answ = maximum(abs.(rts)) -numericq(answ) -``` - - -###### Question - -As $1 + 2 + 3 + 4$ is $10$, Cauchy's bound says that the magnitude of -the largest real root of $x^3 - ax^2 + bx - c$ is $10$ where $a,b,c$ -is one of $2,3,4$. By considering all 6 such possible polynomials -(such as $x^3 - 3x^2 + 2x - 4$) what is the largest magnitude or a root? - - -```julia; hold=true; echo=false -function mag() - p = Permutation(0,2) - q = Permutation(1,2) - m = 0 - for perm in (p, q, q*p, p*q, p*q*p, p^2) - as = perm([2,3,4]) - fn = x -> x^3 - as[1]*x^2 + as[2]*x - as[3] - rts_ = find_zeros(fn, -10..10) - a1 = maximum(abs.(rts_)) - m = a1 > m ? a1 : m - end - m -end -numericq(mag()) -``` - - -###### Question - -The roots of the [Chebyshev](https://en.wikipedia.org/wiki/Chebyshev_polynomials) polynomials are helpful for some numeric algorithms. These are a family of polynomials related by $T_{n+1}(x) = 2xT_n(x) - T_{n-1}(x)$ (a recurrence relation in the manner of the Fibonacci sequence). The first two are $T_0(x) = 1$ and $T_1(x) =x$. - - - -* Based on the relation, figure out $T_2(x)$. It is - -```julia; hold=true; echo=false -choices = [ - "``4x^2 - 1``", - "``2x^2``", - "``x``", - "``2x``"] -answ = 1 -radioq(choices, answ) -``` - -* True or false, the $degree$ of $T_n(x)$ is $n$: (Look at the defining relation and reason this out). - -```julia; hold=true; echo=false -yesnoq(true) -``` - -* The fifth one is $T_5(x) = 32x^5 - 32x^3 + 6x$. Cauchy's bound says that the largest root has absolute value - -```julia; -1 + 1 + 6/32 -``` - -The Chebyshev polynomials have the property that in fact all $n$ roots are real, distinct, and in $[-1, 1]$. Using `SymPy`, find the magnitude of the largest root: - -```julia; hold=true; echo=false -@syms x -p = 16x^5 - 20x^3 + 5x -rts = N.(solve(p)) -answ = maximum(norm.(rts)) -numericq(answ) -``` - -* Plotting `p` over the interval $[-2,2]$ does not help graphically identify the roots: - -```julia; hold=true; -plot(16x^5 - 20x^3 + 5x, -2, 2) -``` - - -Does graphing over $[-1,1]$ show clearly the $5$ roots? - -```julia; hold=true; echo=false -yesnoq(true) -``` - - - - -## Appendix: Proof of Descartes' rule of signs - - -[Proof modified from this post](http://www.cut-the-knot.org/fta/ROS2.shtml). - -First, we can assume ``p`` is monic (``p_n=1`` and *positive*), and ``p_0`` is non zero. The latter, as we can easily deflate the polynomial by dividing by ``x`` if ``p_0`` is zero. - -Let `var(p)` be the number of sign changes and `pos(p)` the number of positive real roots of `p`. - -First: For a monic ``p`` if ``p_0 < 0`` then `var(p)` is odd and if ``p_0 > 0`` then `var(p)` is even. - -This is true for degree ``n=1`` the two sign patterns under the assumption are `+-` (``p_0 < 0``) or `++` (``p_0 > 0``). If it is true for degree ``n-1``, then the we can consider the sign pattern of such an ``n`` degree polynomial having one of these patterns: -`+...+-` or `+...--` (if ``p_0 < 0``) or `+...++` or `+...-+` if (``p_0>0``). An induction step applied to all but the last sign for these four patterns leads to even, odd, even, odd as the number of sign changes. Incorporating the last sign leads to odd, odd, even, even as the number of sign changes. - - -Second: For a monic ``p`` if `p_0 < 0` then `pos(p)` is *odd*, if `p_0 > 0` then `pos(p)` is even. - -This is clearly true for **monic** degree ``1`` polynomials: if ``c`` -is positive ``p = x - c`` has one real root (an odd number) and ``p = -x + c`` has ``0`` real roots (an even number). Now, suppose ``p`` has -degree ``n`` and is monic. Then as ``x`` goes to ``\infty``, it must -be ``p`` goes to ``\infty``. - -If ``p_0 < 0`` then there must be a positive real root, say ``r``, -(Bolzano's intermediate value theorem). Dividing ``p`` by ``(x-r)`` to -produce ``q`` requires ``q_0`` to be *positive* and of lower -degree. By *induction* ``q`` will have an even number of roots. Add in -the root ``r`` to see that ``p`` will have an **odd** number of roots. - -Now consider the case ``p_0 > 0``. There are two possibilities either -`pos(p)` is zero or positive. If `pos(p)` is ``0`` then there are an -even number of roots. If `pos(p)` is positive, then call ``r`` one of -the real positive roots. Again divide by ``x-r`` to produce ``p = -(x-r) \cdot q``. Then ``q_0`` must be *negative* for ``p_0`` to be -positive. By *induction* ``q`` must have an odd number or roots, -meaning ``p`` must have an even numbers - - -So there is parity between `var(p)` and `pos(p)`: if ``p`` is monic and ``p_0 < 0`` then both `var(p)` and `pos(p)` are both odd; and if ``p_0 > 0`` both `var(p)` and `pos(p)` are both even. - -Descartes' rule of signs will be established if it can be shown that `var(p)` is at least as big as `pos(p)`. Supppose ``r`` is a positive real root of ``p`` with ``p = (x-r)q``. We show that `var(p) > var(q)` which can be repeatedly applied to show that if ``p=(x-r_1)\cdot(x-r_2)\cdot \cdots \cdot (x-r_l) q``, where the ``r_i``s are the postive real roots, then `var(p) >= l + var(q) >= l = pos(p)`. - -As ``p = (x-c)q`` we must have the leading term is ``p_nx^n = x \cdot q_{n-1} x^{n-1}`` so ``q_{n_1}`` will also be `+` under our monic assumption. Looking at a possible pattern for the signs of ``q``, we might see the following unfinished synthetic division table for a specific ``q``: - -```verbatim - + ? ? ? ? ? ? ? ? -+ ? ? ? ? ? ? ? ? - ----------------- - + - - - + - + + 0 -``` - -But actually, we can fill in more, as the second row is formed by multiplying a postive ``c``: - - -```verbatim - + ? ? ? ? ? ? ? ? -+ + - - - + - + + - ----------------- - + - - - + - + + 0 -``` - -What's more, using the fact that to get `0` the two summands must differ in sign and to have a `?` plus `+` yield a `-`, the `?` must be `-` (and reverse), the following must be the case for the signs of `p`: - - -```verbatim - + - ? ? + - + ? - -+ + - - - + - + + - ----------------- - + - - - + - + + 0 -``` - -If the bottom row represents ``q_7, q_6, \dots, q_0`` and the top row ``p_8, p_7, \dots, p_0``, then the sign changes in ``q`` from `+` to `-` are matched by sign changes in ``p``. The ones in ``q`` from ``-`` to ``+`` are also matched regardless of the sign of the first two question marks (though ``p`` could possibly have more). The last sign change in ``p`` between ``p_2`` and ``p_0`` has no counterpart in ``q``, so there is at least one more sign change in ``p`` than ``q``. - -As such, the `var(p)` ``\geq 1 +`` `var(q)`. diff --git a/CwJ/precalc/polynomials_package.jmd b/CwJ/precalc/polynomials_package.jmd deleted file mode 100644 index c954057..0000000 --- a/CwJ/precalc/polynomials_package.jmd +++ /dev/null @@ -1,514 +0,0 @@ -# The Polynomials package - -This section will use the following add-on packages: - -```julia -import CalculusWithJulia -using Plots -using Polynomials -using RealPolynomialRoots -import SymPy # imported only: some functions, e.g. degree, need qualification -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "The Polynomials package", - description = "Calculus with Julia: The Polynomials package", - tags = ["CalculusWithJulia", "precalc", "the polynomials package"], -); - -nothing -``` - ----- - -While `SymPy` can be used to represent polynomials, there are also -native `Julia` packages available for this and related tasks. These -packages include `Polynomials`, `MultivariatePolynomials`, and -`AbstractAlgebra`, among many others. (A search on -[juliahub.com](juliahub.com) found over ``50`` packages matching -"polynomial".) We will look at the `Polynomials` package in the -following, as it is straightforward to use and provides the features -we are looking at for univariate polynomials. - - - -## Construction - -The polynomial expression ``p = a_0 + a_1\cdot x + a_2\cdot x^2 + -\cdots + a_n\cdot x^n`` can be viewed mathematically as a vector of -numbers with respect to some "basis", which for standard polynomials, -as above, is just the set of monomials, ``1, x, x^2, \dots, -x^n``. With this viewpoint, the polynomial ``p`` can be identified -with the vector `[a0, a1, a2, ..., an]`. The `Polynomials` package -provides a wrapper for such an identification through the `Polynomial` -constructor. We have previously loaded this add-on package. - -To illustrate, the polynomial ``p = 3 + 4x + 5x^2`` is constructed with - -```julia; -p = Polynomial([3,4,5]) -``` - -where the vector `[3,4,5]` represents the coefficients. The polynomial ``q = 3 + 5x^2 + 7x^4`` has some coefficients that are ``0``, these too must be indicated on construction, so we would have: - -```julia; -q = Polynomial([3,0,5,0,7]) -``` - -The `coeffs` function undoes `Polynomial`, returning the coefficients from a `Polynomial` object. - -```julia; -coeffs(q) -``` - -Once defined, the usual arithmetic operations for polynomials follow: - -```julia; -p + q -``` - -```julia; -p*q + p^2 -``` - -A polynomial has several familiar methods, such as `degree`: - -```julia; -degree(p), degree(q) -``` - -The zero polynomial has degree `-1`, by convention. - -Polynomials may be evaluated using function notation, that is: - -```julia; -p(1) -``` - -This blurs the distinction between a polynomial expression -- a formal object consisting of an indeterminate, coefficients, and the operations of addition, subtraction, multiplication, and non-negative integer powers -- and a polynomial function. - - -The polynomial variable, in this case `1x`, can be returned by `variable`: - -```julia; -x = variable(p) -``` - -This variable is a `Polynomial` object, so can be manipulated as a polynomial; we can then construct polynomials through expressions like: - -```julia; -r = (x-2)^3 * (x-1) * (x+1) -``` - -The product is expanded for storage by `Polynomials`, which may not be desirable for some uses. -A new variable can produced by calling `variable()`; so we could have constructed `p` by: - -```julia; hold=true; -x = variable() -3 + 4x + 5x^2 -``` - - -A polynomial in factored form, as `r` above is, can be constructed from its roots. Above, `r` has roots ``2`` (twice), ``1``, and ``-1``. Passing these as a vector to `fromroots` re-produces `r`: - -```julia; -fromroots([2,2,1,-1]) -``` - -The `fromroots` function is basically the [factor thereom](https://en.wikipedia.org/wiki/Factor_theorem) which links the factored form of the polynomial with the roots of the polynomial: ``(x-k)`` is a factor of ``p`` if and only if ``k`` is a root of ``p``. By combining a factor of the type ``(x-k)`` for each specified root, the polynomial can be constructed by multiplying its factors. For example, using `prod` and a generarator, we would have: - -```julia; hold=true; -x = variable() -prod(x - k for k in [2,2,1,-1]) -``` - -The `Polynomials` package has different ways to represent polynomials, and a factored form can also be used. For example, the `fromroots` function constructs polynomials from the specified roots and `FactoredPolynomial` leaves these in a factored form: - -```julia -fromroots(FactoredPolynomial, [2, 2, 1, -1]) -``` - -This form is helpful for some operations, for example polynomial multiplication and positive integer exponentiation, but not others such as addition of polynomials, where such polynomials must first be converted to the standard basis to add and are then converted back into a factored form. - ----- - -The indeterminate, or polynomial symbol is a related, but different -concept to `variable`. Polynomials are stored as a collection of -coefficients, an implicit basis, *and* a symbol, in the above this symbol is -`:x`. A polynomial's symbol is checked to ensure that polynomials with different -symbols are not algebraically combined, except for the special case of constant -polynomials. The symbol is specified through a second argument on -construction: - -```julia; -s = Polynomial([1,2,3], "t") -``` - -As `r` uses "`x`", and `s` a "`t`" the two can not be added, say: - -```julia; -r + s -``` - - -## Graphs - -Polynomial objects have a plot recipe defined -- plotting from the `Plots` package should be as easy as calling `plot`: - -```julia; -plot(r, legend=false) # suppress the legend -``` - -The choice of domain is heuristically identified; it and can be manually adjusted, as with: - -```julia; -plot(r, 1.5, 2.5, legend=false) -``` - -## Roots - -The default `plot` recipe checks to ensure the real roots of the polynomial are included in the domain of the plot. To do this, it must identify the roots. This is done *numerically* by the `roots` function, as in this example: - -```julia; hold=true; -x = variable() -p = x^5 - x - 1 -roots(p) -``` - -A consequence of the fundamental theorem of algebra and the factor -theorem is that any fifth degree polynomial with integer coefficients -has ``5`` roots, where possibly some are complex. For real coefficients, -these complex values must come in conjugate pairs, which can be -observed from the output. The lone real root is approximately -`1.1673039782614187`. This value being a numeric approximation to the -irrational root. - - -!!! note - `SymPy` also has a `roots` function. If both `Polynomials` and `SymPy` are used together, calling `roots` must be qualified, as with `Polynomials.roots(...)`. Similarly, `degree` is provided in both, so it too must be qualified. - - -The `roots` function numerically identifies roots. As such, it is susceptible to floating point issues. For example, the following polynomial has one root with multiplicity ``5``, but ``5`` distinct roots are numerically identified: - -```julia; hold=true; -x = variable() -p = (x-1)^5 -roots(p) -``` - -The `Polynomials` package has the `multroot` function to identify roots of polynomials when there are multiplicities expected. This function is not exported, so is called through: - -```julia; hold=true; -x = variable() -p = (x-1)^5 -Polynomials.Multroot.multroot(p) -``` - - -Floating point error can also prevent the finding of real roots. For example, this polynomial has ``3`` real roots, but `roots` finds but ``1``, as the two nearby ones are identified as complex: - -```julia; hold=true; -x = variable() -p = -1 + 254x - 16129x^2 + x^9 -roots(p) -``` - -The `RealPolynomialRoots` package, loaded at the top of this section, can assist in the case of identifying real roots of square-free polynomials (no multiple roots). For example: - -```julia; hold=true; -ps = coeffs(-1 + 254x - 16129x^2 + x^9) -st = ANewDsc(ps) -refine_roots(st) -``` - - -## Fitting a polynomial to data - -The fact that two distinct points determine a line is well known. Deriving the line is easy. Say we have two points ``(x_0, y_0)`` and ``(x_1, y_1)``. The *slope* is then - -```math -m = \frac{y_1 - y_0}{x_1 - x_0}, \quad x_1 \neq x_0 -``` - -The line is then given from the *point-slope* form by, say, ``y= y_0 + m\cdot (x-x_0)``. This all assumes, ``x_1 \neq x_0``, as were that the case the slope would be infinite (though the vertical line ``x=x_0`` would still be determined). - -A line, ``y=mx+b`` can be a linear polynomial or a constant depending on ``m``, so we could say ``2`` points determine a polynomial of degree ``1`` or less. Similarly, ``3`` distinct points determine a degree ``2`` polynomial or less, ``\dots``, ``n+1`` distinct points determine a degree ``n`` or less polynomial. Finding a polynomial, ``p`` that goes through ``n+1`` points (i.e., ``p(x_i)=y_i`` for each ``i``) is called [polynomial interpolation](https://en.wikipedia.org/wiki/Polynomial_interpolation). The main theorem is: - - -> *Polynomial interpolation theorem*: There exists a unique polynomial of degree ``n`` or less that interpolates the points ``(x_0,y_0), (x_1,y_1), \dots, (x_n, y_n)`` when the ``x_i`` are distinct. - -(Uniqueness follows as suppose ``p`` and ``q`` satisfy the above, then ``(p-q)(x) = 0`` at each of the ``x_i`` and is of degree ``n`` or less, so must be the ``0`` polynomial. Existence comes by construction. See the Lagrange basis in the questions.) - -Knowing we can succeed, we approach the problem of ``3`` points, say ``(x_0, y_0)``, ``(x_1,y_1)``, and ``(x_2, y_2)``. There is a polynomial ``p = a\cdot x^2 + b\cdot x + c`` with ``p(x_i) = y_i``. This gives ``3`` equations for the ``3`` unknown values ``a``, ``b``, and ``c``: - -```math -\begin{align*} -a\cdot x_0^2 + b\cdot x_0 + c &= y_0\\ -a\cdot x_1^2 + b\cdot x_1 + c &= y_1\\ -a\cdot x_2^2 + b\cdot x_2 + c &= y_2\\ -\end{align*} -``` - -Solving this with `SymPy` is tractable. A comprehension is used below to create the ``3`` equations; the `zip` function is a simple means to iterate over ``2`` or more iterables simultaneously: - - -```julia -SymPy.@syms a b c xs[0:2] ys[0:2] -eqs = [a*xi^2 + b*xi + c ~ yi for (xi,yi) in zip(xs, ys)] -abc = SymPy.solve(eqs, [a,b,c]) -``` - -As can be seen, the terms do get quite unwieldy when treated symbolically. Numerically, the `fit` function from the `Polynomials` package will return the interpolating polynomial. To compare, - -```julia -fit(Polynomial, [1,2,3], [3,1,2]) -``` - -and we can compare that the two give the same answer with, for example: - -```julia -abc[b]((xs .=> [1,2,3])..., (ys .=> [3,1,2])...) -``` - -(Ignore the tricky way of substituting in each value of `xs` and `ys` for the symbolic values in `x` and `y`.) - -##### Example Inverse quadratic interpolation - -A related problem, that will arise when finding iterative means to solve for zeros of functions, is *inverse* quadratic interpolation. That is finding ``q`` that goes through the points ``(x_0,y_0), (x_1, y_1), \dots, (x_n, y_n)`` satisfying ``q(y_i) = x_i``. (That is ``x`` -and ``y`` are reversed, as with inverse functions.) For the envisioned task, where the inverse quadratic function intersects the ``x`` axis is of interest, which is at the constant term of the polynomial (as it is like the ``y`` intercept of typical polynomial). Let's see what that is in general by replicating the above steps (though now the assumption is the ``y`` values are distinct): - -```julia; hold=true; -SymPy.@syms a b c xs[0:2] ys[0:2] -eqs = [a*yi^2 + b*yi + c ~ xi for (xi, yi) in zip(xs,ys)] -abc = SymPy.solve(eqs, [a,b,c]) -abc[c] -``` - -We can graphically see the result for the specific values of `xs` and `ys` as follows: - -```julia; hold=true; echo=false -SymPy.@syms a b c xs[0:2] ys[0:2] -eqs = [a*yi^2 + b*yi + c ~ xi for (xi, yi) in zip(xs,ys)] -abc = SymPy.solve(eqs, [a,b,c]) -abc[c] - -𝒙s, 𝒚s = [1,2,3], [3,1,2] -q = fit(Polynomial, 𝒚s, 𝒙s) # reverse -# plot -us = range(-1/4, 4, length=100) -vs = q.(us) -plot(vs, us, legend=false) -scatter!(𝒙s, 𝒚s) -plot!(zero) -x0 = abc[c]((xs .=> 𝒙s)..., (ys .=> 𝒚s)...) -scatter!([SymPy.N(x0)], [0], markershape=:star) -``` - -## Questions - -###### Question - -Do the polynomials ``p = x^4`` and ``q = x^2 - 2`` intersect? - -```julia; hold=true; echo=false; -x = variable() -p,q = x^4, x^2 - 2 -st = ANewDsc(coeffs(p-q)) -yesnoq(length(st) > 0) -``` - -###### Question - - -Do the polynomials ``p = x^4-4`` and ``q = x^2 - 2`` intersect? - -```julia; hold=true; echo=false; -x = variable() -p,q = x^4-4, x^2 - 2 -st = ANewDsc(coeffs(p-q)) -yesnoq(length(st) > 0) -``` - -###### Question - -How many real roots does ``p = 1 + x + x^2 + x^3 + x^4 + x^5`` have? - -```julia; hold=true; echo=false; -x = variable() -p = 1 + x + x^2 + x^3 + x^4 + x^5 -st = (ANewDsc∘coeffs)(p) -numericq(length(st)) -``` - -###### Question - -Mathematically we say the ``0`` polynomial has no degree. What convention does `Polynomials` use? (Look at `degree(zero(Polynomial))`.) - -```julia; hold=true; echo=false; -choices = ["`nothing`", "`-1`", "`0`", "`Inf`", "`-Inf`"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Consider the polynomial ``p(x) = a_1 x - a_3 x^3 + a_5 x^5`` where - -```math -\begin{align*} -a_1 &= 4(\frac{3}{\pi} - \frac{9}{16}) \\ -a_3 &= 2a_1 -\frac{5}{2}\\ -a_5 &= a_1 - \frac{3}{2}. -\end{align*} -``` - - -* Form the polynomial `p` by first computing the ``a``s and forming `p=Polynomial([0,a1,0,-a3,0,a5])` -* Form the polynomial `q` by these commands `x=variable(); q=p(2x/pi)` - -The polynomial `q`, a ``5``th-degree polynomial, is a good approximation of for the [sine](http://www.coranac.com/2009/07/sines/) function. - -Make graphs of both `q` and `sin`. Over which interval is the approximation (visually) a good one? - -```julia; hold=true; echo=false -choices = ["``[0,1]``", -"``[0,\\pi]``", -"``[0,2\\pi]``"] -radioq(choices, 1, keep_order=true) -``` - -(This [blog post](https://www.nullhardware.com/blog/fixed-point-sine-and-cosine-for-embedded-systems/) shows how this approximation is valuable under some specific circumstances.) - -###### Question - -The polynomial - -```julia -fromroots([1,2,3,3,5]) -``` - -has ``5`` sign changes and ``5`` real roots. For `x = variable()` use `div(p, x-3)` to find the result of dividing ``p`` by ``x-3``. How many sign changes are there in the new polynomial? - -```julia; hold=true; echo=false; -numericq(4) -``` - - -###### Question - -The identification of a collection of coefficients with a polynomial depends on an understood **basis**. A basis for the polynomials of degree ``n`` or less, consists of a minimal collection of polynomials for which all the polynomials of degree ``n`` or less can be expressed through a combination of sums of terms, each of which is just a coefficient times a basis member. The typical basis is the ``n+`` polynomials ``1`, `x`, `x^2, \dots, x^n``. However, though every basis must have ``n+1`` members, they need not be these. - -A basis used by [Lagrange](https://en.wikipedia.org/wiki/Lagrange_polynomial) is the following. Let there be ``n+1`` points distinct points ``x_0, x_1, \dots, x_n``. For each ``i`` in ``0`` to ``n`` define - -```math -l_i(x) = \prod_{0 \leq j \leq n; j \ne i} \frac{x-x_j}{x_i - x_j} = -\frac{(x-x_1)\cdot(x-x_2)\cdot \cdots \cdot (x-x_{j-1}) \cdot (x-x_{j+1}) \cdot \cdots \cdot (x-x_n)}{(x_i-x_1)\cdot(x_i-x_2)\cdot \cdots \cdot (x_i-x_{j-1}) \cdot (x_i-x_{j+1}) \cdot \cdots \cdot (x_i-x_n)}. -``` - -That is ``l_i(x)`` is a product of terms like ``(x-x_j)/(x_i-x_j)`` *except* when ``j=i``. - - -What is is the value of ``l_0(x_0)``? - -```julia; hold=true; echo=false -numericq(1) -``` - - -Why? - -```julia; hold=true; echo=false -choices = [""" -All terms like ``(x-x_j)/(x_0 - x_j)`` will be ``1`` when ``x=x_0`` and these are all the terms in the product defining ``l_0``. -""", - "The term ``(x_0-x_0)`` will be ``0``, so the product will be zero" - ] -radioq(choices, 1) -``` - - -What is the value of ``l_i(x_i)``? - -```julia; hold=true; echo=false -numericq(1) -``` - - -What is the value of ``l_0(x_1)``? - -```julia; hold=true; echo=false -numericq(0) -``` - - -Why? - -```julia; hold=true; echo=false -choices = [""" -The term like ``(x-x_1)/(x_0 - x_1)`` will be ``0`` when ``x=x_1`` and so the product will be ``0``. -""", - "The term ``(x-x_1)/(x_0-x_1)`` is omitted from the product, so the answer is non-zero." - ] -radioq(choices, 1) -``` - - -What is the value of ``l_i(x_j)`` *if* ``i \ne j``? - -```julia; hold=true; echo=false -numericq(0) -``` - - -Suppose the ``x_0, x_1, \dots, x_n`` are the ``x`` coordinates of ``n`` distinct points ``(x_0,y_0)``, ``(x_1, y_1), \dots, (x_n,y_n).`` Form the polynomial with the above basis and coefficients being the ``y`` values. That is consider: - -```math -p(x) = \sum_{i=0}^n y_i l_i(x) = y_0l_0(x) + y_1l_1(x) + \dots + y_nl_n(x) -``` - -What is the value of ``p(x_j)``? - -```julia; hold=true; echo=false -choices = ["``0``", "``1``", "``y_j``"] -radioq(choices, 3) -``` - -This last answer is why ``p`` is called an *interpolating* polynomial and this question shows an alternative way to identify interpolating polynomials from solving a system of linear equations. - - -###### Question - -The Chebyshev (``T``) polynomials are polynomials which use a different basis from the standard basis. Denote the basis elements ``T_0``, ``T_1``, ... where we have ``T_0(x) = 1``, ``T_1(x) = x``, and for bigger indices ``T_{i+1}(x) = 2xT_i(x) - T_{i-1}(x)``. The first others are then: - -```math -\begin{align*} -T_2(x) &= 2xT_1(x) - T_0(x) = 2x^2 - 1\\ -T_3(x) &= 2xT_2(x) - T_1(x) = 2x(2x^2-1) - x = 4x^3 - 3x\\ -T_4(x) &= 2xT_3(x) - T_2(x) = 2x(4x^3-3x) - (2x^2-1) = 8x^4 - 8x^2 + 1 -\end{align*} -``` - -With these definitions what is the polynomial associated to the coefficients ``[0,1,2,3]`` with this basis? - - -```julia; hold=true; echo=false -choices = [ - raw""" - It is ``0\cdot 1 + 1 \cdot x + 2 \cdots x^2 + 3\cdot x^3 = x + 2x^2 + 3x^3`` - """, - raw""" -It is ``0\cdot T_1(x) + 1\cdot T_1(x) + 2\cdot T_2(x) + 3\cdot T_3(x) = 0`` -""", - raw""" -It is ``0\cdot T_1(x) + 1\cdot T_1(x) + 2\cdot T_2(x) + 3\cdot T_3(x) = -2 - 8\cdot x + 4\cdot x^2 + 12\cdot x^3``` -"""] -radioq(choices, 3) -``` - -!!! note - The `Polynomials` package has an implementation, so you can check your answer through `convert(Polynomial, ChebyshevT([0,1,2,3]))`. Similarly, the `SpecialPolynomials` package has these and many other polynomial bases represented. - - The `ApproxFun` package is built on top of polynomials expressed in this basis, as the Chebyshev polynomials have special properties which make them very suitable when approximating functions with polynomials. The `ApproxFun` package uses easier-to-manipulate polynomials to approximate functions very accurately, thereby being useful for investigating properties of non-linear functions leveraging properties for polynomials. diff --git a/CwJ/precalc/ranges.jmd b/CwJ/precalc/ranges.jmd deleted file mode 100644 index deadbf6..0000000 --- a/CwJ/precalc/ranges.jmd +++ /dev/null @@ -1,663 +0,0 @@ -# Ranges and Sets - - -```julia; echo=false; results="hidden" -using CalculusWithJulia -using CalculusWithJulia.WeaveSupport -frontmatter = ( - title = "Ranges and Sets", - description = "Calculus with Julia: Ranges and Sets", - tags = ["CalculusWithJulia", "precalc", "ranges and sets"], -); - -nothing -``` - -## Arithmetic sequences - -Sequences of numbers are prevalent in math. A simple one is just counting by ones: - -```math -1, 2, 3, 4, 5, 6, 7, 8, 9, 10, \dots -``` - -Or counting by sevens: - -```math -7, 14, 21, 28, 35, 42, 49, \dots -``` - -More challenging for humans is [counting backwards](http://www.psychpage.com/learning/library/assess/mse.htm) by 7: - -```math -100, 93, 86, 79, \dots -``` - -These are examples of [arithmetic sequences](http://en.wikipedia.org/wiki/Arithmetic_progression). The form of the first $n+1$ terms in such a sequence is: - -```math -a_0, a_0 + h, a_0 + 2h, a_0 + 3h, \dots, a_0 + nh -``` - - -The formula for the $a_n$th term can be written in terms of $a_0$, or -any other $0 \leq m \leq n$ with $a_n = a_m + (n-m)\cdot h$. - - -A typical question might be: The first term of an arithmetic sequence -is equal to ``200`` and the common difference is equal to ``-10``. Find the -value of $a_{20}$. We could find this using $a_n = a_0 + n\cdot h$: - -```julia;hold=true -a0, h, n = 200, -10, 20 -a0 + n * h -``` - -More complicated questions involve an unknown first value, as -with: an arithmetic sequence has a common difference equal to ``10`` and -its ``6``th term is equal to ``52``. Find its ``15``th term, $a_{15}$. Here we have to -answer: $a_0 + 15 \cdot 10$. Either we could find $a_0$ (using $52 = a_0 + -6\cdot(10)$) or use the above formula - -```julia;hold=true -a6, h, m, n = 52, 10, 6, 15 -a15 = a6 + (n-m)*h -``` - -### The colon operator - -Rather than express sequences by the $a_0$, $h$, and $n$, `Julia` uses -the starting point (`a`), the difference (`h`) and a *suggested* -stopping value (`b`). That is, we need three values to specify these -ranges of numbers: a `start`, a `step`, and an `endof`. `Julia` gives -a convenient syntax for this: `a:h:b`. When the difference is just $1$, all -numbers between the start and end are specified by `a:b`, as in - -```julia; -1:10 -``` - -But wait, nothing different printed? This is because `1:10` is efficiently -stored. Basically, a recipe to generate the next number from the previous number is created and `1:10` just stores the start and end point and that recipe is used to generate the set of all values. To expand the values, you have to ask for them -to be `collect`ed (though this typically isn't needed in practice): - -```julia; -collect(1:10) -``` - - -When a non-default step size is needed, it goes in the middle, as in -`a:h:b`. For example, counting by sevens from ``1`` to ``50`` is achieved by: - -```julia; -collect(1:7:50) -``` - -Or counting down from 100: - -```julia; -collect(100:-7:1) -``` - -In this last example, we said end with ``1``, but it ended with ``2``. The -ending value in the range is a suggestion to go up to, but not exceed. Negative values for `h` are used to make decreasing sequences. - -### The range function - -For generating points to make graphs, a natural set of points to -specify is $n$ evenly spaced points between $a$ and $b$. We can mimic -creating this set with the range operation by solving for the correct -step size. We have $a_0=a$ and $a_0 + (n-1) \cdot h = b$. (Why $n-1$ -and not $n$?) Solving yields $h = (b-a)/(n-1)$. To be concrete we -might ask for ``9`` points between $-1$ and $1$: - -```julia;hold=true -a, b, n = -1, 1, 9 -h = (b-a)/(n-1) -collect(a:h:b) -``` - -Pretty neat. If we were doing this many times - such as once per plot - we'd want to encapsulate this into a function, for example: - -```julia; -function evenly_spaced(a, b, n) - h = (b-a)/(n-1) - collect(a:h:b) -end -``` - -Great, let's try it out: - -```julia; -evenly_spaced(0, 2pi, 5) -``` - -Now, our implementation was straightforward, but only because it avoids somethings. Look at something simple: - -```julia; -evenly_spaced(1/5, 3/5, 3) -``` - -It seems to work as expected. But looking just at the algorithm it isn't quite so clear: - -```julia; -1/5 + 2*1/5 # last value -``` - -Floating point roundoff leads to the last value *exceeding* `0.6`, so should it be included? Well, here it is pretty clear it *should* be, but -better to have something programmed that hits both `a` and `b` and adjusts `h` accordingly. - -Enter the base function `range` which solves this seemingly -simple - but not really - task. It can use `a`, `b`, and `n`. Like the -range operation, this function returns a generator which can be -collected to realize the values. - -The number of points is specified with keyword arguments, as in: - -```julia; -xs = range(-1, 1, length=9) # or simply range(-1, 1, 9) as of v"1.7" -``` - -and - -```julia; -collect(xs) -``` - -!!! note - There is also the `LinRange(a, b, n)` function which can be more performant than `range`, as it doesn't try to correct for floating point errors. - - -## Modifying sequences - -Now we concentrate on some more -general styles to modify a sequence to produce a new sequence. - -### Filtering - -For example, another way to get the values between ``0`` and ``100`` that are -multiples of ``7`` is to start with all ``101`` values and throw out those -that don't match. To check if a number is divisible by $7$, we could -use the `rem` function. It gives the remainder upon division. -Multiples of `7` match `rem(m, 7) == 0`. Checking for divisibility by -seven is unusual enough there is nothing built in for that, but -checking for division by $2$ is common, and for that, there is a -built-in function `iseven`. - -The act of throwing out elements of a collection based on some -condition is called *filtering*. The `filter` function does this in -`Julia`; the basic syntax being -`filter(predicate_function, collection)`. -The "`predicate_function`" is one that returns either -`true` or `false`, such as `iseven`. The output of `filter` consists -of the new collection of values - those where the predicate returns `true`. - -To see it used, lets start with the numbers between `0` and `25` -(inclusive) and filter out those that are even: - -```julia; -filter(iseven, 0:25) -``` - - -To get the numbers between ``1`` and ``100`` that are divisible by $7$ -requires us to write a function akin to `iseven`, which isn't hard (e.g., -`is_seven(x) = x%7 == 0` or if being fancy `Base.Fix2(iszero∘rem, 7)`), but isn't something we continue with just yet. - -For another example, here is an inefficient way to list the prime -numbers between ``100`` and ``200``. This uses the `isprime` -function from the `Primes` package - - -```julia; -using Primes -``` - -```julia; -filter(isprime, 100:200) -``` - -Illustrating `filter` at this point is mainly a motivation to -illustrate that we can start with a regular set of numbers and then -modify or filter them. The function takes on more value once we discuss how -to write predicate functions. - -### Comprehensions - -Let's return to the case of the set of even numbers between ``0`` and ``100``. We have many ways to describe this set: - -- The collection of numbers $0, 2, 4, 6 \dots, 100$, or the arithmetic - sequence with step size ``2``, which is returned by `0:2:100`. - -- The numbers between ``0`` and ``100`` that are even, that is `filter(iseven, 0:100)`. - -- The set of numbers $\{2k: k=0, \dots, 50\}$. - -While `Julia` has a special type for dealing with sets, we will use a -vector for such a set. (Unlike a set, vectors can have repeated -values, but as vectors are more widely used, we demonstrate them.) -Vectors are described more fully in a previous section, but as a -reminder, vectors are constructed using square brackets: `[]` (a -special syntax for -[concatenation](http://docs.julialang.org/en/latest/manual/arrays/#concatenation)). Square -brackets are used in different contexts within `Julia`, in this -case we use them to create a *collection*. If we separate single -values in our collection by commas (or semicolons), we will create a -vector: - -```julia; -x = [0, 2, 4, 6, 8, 10] -``` - - -That is of course only part of the set of even numbers we want. -Creating more might be tedious were we to type them all out, as above. In such -cases, it is best to *generate* the values. - - -For this simple case, a range can be used, but more generally a -[comprehension](http://julia.readthedocs.org/en/latest/manual/arrays/#comprehensions) -provides this ability using a construct that closely mirrors a set definition, such as $\{2k: -k=0, \dots, 50\}$. The simplest use of a comprehension takes this -form (as we described in the section on vectors): - -`[expr for variable in collection]` - - -The expression typically involves the variable specified after the keyword -`for`. The collection can be a range, a vector, or many other items -that are *iterable*. Here is how the mathematical set $\{2k: k=0, -\dots, 50\}$ may be generated by a comprehension: - -```julia; -[2k for k in 0:50] -``` - -The expression is `2k`, the variable `k`, and the collection is the range -of values, `0:50`. The syntax is basically identical to how the math -expression is typically read aloud. - - -For some other examples, here is how we can create the first ``10`` numbers divisible by ``7``: - -```julia; -[7k for k in 1:10] -``` - -Here is how we can square the numbers between ``1`` and ``10``: - -```julia; -[x^2 for x in 1:10] -``` - -To generate other progressions, such as powers of ``2``, we could do: - -```julia; -[2^i for i in 1:10] -``` - -Here are decreasing powers of ``2``: - -```julia; -[1/2^i for i in 1:10] -``` - - - -Sometimes, the comprehension does not produce the type of output that -may be expected. This is related to `Julia`'s more limited abilities -to infer types at the command line. If the output type is important, -the extra prefix of `T[]` can be used, where `T` is the desired -type. We will see that this will be needed at times with symbolic math. - - -### Generators - -A typical pattern would be to generate a collection of numbers and then apply a function to them. For example, here is one way to sum the powers of ``2``: - -```julia; -sum([2^i for i in 1:10]) -``` - -Conceptually this is easy to understand, but computationally it is a -bit inefficient. The generator syntax allows this type of task to be -done more efficiently. To use this syntax, we just need to drop the -`[]`: - -```julia; -sum(2^i for i in 1:10) -``` - -(The difference being no intermediate object is created to store the collection of all values specified by the generator.) - -### Filtering generated expressions - -Both comprehensions and generators allow for filtering through the keyword `if`. The following shows *one* way to add the prime numbers in $[1,100]$: - -```julia; -sum(p for p in 1:100 if isprime(p)) -``` - - -The value on the other side of `if` should be an expression that -evaluates to either `true` or `false` for a given `p` (like a -predicate function, but here specified as an expression). The value -returned by `isprime(p)` is such. - -In this example, we use the fact that `rem(k, 7)` returns the -remainder found from dividing `k` by `7`, and so is `0` when `k` is a -multiple of `7`: - -```julia; -sum(k for k in 1:100 if rem(k,7) == 0) ## add multiples of 7 -``` - -The same `if` can be used in a comprehension. For example, this is an alternative to `filter` for identifying the numbers divisble by `7` in a range of numbers: - -```julia; -[k for k in 1:100 if rem(k,7) == 0] -``` - -#### Example: Making change - -This example of Stefan Karpinski comes from a -[blog](http://julialang.org/blog/2016/10/julia-0.5-highlights) post -highlighting changes to the `Julia` language with version -`v"0.5.0"`, which added features to comprehensions that made this example possible. - -First, a simple question: using -pennies, nickels, dimes, and quarters how many different ways can we -generate one dollar? Clearly $100$ pennies, or $20$ nickels, or $10$ dimes, -or $4$ quarters will do this, so the answer is at least four, but how -much more than four? - -Well, we can use a comprehension to enumerate the -possibilities. This example illustrates how comprehensions and generators can involve one or more variable for the iteration. - - - -First, we either have $0,1,2,3$, or $4$ quarters, or $0$, $25$ -cents, $50$ cents, $75$ cents, or a dollar's worth. If we have, say, $1$ -quarter, then we need to make up $75$ cents with the rest. If we had $3$ -dimes, then we need to make up $45$ cents out of nickels and pennies, -if we then had $6$ nickels, we know we must need $15$ pennies. - - -The following expression shows how counting this can be done through -enumeration. Here `q` is the amount contributed by quarters, `d` the -amount from dimes, `n` the amount from nickels, and `p` the amount from -pennies. `q` ranges over $0, 25, 50, 75, 100$ or `0:25:100`, etc. If -we know that the sum of quarters, dimes, nickels contributes a certain -amount, then the number of pennies must round things up to $100$. - -```julia; -ways = [(q, d, n, p) for q = 0:25:100 for d = 0:10:(100 - q) for n = 0:5:(100 - q - d) for p = (100 - q - d - n)] -length(ways) -``` - -We see ``242`` cases, each distinct. The first $3$ are: - -```julia; -ways[1:3] -``` - - - -The generating expression reads naturally. It introduces the use of -multiple `for` statements, each subsequent one depending on the value -of the previous (working left to right). Now suppose, we want to -ensure that the amount in pennies is less than the amount in nickels, -etc. We could use `filter` somehow to do this for our last answer, but -using `if` allows for filtering while the events are generating. Here -our condition is simply expressed: `q > d > n > p`: - - -```julia; -[(q, d, n, p) for q = 0:25:100 - for d = 0:10:(100 - q) - for n = 0:5:(100 - q - d) - for p = (100 - q - d - n) - if q > d > n > p] -``` - -## Random numbers - -We have been discussing structured sets of numbers. On the opposite -end of the spectrum are random numbers. `Julia` makes them easy to -generate, especially random numbers chosen uniformly from $[0,1)$. - -- The `rand()` function returns a randomly chosen number in $[0,1)$. - -- The `rand(n)` function returns a vector of `n` randomly chosen numbers in $[0,1)$. - -To illustrate, this will command return a single number - -```julia; -rand() -``` - -If the command is run again, it is almost certain that a different value will be returned: - -```julia; -rand() -``` - - -This call will return a vector of ``10`` such random numbers: - -```julia; -rand(10) -``` - -The `rand` function is easy to use. The only common source of confusion is the subtle distinction between `rand()` and `rand(1)`, as the latter is a vector of ``1`` random number and the former just ``1`` random number. - - -## Questions - -###### Question - -Which of these will produce the odd numbers between ``1`` and ``99``? - -```julia; hold=true;echo=false; -choices = [ -q"1:99", -q"1:3:99", -q"1:2:99" -] -answ = 3 -radioq(choices, answ) -``` - - -###### Question - -Which of these will create the sequence $2, 9, 16, 23, \dots, 72$? - -```julia; hold=true;echo=false; -choices = [q"2:7:72", q"2:9:72", q"2:72", q"72:-7:2"] -answ = 1 -radioq(choices, answ) -``` - - -###### Question -How many numbers are in the sequence produced by `0:19:1000`? - -```julia; hold=true;echo=false; -val = length(collect(0:19:1000)) -numericq(val) -``` - - -###### Question - -The range operation (`a:h:b`) can also be used to countdown. Which of these will do so, counting down from `10` to `1`? (You can call `collect` to visualize the generated numbers.) - -```julia; hold=true;echo=false; -choices = [ -"`10:-1:1`", -"`10:1`", -"`1:-1:10`", -"`1:10`" -] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -What is the last number generated by `1:4:7`? - -```julia; hold=true;echo=false; -val = (1:4:7)[end] -numericq(val) -``` - -###### Question - -While the range operation can generate vectors by collecting, do the objects themselves act like vectors? - -Does scalar multiplication work as expected? In particular, is the result of `2*(1:5)` *basically* the same as `2 * [1,2,3,4,5]`? - -```julia; hold=true;echo=false; -yesnoq(true) -``` - -Does vector addition work? as expected? In particular, is the result of `(1:4) + (2:5)` *basically* the same as `[1,2,3,4]` + `[2,3,4,5]`? - -```julia; hold=true;echo=false; -yesnoq(true) -``` - -What if parenthese are left off? Explain the output of `1:4 + 2:5`? - -```julia; hold=true;echo=false; -choices = ["It is just random", -"Addition happens prior to the use of `:` so this is like `1:(4+2):5`", -"It gives the correct answer, a generator for the vector `[3,5,7,9]`" -] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -How is `a:b-1` interpreted: - -```julia; hold=true;echo=false; -choices = ["as `a:(b-1)`", "as `(a:b) - 1`, which is `(a-1):(b-1)`"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Create the sequence $10, 100, 1000, \dots, 1,000,000$ using a list comprehension. Which of these works? - -```julia; hold=true;echo=false; -choices = [q"[10^i for i in 1:6]", q"[10^i for i in [10, 100, 1000]]", q"[i^10 for i in [1:6]]"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Create the sequence $0.1, 0.01, 0.001, \dots, 0.0000001$ using a list comprehension. Which of these will work: - -```julia; hold=true;echo=false; -choices = [ -q"[10^-i for i in 1:7]", -q"[(1/10)^i for i in 1:7]", -q"[i^(1/10) for i in 1:7]"] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -Evaluate the expression $x^3 - 2x + 3$ for each of the values $-5, -4, \dots, 4, 5$ using a comprehension. Which of these will work? - -```julia; hold=true;echo=false; -choices = [q"[x^3 - 2x + 3 for i in -5:5]", q"[x^3 - 2x + 3 for x in -(5:5)]", q"[x^3 - 2x + 3 for x in -5:5]"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -How many prime numbers are there between $1100$ and $1200$? (Use `filter` and `isprime`) - -```julia; hold=true;echo=false; -val = length(filter(isprime, 1100:1200)) -numericq(val) -``` - - -###### Question - -Which has more prime numbers the range `1000:2000` or the range `11000:12000`? - -```julia; hold=true;echo=false; -n1 = length(filter(isprime, 1000:2000)) -n2 = length(filter(isprime, 11_000:12_000)) -booleanq(n1 > n2, labels=[q"1000:2000", q"11000:12000"]) -``` - -###### Question - -We can easily add an arithmetic progression with the `sum` -function. For example, `sum(1:100)` will add the numbers $1, 2, ..., 100$. - - -What is the sum of the odd numbers between $0$ and $100$? - -```julia; hold=true;echo=false; -val = sum(1:2:99) -numericq(val) -``` - -###### Question - -The sum of the arithmetic progression $a, a+h, \dots, a+n\cdot h$ has -a simple formula. Using a few cases, can you tell if this is the -correct one: - -```math -(n+1)\cdot a + h \cdot n(n+1)/2 -``` - -```julia; hold=true;echo=false; -booleanq(true, labels=["Yes, this is true", "No, this is false"]) -``` - - -###### Question - -A *geometric progression* is of the form $a^0, a^1, a^2, \dots, -a^n$. These are easily generated by comprehensions of the form -`[a^i for i in 0:n]`. Find the sum of the geometric progression $1, -2^1, 2^2, \dots, 2^{10}$. - -```julia; hold=true;echo=false; -as = [2^i for i in 0:10] -val = sum(as) -numericq(val) -``` - -Is your answer of the form $(1 - a^{n+1}) / (1-a)$? - -```julia; hold=true;echo=false; -yesnoq(true) -``` - - -###### Question - -The [product](http://en.wikipedia.org/wiki/Arithmetic_progression) of -the terms in an arithmetic progression has a known formula. The product -can be found by an expression of the form `prod(a:h:b)`. Find the product of the terms in the sequence $1,3,5,\dots,19$. - -```julia; hold=true;echo=false; -val = prod(1:2:19) -numericq(val) -``` diff --git a/CwJ/precalc/rational_functions.jmd b/CwJ/precalc/rational_functions.jmd deleted file mode 100644 index 2531dee..0000000 --- a/CwJ/precalc/rational_functions.jmd +++ /dev/null @@ -1,1078 +0,0 @@ -# Rational functions - -This section uses the following add-on packages: - -```julia -using CalculusWithJulia -using SymPy -using Plots -import Polynomials -using RealPolynomialRoots -``` - -The `Polynomials` package is "imported" to avoid naming collisions with `SymPy`; names will need to be qualified. - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Rational functions", - description = "Calculus with Julia: Rational functions", - tags = ["CalculusWithJulia", "precalc", "rational functions"], -); - -using Roots - -nothing -``` - ----- - -A rational expression is the ratio of two polynomial expressions. -Such expressions arise in many modeling situations. As many facts are -known about polynomial expressions, much can be determined about -rational expressions. This section covers some additional details -that arise when graphing such expressions. - -## Rational functions - - - -The rational numbers are simply ratios of integers, of the form $p/q$ -for non-zero $q$. A rational function is a ratio of *polynomial* -*functions* of the form $p(x)/q(x)$, again $q$ is non-zero, but may -have zeros. - -We know that polynomials have nice behaviors due to the following -facts: - -* Behaviors at $-\infty$, $\infty$ are known just from the leading term. - -* There are possible wiggles up and down, the exact behavior depends - on intermediate terms, but there can be no more than $n-1$ wiggles. - -* The number of real zeros is no more than $n$, the degree of the polynomial. - -Rational functions are not quite so nice: - -* behavior at $-\infty$ and $\infty$ can be like a polynomial of any degree, including constants - -* behaviour at any value x can blow up due to division by ``0`` - - rational functions, unlike polynomials, need not be always defined - -* The function may or may not cross zero, even if the range includes - every other point, as the graph of $f(x) =1/x$ will show. - - -Here, as with our discussion on polynomials, we are interested for now in just -a few properties: - -* What happens to $f(x)$ when $x$ gets really big or really small (towards $\infty$ or $-\infty$)? - -* What happens near the values where $q(x) = 0$? - -* When is $f(x) = 0$? - - -These questions can often be answered with a graph, but with rational -functions we will see that care must be taken to produce -a useful graph. - -For example, consider this graph generated from a simple rational -function: - -```math -f(x) = \frac{(x-1)^2 \cdot (x-2)}{(x+3) \cdot (x-3)}. -``` - - -```julia; -f(x) = (x-1)^2 * (x-2) / ((x+3)*(x-3) ) -plot(f, -10, 10) -``` - -We would be hard pressed to answer any of the three questions above -from the graph, though, on inspection, we might think the strange spikes have something -to do with $x$ values where $q(x)=0$. - -The question of big or small $x$ is not answered well with this graph, -as the spikes dominate the scale of the $y$-axis. Setting a much -larger viewing window illuminates this question: - -```julia; -plot(f, -100, 100) -``` - -We can see from this, that the function eventually looks like a -slanted straight line. The *eventual* shape of the graph is something -that can be determined just from the two leading terms. - -The spikes haven't vanished completely. It is just that with only a -few hundred points to make the graph, there aren't any values near -enough to the problem to make a large spike. The spikes happen because -the function has a *vertical asymptote* at these values. Though not -quite right, it is reasonable to think of the graph being made -by selecting a few hundred points in the specified domain, computing -the corresponding $y$ values, plotting the pairs, and finally connecting the points with -straight line segments. Near a vertical asymptote the function values -can be arbitrarily large in absolute values, though at the vertical asymptote the -function is undefined. This graph doesn't show such detail. - -The spikes will be related to the points where $q(x) = 0$, though not -necessarily all of them -- not all such points will produce a -vertical asymptote. - -Where the function crosses $0$ is very hard to tell from these -two graphs. As well, other finer features, such as local peaks or valleys, -when present, can be hard to identify as the $y$-scale is set to -accommodate the asymptotes. Working around the asymptotes requires -some extra effort. Strategies are discussed herein. - - - -## Asymptotes - - -Formally, an [asymptote](http://en.wikipedia.org/wiki/Asymptote) of a -curve is a line such that the distance between the curve and the line -approaches $0$ as they tend to infinity. Tending to infinity can -happen as $x \rightarrow \pm \infty$ *or* $y \rightarrow \pm \infty$, the -former being related to *horizontal asymptotes* or *slant asymptotes*, -the latter being related to *vertical asymptotes*. - -### Behaviour as $x \rightarrow \infty$ or $x \rightarrow -\infty$. - -Let's look more closely at our example rational function using -symbolic math. - - -In particular, let's rewrite the expression in terms of its numerator and denominator: - -```julia; -@syms x::real -num = (x-1)^2 * (x-2) -den = (x+3) * (x-3) -``` - -Euclid's -[division](https://en.wikipedia.org/wiki/Polynomial_greatest_common_divisor#Euclidean_division) -algorithm can be used for polynomials $a(x)$ and $b(x)$ to produce $q(x)$ and -$r(x)$ with $a = b\cdot q + r$ *and* the degree of $r(x)$ is less than the -degree of $b(x)$. This is in direct analogy to the division algorithm of -integers, only there the value of the remainder, $r(x)$, satisfies $0 -\leq r < b$. Given $q(x)$ and $r(x)$ as above, we can reexpress the rational function - -```math -\frac{a(x)}{b(x)} = q(x) + \frac{r(x)}{b(x)}. -``` - -The rational expression on the right-hand side has larger degree in the denominator. - - -The division algorithm is implemented in `Julia` generically through the `divrem` method: - -```julia; -q, r = divrem(num, den) -``` - -This yields the decomposition of `num/den`: - -```julia; -q + r/den -``` - - -A similar result can be found using the `apart` function, which can be easier to use if the expression is not given in terms of a separate numerator and denominator. - -```julia; -g(x) = (x-1)^2 * (x-2) / ((x+3)*(x-3)) # as a function -h = g(x) # a symbolic expression -apart(h) -``` - - -This decomposition breaks the rational expression into two pieces: $x-4$ -and $40/(3x+9) + 2/(3x-9)$. The first piece would have a graph -that is the line with slope $1$ and $y$-intercept $4$. As $x$ goes to -$\infty$, the second piece will clearly go towards ``0,`` as this simple -graph shows: - -```julia; -plot(apart(h) - (x - 4), 10, 100) -``` - -Similarly, a plot over $[-100, -10]$ would show decay towards $0$, -though in that case from below. Combining these two facts then, it is -now no surprise that the graph of the rational function $f(x)$ should -approach a straight line, in this case $y=x-4$ as $x \rightarrow \pm -\infty$. - -We can easily do most of this analysis without needing a -computer or algebra. First, we should know the four eventual shapes of a polynomial, that the graph of $y=mx$ is a line with -slope $m$, the graph of $y = c$ is a constant line at height $c$, and -the graph of $y=c/x^m$, $m > 0$ will decay towards $0$ as $x -\rightarrow \pm\infty$. The latter should be clear, as $x^m$ gets big, -so its reciprocal goes towards $0$. - - -The factored form, as $p$ is presented, is a bit hard to work with, -rather we use the expanded form, which we get through the `cancel` -function - -```julia; -cancel(h) -``` - -We can see that the numerator is of degree ``3`` and the denominator -of degree $2$. The leading terms are $x^3$ and $x^2$, respectively. If -we were to pull those out we would get: - -```math -\frac{x^3 \cdot (1 - 4/x + 5/x^2 - 2/x^3)}{x^2 \cdot (1 - 9/x^2)}. -``` - -The terms $(1 - 4/x + 5/x^2 - 2/x^3)$ and $(1 - 9/x^2)$ go towards $1$ -as $x \rightarrow \pm \infty$, as each term with $x$ goes towards -$0$. So the dominant terms comes from the ratio of the leading terms, -$x^3$ and $x^2$. This ratio is $x$, so their will be an asymptote around a -line with slope $1$. (The fact that the asymptote is $y=x-4$ takes a -bit more work, as a division step is needed.) - -Just by looking at the ratio of the two leading terms, the behaviour -as $x \rightarrow \pm \infty$ can be discerned. If this ratio is of: - -* the form $c x^m$ with $m > 1$ then the shape will follow the polynomial growth of of the monomial $c x^m$. - -* the form $c x^m$ with $m=1$ then there will be a line with slope $c$ as a *slant asymptote*. - -* the form $cx^0$ with $m=0$ (or just $c$) then there will be a *horizontal asymptote* $y=c$. - -* the form $c/x^{m}$ with $m > 0$ then there will be a horizontal asymptote $y=0$, or the $y$ axis. - -To expand on the first points where the degree of the numerator is -greater than that of the denominator, we have from the division -algorithm that if $a(x)$ is the numerator and $b(x)$ the denominator, -then $a(x)/b(x) = q(x) + r(x)/b(x)$ where the degree of $b(x)$ is -greater than the degree of $r(x)$, so the right-most term will have a -horizontal asymptote of $0$. This says that the graph -will eventually approach the graph of $q(x)$, giving more detail than just -saying it follows the shape of the leading term of $q(x)$, at the -expense of the work required to find $q(x)$. - - -### Examples - - -Consider the rational expression - -```math -\frac{17x^5 - 300x^4 - 1/2}{x^5 - 2x^4 + 3x^3 - 4x^2 + 5}. -``` - -The leading term of the numerator is $17x^5$ and the leading term of the denominator is $x^5$. The ratio is $17$ (or $17x^0 = 17x^{5-5}$). As such, we would have a horizontal asymptote $y=17$. - ----- - -If we consider instead this rational expression: - -```math -\frac{x^5 - 2x^4 + 3x^3 - 4x^2 + 5}{5x^4 + 4x^3 + 3x^2 + 2x + 1} -``` - -Then we can see that the ratio of the leading terms is $x^5 / (5x^4) = (1/5)x$. We expect a slant asymptote with slope $1/5$, though we would need to divide to see the exact intercept. This is found with, say: - -```julia; hold=true; -p = (x^5 - 2x^4 + 3x^3 - 4x^2 + 5) / (5x^4 + 4x^3 + 3x^2 + 2x + 1) -quo, rem = divrem(numerator(p), denominator(p)) # or apart(p) -quo -``` - ----- - -The rational function - -```math -\frac{5x^3 + 6x^2 + 2}{x-1} -``` - -has decomposition $5x^2 + 11x + 11 + 13/(x-1)$: - -```julia; -top = 5x^3 + 6x^2 +2 -bottom = x-1 -quo, rem = divrem(top, bottom) -``` - -The graph of has nothing in common with the graph of the quotient for small $x$ - -```julia; -plot(top/bottom, -3, 3) -plot!(quo, -3, 3) -``` - -But the graphs do match for large $x$: - -```julia; -plot(top/bottom, 5, 10) -plot!(quo, 5, 10) -``` - - ----- - -Finally, consider this rational expression in factored form: - -```math -\frac{(x-2)^3\cdot(x-4)\cdot(x-3)}{(x-5)^4 \cdot (x-6)^2}. -``` - -By looking at the powers we can see that the leading term of the -numerator will the $x^5$ and the leading term of the denominator -$x^6$. The ratio is $1/x^1$. As such, we expect the $y$-axis as a -horizontal asymptote: - -#### Partial fractions - -The `apart` function was useful to express a rational function in -terms of a polynomial plus additional rational functions whose -horizontal asymptotes are $0$. This function computes the partial -fraction -[decomposition](https://en.wikipedia.org/wiki/Partial_fraction_decomposition) -of a rational function. Outside of the initial polynomial, this -decomposition is a reexpression of a rational function into a sum of -rational functions, where the denominators are *irreducible*, or -unable to be further factored (non-trivially) and the numerators have -lower degree than the denominator. Hence the horizontal asymptotes of $0$. - -To see another example we have: - -```julia; hold=true; -p = (x-1)*(x-2) -q = (x-3)^3 * (x^2 - x - 1) -apart(p/q) -``` - -The denominator, $q$, has factors $x-3$ and $x^2 - x - 1$, each -irreducible. The answer is expressed in terms of a sum of rational -functions each with a denominator coming from one of these factors, -possibly with a power. - -### Vertical asymptotes - -As just discussed, the graph of $1/x$ will have a horizontal asymptote. However it will also show a spike at $0$: - -```julia; -plot(1/x, -1, 1) -``` - -Again, this spike is an artifact of the plotting algorithm. The -$y$ values for $x$-values just smaller than $0$ are large negative -values and the $x$ values just larger than $0$ produce large, positive -$y$ values. - -The two points with $x$ components closest to $0$ are connected with a -line, though that is misleading. Here we deliberately use far fewer -points to plot ``1/x`` to show how this happens: - -```julia; hold=true; -f(x) = 1/x -xs = range(-1, 1, length=12) -ys = f.(xs) -plot(xs, ys) -scatter!(xs, ys) -``` -The line $x = 0$ is a *vertical asymptote* for the graph of $1/x$. As -$x$ values get close to $0$ from the right, the $y$ values go towards -$\infty$ and as the $x$ values get close to $0$ on the left, the $y$ -values go towards $-\infty$. - -This has everything to do with the fact that $0$ is a root of the denominator. - -For a rational function $p(x)/q(x)$, the roots of $q(x)$ may or may -not lead to vertical asymptotes. For a root $c$ if $p(c)$ is not zero then the line -$x=c$ will be a vertical asymptote. If $c$ is a root of both $p(x)$ -and $q(x)$, then we can rewrite the expression as: - -```math -\frac{p(x)}{q(x)} = \frac{(x-c)^m r(x)}{(x-c)^n s(x)}, -``` - -where both $r(c)$ and $s(c)$ are non zero. Knowing $m$ and $n$ (the multiplicities of the root $c$) allows the following to be said: - -* If $m < n$ then $x=c$ will be a vertical asymptote. - -* If $m \geq n$ then $x=c$ will not be vertical asymptote. (The value - $c$ will be known as a removable singularity). In this case, the - graph of $p(x)/q(x)$ and the graph of $(x-c)^{m-n}r(x)/s(x)$ will - differ, though very slightly, as the latter will include a value for - $x=c$, whereas $x=c$ is not in the domain of $p(x)/q(x)$. - -Finding the multiplicity may or may not be hard, but there is a very -kludgy quick check that is often correct. With `Julia`, if you have a -rational function that has `f(c)` evaluate to `Inf` or `-Inf` then -there will be a vertical asymptote. If the expression evaluates to -`NaN`, more analysis is needed. (The value of `0/0` is `NaN`, where as -`1/0` is `Inf`.) - -For example, the function -``f(x) = ((x-1)^2 \cdot (x-2)) / ((x+3) \cdot(x-3))`` has vertical asymptotes at ``-3`` and ``3``, as its graph -illustrated. Without the graph we could see this as well: - -```julia; hold=true; -f(x) = (x-1)^2 * (x-2) / ((x+3)*(x-3) ) -f(3), f(-3) -``` - -#### Graphing with vertical asymptotes - -As seen in several graphs, the basic plotting algorithm does a poor -job with vertical asymptotes. For example, it may erroneously connect -their values with a steep vertical line, or the $y$-axis scale can get -so large as to make reading the rest of the graph impossible. There are some tricks to work around this. - -Consider again the function $f(x) = ((x-1)^2 \cdot (x-2)) / ((x+3) -\cdot(x-3))$. Without much work, we can see that $x=3$ and $x=-3$ will -be vertical asymptotes and there will be a slant asymptote with -slope ``1``. How to graph this? - -We can avoid the vertical asymptotes in our viewing window. For -example we could look at the area between the vertical asymptotes, by -plotting over $(-2.9, 2.9)$, say: - -```julia; -𝒇(x) = (x-1)^2 * (x-2) / ((x+3)*(x-3) ) -plot(𝒇, -2.9, 2.9) -``` - -This backs off by $\delta = 0.1$. As we have that $3 - 2.9$ is -$\delta$ and $1/\delta$ is 10, the $y$ axis won't get too large, and -indeed it doesn't. - -This graph doesn't show well the two zeros at $x=1$ and $x=2$, for -that a narrower viewing window is needed. By successively panning -throughout the interesting part of the graph, we can get a view of the -function. - - -We can also clip the `y` axis. The `plot` function can be passed an -argument `ylims=(lo, hi)` to limit which values are plotted. With this, -we can have: - -```julia; hold=true -plot(𝒇, -5, 5, ylims=(-20, 20)) -``` - -This isn't ideal, as the large values are still computed, just the -viewing window is clipped. This leaves the vertical asymptotes still -effecting the graph. - -There is another way, we could ask `Julia` to not plot $y$ values that -get too large. This is not a big request. If instead of the value of -`f(x)` - when it is large - -we use `NaN` instead, then the -connect-the-dots algorithm will skip those values. - -This was discussed in an earlier section where the `rangeclamp` function was introduced to replace large values of `f(x)` (in absolute values) with `NaN`. - - - -```julia; -plot(rangeclamp(𝒇, 30), -25, 25) # rangeclamp is in the CalculusWithJulia package -``` - -We can see the general shape of ``3`` curves broken up by the vertical -asymptotes. The two on the side heading off towards the line $x-4$ and -the one in the middle. We still can't see the precise location of the -zeros, but that wouldn't be the case with most graphs that show -asymptotic behaviors. However, we can clearly tell where to "zoom in" -were those of interest. - - -### Sign charts - -When sketching graphs of rational functions by hand, it is useful to use sign charts. -A sign chart of a function indicates when the function is positive, -negative, $0$, or undefined. It typically is represented along the -lines of this one for $f(x) = x^3 - x$: - -```verbatim - - 0 + 0 - 0 + -< ----- -1 ----- 0 ----- 1 ----- > -``` - - -The usual recipe for construction follows these steps: - -- Identify when the function is $0$ or undefined. Place those values - on a number line. - -- Identify "test points" within each implied interval (these are $(-\infty, -1)$, $(-1,0)$, $(0,1)$, and $(1, \infty)$ in the example) and check for the sign of $f(x)$ at these test points. Write in `-`, `+`, `0`, or `*`, as appropriate. The value comes from the fact that "continuous" functions may only change sign when they cross $0$ or are undefined. - -With the computer, where it is convenient to draw a graph, it might be better to emphasize -the sign on the graph of the function. The `sign_chart` function from `CalculusWithJulia` does this by numerically identifying points where the function is ``0`` or ``\infty`` and indicating the sign as ``x`` crosses over these points. - -```julia; echo=false -# in CalculusWithJuia -function sign_chart(f, a, b; atol=1e-6) - pm(x) = x < 0 ? "-" : x > 0 ? "+" : "0" - summarize(f,cp,d) = (∞0=cp, sign_change=pm(f(cp-d)) * " → " * pm(f(cp+d))) - - zs = find_zeros(f, a, b) - pts = vcat(a, zs, b) - for (u,v) ∈ zip(pts[1:end-1], pts[2:end]) - zs′ = find_zeros(x -> 1/f(x), u, v) - for z′ ∈ zs′ - flag = false - for z ∈ zs - if isapprox(z′, z, atol=atol) - flag = true - break - end - end - !flag && push!(zs, z′) - end - end - sort!(zs) - - length(zs) == 0 && return [] - m,M = extrema(zs) - d = min((m-a)/2, (b-M)/2) - if length(zs) > 0 - d′ = minimum(diff(zs))/2 - d = min(d, d′ ) - end - summarize.(f, zs, d) -end -``` - -```julia; hold=true; -f(x) = x^3 - x -sign_chart(f, -3/2, 3/2) -``` - - -## Pade approximate - -One area where rational functions are employed is in approximating -functions. Later, the Taylor polynomial will be seen to be a polynomial that -approximates well a function (where "well" will be described -later). The Pade approximation is similar, though uses a rational -function for the form $p(x)/q(x)$, where $q(0)=1$ is customary. - -Some example approximations are - -```math -\sin(x) \approx \frac{x - 7/60 \cdot x^3}{1 + 1/20 \cdot x^2} -``` - -and - -```math -\tan(x) \approx \frac{x - 1/15 \cdot x^3}{1 - 2/5 \cdot x^2} -``` - - -We can look graphically at these approximations: - -```julia; -sin_p(x) = (x - (7/60)*x^3) / (1 + (1/20)*x^2) -tan_p(x) = (x - (1/15)*x^3) / (1 - (2/5)*x^2) -plot(sin, -pi, pi) -plot!(sin_p, -pi, pi) -``` - -```julia; -plot(tan, -pi/2 + 0.2, pi/2 - 0.2) -plot!(tan_p, -pi/2 + 0.2, pi/2 - 0.2) -``` - -## The `Polynomials` package for rational functions - -In the following, we import some functions from the `Polynomials` package. We avoided loading the entire namespace, as there are conflicts with `SymPy`. Here we import some useful functions and the `Polynomial` constructor: - -```julia -import Polynomials: Polynomial, variable, lowest_terms, fromroots, coeffs -``` - -The `Polynomials` package has support for rational functions. The `//` operator can be used to create rational expressions: - -```julia; -𝒙 = variable() -𝒑 = (𝒙-1)*(𝒙-2)^2 -𝒒 = (𝒙-2)*(𝒙-3) -𝒑𝒒 = 𝒑 // 𝒒 -``` - -A rational expression is a formal object; a rational function the viewpoint that this object will be evaluated by substituting values for the indeterminate. Rational expressions made within `Polynomials` are evaluated just like functions: - -```julia; -𝒑𝒒(4) # p(4)/q(4) -``` - -The rational expressions are not in lowest terms unless requested through the `lowest_terms` method: - -```julia; -lowest_terms(𝒑𝒒) -``` - -For polynomials as simple as these, this computation is not a problem, -but there is the very real possibility that the lowest term -computation may be incorrect. Unlike `SymPy` which factors -symbolically, `lowest_terms` uses a numeric algorithm and does not, as -would be done by hand or with `SymPy`, factor the polynomial and then cancel common -factors. - -The distinction between the two expressions is sometimes made; the -initial expression is not defined at ``x=2``; the reduced one is, so -the two are not identical when viewed as functions of the variable -``x``. - - -Rational expressions include polynomial expressions, just as the -rational numbers include the integers. The identification there is to -divide by ``1``, thinking of ``3`` as ``3/1``. In `Julia`, we would -just use - -```julia; -3//1 -``` - -The integer can be recovered from the rational number using `numerator`: - -```julia; -numerator(3//1) -``` - -Similarly, we can divide a polynomial by the polynomial ``1``, which in `Julia` is returned by `one(p)`, to produce a rational expression: - -```julia; -pp = 𝒑 // one(𝒑) -``` - -And as with rational numbers, `p` is recovered by `numerator`: - -```julia; -numerator(pp) -``` - -One difference is the rational number `3//1` also represents other -expressions, say `6/2` or `12/4`, as `Julia`'s rational numbers are -presented in lowest terms, unlike the rational expressions in -`Polynomials`. - -Rational functions also have a plot recipe defined for them that -attempts to ensure the basic features are identifiable. As previously -discussed, a plot of a rational function can require some effort to -avoid the values associated to vertical asymptotes taking up too many -of the available vertical pixels in a graph. - -For the polynomial `pq` above, we have from observation that ``1`` and -``2`` will be zeros and ``x=3`` a vertical asymptote. We also can -identify a slant asymptote with slope ``1``. These are hinted at in this graph: - -```julia; -plot(𝒑𝒒) -``` - -To better see the zeros, a plot over a narrower interval, say ``[0,2.5]``, -would be encouraged; to better see the slant asymptote, a plot over -a wider interval, say ``[-10,10]``, would be encouraged. - - -For one more example of the default plot recipe, we redo the graphing of the rational expression we earlier plotted with `rangeclamp`: - -```julia; hold=true; -p,q = fromroots([1,1,2]), fromroots([-3,3]) -plot(p//q) -``` - - - -##### Example: transformations of polynomials; real roots - -We have seen some basic transformations of functions such as shifts and scales. For a polynomial expression we can implement these as follows, taking advantage of polynomial evaluation: - -```julia; hold=true; -x = variable() -p = 3 + 4x + 5x^2 -a = 2 -p(a*x), p(x+a) # scale, shift -``` - -A different polynomial transformation is inversion, or the mapping ``x^d \cdot p(1/x)`` where ``d`` is the degree of ``p``. This will yield a polynomial, as perhaps this example will convince you: - - -```julia; hold=true -p = Polynomial([1, 2, 3, 4, 5]) -d = Polynomials.degree(p) # degree is in SymPy and Polynomials, indicate which -pp = p // one(p) -x = variable(pp) -q = x^d * pp(1/x) -lowest_terms(q) -``` - -We had to use a rational expression so that division by the variable was possible. -The above indicates that the new polynomial, ``q``, is constructed from ``p`` by **reversing** the coefficients. - -Inversion is like a funhouse mirror, flipping around parts of the polynomial. For example, the interval -``[1/4,1/2]`` is related to the interval ``[2,4]``. Of interest here, is that if ``p(x)`` had a root, ``r``, in ``[1/4,1/2]`` then ``q(x) = x^d \cdot p(1/x)`` would have a root in ``[2,4]`` at ``1/r``. - -So these three transformations -- scale, shift, and inversion -- can be defined for polynomials. - -Combined, the three can be used to create a [Mobius transformation](https://en.wikipedia.org/wiki/M%C3%B6bius_transformation). For two values ``a`` and ``b``, consider the polynomial derived from ``p`` (again `d=degree(p)`) by: - -```math -q = (x+1)^d \cdot p(\frac{ax + b}{x + 1}). -``` - -Here is a non-performant implementation as a `Julia` function: - -```julia; -function mobius_transformation(p, a, b) - x = variable(p) - p = p(x + a) # shift - p = p((b-a)*x) # scale - p = Polynomial(reverse(coeffs(p))) # invert - p = p(x + 1) # shift - p -end -``` - -We can verify this does what we want through example with the previously defined `p`: - -```julia; -𝐩 = Polynomial([1, 2, 3, 4, 5]) -𝐪 = mobius_transformation(𝐩, 4, 6) -``` - -As contrasted with - -```julia; hold=true; -a, b = 4, 6 - -pq = 𝐩 // one(𝐩) -x = variable(pq) -d = Polynomials.degree(𝐩) -numerator(lowest_terms( (x + 1)^2 * pq((a*x + b)/(x + 1)))) -``` - ----- - -Now, why is this of any interest? - -Mobius transforms are used to map regions into other regions. In this special case, the transform ``\phi(x) = (ax + b)/(x + 1)`` takes the interval ``[0,\infty]`` and sends it to ``[a,b]`` (``0`` goes to ``(a\cdot 0 + b)/(0+1) = b``, whereas ``\infty`` goes to ``ax/x \rightarrow a``). Using this, if ``p(u) = 0``, with ``q(x) = (x-1)^d p(\phi(x))``, then setting ``u = \phi(x)`` we have ``q(x) = (\phi^{-1}(u)+1)^d p(\phi(\phi^{-1}(u))) = (\phi^{-1}(u)+1)^d \cdot p(u) = (\phi^{-1}(u)+1)^d \cdot 0 = 0``. That is, a zero of ``p`` in ``[a,b]`` will appear as a zero of ``q`` in ``[0,\infty)`` at ``\phi^{-1}(u)``. - - - -The Descartes rule of signs applied to ``q`` then will give a bound on the number of possible roots of ``p`` in the interval ``[a,b]``. In the example we did, the Mobius transform for ``a=4, b=6`` is ``15 - x - 11x^2 - 3x^3`` with ``1`` sign change, so there must be exactly ``1`` real root of ``p=(x-1)(x-3)(x-5)`` in the interval ``[4,6]``, as we can observe from the factored form of ``p``. - -Similarly, we can see there are ``2`` or ``0`` roots for ``p`` in the interval ``[2,6]`` by counting the two sign changes here: - -```julia; -mobius_transformation(𝐩, 2,6) -``` - -This observation, along with a detailed analysis provided by [Kobel, Rouillier, and Sagraloff](https://dl.acm.org/doi/10.1145/2930889.2930937) provides a means to find intervals that enclose the real roots of a polynomial. - -The basic algorithm, as presented next, is fairly simple to understand, and hints at the bisection algorithm to come. It is due to Akritas and Collins. Suppose you know the only possible positive real roots are between ``0`` and ``M`` *and* no roots are repeated. Find the transformed polynomial over ``[0,M]``: - -* If there are no sign changes, then there are no roots of ``p`` in ``[0,M]``. -* If there is one sign change, then there is a single root of ``p`` in ``[0,M]``. The interval ``[0,M]`` is said to isolate the root (and the actual root can then be found by other means) -* If there is more than one sign change, divide the interval in two (``[0,M/2]`` and ``[M/2,M]``, say) and apply the same consideration to each. - -Eventually, **mathematically** this will find isolating intervals for each positive real root. (The negative ones can be similarly isolated.) - -Applying these steps to ``p`` with an initial interval, say ``[0,9]``, we would have: - -```julia; hold=true -p = fromroots([1,3,5]) # (x-1)⋅(x-3)⋅(x-5) = -15 + 23*x - 9*x^2 + x^3 -mobius_transformation(p, 0, 9) # 3 -mobius_transformation(p, 0, 9//2) # 2 -mobius_transformation(p, 9//2, 9) # 1 (and done) -mobius_transformation(p, 0, 9//4) # 1 (and done) -mobius_transformation(p, 9//4, 9//2) # 1 (and done) -``` - -So the three roots (``1``, ``3``, ``5``) are isolated by ``[0, 9/4]``, ``[9/4, 9/2]``, and ``[9/2, 9]``. - -### The `RealPolynomialRoots` package. - -For square-free polynomials, the `RealPolynomialRoots` package implements a basic version of the paper of [Kobel, Rouillier, and Sagraloff](https://dl.acm.org/doi/10.1145/2930889.2930937) to identify the real roots of a polynomial using the Descartes rule of signs and the Möbius transformations just described. - -The `ANewDsc` function takes a collection of coefficients representing a polynomial and returns isolating intervals for each real root. For example: - -```julia -p₀ = fromroots([1,3,5]) -st = ANewDsc(coeffs(p₀)) -``` - -These intervals can be refined to give accurate approximations to the roots: - -```julia -refine_roots(st) -``` - - -More challenging problems can be readily handled by this package. The following polynomial - - -```julia -𝒔 = Polynomial([0,1]) # also just variable(Polynomial{Int}) -𝒖 = -1 + 254*𝒔 - 16129*𝒔^2 + 𝒔^15 -``` - -has three real roots, two of which are clustered very close to each other: - -```julia -𝒔𝒕 = ANewDsc(coeffs(𝒖)) -``` - -and - -```julia -refine_roots(𝒔𝒕) -``` - -The SymPy package (`sympy.real_roots`) can accurately identify the -three roots but it can take a **very** long time. The -`Polynomials.roots` function from the `Polynomials` package identifies -the cluster as complex valued. Though the implementation in -`RealPolynomialRoots` doesn't handle such large polynomials, the -authors of the algorithm have implementations that can quickly solve -polynomials with degrees as high as ``10,000``. - - - - -## Questions - -###### Question - -The rational expression $(x^3 - 2x + 3) / (x^2 - x + 1)$ would have - -```julia; hold=true; echo=false -choices = [L"A horizontal asymptote $y=0$", -L"A horizontal asymptote $y=1$", -L"A slant asymptote with slope $m=1$"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - - -The rational expression $(x^2 - x + 1)/ (x^3 - 2x + 3)$ would have - -```julia; hold=true; echo=false -choices = [L"A horizontal asymptote $y=0$", -L"A horizontal asymptote $y=1$", -L"A slant asymptote with slope $m=1$"] -answ = 1 -radioq(choices, answ) -``` - - - -###### Question - - -The rational expression $(x^2 - x + 1)/ (x^2 - 3x + 3)$ would have - -```julia; hold=true; echo=false -choices = [L"A horizontal asymptote $y=0$", -L"A horizontal asymptote $y=1$", -L"A slant asymptote with slope $m=1$"] -answ = 2 -radioq(choices, answ) -``` - - -###### Question - - -The rational expression - -```math -\frac{(x-1)\cdot(x-2)\cdot(x-3)}{(x-4)\cdot(x-5)\cdot(x-6)} -``` - -would have - - -```julia; hold=true; echo=false -choices = [L"A horizontal asymptote $y=0$", -L"A horizontal asymptote $y=1$", -L"A slant asymptote with slope $m=1$"] -answ = 2 -radioq(choices, answ) -``` - - -###### Question - - -The rational expression - -```math -\frac{(x-1)\cdot(x-2)\cdot(x-3)}{(x-4)\cdot(x-5)\cdot(x-6)} -``` - -would have - -```julia; hold=true; echo=false -choices = [L"A vertical asymptote $x=1$", -L"A slant asymptote with slope $m=1$", -L"A vertical asymptote $x=5$" -] -answ = 3 -radioq(choices, answ) -``` - - - - -###### Question - -The rational expression - -```math -\frac{x^3 - 3x^2 + 2x}{3x^2 - 6x + 2} -``` - -has a slant asymptote. What is the equation of that line? - -```julia; hold=true; echo=false -choices = [ - "``y = 3x``", - "``y = (1/3)x``", - "``y = (1/3)x - (1/3)``" -] -answ = 3 -radioq(choices, answ) -``` - - -###### Question - -Look at the graph of the function ``f(x) = ((x-1)\cdot(x-2)) / ((x-3)\cdot(x-4))`` - -```julia; hold=true; echo=false -f(x) = ((x-1) * (x-2)) / ((x-3) *(x-4)) -delta = 1e-1 -col = :blue -p = plot(f, -1, 3-delta, color=col, legend=false) -plot!(p, f, 3+delta, 4-3delta, color=col) -plot!(p,f, 4 + 5delta, 9, color=col) -p -``` - - -Is the following common conception true: "The graph of a function never crosses its asymptotes." - -```julia; hold=true; echo=false -choices = ["No, the graph clearly crosses the drawn asymptote", -"Yes, this is true"] -answ = 1 -radioq(choices, answ) -``` - -(The wikipedia page indicates that the term "asymptote" was introduced -by Apollonius of Perga in his work on conic sections, but in contrast -to its modern meaning, he used it to mean any line that does not -intersect the given curve. It can sometimes take a while to change perception.) - - - -###### Question - -Consider the two graphs of $f(x) = 1/x$ over $[10,20]$ and $[100, 200]$: - -```julia; hold=true; echo=false -plot(x -> 1/x, 10, 20) -``` - - -```julia; hold=true; echo=false -plot(x -> 1/x, 100, 200) -``` - -The two shapes are basically identical and do not look like straight lines. How does this reconcile with the fact that $f(x)=1/x$ has a horizontal asymptote $y=0$? - -```julia; hold=true; echo=false -choices = ["The horizontal asymptote is not a straight line.", -L"The $y$-axis scale shows that indeed the $y$ values are getting close to $0$.", -L"The graph is always decreasing, hence it will eventually reach $-\infty$." -] -answ = 2 -radioq(choices, answ) -``` - - -###### Question - -The amount of drug in a bloodstream after $t$ hours is modeled by the rational function - -```math -r(t) = \frac{50t^2}{t^3 + 20}, \quad t \geq 0. -``` - -What is the amount of the drug after $1$ hour? - -```julia; echo=false -r1(t) = 50t^2 / (t^3 + 20) -``` - -```julia; hold=true; echo=false -val = r1(1) -numericq(val) -``` - -What is the amount of drug in the bloodstream after 24 hours? - -```julia; hold=true; echo=false -val = r1(24) -numericq(val) -``` - -What is more accurate: the peak amount is - -```julia; hold=true; echo=false -choices = ["between ``0`` and ``8`` hours", - "between ``8`` and ``16`` hours", - "between ``16`` and ``24`` hours", - "after one day"] -answ = 1 -radioq(choices, answ) -``` - -This graph has - -```julia; hold=true; echo=false -choices = [L"a slant asymptote with slope $50$", -L"a horizontal asymptote $y=20$", -L"a horizontal asymptote $y=0$", -L"a vertical asymptote with $x = 20^{1/3}$"] -answ = 3 -radioq(choices, answ) -``` - - -###### Question - -The (low-order) Pade approximation for $\sin(x)$ was seen to be $(x - 7/60 \cdot x^3)/(1 + 1/20 \cdot x^2)$. The graph showed that this approximation was fairly close -over $[-\pi, \pi]$. Without graphing would you expect the behaviour of the function and its approximation to be similar for *large* values of $x$? - -```julia; hold=true; echo=false -yesnoq(false) -``` - -Why? - -```julia; hold=true; echo=false -choices = [ -L"The $\sin(x)$ oscillates, but the rational function eventually follows $7/60 \cdot x^3$", -L"The $\sin(x)$ oscillates, but the rational function has a slant asymptote", -L"The $\sin(x)$ oscillates, but the rational function has a non-zero horizontal asymptote", -L"The $\sin(x)$ oscillates, but the rational function has a horizontal asymptote of $0$"] -answ = 2 -radioq(choices, answ) -``` diff --git a/CwJ/precalc/transformations.jmd b/CwJ/precalc/transformations.jmd deleted file mode 100644 index 0adc31c..0000000 --- a/CwJ/precalc/transformations.jmd +++ /dev/null @@ -1,644 +0,0 @@ -# Function manipulations - -In this section we will use these add-on packages: - -```julia -using CalculusWithJulia -using Plots -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport -using DataFrames - -const frontmatter = ( - title = "Function manipulations", - description = "Calculus with Julia: Function manipulations", - tags = ["CalculusWithJulia", "precalc", "function manipulations"], -); -nothing -``` - ----- - -Thinking of functions as objects themselves that can be -manipulated - rather than just blackboxes for evaluation - is a major -abstraction of calculus. The main operations to come: the limit *of a -function*, the derivative *of a function*, and the integral *of a -function* all operate on functions. Hence the idea of an -[operator](http://tinyurl.com/n5gp6mf). Here we discuss manipulations of functions from pre-calculus that have proven -to be useful abstractions. - -## The algebra of functions - -We can talk about the algebra of functions. For example, the sum of functions $f$ and $g$ would be a function whose value at $x$ was just $f(x) + g(x)$. More formally, we would have: - -```math -(f + g)(x) = f(x) + g(x), -``` - -We have given meaning to a new function $f+g$ by defining what is does -to $x$ with the rule on the right hand side. Similarly, we can define -operations for subtraction, multiplication, addition, and -powers. - - -These mathematical concepts aren't defined for functions in base -`Julia`, though they could be if desired, by a commands such as: - -```julia; -import Base: + -f::Function + g::Function = x -> f(x) + g(x) -``` - -This adds a method to the generic `+` function for functions. The type annotations `::Function` ensure this applies only to functions. To see that it would work, we could do odd-looking things like: - -```julia; -ss = sin + sqrt -ss(4) -``` - -Doing this works, as Julia treats functions as first class objects, -lending itself to -[higher](https://en.wikipedia.org/wiki/Higher-order_programming) order -programming. However, this definition in general is kind of limiting, -as functions in mathematics and Julia can be much more varied than -just the univariate functions we have defined addition for. We won't -pursue this further. - -### Composition of functions - -As seen, just like with numbers, it can make sense mathematically to define addition, -subtraction, multiplication and division of functions. Unlike numbers though, -we can also define a new operation on functions called **composition** -that involves chaining the output of one function to the input of -another. Composition is a common practice in life, where the result -of some act is fed into another process. For example, making a pie -from scratch involves first making a crust, then composing this with a -filling. A better abstraction might be how we "surf" the web. The -output of one search leads us to another search whose output then is a -composition. - -Mathematically, a composition of univariate functions $f$ and $g$ is -written $f \circ g$ and defined by what it does to a value in the -domain of $g$ by: - -```math -(f \circ g)(x) = f(g(x)). -``` - -The output of $g$ becomes the input of $f$. - -Composition depends on the order of things. There is no guarantee that -$f \circ g$ should be the same as $g \circ f$. (Putting on socks then -shoes is quite different from putting on shoes then socks.) Mathematically, we can see -this quite clearly with the functions $f(x) = x^2$ and $g(x) = -\sin(x)$. Algebraically we have: - -```math -(f \circ g)(x) = \sin(x)^2, \quad (g \circ f)(x) = \sin(x^2). -``` - -Though they may be *typographically* similar don't be fooled, the following -graph shows that the two functions aren't even close except for $x$ -near $0$ (for example, one composition is always non-negative, whereas -the other is not): - -```julia;hold=true -f(x) = x^2 -g(x) = sin(x) -fg = f ∘ g # typed as f \circ[tab] g -gf = g ∘ f # typed as g \circ[tab] f -plot(fg, -2, 2, label="f∘g") -plot!(gf, label="g∘f") -``` - - -!!! note - Unlike how the basic arithmetic operations are treated, `Julia` defines the infix - Unicode operator `\\circ[tab]` to represent composition of functions, - mirroring mathematical notation. This infix operations takes in two functions and returns an anonymous function. It - can be useful and will mirror standard mathematical usage up to issues - with precedence rules. - - -Starting with two functions and composing them requires nothing more -than a solid grasp of knowing the rules of function evaluation. If -$f(x)$ is defined by some rule involving $x$, then $f(g(x))$ just -replaces each $x$ in the rule with a $g(x)$. - -So if $f(x) = x^2 + 2x - 1$ and $g(x) = e^x - x$ then $f \circ g$ would be (before any simplification) - -```math -(f \circ g)(x) = (e^x - x)^2 + 2(e^x - x) - 1. -``` - - -If can be helpful to think of the argument to ``f`` as a "box" that gets filled in by ``g``: - -```math -\begin{align*} -g(x) &=e^x - x\\ -f(\square) &= (\square)^2 + 2(\square) - 1\\ -f(g(x)) &= (g(x))^2 + 2(g(x)) - 1 = (e^x - x)^2 + 2(e^x - x) - 1. -\end{align*} -``` - - -Here we look at a few compositions: - - -* The function $h(x) = \sqrt{1 - x^2}$ can be seen as $f\circ g$ with $f(x) = \sqrt{x}$ and $g(x) = 1-x^2$. - -* The function $h(x) = \sin(x/3 + x^2)$ can be viewed as $f\circ g$ with $f(x) = \sin(x)$ and $g(x) = x/3 + x^2$. - -* The function $h(x) = e^{-1/2 \cdot x^2}$ can be viewed as $f\circ g$ with $f(x) = e^{-x}$ and $g(x) = (1/2) \cdot x^2$. - -Decomposing a function into a composition of functions is not unique, -other compositions could have been given above. For example, the last -function is also $f(x) = e^{-x/2}$ composed with $g(x) = x^2$. - - -!!! note - The real value of composition is to break down more complicated things into a sequence of easier steps. This is good mathematics, but also good practice more generally. For example, when we approach a problem with the computer, we generally use a smallish set of functions and piece them together (that is, compose them) to find a solution. - - -### Shifting and scaling graphs - -It is very useful to mentally categorize functions within -families. The difference between $f(x) = \cos(x)$ and $g(x) = -12\cos(2(x - \pi/4))$ is not that much - both are cosine functions, -one is just a simple enough transformation of the other. As such, we -expect bounded, oscillatory behaviour with the details of how large -and how fast the oscillations are to depend on the specifics of -the function. Similarly, both these functions $f(x) = 2^x$ and $g(x)=e^x$ -behave like exponential growth, the difference being only in the rate -of growth. There are families of functions that are qualitatively -similar, but quantitatively different, linked together by a few basic transformations. - -There is a set of operations of functions, which does not really -change the type of function. Rather, it basically moves and stretches -how the functions are graphed. We discuss these four main transformations of $f$: - -```julia; echo=false - -nms = ["*vertical shifts*","*horizontal shifts*","*stretching*","*scaling*"] -acts = [L"The function $h(x) = k + f(x)$ will have the same graph as $f$ shifted up by $k$ units.", -L"The function $h(x) = f(x - k)$ will have the same graph as $f$ shifted right by $k$ units.", -L"The function $h(x) = kf(x)$ will have the same graph as $f$ stretched by a factor of $k$ in the $y$ direction.", -L"The function $h(x) = f(kx)$ will have the same graph as $f$ compressed horizontally by a factor of $1$ over $k$."] -table(DataFrame(Transformation=nms, Description=acts)) -``` - - -The functions $h$ are derived from $f$ in a predictable way. To implement these transformations within `Julia`, we define operators (functions which transform one function into another). As these return functions, the function bodies are anonymous functions. The basic definitions are similar, save for the `x -> ...` part that signals the creation of an anonymous function to return: - -```julia; -up(f, k) = x -> f(x) + k -over(f, k) = x -> f(x - k) -stretch(f, k) = x -> k * f(x) -scale(f, k) = x -> f(k * x) -``` - -To illustrate, let's define a hat-shaped function as follows: - -```julia; -𝒇(x) = max(0, 1 - abs(x)) -``` - -A plot over the interval ``[-2,2]`` is shown here: - -```julia -plot(𝒇, -2,2) -``` - -The same graph of $f$ and its image shifted up by ``2`` units would be given by: - -```julia; -plot(𝒇, -2, 2, label="f") -plot!(up(𝒇, 2), label="up") -``` - -A graph of $f$ and its shift over by $2$ units would be given by: - -```julia; -plot(𝒇, -2, 4, label="f") -plot!(over(𝒇, 2), label="over") -``` - -A graph of $f$ and it being stretched by $2$ units would be given by: - -```julia; -plot(𝒇, -2, 2, label="f") -plot!(stretch(𝒇, 2), label="stretch") -``` - -Finally, a graph of $f$ and it being scaled by $2$ would be given by: - - -```julia; -plot(𝒇, -2, 2, label="f") -plot!(scale(𝒇, 2), label="scale") -``` - -Scaling by $2$ shrinks the non-zero domain, scaling by $1/2$ would -stretch it. If this is not intuitive, the defintion `x-> f(x/c)` could -have been used, which would have opposite behaviour for scaling. - ----- - -More exciting is what happens if we combine these operations. - -A shift right by ``2`` and up by ``1`` is achieved through - -```julia; -plot(𝒇, -2, 4, label="f") -plot!(up(over(𝒇,2), 1), label="over and up") -``` - - -Shifting and scaling can be confusing. Here we graph `scale(over(𝒇,2),1/3)`: - -```julia; -plot(𝒇, -1,9, label="f") -plot!(scale(over(𝒇,2), 1/3), label="over and scale") -``` - -This graph is over by $6$ with a width of $3$ on each side of the center. Mathematically, we have $h(x) = f((1/3)\cdot x - 2)$ - - -Compare this to the same operations in opposite order: - -```julia; -plot(𝒇, -1, 5, label="f") -plot!(over(scale(𝒇, 1/3), 2), label="scale and over") -``` - - -This graph first scales the symmetric graph, stretching from $-3$ to $3$, then shifts over right by $2$. The resulting function is $f((1/3)\cdot (x-2))$. - -As a last example, following up on the last example, a common transformation mathematically is - -```math -h(x) = \frac{1}{a}f(\frac{x - b}{a}). -``` - -We can view this as a composition of "scale" by $1/a$, then "over" by $b$, and finally "stretch" by $1/a$: - -```julia;hold=true -a = 2; b = 5 -𝒉(x) = stretch(over(scale(𝒇, 1/a), b), 1/a)(x) -plot(𝒇, -1, 8, label="f") -plot!(𝒉, label="h") -``` - -(This transformation keeps the same amount of area in the triangles, can you tell from the graph?) - -##### Example - -A model for the length of a day in New York City must take into -account periodic seasonal effects. A simple model might be a sine -curve. However, there would need to be many modifications: Obvious -ones would be that the period would need to be about $365$ days, the -oscillation around ``12`` and the amplitude of the oscillations no more -than ``12``. - -We can be more precise. According to -[dateandtime.info](http://dateandtime.info/citysunrisesunset.php?id=5128581) -in ``2015`` the longest day will be June ``21``st when there will be -``15``h ``5``m ``46``s of sunlight, the shortest day will be December ``21``st -when there will be ``9``h ``15``m ``19``s of sunlight. On January -``1``, there will be ``9``h ``19``m ``44``s of sunlight. - -A model for a transformed sine curve is - -```math -a + b\sin(d(x - c)) -``` - -Where $b$ is related to the amplitude, $c$ the shift and the period is $T=2\pi/d$. We can find some of these easily from the above: - -```julia; -a = 12 -b = ((15 + 5/60 + 46/60/60) - (9 + 19/60 + 44/60/60)) / 2 -d = 2pi/365 -``` - -If we let January ``1`` be $x=0$ then the first day of spring, March ``21``, -is day ``80`` (`Date(2017, 3, 21) - Date(2017, 1, 1) + 1`). This day aligns -with the shift of the sine curve. This shift is ``80``: - -```julia; -c = 80 -``` - -Putting this together, we have our graph is "scaled" by $d$, "over" by -$c$, "stretched" by $b$ and "up" by $a$. Here we plot it over slightly more than one year so that we can see that the shortest day of light is in late December ($x \approx -10$ or $x \approx 355$). - - -```julia; -newyork(t) = up(stretch(over(scale(sin, d), c), b), a)(t) -plot(newyork, -20, 385) -``` - -To test, if we match up with the model powering -[dateandtime.info](http://dateandtime.info/citysunrisesunset.php?id=5128581) -we note that it predicts "``15``h ``0``m ``4``s" on July ``4``, -``2015``. This is day ``185`` (`Date(2015, 7, 4) - Date(2015, 1, 1) + -1`). Our model prediction has a difference of - -```julia; -datetime = 15 + 0/60 + 4/60/60 -delta = (newyork(185) - datetime) * 60 -``` - -This is off by a fair amount - almost ``12`` minutes. Clearly a -trigonometric model, based on the assumption of circular motion of the -earth around the sun, is not accurate enough for precise work, but it does help one understand how summer days are longer than winter days and how the length of a day changes fastest at the spring and fall equinoxes. - - -##### Example: a growth model in fisheries - -The von Bertalanffy growth -[equation](https://en.wikipedia.org/wiki/Von_Bertalanffy_function) is -$L(t) =L_\infty \cdot (1 - e^{k\cdot(t-t_0)})$. This family of functions can -be viewed as a transformation of the exponential function $f(t)=e^t$. -Part is a scaling and shifting (the $e^{k \cdot (t - t_0)}$) -along with some shifting and stretching. The various parameters have -physical importance which can be measured: $L_\infty$ is a carrying -capacity for the species or organism, and $k$ is a rate of growth. These -parameters may be estimated from data by finding the "closest" curve -to a given data set. - -##### Example: the pipeline operator - -In the last example, we described our sequence as scale, over, -stretch, and up, but code this in reverse order, as the composition $f -\circ g$ is done from right to left. A more convenient notation would -be to have syntax that allows the composition of $g$ then $f$ to be -written $x \rightarrow g \rightarrow f$. `Julia` provides the -[pipeline](http://julia.readthedocs.org/en/latest/stdlib/base/#Base.|>) -operator for chaining function calls together. - -For example, if $g(x) = \sqrt{x}$ and $f(x) =\sin(x)$ we could call $f(g(x))$ through: - -```julia; hold=true -g(x) = sqrt(x) -f(x) = sin(x) -pi/2 |> g |> f -``` - -The output of the preceding expression is passed as the input to the -next. This notation is especially convenient when the enclosing -function is not the main focus. (Some programming languages have more -developed [fluent interfaces](https://en.wikipedia.org/wiki/Fluent_interface) for -chaining function calls. Julia has more powerful chaining macros provided in -packages, such as `DataPipes.jl` or `Chain.jl`.) - - - - -### Operators - -The functions `up`, `over`, etc. are operators that take a function as -an argument and return a function. The use of operators fits in with -the template `action(f, args...)`. The action is what we are doing, -such as `plot`, `over`, and others to come. The function `f` here is -just an object that we are performing the action on. For example, a -plot takes a function and renders a graph using the additional -arguments to select the domain to view, etc. - -Creating operators that return functions involves the use of anonymous -functions, using these operators is relatively straightforward. Two basic patterns are - -- Storing the returned function, then calling it: - -```julia; eval=false -l(x) = action1(f, args...)(x) -l(10) -``` - - -- Composing two operators: - -```julia; eval=false -action2( action1(f, args..), other_args...) -``` - -Composition like the above is convenient, but can get confusing if more than one composition is involved. - -##### Example: two operators - - - -(See [Krill](http://arxiv.org/abs/1403.5821) for background on this example.) Consider two operations on functions. The first takes the *difference* between adjacent points. We call this `D`: - -```julia; -D(f::Function) = k -> f(k) - f(k-1) -``` - -To see that it works, we take a typical function - -```julia; -𝐟(k) = 1 + k^2 -``` - -and check: - -```julia -D(𝐟)(3), 𝐟(3) - 𝐟(3-1) -``` - -That the two are the same value is no coincidence. (Again, pause for a -second to make sure you understand why `D(f)(3)` makes sense. If this -is unclear, you could name the function `D(f)` and then call this with -a value of `3`.) - -Now we want a function to cumulatively *sum* the values $S(f)(k) = -f(1) + f(2) + \cdots + f(k-1) + f(k)$, as a function of $k$. Adding up -$k$ terms is easy to do with a generator and the function `sum`: - -```julia; -S(f) = k -> sum(f(i) for i in 1:k) -``` - -To check if this works as expected, compare these two values: - -```julia; -S(𝐟)(4), 𝐟(1) + 𝐟(2) + 𝐟(3) + 𝐟(4) -``` - -So one function adds, the other subtracts. Addition and subtraction -are somehow inverse to each other so should "cancel" out. This holds -for these two operations as well, in the following sense: subtracting -after adding leaves the function alone: - -```julia; -k = 10 # some arbitrary value k >= 1 -D(S(𝐟))(k), 𝐟(k) -``` - -Any positive integer value of `k` will give the same answer (up to -overflow). This says the difference of the accumulation process is -just the last value to accumulate. - -Adding after subtracting also leaves the function alone, save for a vestige of $f(0)$. For example, `k=15`: - -```julia; -S(D(𝐟))(15), 𝐟(15) - 𝐟(0) -``` - -That is the accumulation of differences is just the difference of the end values. - -These two operations are discrete versions of the two main operations -of calculus - the derivative and the integral. This relationship will -be known as the "fundamental theorem of calculus." - -## Questions - -###### Question - -If $f(x) = 1/x$ and $g(x) = x-2$, what is $g(f(x))$? - -```julia; hold=true;echo=false -choices=["``1/(x-2)``", "``1/x - 2``", "``x - 2``", "``-2``"] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -If $f(x) = e^{-x}$ and $g(x) = x^2$ and $h(x) = x-3$, what is $f \circ g \circ h$? - -```julia; hold=true;echo=false -choices=["``e^{-x^2 - 3}``", "``(e^x -3)^2``", - "``e^{-(x-3)^2}``", "``e^x+x^2+x-3``"] -answ = 3 -radioq(choices, answ) -``` - -###### Question -If $h(x) = (f \circ g)(x) = \sin^2(x)$ which is a possibility for $f$ and $g$: - -```julia; hold=true;echo=false -choices = [raw"``f(x)=x^2; \quad g(x) = \sin^2(x)``", - raw"```f(x)=x^2; \quad g(x) = \sin(x)``", - raw"``f(x)=\sin(x); \quad g(x) = x^2``"] -answ = 2 -radioq(choices, answ) -``` - - -###### Question -Which function would have the same graph as the sine curve shifted over by 4 and up by 6? - -```julia; hold=true;echo=false -choices = [ - raw"``h(x) = 4 + \sin(6x)``", - raw"``h(x) = 6 + \sin(x + 4)``", - raw"``h(x) = 6 + \sin(x-4)``", - raw"``h(x) = 6\sin(x-4)``"] -answ = 3 -radioq(choices, 3) -``` - -###### Question -Let $h(x) = 4x^2$ and $f(x) = x^2$. Which is **not** true: - -```julia; hold=true;echo=false -choices = [L"The graph of $h(x)$ is the graph of $f(x)$ stretched by a factor of ``4``", - L"The graph of $h(x)$ is the graph of $f(x)$ scaled by a factor of ``2``", - L"The graph of $h(x)$ is the graph of $f(x) shifted up by ``4`` units"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -The transformation $h(x) = (1/a) \cdot f((x-b)/a)$ can be viewed in one sequence: - -```julia; hold=true;echo=false -choices = [L"scaling by $1/a$, then shifting by $b$, then stretching by $1/a$", - L"shifting by $a$, then scaling by $b$, and then scaling by $1/a$", - L"shifting by $a$, then scaling by $a$, and then scaling by $b$" ] -answ=1 -radioq(choices, answ) -``` - -###### Question - -This is the graph of a transformed sine curve. - -```julia;hold=true;echo=false -f(x) = 2*sin(pi*x) -p = plot(f, -2,2) -``` - -What is the period of the graph? - -```julia; hold=true;echo=false -val = 2 -numericq(val) -``` - -What is the amplitude of the graph? - -```julia; hold=true;echo=false -val = 2 -numericq(val) -``` - -What is the form of the function graphed? - -```julia; hold=true;echo=false -choices = [ -raw"``2 \sin(x)``", -raw"``\sin(2x)``", -raw"``\sin(\pi x)``", -raw"``2 \sin(\pi x)``" -] -answ = 4 -radioq(choices, answ) -``` - - - -###### Question - -Consider this expression - -```math -\left(f(1) - f(0)\right) + \left(f(2) - f(1)\right) + \cdots + \left(f(n) - f(n-1)\right) = --f(0) + f(1) - f(1) + f(2) - f(2) + \cdots + f(n-1) - f(n-1) + f(n) = -f(n) - f(0). -``` - -Referring to the definitions of `D` and `S` in the example on operators, which relationship does this support: - -```julia; hold=true;echo=false -choices = [ -q"D(S(f))(n) = f(n)", -q"S(D(f))(n) = f(n) - f(0)" -] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Consider this expression: - -```math -\left(f(1) + f(2) + \cdots + f(n-1) + f(n)\right) - \left(f(1) + f(2) + \cdots + f(n-1)\right) = f(n). -``` - -Referring to the definitions of `D` and `S` in the example on operators, which relationship does this support: - -```julia; hold=true;echo=false -choices = [ -q"D(S(f))(n) = f(n)", -q"S(D(f))(n) = f(n) - f(0)" -] -answ = 1 -radioq(choices, answ, keep_order=true) -``` diff --git a/CwJ/precalc/trig_functions.jmd b/CwJ/precalc/trig_functions.jmd deleted file mode 100644 index 0b7b4a2..0000000 --- a/CwJ/precalc/trig_functions.jmd +++ /dev/null @@ -1,813 +0,0 @@ -# Trigonometric functions - - -This section uses the following add-on packages: - -```julia -using CalculusWithJulia -using Plots -using SymPy -``` - -```julia; echo=false; results="hidden" -using CalculusWithJulia.WeaveSupport - -fig_size = (800, 600) - -const frontmatter = ( - title = "Trigonometric functions", - description = "Calculus with Julia: Trigonometric functions", - tags = ["CalculusWithJulia", "precalc", "trigonometric functions"], -); - -nothing -``` - ----- - -We have informally used some of the trigonometric functions in examples -so far. In this section we quickly review their definitions and some -basic properties. - -The trigonometric functions are used to describe relationships between -triangles and circles as well as oscillatory motions. With such a wide range of -utility it is no wonder that they pop up in many places and their -[origins](https://en.wikipedia.org/wiki/Trigonometric_functions#History) -date to Hipparcus and Ptolemy over ``2000`` years ago. - -## The 6 basic trigonometric functions - -We measure angles in radians, where $360$ degrees is $2\pi$ -radians. By proportions, $180$ degrees is $\pi$ radian, $90$ degrees is $\pi/2$ radians, $60$ degrees is $\pi/3$ radians, etc. In general, $x$ degrees is $2\pi \cdot x / 360$ radians (or, with cancellation, ``x \cdot \frac{\pi}{180}``). - -For a right triangle with angles $\theta$, $\pi/2 - \theta$, and -$\pi/2$ (``0 < \theta < \pi/2``) we call the side opposite $\theta$ the "opposite" side, the -shorter adjacent side the "adjacent" side and the longer adjacent side -the hypotenuse. - - -```julia; hide=true; echo=false -p = plot(legend=false, xlim=(-1/4,5), ylim=(-1/2, 3), - xticks=nothing, yticks=nothing, border=:none) -plot!([0,4,4,0],[0,0,3,0], linewidth=3) -del = .25 -plot!([4-del, 4-del,4], [0, del, del], color=:black, linewidth=3) -annotate!([(.75, .25, "θ"), (4.0, 1.25, "opposite"), (2, -.25, "adjacent"), (1.5, 1.25, "hypotenuse")]) -``` - - - - -With these, the basic definitions for the primary -trigonometric functions are - -```math -\begin{align*} -\sin(\theta) &= \frac{\text{opposite}}{\text{hypotenuse}} &\quad(\text{the sine function})\\ -\cos(\theta) &= \frac{\text{adjacent}}{\text{hypotenuse}} &\quad(\text{the cosine function})\\ -\tan(\theta) &= \frac{\text{opposite}}{\text{adjacent}}. &\quad(\text{the tangent function}) -\end{align*} -``` - -!!! note - Many students remember these through [SOH-CAH-TOA](http://mathworld.wolfram.com/SOHCAHTOA.html). - -Some algebra shows that $\tan(\theta) = \sin(\theta)/\cos(\theta)$. There are also ``3`` reciprocal functions, the cosecant, secant and cotangent. - - -These definitions in terms of sides only apply for $0 \leq \theta \leq \pi/2$. More generally, if we relate any angle taken in the counter clockwise direction for the $x$-axis with a point $(x,y)$ on the *unit* circle, then we can extend these definitions - the point $(x,y)$ is also $(\cos(\theta), \sin(\theta))$. - -```julia; hold=true; echo=false; cache=true -## {{{radian_to_trig}}} - - -function plot_angle(m) - r = m*pi - - ts = range(0, stop=2pi, length=100) - tit = "$m ⋅ pi -> ($(round(cos(r), digits=2)), $(round(sin(r), digits=2)))" - p = plot(cos.(ts), sin.(ts), legend=false, aspect_ratio=:equal,title=tit) - plot!(p, [-1,1], [0,0], color=:gray30) - plot!(p, [0,0], [-1,1], color=:gray30) - - if r > 0 - ts = range(0, stop=r, length=100) - else - ts = range(r, stop=0, length=100) - end - - plot!(p, (1/2 .+ abs.(ts)/10pi).* cos.(ts), (1/2 .+ abs.(ts)/10pi) .* sin.(ts), color=:red, linewidth=3) - l = 1 #1/2 + abs(r)/10pi - plot!(p, [0,l*cos(r)], [0,l*sin(r)], color=:green, linewidth=4) - - scatter!(p, [cos(r)], [sin(r)], markersize=5) - annotate!(p, [(1/4+cos(r), sin(r), "(x,y)")]) - - p -end - - - -## different linear graphs -anim = @animate for m in -4//3:1//6:10//3 - plot_angle(m) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 1) -caption = "An angle in radian measure corresponds to a point on the unit circle, whose coordinates define the sine and cosine of the angle. That is ``(x,y) = (\\cos(\\theta), \\sin(\\theta))``." - -ImageFile(imgfile, caption) -``` - - -### The trigonometric functions in Julia - -Julia has the ``6`` basic trigonometric functions defined through the functions `sin`, `cos`, `tan`, `csc`, `sec`, and `cot`. - -Two right triangles - the one with equal, $\pi/4$, angles; and the -one with angles $\pi/6$ and $\pi/3$ can have the ratio of their sides -computed from basic geometry. In particular, this leads to the following values, which are -usually committed to memory: - -```math -\begin{align*} -\sin(0) &= 0, \quad \sin(\pi/6) = \frac{1}{2}, \quad \sin(\pi/4) = \frac{\sqrt{2}}{2}, \quad\sin(\pi/3) = \frac{\sqrt{3}}{2},\text{ and } \sin(\pi/2) = 1\\ -\cos(0) &= 1, \quad \cos(\pi/6) = \frac{\sqrt{3}}{2}, \quad \cos(\pi/4) = \frac{\sqrt{2}}{2}, \quad\cos(\pi/3) = \frac{1}{2},\text{ and } \cos(\pi/2) = 0. -\end{align*} -``` - -Using the circle definition allows these basic values to inform us of -values throughout the unit circle. - - -These all follow from the definition involving the unit circle: - - -* If the angle $\theta$ corresponds to a point $(x,y)$ on the unit circle, then the angle $-\theta$ corresponds to $(x, -y)$. So $\sin(\theta) = - \sin(-\theta)$ (an odd function), but $\cos(\theta) = \cos(-\theta)$ (an even function). - -* If the angle $\theta$ corresponds to a point $(x,y)$ on the unit circle, then rotating by $\pi$ moves the points to $(-x, -y)$. So $\cos(\theta) = x = - \cos(\theta + \pi)$, and $\sin(\theta) = y = -\sin(\theta + \pi)$. - -* If the angle $\theta$ corresponds to a point $(x,y)$ on the unit circle, then rotating by $\pi/2$ moves the points to $(-y, x)$. So $\cos(\theta) = x = \sin(\theta + \pi/2)$. - - - - -The fact that $x^2 + y^2 = 1$ for the unit circle leads to the -"Pythagorean identity" for trigonometric functions: - -```math -\sin(\theta)^2 + \cos(\theta)^2 = 1. -``` - -This basic fact can be manipulated many ways. For example, dividing through by $\cos(\theta)^2$ gives the related identity: $\tan(\theta)^2 + 1 = \sec(\theta)^2$. - - -`Julia`'s functions can compute values for any angles, including these fundamental ones: - -```julia; -[cos(theta) for theta in [0, pi/6, pi/4, pi/3, pi/2]] -``` - -These are floating point approximations, as can be seen clearly in the last value. Symbolic math can be used if exactness matters: - -```julia; -cos.([0, PI/6, PI/4, PI/3, PI/2]) -``` - -The `sincos` function computes both `sin` and `cos` simultaneously, which can be more performant when both values are needed. - -```juila -sincos(pi/3) -``` - - -!!! note - For really large values, round off error can play a big role. For example, the *exact* value of $\sin(1000000 \pi)$ is $0$, but the returned value is not quite $0$ `sin(1_000_000 * pi) = -2.231912181360871e-10`. For exact multiples of $\pi$ with large multiples the `sinpi` and `cospi` functions are useful. - - (Both functions are computed by first employing periodicity to reduce the problem to a smaller angle. However, for large multiples the floating-point roundoff becomes a problem with the usual functions.) - -##### Example - -Measuring the height of a -[tree](https://lifehacker.com/5875184/is-there-an-easy-way-to-measure-the-height-of-a-tree) -may be a real-world task for some, but a typical task for nearly all trigonometry -students. How might it be done? If a right triangle can be formed -where the angle and adjacent side length are known, then the opposite -side (the height of the tree) can be solved for with the tangent -function. For example, if standing $100$ feet from the base of the -tree the tip makes a ``15`` degree angle the height is given by: - -```julia; hold=true; -theta = 15 * pi / 180 -adjacent = 100 -opposite = adjacent * tan(theta) -``` - -Having some means to compute an angle and then a tangent of that angle -handy is not a given, so the linked to article provides a few other -methods taking advantage of similar triangles. - -You can also measure distance with your -[thumb](http://www.vendian.org/mncharity/dir3/bodyruler_angle/) or -fist. How? The fist takes up about $10$ degrees of view when held -straight out. So, pacing off backwards until the fist completely -occludes the tree will give the distance of the adjacent side of a -right triangle. If that distance is $30$ paces what is the height of the tree? -Well, we need some facts. Suppose your pace is $3$ feet. Then the -adjacent length is $90$ feet. The multiplier is the tangent of $10$ -degrees, or: - -```julia; -tan(10 * pi/180) -``` - -Which for sake of memory we will say is $1/6$ (a $5$ percent error). So that answer is *roughly* $15$ feet: - -```julia; -30 * 3 / 6 -``` - -Similarly, you can use your thumb instead of your first. To use your first you can multiply by $1/6$ the adjacent side, to use your thumb about $1/30$ as this approximates the tangent of $2$ degrees: - -```julia; -1/30, tan(2*pi/180) -``` - -This could be reversed. If you know the height of something a distance -away that is covered by your thumb or fist, then you would multiply -that height by the appropriate amount to find your distance. - - -### Basic properties - -The sine function is defined for all real $\theta$ and has a range of $[-1,1]$. Clearly as $\theta$ winds around the $x$-axis, the position of the $y$ coordinate begins to repeat itself. We say the sine function is *periodic* with period $2\pi$. A graph will illustrate: - -```julia; -plot(sin, 0, 4pi) -``` - - -The graph shows two periods. The wavy aspect of the graph is why this function is used to model periodic motions, such as the amount of sunlight in a day, or the alternating current powering a computer. - -From this graph - or considering when the $y$ coordinate is $0$ - we see that the sine function has zeros at any integer multiple of $\pi$, or $k\pi$, $k$ in $\dots,-2,-1, 0, 1, 2, \dots$. - -The cosine function is similar, in that it has the same domain and range, but is "out of phase" with the sine curve. A graph of both shows the two are related: - -```julia; -plot(sin, 0, 4pi, label="sin") -plot!(cos, 0, 4pi, label="cos") -``` - -The cosine function is just a shift of the sine function (or vice versa). We see that the zeros of the cosine function happen at points of the form $\pi/2 + k\pi$, $k$ in $\dots,-2,-1, 0, 1, 2, \dots.$ - -The tangent function does not have all $\theta$ for its domain, rather those points where division by $0$ occurs are excluded. These occur when the cosine is $0$, or, again, at $\pi/2 + k\pi$, $k$ in $\dots,-2,-1, 0, 1, 2, \dots.$ The range of the tangent function will be all real $y$. - -The tangent function is also periodic, but not with period $2\pi$, but -rather just $\pi$. A graph will show this. Here we avoid the vertical -asymptotes using `rangeclamp`: - -```julia; -plot(rangeclamp(tan), -10, 10, label="tan") -``` - - -##### Example sums of sines - -For the function ``f(x) = \sin(x)`` we have an understanding of the related family of functions defined by linear transformations: - -```math -g(x) = a + b \sin((2\pi n)x) -``` - -That is ``g`` is shifted up by ``a`` units, scaled vertically by ``b`` units and has a period of ``1/n``. We see a simple plot here where we can verify the transformation: - -```julia -g(x; b=1,n=1) = b*sin(2pi*n*x) -g1(x) = 1 + g(x, b=2, n=3) -plot(g1, 0, 1) -``` - -We can consider the sum of such functions, for example - -```julia -g2(x) = 1 + g(x, b=2, n=3) + g(x, b=4, n=5) -plot(g2, 0, 1) -``` - -Though still periodic, we can see with this simple example that sums of different sine functions can have somewhat complicated graphs. - -Sine functions can be viewed as the `x` position of a point traveling around a circle so `g(x, b=2, n=3)` is the `x` position of point traveling around a circle of radius ``2`` that completes a circuit in ``1/3`` units of time. - -The superposition of the two sine functions that `g2` represents could be viewed as the position of a circle moving around a point that is moving around another circle. The following graphic, with ``b_1=1/3, n_1=3, b_2=1/4``, and ``n_2=4``, shows an example that produces the related cosine sum (moving right along the ``x`` axis), the sine sum (moving down along the ``y`` axis, *and* the trace of the position of the point generating these two plots. - - -```julia; hold=true; echo=false; cache=true -unzip(vs::Vector) = Tuple([[vs[i][j] for i in eachindex(vs)] for j in eachindex(vs[1])]) -function makegraph(t, b₁,n₁, b₂=0, n₂=1) - - f₁ = x -> b₁*[sin(2pi*n₁*x), cos(2pi*n₁*x)] - f₂ = x -> b₂*[sin(2pi*n₂*x), cos(2pi*n₂*x)] - h = x -> f₁(x) + f₂(x) - - ts = range(0, 2pi, length=1000) - - - ylims = (-b₁-b₂-2, b₁ + b₂) - xlims = (-b₁-b₂, b₁ + b₂ + 2) - - p = plot(; xlim=xlims, ylim=ylims, - legend=false, - aspect_ratio=:equal) - - α = 0.3 - # circle 1 - plot!(p, unzip(f₁.(range(0, 2pi/n₁, length=100))), alpha=α) - scatter!(p, unzip([f₁(t)]), markersize=1, alpha=α) - - # circle 2 - us, vs = unzip(f₂.(range(0, 2pi/n₂, length=100))) - a,b = f₁(t) - plot!(p, a .+ us, b .+ vs, alpha=α) - scatter!(p, unzip([h(t)]), markersize=5) - - # graph of (x,y) over [0,t] - ts = range(0, t, length=200) - plot!(p, unzip(h.(ts)), linewidth=1, alpha=0.5, linestyle=:dash) - - # graph of x over [0,t] - ys′ = -ts - xs′ = unzip(h.(ts))[1] - plot!(p, xs′, ys′, linewidth=2) - - # graph of y over [0,t] - xs′ = ts - ys′ = unzip(h.(ts))[2] - plot!(p, xs′, ys′, linewidth=2) - - - p -end - -# create animoation -b₁=1/3; n₁=3; b₂=1/4; n₂=4 - -anim = @animate for t ∈ range(0, 2.5, length=50) - makegraph(t, b₁, n₁, b₂, n₂) -end - -imgfile = tempname() * ".gif" -gif(anim, imgfile, fps = 5) - -caption = "Superposition of sines and cosines represented by an epicycle" - -ImageFile(imgfile, caption) -``` - - - - - - -As can be seen, even a somewhat simple combination can produce complicated graphs (a fact known to [Ptolemy](https://en.wikipedia.org/wiki/Deferent_and_epicycle)) . How complicated can such a graph get? This won't be answered here, but for fun enjoy this video produced by the same technique using more moving parts from the [`Javis.jl`](https://github.com/Wikunia/Javis.jl/blob/master/examples/fourier.jl) package: - -```julia; echo=false; -txt =""" - -""" -tpl = CalculusWithJulia.WeaveSupport.centered_content_tpl -txt = CalculusWithJulia.WeaveSupport.Mustache.render(tpl, content=txt, caption="Julia logo animated") -CalculusWithJulia.WeaveSupport.HTMLoutput(txt) -``` - - -### Functions using degrees - -Trigonometric function are functions of angles which have two common descriptions: in terms of degrees or radians. Degrees are common when right triangles are considered, radians much more common in general, as the relationship with arc-length holds in that $r\theta = l$, where $r$ is the radius of a circle and $l$ the length of the arc formed by angle $\theta$. - -The two are related, as a circle has both $2\pi$ radians and ``360`` degrees. So to convert from degrees into radians it takes multiplying by $2\pi/360$ and to convert from radians to degrees it takes multiplying by $360/(2\pi)$. The `deg2rad` and `rad2deg` functions are available for this task. - - -In `Julia`, the functions `sind`, `cosd`, `tand`, `cscd`, `secd`, and `cotd` are available to simplify the task of composing the two operations (that is `sin(deg2rad(x))` is the essentially same as `sind(x)`). - -## The sum-and-difference formulas - -Consider the point on the unit circle $(x,y) = (\cos(\theta), \sin(\theta))$. In terms of $(x,y)$ (or $\theta$) is there a way to represent the angle found by rotating an additional $\theta$, that is what is $(\cos(2\theta), \sin(2\theta))$? - -More generally, suppose we have two angles $\alpha$ and $\beta$, can we -represent the values of $(\cos(\alpha + \beta), \sin(\alpha + \beta))$ -using the values just involving $\beta$ and $\alpha$ separately? - - -According to [Wikipedia](https://en.wikipedia.org/wiki/Trigonometric_functions#Identities) the following figure (from [mathalino.com](http://www.mathalino.com/reviewer/derivation-of-formulas/derivation-of-sum-and-difference-of-two-angles)) has ideas that date to Ptolemy: - - -```julia; echo=false -ImageFile(:precalc, "figures/summary-sum-and-difference-of-two-angles.jpg", "Relations between angles") -``` - - -To read this, there are three triangles: the bigger (green with pink part) has hypotenuse $1$ (and adjacent and opposite sides that form the hypotenuses of the other two); the next biggest (yellow) hypotenuse $\cos(\beta)$, adjacent side (of angle $\alpha$) $\cos(\beta)\cdot \cos(\alpha)$, and opposite side $\cos(\beta)\cdot\sin(\alpha)$; and the smallest (pink) hypotenuse $\sin(\beta)$, adjacent side (of angle $\alpha$) $\sin(\beta)\cdot \cos(\alpha)$, and opposite side $\sin(\beta)\sin(\alpha)$. - -This figure shows the following sum formula for sine and cosine: - -```math -\begin{align*} -\sin(\alpha + \beta) &= \sin(\alpha)\cos(\beta) + \cos(\alpha)\sin(\beta), & (\overline{CE} + \overline{DF})\\ -\cos(\alpha + \beta) &= \cos(\alpha)\cos(\beta) - \sin(\alpha)\sin(\beta). & (\overline{AC} - \overline{DE}) -\end{align*} -``` - -Using the fact that $\sin$ is an odd function and $\cos$ an even function, related formulas for the difference $\alpha - \beta$ can be derived. - -Taking $\alpha = \beta$ we immediately get the "double-angle" formulas: - -```math -\begin{align*} -\sin(2\alpha) &= 2\sin(\alpha)\cos(\alpha)\\ -\cos(2\alpha) &= \cos(\alpha)^2 - \sin(\alpha)^2. -\end{align*} -``` - -The latter looks like the Pythagorean identify, but has a minus sign. In fact, the Pythagorean identify is often used to rewrite this, for example $\cos(2\alpha) = 2\cos(\alpha)^2 - 1$ or $1 - 2\sin(\alpha)^2$. - - -Applying the above with $\alpha = \beta/2$, we get that $\cos(\beta) = 2\cos(\beta/2)^2 -1$, which rearranged yields the "half-angle" formula: $\cos(\beta/2)^2 = (1 + \cos(\beta))/2$. - - -##### Example - -Consider the expressions $\cos((n+1)\theta)$ and $\cos((n-1)\theta)$. These can be re-expressed as: - -```math -\begin{align*} -\cos((n+1)\theta) &= \cos(n\theta + \theta) = \cos(n\theta) \cos(\theta) - \sin(n\theta)\sin(\theta), \text{ and}\\ -\cos((n-1)\theta) &= \cos(n\theta - \theta) = \cos(n\theta) \cos(-\theta) - \sin(n\theta)\sin(-\theta). -\end{align*} -``` - -But $\cos(-\theta) = \cos(\theta)$, whereas $\sin(-\theta) = -\sin(\theta)$. Using this, we add the two formulas above to get: - -```math -\cos((n+1)\theta) = 2\cos(n\theta) \cos(\theta) - \cos((n-1)\theta). -``` - -That is the angle for a multiple of $n+1$ can be expressed in terms of the angle with a multiple of $n$ and $n-1$. This can be used recursively to find expressions for $\cos(n\theta)$ in terms of polynomials in $\cos(\theta)$. - -## Inverse trigonometric functions - -The trigonometric functions are all periodic. In particular they are not monotonic over their entire domain. This means there is no *inverse* function applicable. However, by restricting the domain to where the functions are monotonic, inverse functions can be defined: - -* For $\sin(x)$, the restricted domain of $[-\pi/2, \pi/2]$ allows for the arcsine function to be defined. In `Julia` this is implemented with `asin`. - -* For $\cos(x)$, the restricted domain of $[0,\pi]$ allows for the arccosine function to be defined. In `Julia` this is implemented with `acos`. - -* For $\tan(x)$, the restricted domain of $(-\pi/2, \pi/2)$ allows for the arctangent function to be defined. In `Julia` this is implemented with `atan`. - - -For example, the arcsine function is defined for $-1 \leq x \leq 1$ and has a range of $-\pi/2$ to $\pi/2$: - -```julia; -plot(asin, -1, 1) -``` - - -The arctangent has domain of all real $x$. It has shape given by: - -```julia; -plot(atan, -10, 10) -``` - -The horizontal asymptotes are $y=\pi/2$ and $y=-\pi/2$. - - -### Implications of a restricted domain - -Notice that $\sin(\arcsin(x)) = x$ for any $x$ in $[-1,1]$, but, of course, not for all $x$, as the output of the sine function can't be arbitrarily large. - -However, $\arcsin(\sin(x))$ is defined for all $x$, but only equals -$x$ when $x$ is in $[-\pi/2, \pi/2]$. The output, or range, of the -$\arcsin$ function is restricted to that interval. - - -This can be limiting at times. A common case is to find the angle in $[0, 2\pi)$ corresponding to a point $(x,y)$. In the simplest case (the first and fourth quadrants) this is just given by $\arctan(y/x)$. But with some work, the correct angle can be found for any pair $(x,y)$. As this is a common desire, the `atan` function with two arguments, `atan(y,x)`, is available. This function returns a value in $(-\pi, \pi]$. - -For example, this will not give back $\theta$ without more work to identify the quadrant: - -```julia; -theta = 3pi/4 # 2.35619... -x,y = (cos(theta), sin(theta)) # -0.7071..., 0.7071... -atan(y/x) -``` - -But, - -```julia; -atan(y, x) -``` - - - - -##### Example - -A (white) light shining through a [prism](http://tinyurl.com/y8sczg4t) will be deflected depending on the material of the prism and the angles involved (refer to the link for a figure). The relationship can be analyzed by tracing a ray through the figure and utilizing Snell's law. If the prism has index of refraction $n$ then the ray will deflect by an amount $\delta$ that depends on the angle, $\alpha$ of the prism and the initial angle ($\theta_0$) according to: - -```math -\delta = \theta_0 - \alpha + \arcsin(n \sin(\alpha - \arcsin(\frac{1}{n}\sin(\theta_0)))). -``` - -If $n=1.5$ (glass), $\alpha = \pi/3$ and $\theta_0=\pi/6$, find the deflection (in radians). - -We have: - -```julia; hold=true; -n, alpha, theta0 = 1.5, pi/3, pi/6 -delta = theta0 - alpha + asin(n * sin(alpha - asin(sin(theta0)/n))) -``` - -For small $\theta_0$ and $\alpha$ the deviation is approximated by $(n-1)\alpha$. Compare this approximation to the actual value when $\theta_0 = \pi/10$ and $\alpha=\pi/15$. - - -We have: - -```julia; hold=true; -n, alpha, theta0 = 1.5, pi/15, pi/10 -delta = theta0 - alpha + asin(n * sin(alpha - asin(sin(theta0)/n))) -delta, (n-1)*alpha -``` - -The approximation error is about ``2.7`` percent. - - -##### Example - -The AMS has an interesting column on -[rainbows](http://www.ams.org/publicoutreach/feature-column/fcarc-rainbows) -the start of which uses some formulas from the previous example. Click through to -see a ray of light passing through a spherical drop of water, as analyzed by Descartes. The -deflection of the ray occurs when the incident light hits the drop of -water, then there is an *internal* deflection of the light, and -finally when the light leaves, there is another deflection. The total -deflection (in radians) is $D = (i-r) + (\pi - 2r) + (i-r) = \pi - 2i - 4r$. However, the incident angle $i$ and the refracted angle $r$ are related by Snell's law: $\sin(i) = n \sin(r)$. The value $n$ is the index of refraction and is $4/3$ for water. (It was $3/2$ for glass in the previous example.) This gives - -```math -D = \pi + 2i - 4 \arcsin(\frac{1}{n} \sin(i)). -``` - -Graphing this for incident angles between $0$ and $\pi/2$ we have: - -```julia; hold=true; -n = 4/3 -D(i) = pi + 2i - 4 * asin(sin(i)/n) -plot(D, 0, pi/2) -``` - -Descartes was interested in the minimum value of this graph, as it relates to where the light concentrates. This is roughly at $1$ radian or about $57$ degrees: - -```julia; -rad2deg(1.0) -``` - -(Using calculus it can be seen to be $\arccos(((n^2-1)/3)^{1/2})$.) - -##### Example: The Chebyshev Polynomials - -Consider again this equation derived with the sum-and-difference formula: - - -```math -\cos((n+1)\theta) = 2\cos(n\theta) \cos(\theta) - \cos((n-1)\theta). -``` - -Let $T_n(x) = \cos(n \arccos(x))$. Calling $\theta = \arccos(x)$ for $-1 \leq x \leq x$ we get a relation between these functions: - -```math -T_{n+1}(x) = 2x T_n(x) - T_{n-1}(x). -``` - -We can simplify a few: For example, when $n=0$ we see immediately that $T_0(x) = 1$, the constant function. Whereas with $n=1$ we get $T_1(x) = \cos(\arccos(x)) = x$. Things get more interesting as we get bigger $n$, for example using the equation above we get $T_2(x) = 2xT_1(x) - T_0(x) = 2x\cdot x - 1 = 2x^2 - 1$. Continuing, we'd get $T_3(x) = 2 x T_2(x) - T_1(x) = 2x(2x^2 - 1) - x = 4x^3 -3x$. - -A few things become clear from the above two representations: - -* Starting from $T_0(x) = 1$ and $T_1(x)=x$ and using the recursive defintion of $T_{n+1}$ we get a family of polynomials where $T_n(x)$ is a degree $n$ polynomial. These are defined for all $x$, not just $-1 \leq x \leq 1$. - -* Using the initial definition, we see that the zeros of $T_n(x)$ all occur within $[-1,1]$ and happen when $n\arccos(x) = k\pi + \pi/2$, or $x=\cos((2k+1)/n \cdot \pi/2)$ for $k=0, 1, \dots, n-1$. - -Other properties of this polynomial family are not at all obvious. One is that amongst all polynomials of degree $n$ with roots in $[-1,1]$, $T_n(x)$ will be the smallest in magnitude (after we divide by the leading coefficient to make all polynomials considered to be monic). We check this for one case. Take $n=4$, then we have: $T_4(x) = 8x^4 - 8x^2 + 1$. Compare this with $q(x) = (x+3/5)(x+1/5)(x-1/5)(x-3/5)$ (evenly spaced zeros): - -```julia; -T4(x) = (8x^4 - 8x^2 + 1) / 8 -q(x) = (x+3/5)*(x+1/5)*(x-1/5)*(x-3/5) -plot(abs ∘ T4, -1,1, label="|T₄|") -plot!(abs ∘ q, -1,1, label="|q|") -``` - - -## Hyperbolic trigonometric functions - -Related to the trigonometric functions are the hyperbolic -trigonometric functions. Instead of associating a point $(x,y)$ on the -unit circle with an angle $\theta$, we associate a point $(x,y)$ on -the unit *hyperbola* ($x^2 - y^2 = 1$). We define the hyperbolic -sine ($\sinh$) and hyperbolic cosine ($\cosh$) through $(\cosh(\theta), -\sinh(\theta)) = (x,y)$. - -```julia; echo=false -let - ## inspired by https://en.wikipedia.org/wiki/Hyperbolic_function - # y^2 = x^2 - 1 - top(x) = sqrt(x^2 - 1) - - p = plot(; legend=false, aspect_ratio=:equal) - - x₀ = 2 - xs = range(1, x₀, length=100) - ys = top.(xs) - plot!(p, xs, ys, color=:red) - plot!(p, xs, -ys, color=:red) - - xs = -reverse(xs) - ys = top.(xs) - plot!(p, xs, ys, color=:red) - plot!(p, xs, -ys, color=:red) - - xs = range(-x₀, x₀, length=3) - plot!(p, xs, xs, linestyle=:dash, color=:blue) - plot!(p, xs, -xs, linestyle=:dash, color=:blue) - - a = 1.2 - plot!(p, [0,cosh(a)], [sinh(a), sinh(a)]) - annotate!(p, [(sinh(a)/2, sinh(a)+0.25,"cosh(a)")]) - plot!(p, [cosh(a),cosh(a)], [sinh(a), 0]) - annotate!(p, [(sinh(a) + 1, cosh(a)/2,"sinh(a)")]) - scatter!(p, [cosh(a)], [sinh(a)], markersize=5) - - - ts = range(0, a, length=100) - xs′ = cosh.(ts) - ys′ = sinh.(ts) - - xs = [0, 1, xs′..., 0] - ys = [0, 0, ys′..., 0] - plot!(p, xs, ys, fillcolor=:red, fill=true, alpha=.3) - - p -end -``` - -These values are more commonly expressed using the exponential function as: - -```math -\begin{align*} -\sinh(x) &= \frac{e^x - e^{-x}}{2}\\ -\cosh(x) &= \frac{e^x + e^{-x}}{2}. -\end{align*} -``` - -The hyperbolic tangent is then the ratio of $\sinh$ and $\cosh$. As well, three inverse hyperbolic functions can be defined. - -The `Julia` functions to compute these values are named `sinh`, `cosh`, and `tanh`. - - -## Questions - -###### Question - -What is bigger $\sin(1.23456)$ or $\cos(6.54321)$? - -```julia; hold=true; echo=false -a = sin(1.23456) > cos(6.54321) -choices = [raw"``\sin(1.23456)``", raw"``\cos(6.54321)``"] -answ = a ? 1 : 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Let $x=\pi/4$. What is bigger $\cos(x)$ or $x$? - -```julia; hold=true; echo=false -x = pi/4 -a = cos(x) > x -choices = [raw"``\cos(x)``", "``x``"] -answ = a ? 1 : 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The cosine function is a simple tranformation of the sine function. Which one? - -```julia; hold=true; echo=false -choices = [ -raw"``\cos(x) = \sin(x - \pi/2)``", -raw"``\cos(x) = \sin(x + \pi/2)``", -raw"``\cos(x) = \pi/2 \cdot \sin(x)``"] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -Graph the secant function. The vertical asymptotes are at? - -```julia; hold=true; echo=false -choices = [ -L"The values $k\pi$ for $k$ in $\dots, -2, -1, 0, 1, 2, \dots$", -L"The values $\pi/2 + k\pi$ for $k$ in $\dots, -2, -1, 0, 1, 2, \dots$", -L"The values $2k\pi$ for $k$ in $\dots, -2, -1, 0, 1, 2, \dots$"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -A formula due to [Bhaskara I](http://tinyurl.com/k89ux5q) dates to around 650AD and gives a rational function approximation to the sine function. In degrees, we have - -```math -\sin(x^\circ) \approx \frac{4x(180-x)}{40500 - x(180-x)}, \quad 0 \leq x \leq 180. -``` - -Plot both functions over $[0, 180]$. What is the maximum difference between the two to two decimal points? (You may need to plot the difference of the functions to read off an approximate answer.) - -```julia; hold=true; echo=false -numericq(.0015, .01) -``` - -###### Question - -Solve the following equation for a value of $x$ using `acos`: - -```math -\cos(x/3) = 1/3. -``` - -```julia; hold=true; echo=false -val = 3*acos(1/3) -numericq(val) -``` - -###### Question - -For any postive integer $n$ the equation $\cos(x) - nx = 0$ has a solution in $[0, \pi/2]$. Graphically estimate the value when $n=10$. - -```julia; hold=true; echo=false -val = 0.1 -numericq(val) -``` - -###### Question - -The sine function is an *odd* function. - -* The hyperbolic sine is: - -```julia; hold=true; echo=false -choices = ["odd", "even", "neither"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -* The hyperbolic cosine is: - -```julia; hold=true; echo=false -choices = ["odd", "even", "neither"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -* The hyperbolic tangent is: - -```julia; hold=true; echo=false -choices = ["odd", "even", "neither"] -answ = 1 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -The hyperbolic sine satisfies this formula: - -```math -\sinh(\theta + \beta) = \sinh(\theta)\cosh(\beta) + \sinh(\beta)\cosh(\theta). -``` - -Is this identical to the pattern for the regular sine function? - -```julia; hold=true; echo=false -yesnoq(true) -``` - -The hyperbolic cosine satisfies this formula: - - -```math -\cosh(\theta + \beta) = \cosh(\theta)\cosh(\beta) + \sinh(\beta)\sinh(\theta). -``` - -Is this identical to the pattern for the regular sine function? - -```julia; hold=true; echo=false -yesnoq(false) -``` diff --git a/CwJ/precalc/variables.jmd b/CwJ/precalc/variables.jmd deleted file mode 100644 index 6f3bae3..0000000 --- a/CwJ/precalc/variables.jmd +++ /dev/null @@ -1,490 +0,0 @@ -# Variables - -## Assignment - -```julia; echo=false; results="hidden" -using CalculusWithJulia -using CalculusWithJulia.WeaveSupport - -const frontmatter = ( - title = "Variables", - description = "Calculus with Julia: Variables", - tags = ["CalculusWithJulia", "precalc", "variables"], -); - -nothing -``` - -```julia; echo=false; -imgfile = "figures/calculator.png" -caption = "Screenshot of a calculator provided by the Google search engine." -ImageFile(:precalc, imgfile, caption) -``` - - -The Google calculator has a button `Ans` to refer to the answer to the -previous evaluation. This is a form of memory. The last answer is -stored in a specific place in memory for retrieval when `Ans` is -used. In some calculators, more advanced memory features are -possible. For some, it is possible to push values onto a stack of -values for them to be referred to at a later time. This proves useful -for complicated expressions, say, as the expression can be broken into -smaller intermediate steps to be computed. These values can then be -appropriately combined. This strategy is a good one, though the memory -buttons can make its implementation a bit cumbersome. - -With `Julia`, as with other programming languages, it is very easy to -refer to past evaluations. This is done by *assignment* whereby a -computed value stored in memory is associated with a name. The name -can be used to look up the value later. Assignment does not change the value of the object being assigned, it only introduces a reference to it. - -Assignment in `Julia` is handled by the equals sign and takes the general form `variable_name = value`. For example, here we assign values to the variables `x` and `y` - - -```julia; -x = sqrt(2) -y = 42 -``` - -In an assignment, the right hand side is always returned, so it appears -nothing has happened. However, the values are there, as can be checked -by typing their name - -```julia; -x -``` - - -Just typing a variable name (without a trailing semicolon) causes the assigned value to be displayed. - - -Variable names can be reused, as here, where we redefine `x`: - -```julia;hold=true -x = 2 -``` - -!!! note - The `Pluto` interface for `Julia` is idiosyncratic, as variables are *reactive*. This interface allows changes to a variable `x` to propogate to all other cells referring to `x`. Consequently, the variable name can only be assigned *once* per notebook **unless** the name is in some other namespace, which can be arranged by including the assignment inside a function or a `let` block. - - - -`Julia` is referred to as a "dynamic language" which means (in most -cases) that a variable can be reassigned with a value of a different -type, as we did with `x` where first it was assigned to a floating -point value then to an integer value. (Though we meet some cases - generic functions - where `Julia` -balks at reassigning a variable if the type if different.) - - - - -More importantly than displaying a value, is the use of variables to -build up more complicated expressions. For example, to compute - -```math -\frac{1 + 2 \cdot 3^4}{5 - 6/7} -``` - -we might break it into the grouped pieces implied by the mathematical notation: - -```julia; -top = 1 + 2*3^4 -bottom = 5 - 6/7 -top/bottom -``` - -### Examples - -##### Example - -Imagine we have the following complicated expression related to the trajectory of a [projectile](http://www.researchgate.net/publication/230963032_On_the_trajectories_of_projectiles_depicted_in_early_ballistic_woodcuts) with wind resistance: - -```math - \left(\frac{g}{k v_0\cos(\theta)} + \tan(\theta) \right) t + \frac{g}{k^2}\ln\left(1 - \frac{k}{v_0\cos(\theta)} t \right) -``` - -Here $g$ is the gravitational constant $9.8$ and $v_0$, $\theta$ and -$k$ parameters, which we take to be $200$, $45$ degrees, and $1/2$ -respectively. With these values, the above expression can be computed -when $s=100$: - -```julia; -g = 9.8 -v0 = 200 -theta = 45 -k = 1/2 -t = 100 -a = v0 * cosd(theta) -(g/(k*a) + tand(theta))* t + (g/k^2) * log(1 - (k/a)*t) -``` - -By defining a new variable `a` to represent a value that is repeated a few times in the expression, the last command is greatly simplified. Doing so makes it much easier to check for accuracy against the expression to compute. - -##### Example - -A common expression in mathematics is a polynomial expression, for example $-16s^2 + 32s - 12$. Translating this to `Julia` at $s =3$ we might have: - -```julia; -s = 3 --16*s^2 + 32*s - 12 -``` - -This looks nearly identical to the mathematical expression, but we inserted `*` to indicate multiplication between the constant and the variable. In fact, this step is not needed as Julia allows numeric literals to have an implied multiplication: - -```julia; --16s^2 + 32s - 12 -``` - - - - -## Where math and computer notations diverge - -It is important to recognize that `=` to `Julia` is not in analogy to -how $=$ is used in mathematical notation. The following `Julia` code -is not an equation: - -```julia;hold=true -x = 3 -x = x^2 -``` - -What happens instead? The right hand side is evaluated (`x` is -squared), the result is stored and bound to the variable `x` (so that -`x` will end up pointing to the new value, `9`, and not the original one, `3`); finally the value computed on the -right-hand side is returned and in this case displayed, as there is no -trailing semicolon to suppress the output. - -This is completely unlike the mathematical equation $x = x^2$ which is -typically solved for values of $x$ that satisfy the equation ($0$ and -$1$). - -##### Example - -Having `=` as assignment is usefully exploited when modeling -sequences. For example, an application of Newton's method might end up -with this expression: - -```math -x_{i+1} = x_i - \frac{x_i^2 - 2}{2x_i} -``` - -As a mathematical expression, for each $i$ this defines a new value -for $x_{i+1}$ in terms of a known value $x_i$. This can be used to -recursively generate a sequence, provided some starting point is -known, such as $x_0 = 2$. - -The above might be written instead with: - -```julia;hold=true -x = 2 -x = x - (x^2 - 2) / (2x) -x = x - (x^2 - 2) / (2x) -``` - -Repeating this last line will generate new values of `x` based on the -previous one - no need for subscripts. This is exactly what the -mathematical notation indicates is to be done. - -## Context - -The binding of a value to a variable name happens within some -context. For our simple illustrations, we are assigning values, as -though they were typed at the command line. This stores the binding in -the `Main` module. `Julia` looks for variables in this module when it -encounters an expression and the value is substituted. Other uses, such as when variables are defined within a function, involve different contexts which may not be -visible within the `Main` module. - -!!! note - The `varinfo` function will list the variables currently defined in the - main workspace. There is no mechanism to delete a single variable. - -!!! warning - **Shooting oneselves in the foot.** `Julia` allows us to - locally redefine variables that are built in, such as the value - for `pi` or the function object assigned to `sin`. For example, - this is a perfectly valid command `sin=3`. However, it will - overwrite the typical value of `sin` so that `sin(3)` will be an - error. At the terminal, the binding to `sin` occurs in the `Main` - module. This shadows that value of `sin` bound in the `Base` - module. Even if redefined in `Main`, the value in base can be used - by fully qualifying the name, as in `Base.sin(pi)`. This uses the - notation `module_name.variable_name` to look up a binding in a - module. - -## Variable names - -`Julia` has a very wide set of possible -[names](https://docs.julialang.org/en/stable/manual/variables/#Allowed-Variable-Names-1) -for variables. Variables are case sensitive and their names can -include many -[Unicode](http://en.wikipedia.org/wiki/List_of_Unicode_characters) -characters. Names must begin with a letter or an appropriate Unicode -value (but not a number). There are some reserved words, such as `try` -or `else` which can not be assigned to. However, many built-in names -can be locally overwritten. Conventionally, variable names are lower -case. For compound names, it is not unusual to see them squished -together, joined with underscores, or written in camelCase. - -```julia; -value_1 = 1 -a_long_winded_variable_name = 2 -sinOfX = sind(45) -__private = 2 # a convention -``` - -### Unicode names - -Julia allows variable names to use Unicode identifiers. Such -names allow `julia` notation to mirror that of many -mathematical texts. For example, in calculus the variable $\epsilon$ -is often used to represent some small number. We can assign to a -symbol that looks like $\epsilon$ using `Julia`'s LaTeX input -mode. Typing `\epsilon[tab]` will replace the text with the symbol -within `IJulia` or the command line. - -```julia; -ϵ = 1e-10 -``` - - -Entering Unicode names follows the pattern of -"slash" + LaTeX name + `[tab]` key. Some other ones that are useful -are `\delta[tab]`, `\alpha[tab]`, and `\beta[tab]`, though there are -[hundreds](https://github.com/JuliaLang/julia/blob/master/stdlib/REPL/src/latex_symbols.jl) -of other values defined. - - -For example, we could have defined `theta` (`\theta[tab]`) and `v0` (`v\_0[tab]`) using Unicode to make them match more closely the typeset math: - -```julia; -θ = 45; v₀ = 200 -``` - -!!! note "Unicode" - These notes can be presented as HTML files *or* as `Pluto` notebooks. They often use Unicode alternatives to avoid the `Pluto` requirement of a single use of assigning to a variable name in a notebook without placing the assignment in a `let` block or a function body. - - -!!! note "Emojis" - There is even support for tab-completion of [emojis](https://github.com/JuliaLang/julia/blob/master/stdlib/REPL/src/emoji_symbols.jl) such as `\\:snowman:[tab]` or `\\:koala:[tab]` - -##### Example - -As mentioned the value of $e$ is bound to the Unicode value -`\euler[tab]` and not the letter `e`, so Unicode entry is required to -access this constant This isn't quite true. The `MathConstants` module -defines `e`, as well as a few other values accessed via Unicode. When -the `CalculusWithJulia` package is loaded, as will often be done in -these notes, a value of `exp(1)` is assigned to `e`. - - - -## Tuple assignment - -It is a common task to define more than one variable. Multiple definitions can be done in one line, using semicolons to break up the commands, as with: - -```julia;hold=true -a = 1; b = 2; c=3 -``` - - - -For convenience, `Julia` allows an alternate means to define more than one variable at a time. The syntax is similar: - -```julia;hold=true -a, b, c = 1, 2, 3 -b -``` - -This sets `a=1`, `b=2`, and `c=3`, as suggested. This construct relies -on *tuple destructuring*. The expression on the right hand side forms -a tuple of values. A tuple is a container for different types of -values, and in this case the tuple has 3 values. When the same number -of variables match on the left-hand side as those in the container on -the right, the names are assigned one by one. - -The value on the right hand side is evaluated, then the assignment occurs. The following exploits this to swap the values assigned to `a` and `b`: - -```julia;hold=true -a, b = 1, 2 -a, b = b, a -``` - -#### Example, finding the slope - -Find the slope of the line connecting the points $(1,2)$ and $(4,6)$. We begin by defining the values and then applying the slope formula: - -```julia; -x0, y0 = 1, 2 -x1, y1 = 4, 6 -m = (y1 - y0) / (x1 - x0) -``` - -Of course, this could be computed directly with `(6-2) / (4-1)`, but by using familiar names for the values we can be certain we apply the formula properly. - - - -## Questions - -###### Question - -Let $a=10$, $b=2.3$, and $c=8$. Find the value of $(a-b)/(a-c)$. - -```julia; hold=true; echo=false; -a,b,c = 10, 2.3, 8; -numericq((a-b)/(a-c)) -``` - - -###### Question - -Let `x = 4`. Compute $y=100 - 2x - x^2$. What is the value: - -```julia; hold=true; echo=false; -x = 4 -y =- 100 - 2x - x^2 -numericq(y, 0.1) -``` - - -###### Question - -What is the answer to this computation? - -```julia; eval=false -a = 3.2; b=2.3 -a^b - b^a -``` - - -```julia; hold=true; echo=false; -a = 3.2; b=2.3; -val = a^b - b^a; -numericq(val) -``` - - -###### Question - -For longer computations, it can be convenient to do them in parts, as -this makes it easier to check for mistakes. - -For example, to compute - -```math -\frac{p - q}{\sqrt{p(1-p)}} -``` - -for $p=0.25$ and $q=0.2$ we might do: - - - -```julia; eval=false -p, q = 0.25, 0.2 -top = p - q -bottom = sqrt(p*(1-p)) -ans = top/bottom -``` - -What is the result of the above? - -```julia; hold=true; echo=false; -p, q = 0.25, 0.2; -top = p - q; -bottom = sqrt(p*(1-p)); -answ = top/bottom; -numericq(answ) -``` - -###### Question - -Using variables to record the top and the bottom of the expression, compute the following for $x=3$: - -```math -y = \frac{x^2 - 2x - 8}{x^2 - 9x - 20}. -``` - -```julia; hold=true; echo=false; -x = 3 -val = (x^2 - 2x - 8)/(x^2 - 9x - 20) -numericq(val) -``` - -###### Question - -Which if these is not a valid variable name (identifier) in `Julia`: - -```julia; hold=true; echo=false; -choices = [ -q"5degreesbelowzero", -q"some_really_long_name_that_is_no_fun_to_type", -q"aMiXeDcAsEnAmE", -q"fahrenheit451" -] -answ = 1 -radioq(choices, answ) -``` - - - -###### Question - -Which of these symbols is one of `Julia`'s built-in math constants? - -```julia; hold=true; echo=false; -choices = [q"pi", q"oo", q"E", q"I"] -answ = 1 -radioq(choices, answ) -``` - - - -###### Question - -What key sequence will produce this assignment - -```julia; eval=false -δ = 1/10 -``` - -```julia; hold=true; echo=false; -choices=[ -q"\delta[tab] = 1/10", -q"delta[tab] = 1/10", -q"$\\delta$ = 1/10"] -answ = 1 -radioq(choices, answ) -``` - - -###### Question - -Which of these three statements will **not** be a valid way to assign three variables at once: - -```julia; hold=true; echo=false; -choices = [ -q"a=1, b=2, c=3", -q"a,b,c = 1,2,3", -q"a=1; b=2; c=3"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -The fact that assignment *always* returns the value of the right hand side *and* the fact that the `=` sign associates from right to left means that the following idiom: - -```julia; eval=false -x = y = z = 3 -``` - -Will always: - -```julia; hold=true; echo=false; -choices = ["Assign all three variables at once to a value of `3`", -"Create ``3`` linked values that will stay synced when any value changes", -"Throw an error" -] -answ = 1 -radioq(choices, answ) -``` diff --git a/CwJ/precalc/vectors.jmd b/CwJ/precalc/vectors.jmd deleted file mode 100644 index 9e8fea0..0000000 --- a/CwJ/precalc/vectors.jmd +++ /dev/null @@ -1,986 +0,0 @@ -# Vectors - - -```julia; echo=false; results="hidden" -using CalculusWithJulia -using CalculusWithJulia.WeaveSupport -using Plots -using Measures -using LaTeXStrings - -#fig_size = (400, 300) -fig_size = (800, 600) - -const frontmatter = ( - title = "Vectors", - description = "Calculus with Julia: Vectors", - tags = ["CalculusWithJulia", "precalc", "vectors"], -); - -nothing -``` - -One of the first models learned in physics are the equations governing the laws of motion with constant acceleration: $x(t) = x_0 + v_0 t + 1/2 \cdot a t^2$. This is a consequence of Newton's second [law](http://tinyurl.com/8ylk29t) of motion applied to the constant acceleration case. A related formula for the velocity is $v(t) = v_0 + at$. The following figure is produced using these formulas applied to both the vertical position and the horizontal position: - -```julia; hold=true; echo=false -px = 0.26mm -x0 = [0, 64] -v0 = [20, 0] -g = [0, -32] - -unit(v::Vector) = v / norm(v) -x_ticks = collect(0:10:80) -y_ticks = collect(0:10:80) - - -function make_plot(t) - - xn = (t) -> x0 + v0*t + 1/2*g*t^2 - vn = (t) -> v0 + g*t - an = (t) -> g - - - t = 1/10 + t*2/10 - - ts = range(0, stop=2, length=100) - xys = map(xn, ts) - - xs, ys = [p[1] for p in xys], [p[2] for p in xys] - - plt = Plots.plot(xs, ys, legend=false, size=fig_size, xlims=(0,45), ylims=(0,70)) - plot!(plt, zero, extrema(xs)...) - arrow!(xn(t), 10*unit(xn(t)), color="black") - arrow!(xn(t), 10*unit(vn(t)), color="red") - arrow!(xn(t), 10*unit(an(t)), color="green") - - plt - - -end - -imgfile = tempname() * ".gif" -caption = """ - -Position, velocity, and acceleration vectors (scaled) for projectile -motion. Vectors are drawn with tail on the projectile. The position -vector (black) points from the origin to the projectile, the velocity -vector (red) is in the direction of the trajectory, and the -acceleration vector (green) is a constant pointing downward. - -""" - -n = 8 -anim = @animate for i=1:n - make_plot(i) -end - -gif(anim, imgfile, fps = 1) - -ImageFile(imgfile, caption) -``` - - - -For the motion in the above figure, the object's $x$ and $y$ values change according to the same rule, but, as the acceleration is different in each direction, we get different formula, namely: $x(t) = x_0 + v_{0x} t$ and $y(t) = y_0 + v_{0y}t - 1/2 \cdot gt^2$. - -It is common to work with *both* formulas at once. Mathematically, -when graphing, we naturally pair off two values using Cartesian -coordinates (e.g., $(x,y)$). Another means of combining related values -is to use a *vector*. The notation for a vector varies, but to -distinguish them from a point we will use $\langle x,~ y\rangle$. -With this notation, we can use it to represent the position, -the velocity, and the acceleration at time $t$ through: - -```math -\begin{align} -\vec{x} &= \langle x_0 + v_{0x}t,~ -(1/2) g t^2 + v_{0y}t + y_0 \rangle,\\ -\vec{v} &= \langle v_{0x},~ -gt + v_{0y} \rangle, \text{ and }\\ -\vec{a} &= \langle 0,~ -g \rangle. -\end{align} -``` - - - -Don't spend time thinking about the formulas if they are -unfamiliar. The point emphasized here is that we have used the -notation $\langle x,~ y \rangle$ to collect the two values into a -single object, which we indicate through a label on the variable -name. These are vectors, and we shall see they find use far beyond -this application. - -Initially, our primary use of vectors will be as containers, but it is -worthwhile to spend some time to discuss properties of vectors and -their visualization. - - -A line segment in the plane connects two points $(x_0, y_0)$ and -$(x_1, y_1)$. The length of a line segment (its magnitude) is given by -the distance formula $\sqrt{(x_1 - x_0)^2 + (y_1 - y_0)^2}$. A line -segment can be given a direction by assigning an initial point and a -terminal point. A directed line segment has both a direction and a -magnitude. A vector is an abstraction where just these two properties -``-`` a **direction** and a **magnitude** ``-`` are intrinsic. While a directed line -segment can be represented by a vector, a single vector describes all -such line segments found by translation. That is, how the the vector -is located when visualized is for convenience, it is not a characteristic of -the vector. In the figure above, all vectors are drawn with their tails -at the position of the projectile over time. - - - -We can visualize a (two-dimensional) vector as an arrow in space. This arrow -has two components. We represent a vector -mathematically as $\langle x,~ y \rangle$. For example, the vector -connecting the point $(x_0, y_0)$ to $(x_1, y_1)$ is $\langle x_1 - -x_0,~ y_1 - y_0 \rangle$. - -The magnitude of a vector comes from the distance formula applied to a line segment, and is -$\| \vec{v} \| = \sqrt{x^2 + y^2}$. - -```julia; hold=true;echo=false; -## generic vector -p0 = [0,0] -a1 = [4,1] -b1 = [-2,2] -unit(v::Vector) = v / norm(v) - -plt = plot(legend=false, size=fig_size) -arrow!(p0, a1, color="blue") -arrow!([1,1], unit(a1), color="red") -annotate!([(2, .4, L"v"), (1.6, 1.05, L"\hat{v}")]) - -imgfile = tempname() * ".png" -png(plt, imgfile) - -caption = "A vector and its unit vector. They share the same direction, but the unit vector has a standardized magnitude." - -ImageFile(imgfile, caption) -``` - - -We call the values $x$ and $y$ of the vector $\vec{v} = \langle x,~ y -\rangle$ the components of the $v$. - -Two operations on vectors are fundamental. - -* Vectors can be multiplied by a scalar (a real number): $c\vec{v} = - \langle cx,~ cy \rangle$. Geometrically this scales the vector by a - factor of $\lvert c \rvert$ and switches the direction of the vector - by ``180`` degrees (in the ``2``-dimensional case) when $c < 0$. A *unit - vector* is one with magnitude ``1``, and, except for the $\vec{0}$ - vector, can be formed from $\vec{v}$ by dividing $\vec{v}$ by its - magnitude. A vector's two parts are summarized by its direction - given by a unit vector **and** its magnitude given by the norm. - -* Vectors can be added: $\vec{v} + \vec{w} = \langle v_x + w_x,~ v_y + w_y - \rangle$. That is, each corresponding component adds to form a new - vector. Similarly for subtraction. The $\vec{0}$ vector then would be just - $\langle 0,~ 0 \rangle$ and would satisfy $\vec{0} + \vec{v} = \vec{v}$ for any vector - $\vec{v}$. Vector addition, $\vec{v} + \vec{w}$, is visualized by placing the tail - of $\vec{w}$ at the tip of $\vec{v}$ and then considering the new vector with - tail coming from $\vec{v}$ and tip coming from the position of the tip of - $\vec{w}$. Subtraction is different, place both the tails of $\vec{v}$ and $\vec{w}$ - at the same place and the new vector has tail at the tip of $\vec{v}$ and - tip at the tip of $\vec{w}$. - -```julia;hold=true; echo=false -## vector_addition_image - -p0 = [0,0] -a1 = [4,1] -b1 = [-2,2] - - -plt = Plots.plot(legend=false, size=fig_size) -arrow!(p0, a1, color="blue") -arrow!(p0+a1, b1, color="red") -arrow!(p0, a1+b1, color="black") -annotate!([(2, .25, L"a"), (3, 2.25, L"b"), (1.35, 1.5, L"a+b")]) - -imgfile = tempname() * ".png" -png(plt, imgfile) - -caption = "The sum of two vectors can be visualized by placing the tail of one at the tip of the other" - -ImageFile(imgfile, caption) -``` - -```julia; hold=true; echo=false; -## vector_subtraction_image - -p0 = [0,0] -a1 = [4,1] -b1 = [-2,2] - -plt = plot(legend=false, size=fig_size) -arrow!(p0, a1, color="blue") -arrow!(p0, b1, color="red") -arrow!(b1, a1-b1, color="black") -annotate!(plt, [(-1, .5, L"a"), (2.45, .5, L"b"), (1, 1.75, L"a-b")]) - - -imgfile = tempname() * ".png" -png(plt, imgfile) - -caption = "The difference of two vectors can be visualized by placing the tail of one at the tip of the other" - -ImageFile(imgfile, caption) -``` - - -The concept of scalar multiplication and addition, allow the -decomposition of vectors into standard vectors. The standard unit -vectors in two dimensions are $e_x = \langle 1,~ 0 \rangle$ and $e_y = -\langle 0,~ 1 \rangle$. Any two dimensional vector can be written -uniquely as $a e_x + b e_y$ for some pair of scalars $a$ and $b$ (or as, $\langle a, b \rangle$). This -is true more generally where the two vectors are not the standard unit -vectors - they can be *any* two non-parallel vectors. - - - -```julia; hold=true; echo=false -### {{{vector_decomp}}} - -p0 = [0,0] -aa = [1,2] -bb = [2,1] -cc = [4,3] -alpha = 2/3 -beta = 5/3 - -plt = plot(legend=false, size=fig_size) -arrow!(p0, cc, color="black", width=1) -arrow!(p0, aa, color="black", width=1) -arrow!(alpha*aa, bb, color="black", width=1) -arrow!(p0, alpha*aa, color="orange", width=4, opacity=0.5) -arrow!(alpha*aa, beta*bb, color="orange", width=4, opacity=0.5) -#annotate!(collect(zip([2, .5, 1.75], [1.25,1.0,2.25], [L"c",L"2/3 \cdot a", L"5/3 \cdot b"]))) - - -imgfile = tempname() * ".png" -png(plt, imgfile) - -caption = raw""" - -The vector ``\langle 4,3 \rangle`` is written as -``2/3 \cdot\langle 1,2 \rangle + 5/3 \cdot\langle 2,1 \rangle``. -Any vector ``\vec{c}`` -can be written uniquely as -``\alpha\cdot\vec{a} + \beta \cdot \vec{b}`` -provided ``\vec{a}`` and ``\vec{b}`` are not parallel. - -""" - -ImageFile(imgfile, caption) -``` - - -The two operations of scalar multiplication and vector addition are -defined in a component-by-component basis. We will see that there are -many other circumstances where performing the same action on each -component in a vector is desirable. - ----- - -When a vector is placed with its tail at the origin, it can be -described in terms of the angle it makes with the $x$ axis, $\theta$, -and its length, $r$. The following formulas apply: - -```math -r = \sqrt{x^2 + y^2}, \quad \tan(\theta) = y/x. -``` - -If we are given $r$ and $\theta$, then the vector is $v = \langle r \cdot \cos(\theta),~ r \cdot \sin(\theta) \rangle$. - -```julia; hold=true; echo=false -## vector_rtheta -p0 = [0,0] - -plt = plot(legend=false, size=fig_size) -arrow!(p0, [2,3], color="black") -arrow!(p0, [2,0], color="orange") -arrow!(p0+[2,0], [0,3], color="orange") -annotate!(plt, collect(zip([.25, 1,1,1.75], [.25, 1.85,.25,1], [L"t",L"r", L"r \cdot \cos(t)", L"r \cdot \sin(t)"]))) #["θ","r", "r ⋅ cos(θ)", "r ⋅ sin(θ)"] - -imgfile = tempname() * ".png" -png(plt, imgfile) - -caption = raw""" - -A vector ``\langle x, y \rangle`` can be written as ``\langle r\cdot -\cos(\theta), r\cdot\sin(\theta) \rangle`` for values ``r`` and -``\theta``. The value ``r`` is a magnitude, the direction parameterized by -``\theta``.""" - -ImageFile(imgfile, caption) -``` - - -## Vectors in Julia - -A vector in `Julia` can be represented by its individual components, -but it is more convenient to combine them into a collection using the -`[,]` notation: - -```julia; -x, y = 1, 2 -v = [x, y] # square brackets, not angles -``` - -The basic vector operations are implemented for vector objects. -For example, the vector `v` has scalar multiplication defined for it: - -```julia; -10 * v -``` - -The `norm` function returns the magnitude of the vector (by default): - -```julia; -import LinearAlgebra: norm -``` - -```julia; -norm(v) -``` - -A unit vector is then found by scaling by the reciprocal of the magnitude: - -```julia; -v / norm(v) -``` - - - -In addition, if `w` is another vector, we can add and subtract: - -```julia; -w = [3, 2] -v + w, v - 2w -``` - -We see above that scalar multiplication, addition, and subtraction can -be done without new notation. This is because the usual operators have -methods defined for vectors. - -Finally, to find an angle $\theta$ from a vector $\langle x,~ y\rangle$, we can employ the `atan` function using two arguments: - -```julia; -norm(v), atan(y, x) # v = [x, y] -``` - -## Higher dimensional vectors - -Mathematically, vectors can be generalized to more than ``2`` -dimensions. For example, using ``3``-dimensional vectors are common when -modeling events happening in space, and ``4``-dimensional vectors are -common when modeling space and time. - - -In `Julia` there are many uses for vectors outside of physics -applications. A vector in `Julia` is just a one-dimensional collection -of similarly typed values and a special case of an array. Such objects -find widespread usage. For example: - -- In plotting graphs with `Julia`, vectors are used to hold the $x$ and $y$ coordinates of a collection of points to plot and connect with straight lines. There can be hundreds of such points in a plot. - -- Vectors are a natural container to hold the roots of a polynomial or zeros of a function. - -- Vectors may be used to record the state of an iterative process. - -- Vectors are naturally used to represent a data set, such as arise when collecting survey data. - -Creating higher-dimensional vectors is similar to creating a -two-dimensional vector, we just include more components: - -```julia; -fibs = [1, 1, 2, 3, 5, 8, 13] -``` - -Later we will discuss different ways to modify the values of a vector -to create new ones, similar to how scalar multiplication does. - -As mentioned, vectors in `Julia` are comprised of elements of a similar type, but the -type is not limited to numeric values. For example, a vector of -strings might be useful for text processing, a vector of Boolean -values can naturally arise, some applications are even naturally -represented in terms of vectors of vectors (such as happens when plotting a collection points). Look at the output of these two vectors: - -```julia; -["one", "two", "three"] # Array{T, 1} is shorthand for Vector{T}. Here T - the type - is String -``` - -```julia; -[true, false, true] # vector of Bool values -``` - - -Finally, we mention that if `Julia` has values of different types it -will promote them to a common type if possible. Here we combine three -types of numbers, and see that each is promoted to `Float64`: - -```julia; -[1, 2.0, 3//1] -``` - - -Whereas, in this example where there is no common type to promote the -values to, a catch-all type of `Any` is used to hold the components. - -```julia; -["one", 2, 3.0, 4//1] -``` - - - -## Indexing - -Getting the components out of a vector can be done in a manner similar to multiple assignment: - -```julia; -vs = [1, 2] -v₁, v₂ = vs -``` - -When the same number of variable names are on the left hand side of -the assignment as in the container on the right, each is assigned in -order. - -Though this is convenient for small vectors, it is far from being so -if the vector has a large number of components. However, the vector is -stored in order with a first, second, third, $\dots$ -component. `Julia` allows these values to be referred to by -*index*. This too uses the `[]` notation, though differently. Here is -how we get the second component of `vs`: - -```julia; -vs[2] -``` - -The last value of a vector is usually denoted by $v_n$. In `Julia`, -the `length` function will return $n$, the number of items in the -container. So `v[length(v)]` will refer to the last -component. However, the special keyword `end` will do so as well, when -put into the context of indexing. So `v[end]` is more idiomatic. (Similarly, there is a `begin` keyword that is useful when the vector is not ``1``-based, as is typical but not mandatory.) - - -!!! note "More on indexing" - There is [much more](http://julia.readthedocs.org/en/latest/manual/arrays/#indexing) to indexing than just indexing by a single integer value. For example, the following can be used for indexing: - * a scalar integer (as seen) - * a range - * a vector of integers - * a boolean vector - Some add-on packages extend this further. - -### Assignment and indexing - -Indexing notation can also be used with assignment, meaning it can appear on the left hand side of an equals sign. The following -expression replaces the second component with a new value: - -```julia; -vs[2] = 10 -``` - -The value of the right hand side is returned, not the value for `vs`. We can check -that `vs` is then $\langle 1,~ 10 \rangle$ by showing it: - -```julia;hold=true -vs = [1,2] -vs[2] = 10 -vs -``` - -The assignment `vs[2]` is different than the initial assignment -`vs=[1,2]` in that, `vs[2]=10` **modifies** the container that `vs` points -to, whereas `v=[1,2]` **replaces** the binding for `vs`. The indexed -assignment is then more memory efficient when vectors are large. This -point is also of interest when passing vectors to functions, as a -function may modify components of the vector passed to it, though -can't replace the container itself. - -## Some useful functions for working with vectors. - -As mentioned, the `length` function returns the number of components -in a vector. It is one of several useful functions for vectors. - -The `sum` and `prod` function will add and multiply the elements in a vector: - -```julia; -v1 = [1,1,2,3,5,8] -sum(v1), prod(v1) -``` - -The `unique` function will throw out any duplicates: - -```julia; -unique(v1) # drop a `1` -``` - -The functions `maximum` and `minimum` will return the largest and smallest values of an appropriate vector. - -```julia; -maximum(v1) -``` - -(These should not be confused with `max` and `min` which give the largest or smallest value over all their arguments.) - -The `extrema` function returns both the smallest and largest value of a collection: - -```julia; -extrema(v1) -``` - -Consider now - -```julia; -𝒗 = [1,4,2,3] -``` - - -The `sort` function will rearrange the values in `𝒗`: - -```julia; -sort(𝒗) -``` - -The keyword argument, `rev=false` can be given to get values in decreasing order: - -```julia; -sort(𝒗, rev=false) -``` - -For adding a new element to a vector the `push!` method can be used, as in - -```julia; -push!(𝒗, 5) -``` - -To append more than one value, the `append!` function can be used: - -```julia; -append!(v1, [6,8,7]) -``` - -These two functions modify or mutate the values stored within the -vector `𝒗` that passed as an argument. In the `push!` example above, -the value `5` is added to the vector of ``4`` elements. In `Julia`, a -convention is to name mutating functions with a trailing exclamation -mark. (Again, these do not mutate the binding of `𝒗` to the container, -but do mutate the contents of the container.) There are functions with -mutating and non-mutating definitions, an example is `sort` and -`sort!`. - -If only a mutating function is available, like `push!`, and this is -not desired a copy of the vector can be made. It is not enough to copy -by assignment, as with `w = 𝒗`. As both `w` and `𝒗` will be bound to -the same memory location. Rather, you call `copy` to make a new -container with copied contents, as in `w = copy(𝒗)`. - -Creating new vectors of a given size is common for programming, though -not much use will be made here. There are many different functions to -do so: `ones` to make a vector of ones, `zeros` to make a vector of -zeros, `trues` and `falses` to make Boolean vectors of a given size, -and `similar` to make a similar-sized vector (with no particular -values assigned). - - -## Applying functions element by element to values in a vector - -Functions such as `sum` or `length` are known as *reductions* as they reduce the "dimensionality" of the data: a vector is in some sense $1$-dimensional, the sum or length are $0$-dimensional numbers. Applying a reduction is straightforward -- it is just a regular function call. - -```julia; hold=true -v = [1, 2, 3, 4] -sum(v), length(v) -``` - -Other desired operations with vectors act differently. Rather than reduce a collection of values using some formula, the goal is to apply some formula to *each* of the values, returning a modified vector. A simple example might be to square each element, or subtract the average value from each element. An example comes from statistics. When computing a variance, we start with data $x_1, x_2, \dots, x_n$ and along the way form the values $(x_1-\bar{x})^2, (x_2-\bar{x})^2, \dots, (x_n-\bar{x})^2$. - -Such things can be done in *many* differents ways. Here we describe two, but will primarily utilize the first. - -### Broadcasting a function call - -If we have a vector, `xs`, and a function, `f`, to apply to each value, there is a simple means to achieve this task. By adding a "dot" between the function name and the parenthesis that enclose the arguments, instructs `Julia` to "broadcast" the function call. The details allow for more flexibility, but, for this purpose, broadcasting will take each value in `xs` and apply `f` to it, returning a vector of the same size as `xs`. When more than one argument is involved, broadcasting will try to fill out different sized objects. - - - -For example, the following will find, using `sqrt`, the square root of each value in a vector: - -```julia; -xs = [1, 1, 3, 4, 7] -sqrt.(xs) -``` - -This would find the sine of each number in `xs`: - -```julia; -sin.(xs) -``` - -For each function, the `.(` (and not `(`) after the name is the surface syntax for broadcasting. - -The `^` operator is an *infix* operator. Infix operators can be broadcast, as well, by using the form `.` prior to the operator, as in: - -```julia; -xs .^ 2 -``` - - -Here is an example involving the logarithm of a set of numbers. In astronomy, a logarithm with base $100^{1/5}$ is used for star [brightness](http://tinyurl.com/ycp7k8ay). We can use broadcasting to find this value for several values at once through: - -```julia; -ys = [1/5000, 1/500, 1/50, 1/5, 5, 50] -base = (100)^(1/5) -log.(base, ys) -``` - -Broadcasting with multiple arguments allows for mixing of vectors and scalar values, as above, making it convenient when parameters are used. - - -As a final example, the task from statistics of centering and then -squaring can be done with broadcasting. We go a bit further, showing how to compute -the [sample variance](http://tinyurl.com/p6wa4r8) of a data set. This has the formula - -```math -\frac{1}{n-1}\cdot ((x_1-\bar{x})^2 + \cdots + (x_n - \bar{x})^2). -``` - -This can be computed, with broadcasting, through: - -```julia; hold=true -import Statistics: mean -xs = [1, 1, 2, 3, 5, 8, 13] -n = length(xs) -(1/(n-1)) * sum(abs2.(xs .- mean(xs))) -``` - -This shows many of the manipulations that can be made with -vectors. Rather than write `.^2`, we follow the definition of `var` and chose the possibly more -performant `abs2` function which, in general, efficiently finds $|x|^2$ -for various number types. The `.-` uses broadcasting to subtract a scalar (`mean(xs)`) from a vector (`xs`). Without the `.`, this would error. - - -!!! note - The `map` function is very much related to broadcasting and similarly named functions are found in many different programming languages. (The "dot" broadcast is mostly limited to `Julia` and mirrors on a similar usage of a dot in `MATLAB`.) For those familiar with other programming languages, using `map` may seem more natural. Its syntax is `map(f, xs)`. - - -### Comprehensions - -In mathematics, set notation is often used to describe elements in a set. - -For example, the first ``5`` cubed numbers can be described by: - -```math -\{x^3: x \text{ in } 1, 2,\dots, 5\} -``` - -Comprehension notation is similar. The above could be created in `Julia` with: - -```julia; -𝒙s = [1,2,3,4,5] -[x^3 for x in 𝒙s] -``` - - -Something similar can be done more succinctly: - -```julia; -𝒙s .^ 3 -``` - -However, comprehensions have a value when more complicated expressions are desired as they work with an expression of `𝒙s`, and not a pre-defined or user-defined function. - - -Another typical example of set notation might include a condition, -such as, the numbers divisible by $7$ between $1$ and $100$. Set -notation might be: - -```math -\{x: \text{rem}(x, 7) = 0 \text{ for } x \text{ in } 1, 2, \dots, 100\}. -``` - -This would be read: "the set of $x$ such that the remainder on division by $7$ is $0$ for all x in $1, 2, \dots, 100$." - -In `Julia`, a comprehension can include an `if` clause to mirror, -somewhat, the math notation. For example, the above would become -(using `1:100` as a means to create the numbers ``1,2,\dots, 100``, as will be described in an upcoming section): - -```julia; -[x for x in 1:100 if rem(x,7) == 0] -``` - - -Comprehensions can be a convenient means to describe a collection of -numbers, especially when no function is defined, but the simplicity of -the broadcast notation (just adding a judicious ".") leads to its more -common use in these notes. - - -##### Example: creating a "T" table for creating a graph - -The process of plotting a function is usually first taught by -generating a "T" table: values of $x$ and corresponding values of -$y$. These pairs are then plotted on a Cartesian grid and the points -are connected with lines to form the graph. Generating a "T" table in -`Julia` is easy: create the $x$ values, then create the $y$ values for -each $x$. - -To be concrete, let's generate $7$ points to plot $f(x) = x^2$ over $[-1,1]$. - -The first task is to create the data. We will soon see more convenient ways to generate patterned data, but for now, we do this by hand: - -```julia; -a,b, n = -1, 1, 7 -d = (b-a) // (n-1) -𝐱s = [a, a+d, a+2d, a+3d, a+4d, a+5d, a+6d] # 7 points -``` - -To get the corresponding $y$ values, we can use a compression (or define a function and broadcast): - -```julia; -𝐲s = [x^2 for x in 𝐱s] -``` - -Vectors can be compared together by combining them into a separate container, as follows: - -```julia; -[𝐱s 𝐲s] -``` - -(If there is a space between objects they are horizontally combined. In our construction of vectors using `[]` we used a comma for vertical combination. More generally we should use a `;` for vertical concatenation.) - - -In the sequel, we will typically use broadcasting for this task using two -steps: one to define a function the second to broadcast it. - - -!!! note - The style generally employed here is to use plural variable names for a collection of values, such as the vector of $y$ values and singular names when a single value is being referred to, leading to expressions like "`x in xs`". - - -## Other container types - -Vectors in `Julia` are a container, one of many different types. Another useful type for programming purposes are *tuples*. If a vector is formed by placing comma-separated values within a `[]` pair (e.g., `[1,2,3]`), a tuple is formed by placing comma-separated values withing a `()` pair. A tuple of length $1$ uses a convention of a trailing comma to distinguish it from a parethesized expression (e.g. `(1,)` is a tuple, `(1)` is just the value `1`). - -Tuples are used in programming, as they don't typically require allocated memory to be used so they can be faster. Internal usages are for function arguments and function return types. Unlike vectors, tuples can be heterogeneous collections. (When commas are used to combine more than one output into a cell, a tuple is being used.) (Also, a big technical distinction is that tuples are also different from vectors and other containers in that tuple types are *covariant* in their parameters, not *invariant*.) - -Unlike vectors, tuples can have names which can be used for -referencing a value, similar to indexing but possibly more convenient. -Named tuples are similar to *dictionaries* which are used to associate -a key (like a name) with a value. - -For example, here a named tuple is constructed, and then its elements referenced: - -```julia -nt = (one=1, two="two", three=:three) # heterogeneous values (Int, String, Symbol) -nt.one, nt[2], n[end] # named tuples have name or index access -``` - -## Questions - -###### Question - - -Which command will create the vector $\vec{v} = \langle 4,~ 3 \rangle$? - -```julia; hold=true; echo=false; -choices = [ -q"v = [4,3]", -q"v = {4, 3}", -q"v = '4, 3'", -q"v = (4,3)", -q"v = <4,3>"] -answ = 1 -radioq(choices, answ) -``` - -###### Question - -Which command will create the vector with components "4,3,2,1"? - -```julia; hold=true; echo=false; -choices = [q"v = [4,3,2,1]", q"v = (4,3,2,1)", q"v = {4,3,2,1}", q"v = '4, 3, 2, 1'", q"v = <4,3,2,1>"] -answ = 1 -radioq(choices, answ) -``` - - - - - -###### Question - -What is the magnitude of the vector $\vec{v} = \langle 10,~ 15 \rangle$? - -```julia; hold=true; echo=false; -v = [10, 15] -val = norm(v) -numericq(val) -``` - - -###### Question - -Which of the following is the unit vector in the direction of $\vec{v} = \langle 3,~ 4 \rangle$? - -```julia; hold=true; echo=false; -choices = [q"[3, 4]", q"[0.6, 0.8]", q"[1.0, 1.33333]", q"[1, 1]"] -answ = 2 -radioq(choices, answ) -``` - - -###### Question - -What vector is in the same direction as $\vec{v} = \langle 3,~ 4 \rangle$ but is 10 times as long? - -```julia; hold=true; echo=false; -choices = [q"[3, 4]", q"[30, 40]", q"[9.48683, 12.6491 ]", q"[10, 10]"] -answ = 2 -radioq(choices, answ) -``` - -###### Question - -If $\vec{v} = \langle 3,~ 4 \rangle$ and $\vec{w} = \langle 1,~ 2 \rangle$ find $2\vec{v} + 5 \vec{w}$. - -```julia; hold=true; echo=false; -choices = [q"[4, 6]", q"[6, 8]", q"[11, 18]", q"[5, 10]"] -answ = 3 -radioq(choices, answ) -``` - -###### Question - -Let `v` be defined by: - -```julia; hold=true; eval=false -v = [1, 1, 2, 3, 5, 8, 13, 21] -``` - -What is the length of `v`? - -```julia; hold=true; echo=false; -v = [1, 1, 2, 3, 5, 8, 13, 21] -val = length(v) -numericq(val) -``` - - -What is the `sum` of `v`? - -```julia; hold=true; echo=false; -v = [1, 1, 2, 3, 5, 8, 13, 21] -val = sum(v) -numericq(val) -``` - -What is the `prod` of `v`? - - -```julia; hold=true; echo=false; -v = [1,1,2,3,5,8,13,21] -val = prod(v) -numericq(val) -``` - - -###### Question - -From [transum.org](http://www.transum.org/Maths/Exam/Online_Exercise.asp?Topic=Vectors). - -```julia; hold=true; echo=false -p = plot(xlim=(0,10), ylim=(0,5), legend=false, framestyle=:none) -for j in (-3):10 - plot!(p, [j, j + 5], [0, 5*sqrt(3)], color=:blue, alpha=0.5) - plot!(p, [j - 5, j], [5*sqrt(3), 0], color=:blue, alpha=0.5) -end -for i in 1/2:1/2:3 - plot!(p, [0,10],sqrt(3)*[i,i], color=:blue, alpha=0.5) -end - -quiver!(p, [(3/2, 3/2*sqrt(3))], quiver=[(1,0)], color=:black, linewidth=5) # a -quiver!(p, [(2, sqrt(3))], quiver=[(1/2,-sqrt(3)/2)], color=:black, linewidth=5) # b - -quiver!(p, [(3 + 3/2, 3/2*sqrt(3))], quiver=[(3,0)], color=:black, linewidth=5) # c -quiver!(p, [(4 , sqrt(3))], quiver=[(3/2,-sqrt(3)/2)], color=:black, linewidth=5) # d -quiver!(p, [(6+1/2 , sqrt(3)/2)], quiver=[(1/2, sqrt(3)/2)], color=:black, linewidth=5) # e - -delta = 1/4 -annotate!(p, [(2, 3/2*sqrt(3) -delta, L"a"), - (2+1/4, sqrt(3), L"b"), - (3+3/2+3/2, 3/2*sqrt(3)-delta, L"c"), - (4+3/4, sqrt(3) - sqrt(3)/4-delta, L"d"), - (6+3/4+delta, sqrt(3)/2 + sqrt(3)/4-delta, L"e") - ]) - - -p -``` - -The figure shows $5$ vectors. - -Express vector **c** in terms of **a** and **b**: - -```julia; hold=true; echo=false; -choices = ["3a", "3b", "a + b", "a - b", "b-a"] -answ = 1 -radioq(choices, answ) -``` - -Express vector **d** in terms of **a** and **b**: - -```julia; hold=true; echo=false; -choices = ["3a", "3b", "a + b", "a - b", "b-a"] -answ = 3 -radioq(choices, answ) -``` - -Express vector **e** in terms of **a** and **b**: - -```julia; hold=true; echo=false; -choices = ["3a", "3b", "a + b", "a - b", "b-a"] -answ = 4 -radioq(choices, answ) -``` - - -###### Question - -If `xs=[1, 2, 3, 4]` and `f(x) = x^2` which of these will not produce the vector `[1, 4, 9, 16]`? - -```julia; hold=true; echo=false; -choices = [q"f.(xs)", q"map(f, xs)", q"[f(x) for x in xs]", "All three of them work"] -answ = 4 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -Let $f(x) = \sin(x)$ and $g(x) = \cos(x)$. In the interval $[0, 2\pi]$ the zeros of $g(x)$ are given by - -```julia; -zs = [pi/2, 3pi/2] -``` - -What construct will give the function values of $f$ at the zeros of $g$? - -```julia;hold=true; echo=false; -choices = [q"sin(zs)", q"sin.(zs)", q"sin(.zs)", q".sin(zs)"] -answ = 2 -radioq(choices, answ, keep_order=true) -``` - -###### Question - -If `zs = [1,4,9,16]` which of these commands will return `[1.0, 2.0, 3.0, 4.0]`? - -```julia; hold=true; echo=false; -choices = [ -q"sqrt(zs)", -q"sqrt.(zs)", -q"zs^(1/2)", -q"zs^(1./2)" -] -answ = 2 -radioq(choices, answ, keep_order=true) -```