From e1c3af4f372b2eb53766cb713cd4175e32bdb586 Mon Sep 17 00:00:00 2001 From: "Steven G. Johnson" Date: Mon, 8 May 2023 11:45:16 -0400 Subject: [PATCH] pset 6 solutions --- README.md | 2 +- psets/pset6sol.ipynb | 439 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 440 insertions(+), 1 deletion(-) create mode 100644 psets/pset6sol.ipynb diff --git a/README.md b/README.md index 41da033..eb39a53 100644 --- a/README.md +++ b/README.md @@ -376,4 +376,4 @@ Fourier series vs. DFT: If we view the DFT as a [Riemann sum](https://en.wikiped * IIR filter stability: poles zₚ must lie inside the unit circle |zₚ| * IIR filter design: non-convex and complicated, lots of algorithms and approximations. * [Kronecker products A⊗B](https://en.wikipedia.org/wiki/Kronecker_product) and ["vectorization" vec(A)](https://en.wikipedia.org/wiki/Vectorization_(mathematics)): expressing "multidimensional linear operations" ([multilinear algebra](https://en.wikipedia.org/wiki/Multilinear_algebra)) as ordinary matrix–vector products. -* pset 6 solutions: coming soon +* [pset 6 solutions](psets/pset6sol.ipynb) diff --git a/psets/pset6sol.ipynb b/psets/pset6sol.ipynb new file mode 100644 index 0000000..590b8e3 --- /dev/null +++ b/psets/pset6sol.ipynb @@ -0,0 +1,439 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a65bff2e", + "metadata": {}, + "source": [ + "# 18.065 Problem Set 6 Solutions" + ] + }, + { + "cell_type": "markdown", + "id": "46c6dd3c", + "metadata": {}, + "source": [ + "## Problem 1 (10+10 points)\n", + "\n", + "**(a)** In class, we showed that if $\\tilde{y} = \\bar{F} y, \\tilde{a} = \\bar{F} a, \\tilde{x} = \\bar{F} x$ where $F$ is the DFT matrix, then $y = a \\circledast x$ (circular convolution) implies $\\tilde{y} = \\tilde{a} \\, \\mbox{.*} \\, \\tilde{x}$ (element-wise product). Show that the reverse is also true: show that if $\\tilde{y} = \\tilde{a} \\circledast \\tilde{x}$, then $y = \\alpha (a \\, \\mbox{.*} \\, x) $ for some scale factor $\\alpha = ???$. \n", + "\n", + "**(b)** Differentiating through convolutions, in class, involved the \"reversal\" permutation $R$:\n", + "$$\n", + "R\\begin{pmatrix} x_0 \\\\ x_1 \\\\ x_2 \\\\ \\vdots \\\\ x_{N-1} \\end{pmatrix}\n", + "=\\begin{pmatrix} x_0 \\\\ x_{N-1} \\\\ x_{N-2} \\\\ \\vdots \\\\ x_1 \\end{pmatrix}\n", + "$$\n", + "What happens when we combine this with a DFT, e.g. to apply the convolution theorem? Show that $FRx = \\bar{F}x$.\n", + "\n", + "(This is illustrated by the code below: `fft(x)` computes $Fx$ and `bfft(x)` computes $\\bar{F}x$ using the [FFTW.jl package](https://github.com/JuliaMath/FFTW.jl).)" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "e1ce6605", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "true" + ] + }, + "execution_count": 1, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "using FFTW\n", + "R(x) = [x[begin]; x[end:-1:begin+1]] # compute R*x\n", + "x = rand(20)\n", + "fft(R(x)) ≈ bfft(x)" + ] + }, + { + "cell_type": "markdown", + "id": "be151905", + "metadata": {}, + "source": [ + "## Solutions:\n", + "\n", + "**(a)** There are many ways to approach this problem, one approach is simply to re-trace our steps from class.\n", + "\n", + "In class, we showed that the columns of $F$ are eigenvectors of cyclic convolutions, with the corresponding eigenvalues given by $\\bar{F} a$. Hence we obtained $a \\circledast x = F (\\bar{F}a \\,\\mbox{.*}) F^{-1} x = \\frac{1}{N} F (\\bar{F}a \\, \\mbox{.*} \\, \\bar{F} x) = \\bar{F}^{-1} (\\tilde{a} \\,\\mbox{.*}\\, \\tilde{x})$.\n", + "\n", + "Now, we want to show almost the same thing but we want to transform in the opposite direction. That is, we want to replace $F$ with $F^{-1} = \\bar{F}/N$ and vice versa. If we proceed as in class, we can start by showing that the columns of $\\bar{F}$ are eigenvectors of convolutions. But our proof from class for $F$ only relied on the cyclic property $\\omega_N^N = 1$! The proof is completely unchanged if we replace $\\omega_N$ with $\\overline{\\omega_N}$ (corresponding to replacing $F$ with $\\bar{F}$)!\n", + "\n", + "Hence we can simply quote the results above and replace $F$ with $\\bar{F}$:\n", + "$$\n", + "\\tilde{y} = \\tilde{a} \\circledast \\tilde{x} = \\bar{F} (F\\tilde{a}\\, \\mbox{.*}) \\bar{F}^{-1} \\tilde{x} \n", + "= F^{-1} (F\\tilde{a} \\,\\mbox{.*}\\, F\\tilde{x})\n", + "$$\n", + "Since $\\tilde{x} = \\bar{F} x$, that means $x = \\bar{F}^{-1} \\tilde{x} = F \\tilde{x} / N$, i.e. $F\\tilde{x} = Nx$. Similarly for $a$. Substituting that in, we obtain:\n", + "$$\n", + "\\tilde{y} = N^2 F^{-1} (a \\,\\mbox{.*}\\, x) \\Longleftrightarrow \\boxed{y = N (a \\,\\mbox{.*}\\, x)}.\n", + "$$\n", + "which is the desired result with $\\boxed{\\alpha = N}$.\n", + "\n", + "*Optional*: Since it's easy to get this algebra wrong, let's do a quick check of our result in Julia:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "d365ac92", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "true" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# brute-force circular convolution\n", + "function conv(a,x)\n", + " N = length(a); N == length(x) || throw(DimensionMismatch())\n", + " return [sum(i -> a[begin+i] * x[begin+mod(j-i, N)], 0:N-1) for j in 0:N-1]\n", + "end\n", + "\n", + "N = 10\n", + "ã = randn(ComplexF64, N)\n", + "x̃ = randn(ComplexF64, N)\n", + "ỹ = conv(ã, x̃)\n", + "a = fft(ã)/N # Fã/N\n", + "x = fft(x̃)/N # Fx̃/N\n", + "y = fft(ỹ)/N # Fỹ/N\n", + "y ≈ N * (a .* x)" + ] + }, + { + "cell_type": "markdown", + "id": "b61d45ed", + "metadata": {}, + "source": [ + "Hooray! Math works!\n", + "\n", + "**(b)**\n", + "\n", + "If $y = Rx$, then $y_n = x_{-n}$ (with indices interpreted $\\mod N$ as usual). If $z = FRx = Fy$, then we have:\n", + "$$\n", + "z_m = \\sum_{n=0}^{N-1} \\omega_N^{mn} y_n = \\sum_{n=0}^{N-1} \\omega_N^{mn} x_{-n} = \\sum_{k=0}^{N-1} \\omega_N^{-mk} x_{k} = (\\bar{F} x)_m\n", + "$$\n", + "where we simply performed a change of variables $k = -n$ in the sum. Hence $z = FRx = \\bar{F}x$ as desired." + ] + }, + { + "cell_type": "markdown", + "id": "fc26407d", + "metadata": {}, + "source": [ + "## Problem 2 (10+10+10 points)\n", + "\n", + "Think way back to grade school, when you learned to multiply two numbers $a_{N_a-1} \\cdots a_2 a_1 a_0 \\times x_{N_x-1} \\cdots x_2 x_1 x_0$, where $a_k$ and $x_k$ are the **decimal digits** of two numbers with $N_a$ and $N_x$ digits, respectively. (That is, $x_{N_x-1} \\cdots x_2 x_1 x_0 = \\sum_k {10}^k x_k$.)\n", + "\n", + "**(a)** Show that, if you ignore carries (i.e. you allow \"digits\" $> 9$) then the *result* $ y_{N_y-1} \\cdots y_2 y_1 y_0$ is in fact a **convolution** of the digits $y = a \\ast x$, but *not* a cyclic convolution — there is no \"wraparound\". (What is $N_y$?)\n", + "\n", + "**(b)** Show that you can write your previous answer as a *cyclic* convolution simply by \"padding\" $a,x,y$ with zeros. That is, if you add some zeros to each of them (in the right place) to extend them to a common length $N$ (what must $N$ be?), then $\\mbox{pad}(y,N) = \\mbox{pad}(a,N) \\circledast \\mbox{pad}(x,N)$.\n", + "\n", + "**(c)** Write Julia code to compute\n", + "$$\n", + "\\underbrace{94403147635990179114}_a \\times \\underbrace{33855185913241354461}_x \\\\\n", + "= \\underbrace{3196036114011618584561027454963352927554}_y\n", + "$$\n", + "by:\n", + "* Implement the function `pad(x,N)` to zero-pad as in part (b)\n", + "* Implement a function `conv(a,x)` to perform a circular convolution using $Fx =$ `fft(x)` and $\\bar{F}x =$ `bfft(x)`. (You may want to call `round.(Int,real(y))` on the result to discard tiny roundoff errors, since the final values should be integers).\n", + "* Implement a function `carry(y)` to perform the carries: any \"digits\" larger than $10$ should have the tens place \"carried\" to the next digit etc. (You may find the [`divrem` function](https://docs.julialang.org/en/v1/base/math/#Base.divrem) useful to divide by 10 and find a remainder.)\n", + "Check that you get the right answer: `bignum(carry(y)) == 3196036114011618584561027454963352927554`." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "b098b074", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "bignum(a) = 94403147635990179114\n", + "bignum(x) = 33855185913241354461\n", + "bignum(a) * bignum(x) = 3196036114011618584561027454963352927554\n" + ] + }, + { + "data": { + "text/plain": [ + "3196036114011618584561027454963352927554" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "using FFTW\n", + "\n", + "a = [4, 1, 1, 9, 7, 1, 0, 9, 9, 5, 3, 6, 7, 4, 1, 3, 0, 4, 4, 9]\n", + "x = [1, 6, 4, 4, 5, 3, 1, 4, 2, 3, 1, 9, 5, 8, 1, 5, 5, 8, 3, 3]\n", + "\n", + "bignum(x) = evalpoly(BigInt(10), x)\n", + "@show bignum(a)\n", + "@show bignum(x)\n", + "@show bignum(a) * bignum(x)" + ] + }, + { + "cell_type": "markdown", + "id": "bb0e34b1", + "metadata": {}, + "source": [ + "## Solutions:\n", + "\n", + "**(a)** Think about the usual \"long multiplication algorithm, omitting carries. To get the 1's place of the output, i.e. the \"digit\" $y_0$, we simply multiply the 1's -places of the inputs: $y_0 = a_0 x_0$. To get the 10's place of the ouptut, i.e. the \"digit\" $y_1$, we multiply the 1's place of $a$ with the 10's place of $x$ and vice versa, and add them, i.e. $y_1 = a_0 x_1 + a_1 x_0$. And so forth, giving a pattern like:\n", + "$$\n", + "y_0 = a_0 x_0 \\\\\n", + "y_1 = a_0 x_1 + a_1 x_0 \\\\\n", + "y_2 = a_0 x_2 + a_1 x_1 + a_2 x_0 \\\\\n", + "\\vdots \\\\\n", + "y_m = \\sum_{n=0}^m a_n x_{m-n}\n", + "$$\n", + "which is exactly in the form of a linear convolution! To make the sums look more uniform, we can equivalently think of each number as having an *infinite* number of digits, but most of the digits are zero (e.g. $x_k = 0$ for $k < 0$ or $k \\ge N_x$, where the former correspond to \"zeros after the decimal point\" and the latter correspond to zeros before the most significant digit). With this convention, we can simply write:\n", + "$$\n", + "y_m = \\sum_{n=-\\infty}^\\infty a_n x_{m-n}\n", + "$$\n", + "\n", + "**(b)** We must pad with zeros to a length $N$ that is big enough that the circular \"wrap around\" of a cyclic convolution does not distort the result — the \"wrapped around\" digits must all be zeros. Equivalently, since a convolution is a sequence of dot product where we \"slide\" $x$ past $a$ (or vice versa), the overlap should be zero before $x$ slides past the boundary. The last possible overlap, corresponds to the most significant digit in the output $y$, which is $y_m$ for $m = N_a + N_x - 2$: the output (not including carries) has $N_a + N_x - 1$ digits.\n", + "In short, we simply need to pad to a length long enough to hold all the digits of the output, i.e. to any $\\boxed{N \\ge N_a + N_x - 1}$.\n", + "\n", + "Be careful in part (c), however! Once we include carries, this can introduce *additional* digits, so the carried result may eventually be a longer length.\n", + "\n", + "**(c)** Our implementation and tests are as follows" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "bdcd7ae6", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "conv (generic function with 2 methods)" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# pad x with N-length(x) zeros of the same type\n", + "pad(x, N) = [x; zeros(eltype(x), N - length(x))]\n", + " \n", + "# there are many ways to write this,\n", + "# since from problem 1 the convolution theorem takes many forms\n", + "conv(x,y) = ifft(fft(x) .* fft(y))\n", + "# or: conv(x,y) = bfft(fft(x) .* fft(y)) / length(x)\n", + "# or: conv(x,y) = fft(bfft(x) .* bfft(y)) / length(x)\n", + "\n", + "# for integer vectors, we expect an integer result, but\n", + "# FFTs will give us slightly complex/noninteger results due\n", + "# to roundoff errors. Let's correct this as suggested:\n", + "conv(x::AbstractVector{<:Integer}, y::AbstractVector{<:Integer}) =\n", + " round.(promote_type(eltype(x),eltype(y)), real.(conv(float.(x), float.(y))))" + ] + }, + { + "cell_type": "markdown", + "id": "792e6fbb", + "metadata": {}, + "source": [ + "Before we do any carries, let's check that our convolution and zero-padding are correct — evaluating the resulting \"digits\" as an integer using the `bignum` function from above should still work even if we have digits $> 10$." + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "id": "8f411509", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "bignum(y) == bignum(a) * bignum(x) = true\n", + "bignum(y) == bignum(a) * bignum(x) = true\n", + "bignum(y) == bignum(a) * bignum(x) = true\n", + "bignum(y) == bignum(a) * bignum(x) = true\n", + "bignum(y) == bignum(a) * bignum(x) = true\n", + "bignum(y) == bignum(a) * bignum(x) = true\n" + ] + } + ], + "source": [ + "Na = length(a)\n", + "Nx = length(x)\n", + "\n", + "# any length ≥ Na + Nx - 1 should work. let's try a few:\n", + "for N in (Na + Nx - 1) .+ (0:5)\n", + " y = conv(pad(a, N), pad(x, N))\n", + " @show bignum(y) == bignum(a) * bignum(x)\n", + "end" + ] + }, + { + "cell_type": "markdown", + "id": "07051556", + "metadata": {}, + "source": [ + "Now, let's implement the carry function. The key step, for a digit $y_k = {}$ `y[k]`, is that we want to find the quotient $q$ and the remainder $r$ after dividing by 10, which can be accomplished in Julia by:\n", + "```jl\n", + "q, r = divrem(y[k], 10)\n", + "```\n", + "Then, the remainder is the *new k-th digit* ($y_k \\to r$) and the quotient *gets carried* (added to $y_{k+1}$). Note, however, that `y[k]` in this expression *includes any previous carries*." + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "id": "53f03b79", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "carry (generic function with 1 method)" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "function carry(y)\n", + " c = similar(y, 0) # result array of the same type as y, of zero length\n", + " push!(c, 0) # let's start off with a result of zero.\n", + " for k = 1:length(y)\n", + " q, r = divrem(y[k]+c[k], 10)\n", + " c[k] = r # remainder = new digit (+ any previous carry)\n", + " push!(c, q) # push quotient as the next digit\n", + " end\n", + " return c\n", + "end" + ] + }, + { + "cell_type": "markdown", + "id": "0dabf98c", + "metadata": {}, + "source": [ + "Let's try it:" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "id": "1a7587ea", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "y = [4, 5, 5, 7, 2, 9, 2, 5, 3, 3, 6, 9, 4, 5, 4, 7, 2, 0, 1, 6, 5, 4, 8, 5, 8, 1, 6, 1, 1, 0, 4, 1, 1, 6, 3, 0, 6, 9, 1, 3]\n", + "y == digits(bignum(a) * bignum(x)) = true\n", + "bignum(y) == bignum(a) * bignum(x) = true\n" + ] + }, + { + "data": { + "text/plain": [ + "true" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "N = Na + Nx - 1\n", + "y = carry(conv(pad(a, N), pad(x, N)))\n", + "\n", + "@show y\n", + "@show y == digits(bignum(a) * bignum(x))\n", + "@show bignum(y) == bignum(a) * bignum(x)" + ] + }, + { + "cell_type": "markdown", + "id": "5d6e8273", + "metadata": {}, + "source": [ + "Hooray, it's the correct digits $319603 \\ldots 927554$ (starting with the least signifcant!) and passes all of our tests!\n", + "\n", + "Note that the carried result is one digit longer than $N$ digits:" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "id": "d4dd3e88", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "N = 39\n", + "length(y) = 40\n" + ] + }, + { + "data": { + "text/plain": [ + "40" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "@show N\n", + "@show length(y)" + ] + } + ], + "metadata": { + "@webio": { + "lastCommId": null, + "lastKernelId": null + }, + "kernelspec": { + "display_name": "Julia 1.8.3", + "language": "julia", + "name": "julia-1.8" + }, + "language_info": { + "file_extension": ".jl", + "mimetype": "application/julia", + "name": "julia", + "version": "1.8.3" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}