From e1c3af4f372b2eb53766cb713cd4175e32bdb586 Mon Sep 17 00:00:00 2001
From: "Steven G. Johnson" <stevenj@mit.edu>
Date: Mon, 8 May 2023 11:45:16 -0400
Subject: [PATCH] pset 6 solutions

---
 README.md            |   2 +-
 psets/pset6sol.ipynb | 439 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 440 insertions(+), 1 deletion(-)
 create mode 100644 psets/pset6sol.ipynb

diff --git a/README.md b/README.md
index 41da033..eb39a53 100644
--- a/README.md
+++ b/README.md
@@ -376,4 +376,4 @@ Fourier series vs. DFT: If we view the DFT as a [Riemann sum](https://en.wikiped
 * IIR filter stability: poles zₚ must lie inside the unit circle |zₚ|
 * IIR filter design: non-convex and complicated, lots of algorithms and approximations.
 * [Kronecker products A⊗B](https://en.wikipedia.org/wiki/Kronecker_product) and ["vectorization" vec(A)](https://en.wikipedia.org/wiki/Vectorization_(mathematics)): expressing "multidimensional linear operations" ([multilinear algebra](https://en.wikipedia.org/wiki/Multilinear_algebra)) as ordinary matrix–vector products.
-* pset 6 solutions: coming soon
+* [pset 6 solutions](psets/pset6sol.ipynb)
diff --git a/psets/pset6sol.ipynb b/psets/pset6sol.ipynb
new file mode 100644
index 0000000..590b8e3
--- /dev/null
+++ b/psets/pset6sol.ipynb
@@ -0,0 +1,439 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "a65bff2e",
+   "metadata": {},
+   "source": [
+    "# 18.065 Problem Set 6 Solutions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "46c6dd3c",
+   "metadata": {},
+   "source": [
+    "## Problem 1 (10+10 points)\n",
+    "\n",
+    "**(a)** In class, we showed that if $\\tilde{y} = \\bar{F} y, \\tilde{a} = \\bar{F} a, \\tilde{x} = \\bar{F} x$ where $F$ is the DFT matrix, then $y = a \\circledast x$ (circular convolution) implies $\\tilde{y} = \\tilde{a} \\, \\mbox{.*} \\, \\tilde{x}$ (element-wise product).   Show that the reverse is also true: show that if $\\tilde{y} = \\tilde{a} \\circledast \\tilde{x}$, then $y = \\alpha (a \\, \\mbox{.*} \\, x) $ for some scale factor $\\alpha = ???$. \n",
+    "\n",
+    "**(b)** Differentiating through convolutions, in class, involved the \"reversal\" permutation $R$:\n",
+    "$$\n",
+    "R\\begin{pmatrix} x_0 \\\\ x_1 \\\\ x_2 \\\\ \\vdots \\\\ x_{N-1} \\end{pmatrix}\n",
+    "=\\begin{pmatrix} x_0 \\\\ x_{N-1} \\\\ x_{N-2} \\\\ \\vdots \\\\ x_1 \\end{pmatrix}\n",
+    "$$\n",
+    "What happens when we combine this with a DFT, e.g. to apply the convolution theorem?  Show that $FRx = \\bar{F}x$.\n",
+    "\n",
+    "(This is illustrated by the code below: `fft(x)` computes $Fx$ and `bfft(x)` computes $\\bar{F}x$ using the [FFTW.jl package](https://github.com/JuliaMath/FFTW.jl).)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "e1ce6605",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "true"
+      ]
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "using FFTW\n",
+    "R(x) = [x[begin]; x[end:-1:begin+1]] # compute R*x\n",
+    "x = rand(20)\n",
+    "fft(R(x)) ≈ bfft(x)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "be151905",
+   "metadata": {},
+   "source": [
+    "## Solutions:\n",
+    "\n",
+    "**(a)** There are many ways to approach this problem, one approach is simply to re-trace our steps from class.\n",
+    "\n",
+    "In class, we showed that the columns of $F$ are eigenvectors of cyclic convolutions, with the corresponding eigenvalues given by $\\bar{F} a$.  Hence we obtained $a \\circledast x = F (\\bar{F}a \\,\\mbox{.*}) F^{-1} x =  \\frac{1}{N} F (\\bar{F}a \\, \\mbox{.*} \\, \\bar{F} x) = \\bar{F}^{-1} (\\tilde{a} \\,\\mbox{.*}\\, \\tilde{x})$.\n",
+    "\n",
+    "Now, we want to show almost the same thing but we want to transform in the opposite direction.  That is, we want to replace $F$ with $F^{-1} = \\bar{F}/N$ and vice versa.   If we proceed as in class, we can start by showing that the columns of $\\bar{F}$ are eigenvectors of convolutions.   But our proof from class for $F$ only relied on the cyclic property $\\omega_N^N = 1$!  The proof is completely unchanged if we replace $\\omega_N$ with $\\overline{\\omega_N}$ (corresponding to replacing $F$ with $\\bar{F}$)!\n",
+    "\n",
+    "Hence we can simply quote the results above and replace $F$ with $\\bar{F}$:\n",
+    "$$\n",
+    "\\tilde{y} = \\tilde{a} \\circledast \\tilde{x} = \\bar{F} (F\\tilde{a}\\, \\mbox{.*}) \\bar{F}^{-1} \\tilde{x} \n",
+    "=  F^{-1} (F\\tilde{a} \\,\\mbox{.*}\\, F\\tilde{x})\n",
+    "$$\n",
+    "Since $\\tilde{x} = \\bar{F} x$, that means $x = \\bar{F}^{-1} \\tilde{x} = F \\tilde{x} / N$, i.e. $F\\tilde{x} = Nx$.  Similarly for $a$.  Substituting that in, we obtain:\n",
+    "$$\n",
+    "\\tilde{y} = N^2 F^{-1} (a \\,\\mbox{.*}\\, x) \\Longleftrightarrow \\boxed{y = N (a \\,\\mbox{.*}\\, x)}.\n",
+    "$$\n",
+    "which is the desired result with $\\boxed{\\alpha = N}$.\n",
+    "\n",
+    "*Optional*: Since it's easy to get this algebra wrong, let's do a quick check of our result in Julia:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "d365ac92",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "true"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# brute-force circular convolution\n",
+    "function conv(a,x)\n",
+    "    N = length(a); N == length(x) || throw(DimensionMismatch())\n",
+    "    return [sum(i -> a[begin+i] * x[begin+mod(j-i, N)], 0:N-1) for j in 0:N-1]\n",
+    "end\n",
+    "\n",
+    "N = 10\n",
+    "ã = randn(ComplexF64, N)\n",
+    "x̃ = randn(ComplexF64, N)\n",
+    "ỹ = conv(ã, x̃)\n",
+    "a = fft(ã)/N # Fã/N\n",
+    "x = fft(x̃)/N # Fx̃/N\n",
+    "y = fft(ỹ)/N # Fỹ/N\n",
+    "y ≈ N * (a .* x)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b61d45ed",
+   "metadata": {},
+   "source": [
+    "Hooray!  Math works!\n",
+    "\n",
+    "**(b)**\n",
+    "\n",
+    "If $y = Rx$, then $y_n = x_{-n}$ (with indices interpreted $\\mod N$ as usual).  If $z = FRx = Fy$, then we have:\n",
+    "$$\n",
+    "z_m = \\sum_{n=0}^{N-1} \\omega_N^{mn} y_n = \\sum_{n=0}^{N-1} \\omega_N^{mn} x_{-n} = \\sum_{k=0}^{N-1} \\omega_N^{-mk} x_{k} = (\\bar{F} x)_m\n",
+    "$$\n",
+    "where we simply performed a change of variables $k = -n$ in the sum.  Hence $z = FRx = \\bar{F}x$ as desired."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fc26407d",
+   "metadata": {},
+   "source": [
+    "## Problem 2 (10+10+10 points)\n",
+    "\n",
+    "Think way back to grade school, when you learned to multiply two numbers $a_{N_a-1} \\cdots a_2 a_1 a_0 \\times x_{N_x-1} \\cdots x_2 x_1 x_0$, where $a_k$ and $x_k$ are the **decimal digits** of two numbers with $N_a$ and $N_x$ digits, respectively.  (That is, $x_{N_x-1} \\cdots x_2 x_1 x_0 = \\sum_k {10}^k x_k$.)\n",
+    "\n",
+    "**(a)** Show that, if you ignore carries (i.e. you allow \"digits\" $> 9$) then the *result* $ y_{N_y-1} \\cdots y_2 y_1 y_0$ is in fact a **convolution** of the digits $y = a \\ast x$, but *not* a cyclic convolution — there is no \"wraparound\".  (What is $N_y$?)\n",
+    "\n",
+    "**(b)** Show that you can write your previous answer as a *cyclic* convolution simply by \"padding\" $a,x,y$ with zeros.  That is, if you add some zeros to each of them (in the right place) to extend them to a common length $N$ (what must $N$ be?), then $\\mbox{pad}(y,N) = \\mbox{pad}(a,N) \\circledast \\mbox{pad}(x,N)$.\n",
+    "\n",
+    "**(c)** Write Julia code to compute\n",
+    "$$\n",
+    "\\underbrace{94403147635990179114}_a \\times \\underbrace{33855185913241354461}_x \\\\\n",
+    "= \\underbrace{3196036114011618584561027454963352927554}_y\n",
+    "$$\n",
+    "by:\n",
+    "* Implement the function `pad(x,N)` to zero-pad as in part (b)\n",
+    "* Implement a function `conv(a,x)` to perform a circular convolution using $Fx =$ `fft(x)` and $\\bar{F}x =$ `bfft(x)`.  (You may want to call `round.(Int,real(y))` on the result to discard tiny roundoff errors, since the final values should be integers).\n",
+    "* Implement a function `carry(y)` to perform the carries: any \"digits\" larger than $10$ should have the tens place \"carried\" to the next digit etc.  (You may find the [`divrem` function](https://docs.julialang.org/en/v1/base/math/#Base.divrem) useful to divide by 10 and find a remainder.)\n",
+    "Check that you get the right answer: `bignum(carry(y)) == 3196036114011618584561027454963352927554`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "b098b074",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "bignum(a) = 94403147635990179114\n",
+      "bignum(x) = 33855185913241354461\n",
+      "bignum(a) * bignum(x) = 3196036114011618584561027454963352927554\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "3196036114011618584561027454963352927554"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "using FFTW\n",
+    "\n",
+    "a = [4, 1, 1, 9, 7, 1, 0, 9, 9, 5, 3, 6, 7, 4, 1, 3, 0, 4, 4, 9]\n",
+    "x = [1, 6, 4, 4, 5, 3, 1, 4, 2, 3, 1, 9, 5, 8, 1, 5, 5, 8, 3, 3]\n",
+    "\n",
+    "bignum(x) = evalpoly(BigInt(10), x)\n",
+    "@show bignum(a)\n",
+    "@show bignum(x)\n",
+    "@show bignum(a) * bignum(x)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bb0e34b1",
+   "metadata": {},
+   "source": [
+    "## Solutions:\n",
+    "\n",
+    "**(a)** Think about the usual \"long multiplication algorithm, omitting carries.  To get the 1's place of the output, i.e. the \"digit\" $y_0$, we simply multiply the 1's -places of the inputs: $y_0 = a_0 x_0$.   To get the 10's place of the ouptut, i.e. the \"digit\" $y_1$, we multiply the 1's place of $a$ with the 10's place of $x$ and vice versa, and add them, i.e. $y_1 = a_0 x_1 + a_1 x_0$.  And so forth, giving a pattern like:\n",
+    "$$\n",
+    "y_0 = a_0 x_0 \\\\\n",
+    "y_1 = a_0 x_1 + a_1 x_0 \\\\\n",
+    "y_2 = a_0 x_2 + a_1 x_1 + a_2 x_0 \\\\\n",
+    "\\vdots \\\\\n",
+    "y_m = \\sum_{n=0}^m a_n x_{m-n}\n",
+    "$$\n",
+    "which is exactly in the form of a linear convolution!  To make the sums look more uniform, we can equivalently think of each number as having an *infinite* number of digits, but most of the digits are zero (e.g. $x_k = 0$ for $k < 0$ or $k \\ge N_x$, where the former correspond to \"zeros after the decimal point\" and the latter correspond to zeros before the most significant digit).  With this convention, we can simply  write:\n",
+    "$$\n",
+    "y_m = \\sum_{n=-\\infty}^\\infty a_n x_{m-n}\n",
+    "$$\n",
+    "\n",
+    "**(b)** We must pad with zeros to a length $N$ that is big enough that the circular \"wrap around\" of a cyclic convolution does not distort the result — the \"wrapped around\" digits must all be zeros.  Equivalently, since a convolution is a sequence of dot product where we \"slide\" $x$ past $a$ (or vice versa), the overlap should be zero before $x$ slides past the boundary.  The last possible overlap, corresponds to the most significant digit in the output $y$, which is $y_m$ for $m = N_a + N_x - 2$: the output (not including carries) has $N_a + N_x - 1$ digits.\n",
+    "In short, we simply need to pad to a length long enough to hold all the digits of the output, i.e. to any $\\boxed{N \\ge N_a + N_x - 1}$.\n",
+    "\n",
+    "Be careful in part (c), however!  Once we include carries, this can introduce *additional* digits, so the carried result may eventually be a longer length.\n",
+    "\n",
+    "**(c)** Our implementation and tests are as follows"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "id": "bdcd7ae6",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "conv (generic function with 2 methods)"
+      ]
+     },
+     "execution_count": 22,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# pad x with N-length(x) zeros of the same type\n",
+    "pad(x, N) = [x; zeros(eltype(x), N - length(x))]\n",
+    "    \n",
+    "# there are many ways to write this,\n",
+    "# since from problem 1 the convolution theorem takes many forms\n",
+    "conv(x,y) = ifft(fft(x) .* fft(y))\n",
+    "# or: conv(x,y) = bfft(fft(x) .* fft(y)) / length(x)\n",
+    "# or: conv(x,y) = fft(bfft(x) .* bfft(y)) / length(x)\n",
+    "\n",
+    "# for integer vectors, we expect an integer result, but\n",
+    "# FFTs will give us slightly complex/noninteger results due\n",
+    "# to roundoff errors.  Let's correct this as suggested:\n",
+    "conv(x::AbstractVector{<:Integer}, y::AbstractVector{<:Integer}) =\n",
+    "    round.(promote_type(eltype(x),eltype(y)), real.(conv(float.(x), float.(y))))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "792e6fbb",
+   "metadata": {},
+   "source": [
+    "Before we do any carries, let's check that our convolution and zero-padding are correct — evaluating the resulting \"digits\" as an integer using the `bignum` function from above should still work even if we have digits $> 10$."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "id": "8f411509",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "bignum(y) == bignum(a) * bignum(x) = true\n",
+      "bignum(y) == bignum(a) * bignum(x) = true\n",
+      "bignum(y) == bignum(a) * bignum(x) = true\n",
+      "bignum(y) == bignum(a) * bignum(x) = true\n",
+      "bignum(y) == bignum(a) * bignum(x) = true\n",
+      "bignum(y) == bignum(a) * bignum(x) = true\n"
+     ]
+    }
+   ],
+   "source": [
+    "Na = length(a)\n",
+    "Nx = length(x)\n",
+    "\n",
+    "# any length ≥ Na + Nx - 1 should work.  let's try a few:\n",
+    "for N in (Na + Nx - 1) .+ (0:5)\n",
+    "    y = conv(pad(a, N), pad(x, N))\n",
+    "    @show bignum(y) == bignum(a) * bignum(x)\n",
+    "end"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "07051556",
+   "metadata": {},
+   "source": [
+    "Now, let's implement the carry function.  The key step, for a digit $y_k = {}$ `y[k]`, is that we want to find the quotient $q$ and the remainder $r$ after dividing by 10, which can be accomplished in Julia by:\n",
+    "```jl\n",
+    "q, r = divrem(y[k], 10)\n",
+    "```\n",
+    "Then, the remainder is the *new k-th digit* ($y_k \\to r$) and the quotient *gets carried* (added to $y_{k+1}$).   Note, however, that `y[k]` in this expression *includes any previous carries*."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "id": "53f03b79",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "carry (generic function with 1 method)"
+      ]
+     },
+     "execution_count": 29,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "function carry(y)\n",
+    "    c = similar(y, 0) # result array of the same type as y, of zero length\n",
+    "    push!(c, 0) # let's start off with a result of zero.\n",
+    "    for k = 1:length(y)\n",
+    "        q, r = divrem(y[k]+c[k], 10)\n",
+    "        c[k] = r # remainder = new digit (+ any previous carry)\n",
+    "        push!(c, q) # push quotient as the next digit\n",
+    "    end\n",
+    "    return c\n",
+    "end"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0dabf98c",
+   "metadata": {},
+   "source": [
+    "Let's try it:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 32,
+   "id": "1a7587ea",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "y = [4, 5, 5, 7, 2, 9, 2, 5, 3, 3, 6, 9, 4, 5, 4, 7, 2, 0, 1, 6, 5, 4, 8, 5, 8, 1, 6, 1, 1, 0, 4, 1, 1, 6, 3, 0, 6, 9, 1, 3]\n",
+      "y == digits(bignum(a) * bignum(x)) = true\n",
+      "bignum(y) == bignum(a) * bignum(x) = true\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "true"
+      ]
+     },
+     "execution_count": 32,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "N = Na + Nx - 1\n",
+    "y = carry(conv(pad(a, N), pad(x, N)))\n",
+    "\n",
+    "@show y\n",
+    "@show y == digits(bignum(a) * bignum(x))\n",
+    "@show bignum(y) == bignum(a) * bignum(x)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5d6e8273",
+   "metadata": {},
+   "source": [
+    "Hooray, it's the correct digits $319603 \\ldots 927554$ (starting with the least signifcant!) and passes all of our tests!\n",
+    "\n",
+    "Note that the carried result is one digit longer than $N$ digits:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 33,
+   "id": "d4dd3e88",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "N = 39\n",
+      "length(y) = 40\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "40"
+      ]
+     },
+     "execution_count": 33,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "@show N\n",
+    "@show length(y)"
+   ]
+  }
+ ],
+ "metadata": {
+  "@webio": {
+   "lastCommId": null,
+   "lastKernelId": null
+  },
+  "kernelspec": {
+   "display_name": "Julia 1.8.3",
+   "language": "julia",
+   "name": "julia-1.8"
+  },
+  "language_info": {
+   "file_extension": ".jl",
+   "mimetype": "application/julia",
+   "name": "julia",
+   "version": "1.8.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}