pset 6 solutions

This commit is contained in:
Steven G. Johnson 2023-05-08 11:45:16 -04:00
parent 76957f16dc
commit e1c3af4f37
2 changed files with 440 additions and 1 deletions

View File

@ -376,4 +376,4 @@ Fourier series vs. DFT: If we view the DFT as a [Riemann sum](https://en.wikiped
* IIR filter stability: poles zₚ must lie inside the unit circle |zₚ|
* IIR filter design: non-convex and complicated, lots of algorithms and approximations.
* [Kronecker products A⊗B](https://en.wikipedia.org/wiki/Kronecker_product) and ["vectorization" vec(A)](https://en.wikipedia.org/wiki/Vectorization_(mathematics)): expressing "multidimensional linear operations" ([multilinear algebra](https://en.wikipedia.org/wiki/Multilinear_algebra)) as ordinary matrixvector products.
* pset 6 solutions: coming soon
* [pset 6 solutions](psets/pset6sol.ipynb)

439
psets/pset6sol.ipynb Normal file
View File

@ -0,0 +1,439 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a65bff2e",
"metadata": {},
"source": [
"# 18.065 Problem Set 6 Solutions"
]
},
{
"cell_type": "markdown",
"id": "46c6dd3c",
"metadata": {},
"source": [
"## Problem 1 (10+10 points)\n",
"\n",
"**(a)** In class, we showed that if $\\tilde{y} = \\bar{F} y, \\tilde{a} = \\bar{F} a, \\tilde{x} = \\bar{F} x$ where $F$ is the DFT matrix, then $y = a \\circledast x$ (circular convolution) implies $\\tilde{y} = \\tilde{a} \\, \\mbox{.*} \\, \\tilde{x}$ (element-wise product). Show that the reverse is also true: show that if $\\tilde{y} = \\tilde{a} \\circledast \\tilde{x}$, then $y = \\alpha (a \\, \\mbox{.*} \\, x) $ for some scale factor $\\alpha = ???$. \n",
"\n",
"**(b)** Differentiating through convolutions, in class, involved the \"reversal\" permutation $R$:\n",
"$$\n",
"R\\begin{pmatrix} x_0 \\\\ x_1 \\\\ x_2 \\\\ \\vdots \\\\ x_{N-1} \\end{pmatrix}\n",
"=\\begin{pmatrix} x_0 \\\\ x_{N-1} \\\\ x_{N-2} \\\\ \\vdots \\\\ x_1 \\end{pmatrix}\n",
"$$\n",
"What happens when we combine this with a DFT, e.g. to apply the convolution theorem? Show that $FRx = \\bar{F}x$.\n",
"\n",
"(This is illustrated by the code below: `fft(x)` computes $Fx$ and `bfft(x)` computes $\\bar{F}x$ using the [FFTW.jl package](https://github.com/JuliaMath/FFTW.jl).)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "e1ce6605",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"true"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using FFTW\n",
"R(x) = [x[begin]; x[end:-1:begin+1]] # compute R*x\n",
"x = rand(20)\n",
"fft(R(x)) ≈ bfft(x)"
]
},
{
"cell_type": "markdown",
"id": "be151905",
"metadata": {},
"source": [
"## Solutions:\n",
"\n",
"**(a)** There are many ways to approach this problem, one approach is simply to re-trace our steps from class.\n",
"\n",
"In class, we showed that the columns of $F$ are eigenvectors of cyclic convolutions, with the corresponding eigenvalues given by $\\bar{F} a$. Hence we obtained $a \\circledast x = F (\\bar{F}a \\,\\mbox{.*}) F^{-1} x = \\frac{1}{N} F (\\bar{F}a \\, \\mbox{.*} \\, \\bar{F} x) = \\bar{F}^{-1} (\\tilde{a} \\,\\mbox{.*}\\, \\tilde{x})$.\n",
"\n",
"Now, we want to show almost the same thing but we want to transform in the opposite direction. That is, we want to replace $F$ with $F^{-1} = \\bar{F}/N$ and vice versa. If we proceed as in class, we can start by showing that the columns of $\\bar{F}$ are eigenvectors of convolutions. But our proof from class for $F$ only relied on the cyclic property $\\omega_N^N = 1$! The proof is completely unchanged if we replace $\\omega_N$ with $\\overline{\\omega_N}$ (corresponding to replacing $F$ with $\\bar{F}$)!\n",
"\n",
"Hence we can simply quote the results above and replace $F$ with $\\bar{F}$:\n",
"$$\n",
"\\tilde{y} = \\tilde{a} \\circledast \\tilde{x} = \\bar{F} (F\\tilde{a}\\, \\mbox{.*}) \\bar{F}^{-1} \\tilde{x} \n",
"= F^{-1} (F\\tilde{a} \\,\\mbox{.*}\\, F\\tilde{x})\n",
"$$\n",
"Since $\\tilde{x} = \\bar{F} x$, that means $x = \\bar{F}^{-1} \\tilde{x} = F \\tilde{x} / N$, i.e. $F\\tilde{x} = Nx$. Similarly for $a$. Substituting that in, we obtain:\n",
"$$\n",
"\\tilde{y} = N^2 F^{-1} (a \\,\\mbox{.*}\\, x) \\Longleftrightarrow \\boxed{y = N (a \\,\\mbox{.*}\\, x)}.\n",
"$$\n",
"which is the desired result with $\\boxed{\\alpha = N}$.\n",
"\n",
"*Optional*: Since it's easy to get this algebra wrong, let's do a quick check of our result in Julia:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d365ac92",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"true"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# brute-force circular convolution\n",
"function conv(a,x)\n",
" N = length(a); N == length(x) || throw(DimensionMismatch())\n",
" return [sum(i -> a[begin+i] * x[begin+mod(j-i, N)], 0:N-1) for j in 0:N-1]\n",
"end\n",
"\n",
"N = 10\n",
"ã = randn(ComplexF64, N)\n",
"x̃ = randn(ComplexF64, N)\n",
"ỹ = conv(ã, x̃)\n",
"a = fft(ã)/N # Fã/N\n",
"x = fft(x̃)/N # Fx̃/N\n",
"y = fft(ỹ)/N # Fỹ/N\n",
"y ≈ N * (a .* x)"
]
},
{
"cell_type": "markdown",
"id": "b61d45ed",
"metadata": {},
"source": [
"Hooray! Math works!\n",
"\n",
"**(b)**\n",
"\n",
"If $y = Rx$, then $y_n = x_{-n}$ (with indices interpreted $\\mod N$ as usual). If $z = FRx = Fy$, then we have:\n",
"$$\n",
"z_m = \\sum_{n=0}^{N-1} \\omega_N^{mn} y_n = \\sum_{n=0}^{N-1} \\omega_N^{mn} x_{-n} = \\sum_{k=0}^{N-1} \\omega_N^{-mk} x_{k} = (\\bar{F} x)_m\n",
"$$\n",
"where we simply performed a change of variables $k = -n$ in the sum. Hence $z = FRx = \\bar{F}x$ as desired."
]
},
{
"cell_type": "markdown",
"id": "fc26407d",
"metadata": {},
"source": [
"## Problem 2 (10+10+10 points)\n",
"\n",
"Think way back to grade school, when you learned to multiply two numbers $a_{N_a-1} \\cdots a_2 a_1 a_0 \\times x_{N_x-1} \\cdots x_2 x_1 x_0$, where $a_k$ and $x_k$ are the **decimal digits** of two numbers with $N_a$ and $N_x$ digits, respectively. (That is, $x_{N_x-1} \\cdots x_2 x_1 x_0 = \\sum_k {10}^k x_k$.)\n",
"\n",
"**(a)** Show that, if you ignore carries (i.e. you allow \"digits\" $> 9$) then the *result* $ y_{N_y-1} \\cdots y_2 y_1 y_0$ is in fact a **convolution** of the digits $y = a \\ast x$, but *not* a cyclic convolution — there is no \"wraparound\". (What is $N_y$?)\n",
"\n",
"**(b)** Show that you can write your previous answer as a *cyclic* convolution simply by \"padding\" $a,x,y$ with zeros. That is, if you add some zeros to each of them (in the right place) to extend them to a common length $N$ (what must $N$ be?), then $\\mbox{pad}(y,N) = \\mbox{pad}(a,N) \\circledast \\mbox{pad}(x,N)$.\n",
"\n",
"**(c)** Write Julia code to compute\n",
"$$\n",
"\\underbrace{94403147635990179114}_a \\times \\underbrace{33855185913241354461}_x \\\\\n",
"= \\underbrace{3196036114011618584561027454963352927554}_y\n",
"$$\n",
"by:\n",
"* Implement the function `pad(x,N)` to zero-pad as in part (b)\n",
"* Implement a function `conv(a,x)` to perform a circular convolution using $Fx =$ `fft(x)` and $\\bar{F}x =$ `bfft(x)`. (You may want to call `round.(Int,real(y))` on the result to discard tiny roundoff errors, since the final values should be integers).\n",
"* Implement a function `carry(y)` to perform the carries: any \"digits\" larger than $10$ should have the tens place \"carried\" to the next digit etc. (You may find the [`divrem` function](https://docs.julialang.org/en/v1/base/math/#Base.divrem) useful to divide by 10 and find a remainder.)\n",
"Check that you get the right answer: `bignum(carry(y)) == 3196036114011618584561027454963352927554`."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "b098b074",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"bignum(a) = 94403147635990179114\n",
"bignum(x) = 33855185913241354461\n",
"bignum(a) * bignum(x) = 3196036114011618584561027454963352927554\n"
]
},
{
"data": {
"text/plain": [
"3196036114011618584561027454963352927554"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using FFTW\n",
"\n",
"a = [4, 1, 1, 9, 7, 1, 0, 9, 9, 5, 3, 6, 7, 4, 1, 3, 0, 4, 4, 9]\n",
"x = [1, 6, 4, 4, 5, 3, 1, 4, 2, 3, 1, 9, 5, 8, 1, 5, 5, 8, 3, 3]\n",
"\n",
"bignum(x) = evalpoly(BigInt(10), x)\n",
"@show bignum(a)\n",
"@show bignum(x)\n",
"@show bignum(a) * bignum(x)"
]
},
{
"cell_type": "markdown",
"id": "bb0e34b1",
"metadata": {},
"source": [
"## Solutions:\n",
"\n",
"**(a)** Think about the usual \"long multiplication algorithm, omitting carries. To get the 1's place of the output, i.e. the \"digit\" $y_0$, we simply multiply the 1's -places of the inputs: $y_0 = a_0 x_0$. To get the 10's place of the ouptut, i.e. the \"digit\" $y_1$, we multiply the 1's place of $a$ with the 10's place of $x$ and vice versa, and add them, i.e. $y_1 = a_0 x_1 + a_1 x_0$. And so forth, giving a pattern like:\n",
"$$\n",
"y_0 = a_0 x_0 \\\\\n",
"y_1 = a_0 x_1 + a_1 x_0 \\\\\n",
"y_2 = a_0 x_2 + a_1 x_1 + a_2 x_0 \\\\\n",
"\\vdots \\\\\n",
"y_m = \\sum_{n=0}^m a_n x_{m-n}\n",
"$$\n",
"which is exactly in the form of a linear convolution! To make the sums look more uniform, we can equivalently think of each number as having an *infinite* number of digits, but most of the digits are zero (e.g. $x_k = 0$ for $k < 0$ or $k \\ge N_x$, where the former correspond to \"zeros after the decimal point\" and the latter correspond to zeros before the most significant digit). With this convention, we can simply write:\n",
"$$\n",
"y_m = \\sum_{n=-\\infty}^\\infty a_n x_{m-n}\n",
"$$\n",
"\n",
"**(b)** We must pad with zeros to a length $N$ that is big enough that the circular \"wrap around\" of a cyclic convolution does not distort the result — the \"wrapped around\" digits must all be zeros. Equivalently, since a convolution is a sequence of dot product where we \"slide\" $x$ past $a$ (or vice versa), the overlap should be zero before $x$ slides past the boundary. The last possible overlap, corresponds to the most significant digit in the output $y$, which is $y_m$ for $m = N_a + N_x - 2$: the output (not including carries) has $N_a + N_x - 1$ digits.\n",
"In short, we simply need to pad to a length long enough to hold all the digits of the output, i.e. to any $\\boxed{N \\ge N_a + N_x - 1}$.\n",
"\n",
"Be careful in part (c), however! Once we include carries, this can introduce *additional* digits, so the carried result may eventually be a longer length.\n",
"\n",
"**(c)** Our implementation and tests are as follows"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "bdcd7ae6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"conv (generic function with 2 methods)"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# pad x with N-length(x) zeros of the same type\n",
"pad(x, N) = [x; zeros(eltype(x), N - length(x))]\n",
" \n",
"# there are many ways to write this,\n",
"# since from problem 1 the convolution theorem takes many forms\n",
"conv(x,y) = ifft(fft(x) .* fft(y))\n",
"# or: conv(x,y) = bfft(fft(x) .* fft(y)) / length(x)\n",
"# or: conv(x,y) = fft(bfft(x) .* bfft(y)) / length(x)\n",
"\n",
"# for integer vectors, we expect an integer result, but\n",
"# FFTs will give us slightly complex/noninteger results due\n",
"# to roundoff errors. Let's correct this as suggested:\n",
"conv(x::AbstractVector{<:Integer}, y::AbstractVector{<:Integer}) =\n",
" round.(promote_type(eltype(x),eltype(y)), real.(conv(float.(x), float.(y))))"
]
},
{
"cell_type": "markdown",
"id": "792e6fbb",
"metadata": {},
"source": [
"Before we do any carries, let's check that our convolution and zero-padding are correct — evaluating the resulting \"digits\" as an integer using the `bignum` function from above should still work even if we have digits $> 10$."
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "8f411509",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"bignum(y) == bignum(a) * bignum(x) = true\n",
"bignum(y) == bignum(a) * bignum(x) = true\n",
"bignum(y) == bignum(a) * bignum(x) = true\n",
"bignum(y) == bignum(a) * bignum(x) = true\n",
"bignum(y) == bignum(a) * bignum(x) = true\n",
"bignum(y) == bignum(a) * bignum(x) = true\n"
]
}
],
"source": [
"Na = length(a)\n",
"Nx = length(x)\n",
"\n",
"# any length ≥ Na + Nx - 1 should work. let's try a few:\n",
"for N in (Na + Nx - 1) .+ (0:5)\n",
" y = conv(pad(a, N), pad(x, N))\n",
" @show bignum(y) == bignum(a) * bignum(x)\n",
"end"
]
},
{
"cell_type": "markdown",
"id": "07051556",
"metadata": {},
"source": [
"Now, let's implement the carry function. The key step, for a digit $y_k = {}$ `y[k]`, is that we want to find the quotient $q$ and the remainder $r$ after dividing by 10, which can be accomplished in Julia by:\n",
"```jl\n",
"q, r = divrem(y[k], 10)\n",
"```\n",
"Then, the remainder is the *new k-th digit* ($y_k \\to r$) and the quotient *gets carried* (added to $y_{k+1}$). Note, however, that `y[k]` in this expression *includes any previous carries*."
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "53f03b79",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"carry (generic function with 1 method)"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function carry(y)\n",
" c = similar(y, 0) # result array of the same type as y, of zero length\n",
" push!(c, 0) # let's start off with a result of zero.\n",
" for k = 1:length(y)\n",
" q, r = divrem(y[k]+c[k], 10)\n",
" c[k] = r # remainder = new digit (+ any previous carry)\n",
" push!(c, q) # push quotient as the next digit\n",
" end\n",
" return c\n",
"end"
]
},
{
"cell_type": "markdown",
"id": "0dabf98c",
"metadata": {},
"source": [
"Let's try it:"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "1a7587ea",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"y = [4, 5, 5, 7, 2, 9, 2, 5, 3, 3, 6, 9, 4, 5, 4, 7, 2, 0, 1, 6, 5, 4, 8, 5, 8, 1, 6, 1, 1, 0, 4, 1, 1, 6, 3, 0, 6, 9, 1, 3]\n",
"y == digits(bignum(a) * bignum(x)) = true\n",
"bignum(y) == bignum(a) * bignum(x) = true\n"
]
},
{
"data": {
"text/plain": [
"true"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"N = Na + Nx - 1\n",
"y = carry(conv(pad(a, N), pad(x, N)))\n",
"\n",
"@show y\n",
"@show y == digits(bignum(a) * bignum(x))\n",
"@show bignum(y) == bignum(a) * bignum(x)"
]
},
{
"cell_type": "markdown",
"id": "5d6e8273",
"metadata": {},
"source": [
"Hooray, it's the correct digits $319603 \\ldots 927554$ (starting with the least signifcant!) and passes all of our tests!\n",
"\n",
"Note that the carried result is one digit longer than $N$ digits:"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "d4dd3e88",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"N = 39\n",
"length(y) = 40\n"
]
},
{
"data": {
"text/plain": [
"40"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@show N\n",
"@show length(y)"
]
}
],
"metadata": {
"@webio": {
"lastCommId": null,
"lastKernelId": null
},
"kernelspec": {
"display_name": "Julia 1.8.3",
"language": "julia",
"name": "julia-1.8"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}