Discrete Fourier Transform
The discrete Fourier transform (DFT) is a method for converting a sequence of \(N\) complex numbers \( x_0,x_1,\ldots,x_{N-1}\) to a new sequence of \(N\) complex numbers,
\[ X_k = \sum_{n=0}^{N-1} x_n e^{-2\pi i kn/N}, \]
for \( 0 \le k \le N-1.\)
The \(x_i\) are thought of as the values of a function, or signal, at equally spaced times \(t=0,1,\ldots,N-1.\) The output \(X_k\) is a complex number which encodes the amplitude and phase of a sinusoidal wave with frequency \(\frac kN\) cycles per time unit. \(\big(\)This comes from Euler's formula: \( e^{2\pi i kn/N} = \cos(2\pi kn/N) + i\sin(2\pi kn/N).\big)\) The effect of computing the \(X_k\) is to find the coefficients of an approximation of the signal by a linear combination of such waves. Since each wave has an integer number of cycles per \(N\) time units, the approximation will be periodic with period \(N.\) This approximation is given by the inverse Fourier transform
\[ x_n = \frac1{N} \sum_{k=0}^{N-1} X_k e^{2\pi ikn/N}. \]
The DFT is useful in many applications, including the simple signal spectral analysis outlined above. Knowing how a signal can be expressed as a combination of waves allows for manipulation of that signal and comparisons of different signals:
Digital files (jpg, mp3, etc.) can be shrunk by eliminating contributions from the least important waves in the combination.
Different sound files can be compared by comparing the coefficients \(X_k\) of the DFT.
Radio waves can be filtered to avoid "noise" and listen to the important components of the signal.
Other applications of the DFT arise because it can be computed very efficiently by the fast Fourier transform (FFT) algorithm. For example, the DFT is used in state-of-the-art algorithms for multiplying polynomials and large integers together; instead of working with polynomial multiplication directly, it turns out to be faster to compute the DFT of the polynomial functions and convert the problem of multiplying polynomials to an analogous problem involving their DFTs.
Contents
Basic Examples
Let \( x_0 = 1,\) \( x_1 = x_2 = \cdots =x_{N-1} = 0.\) Then the DFT of the \(x_n\) is \[ X_k = \sum_{n=0}^{N-1} x_n e^{-2\pi i k n/N} = 1. \] So this gives an expression of \( x_n\) as \[ x_n = \frac1{N} \sum_{k=0}^{N-1} e^{2\pi i kn /N}. \] Check: When \( n=0,\) this returns \(x_0 = \frac1{N}\cdot N = 1,\) and when \(n \ne 0\) this returns \[ \begin{align} x_n &= \frac1{N} \sum_{k=0}^{N-1} \big(e^{2\pi in/N}\big)^k \\\\ &= \frac1{N}\frac{\big(e^{2\pi in/N}\big)^N-1}{e^{2\pi in/N}-1} \\\\ &= \frac1{N}\frac{e^{2\pi i n} - 1}{e^{2\pi in/N}-1} \\\\ &= 0. \end{align} \]
So the DFT gives a breakdown of a "spike" into a sum of waves (equally weighted in this case), which all peak at \(t=0,\) but interfere with each other and cancel out perfectly at other integer time values \( < N.\)
If the spike occurs at a different time, the coefficients change:
Find the DFT of \((x_0,x_1,x_2,x_3) = (0,1,0,0).\)
In this case, \[ X_k = \sum_{n=0}^3 x_n e^{-2\pi i kn/4} = e^{-2\pi i k/4}. \] So \( X_0 = 1, X_1 = -i, X_2 = -1, X_3 = i.\) The answer is \( (1,-i,-1,i).\)
Proceeding as in the previous example, this gives the expansion \[ x_n = 1 - i e^{2\pi i n/4} - e^{4\pi i n/4} + i e^{6\pi i n/4}. \] Converting the complex coefficients to complex exponentials gives \[ \begin{align} x_n &= 1 + e^{6\pi i/4} e^{2\pi i n/4} + e^{4\pi i/4} e^{4\pi i n/4} + e^{2\pi i/4} e^{6 \pi i n/4} \\ &= 1 + e^{2\pi i(n+3)/4} + e^{2\pi i (2n+2)/4} + e^{2\pi i (3n+1)/4}. \end{align} \] This is an example of phase shifting occurring in the sum. Taking the real parts of both sides gives a sum of cosine waves: \[ x_n = 1 + \cos(2\pi n/4 + 3\pi/2) + \cos(4\pi n/4 + \pi) + \cos(6\pi n/4 + \pi/2), \] where the addition of \(3\pi/2, \pi, \pi/2\) has the effect of shifting the waves forward by \( 270^\circ, 180^\circ, 90^\circ,\) respectively. \(_\square\)
Orthogonality and the Inverse Transform
Why is the formula for the inverse transform true? Substitute the formula for \( X_k\) into the formula for \( x_n\): \[ \begin{align} \sum_{k=0}^{N-1} X_k e^{-2\pi ikn/N} &= \frac1{N} \sum_{k=0}^{N-1} \sum_{m=0}^{N-1} x_m e^{2\pi i k m/N} e^{-2\pi ikn/N} \\ &= \frac1{N} \sum_{k=0}^{N-1} \sum_{m=0}^{N-1} x_m e^{2\pi i k(m-n)/N} \\ &= \frac1{N} \sum_{m=0}^{N-1} x_m \sum_{k=0}^{N-1} e^{2\pi i k(m-n)/N}. \end{align} \] When \(m \ne n,\) the inner sum is \(0\) by the formula for a geometric series (as in the first example in the previous section). When \(m=n,\) the inner sum is \( N.\) So the entire sum is \( \frac1{N} \cdot x_n \cdot N = x_n,\) as desired.
Another way to think of this argument is that it is a consequence of orthogonality with respect to the complex dot product. Consider the complex vectors \[ v_k = \left( 1, \omega_N^k, \omega_N^{2k}, \ldots, \omega_N^{k(N-1)} \right), \] where \(\omega_N = e^{2\pi i/N}.\) It is not hard to check, by an argument similar to the one given above, that these complex vectors are orthogonal with respect to the complex dot product (or inner product): \( v_k \cdot v_\ell = N\) if \(k=\ell\), otherwise \(0.\) \(\big(\)Remember that the complex dot product of two vectors \( (x_i)\) and \( (y_i)\) is \(\sum x_i{\overline{y_i}},\) where the bar denotes complex conjugation.\(\big)\) There are \(N\) of these vectors, so they form an orthogonal basis of \( {\mathbb C}^N.\)
The DFT formula for \( X_k \) is simply that \(X_k = x \cdot v_k,\) where \(x\) is the vector \( (x_0,x_1,\ldots,x_{N-1}).\) The inverse formula recovers \(x\) in terms of \(X\), by writing \(x\) using the standard formula for expressing any vector as a linear combination of orthogonal basis vectors: \[ x = \sum_{k=0}^{N-1} \frac{x \cdot v_k}{v_k \cdot v_k} v_k = \frac1{N} \sum_{k=0}^{N-1} X_k v_k. \]
Convolution and Polynomial Multiplication
As a sample application of the DFT, consider polynomial multiplication.
Given two polynomials \( f(x) = a_nx^n + a_{n-1}x^{n-1} + \cdots + a_1 x + a_0\) and \(g(x) = b_mx^m + b_{m-1}x^{m-1} + \cdots + b_1 x + b_0,\) call the coefficient vectors \( (a_0,a_1,\cdots)\) and \( (b_0,b_1,\cdots)\) \(\bf a\) and \( \bf b\), respectively. The product \( f(x)g(x)\) will have degree \(n+m\) with the coefficients given by the convolution vector \( {\bf a} * {\bf b},\) with
\[ ({\bf a} * {\bf b})_k = \sum_{c=0}^k a_c b_{k-c}. \]
It is convenient to extend the vectors \( \bf a\) and \( \bf b\) to a common space \( {\mathbb C}^N\) by padding them with extra \(0\)s: take a value of \(N\) larger than \(m+n,\) and let \( a_x = b_y = 0\) if \(x\) or \(y\) is larger than \(n\) or \(m\), respectively. (For FFT applications it is often best to let \(N\) be a power of 2.) Then the beautiful fact about convolution and the DFT is
The DFT of \( {\bf a} * {\bf b} \) is the componentwise product of the DFT of \( \bf a \) and the DFT of \( \bf b\).
The proof of this fact is straightforward and can be found in most standard references.
So multiplying \(f(x)\) and \(g(x)\) can be accomplished by padding the coefficient vectors, computing their DFTs, multiplying the DFTs, and applying the inverse DFT to the result.
Find \( (1+x)\big(1+x+x^2\big)\) using the DFT.
Pad the coordinate vectors to \( (1,1,0,0) \) and \( (1,1,1,0).\) The DFT of \((1,1,0,0)\) is \( (2,1-i, 0, 1+i) \) \(\big(\)this follows from the two examples above, since the DFT is additive and we know the DFTs of \( (1,0,0,0) \) and \( (0,1,0,0)\big),\) and the DFT of \( (1,1,1,0) \) is \( (3,-i,1,i).\) The coordinatewise product of these two is \( (6,-1-i,0,-1+i),\) and the inverse DFT of this vector is \( (1,2,2,1).\) So the product is \( 1+2x+2x^2+x^3.\) \(_\square\)
This may seem like a roundabout way to accomplish a simple polynomial multiplication, but in fact it is quite efficient due to the existence of a fast Fourier transform (FFT). The point is that a normal polynomial multiplication requires \( O(N^2)\) multiplications of integers, while the coordinatewise multiplication in this algorithm requires only \( O(N)\) multiplications. The FFT algorithm is \( O(N \log N),\) so the polynomial multiplication algorithm which uses the FFT is \( O(N \log N)\) as well.
Multiplying large integers is done in the same way: an integer like \( 1408\) can be viewed as the evaluation of the polynomial \( 8 + 4x^2+x^3\) at \(x=10,\) so multiplying integers can be reduced to multiplying polynomials (and then evaluating at 10, or whatever base is most convenient).