PDF of Eric’s handwritten notes are here.
Given two polynomials we would like to compute their product polynomial. Thus given
and , we want to compute where and for .
A natural way to represent a polynomial is by its coefficients. Hence we will assume each of the polynomials and is given by the vector of coefficients and our goal is to determine the vector of coefficients of the product polynomial . Here is the general problem formulation:
Input: Vectors and .
Output: The vector which are the coefficients for their product polynomial .
The vector is known as the convolution of and and the convolution operation is denoted as: .
We can compute each in time and thus we can compute all coefficients in time.
We can do better via Fast Fourier Transform (FFT), an elegant divide and conquer algorithm which solves this problem in time.
We first look at a few applications of the convolution algorithm and then we dive into the algorithm.
Given two binary strings: a long string and a pattern , find all occurrences of in . In other words, find all positions where: .
The strings and are binary strings so say the alphabet is . It’ll be useful to represent the strings and as vectors with components in by replacing by and by .
Thus let where if and if . Similarly, define .
The algorithm is motivated by the following observation. Take two vectors and where . Look at their dot product . If then whereas if then . Thus our idea is that to check if there is a match of with we would like to take the dot product of the corresponding strings. Consider the convolution of with . Then the entry which is the dot product of the vector with , so it is almost what we want. We instead need to first reverse the vector . Thus let denote reversed.
Let denote the convolution of and .
Suppose we have a match starting at index , that is . Then we also have that
So in the new vector setting, a match corresponds to a coefficient in the convolution equal to .
Given a vector , the goal is to ‘smooth out’ the vector, for example to reduce the noise or account for outliers. Here are some natural, common filters.
We replace the data point by the average of itself and its neighboring points within a window of size for a parameter . Thus replace by .
Therefore, the filter is the vector of size . Observe that the vector .
We could instead compute a weighted average where closer points have larger weights. A mean filter corresponds to having equal weights and we could instead use a filter with weights corresponding to a discrete Gaussian:
where is a normalizing constant so that the weights sum to 1.
Both of the above filters can be computed by taking the convolution for the appropriate .
Now let’s go back to our original problem of multiplying two polynomials.
There are two natural ways of representing a polynomial :
- Coefficient representation: A list of coefficients
- Point representation: Any distinct points
Note that the above representations are equivalent because of the following lemma.
Lemma: A degree polynomial is uniquely characterized by its values at any points.
The case d=2 is the familiar case of a line: a line can be represented by its slope and y-intercept or by any 2 points on the line.
Observe that it is easier to multiply polynomials in point representation since:
. Thus given two polynomials in point representation (with the same set of 2d points) then we can multiply the polynomials in O(d) time. So the key to multiplying polynomials will be to convert the polynomials between their coefficient representation and their point representation. This is exactly what FFT does in time for a carefully selected set of points. We now can sketch the idea for the polynomial multiplication algorithm and then we’ll delve into the design of the FFT algorithm.
High Level Idea for Polynomial Multiplication
- Convert from coefficient representation to point representation (at 2d points) in time using FFT.
- Evaluate the point representation of in time using pointwise multiplication.
- Convert from point representation to coefficient representation in time using Inverse FFT.
FFT – Preliminary Idea
Given , we will evaluate at points . For convenience, assume is a power of .
The key idea is to construct 2d points with the following property: the first d points are the opposite of the second d points. In other words, for all . Why do we want this property? Look at the d terms when we evaluate for and also for . Notice that the even terms are the same for both while the odd terms are opposites. Thus it makes sense to separately consider the even and odd terms of the polynomial A(x).
Consider the polynomials represented by even and odd indices of :
Observe the relation:
Thus for points such that we have that:
Thus, to evaluate at these points, we need to evaluate and at the points and combine them in time. If we choose points which satisfy the relation at all levels of recursion, it would lead to the recurrence which simplifies to . However at the next level of recursion we will have the points: . We want these points to have the property and thus we want . But and thus both points are positive so they cannot be opposites. Therefore to obtain the property we need to consider complex points. The points will be carefully selected complex values.
Complex Numbers Review
We can represent a complex number as a point in the complex plane or in polar coordinates where:
Euler’s Identity gives a convenient form for a complex number by noting that and thus .
A benefit of the representation in polar coordinates is that they are convenient for multiplication:
A few convenient basic facts:
Multiplication by :
We will use the nth complex roots of unity:
A complex number that is a solution to the equation is referred to as an root of unity. If we work in polar coordinates, then we can obtain a very nice classification of all roots of unity.
We are looking for such that . This implies that and for some . We denote:
and then all of the roots of unity are
- They satisfy the property when is even i.e.
To see why, observe that and
- The squares of the roots of unity are the roots of unity when is a power of . To see why, observe that
Conclusion: we can use the (2n)th roots of unity as our 2n points.
To compute the polynomial of degree at the (2n)th complex roots of unity:
we recursively compute and which are degree at the nth complex roots of unity, and then we spend combining the values of and to get at the (2n)th complex roots of unity.
We’ll recap the algorithm in detail next class.