Fast Polynomial Multiplication (from Chapter 2.6)

We all know how to multiply polynomials by multiplying all the cross terms. Given a polynomial P of degree d (the largest power is d), and a polynomial Q of degree e (the largest power is e), the product, P Q, will have degree d e. To compute the product P Q, there will be up to (d+1) (e+1) cross terms to multiply.

In general, if two polynomials are both of degree d, then there will be (d+1)² = O(d²) cross terms.

But using the idea of the fast Fourier transform, one can actually reduce the number of multiplications from O(d²) to O(d log d). multiply the polynomials in O(d log d) time. Let A(x) and B(x) be polynomials of degree d. For example, if d=3, we might have A(x) = 4 + 3x + 5x² + 6x³.

In outline, the idea for polynomial multiplication for C(x) = A(x) B(x) of degree d is:

Choose d points x₀,…, x_d/2 and x_-1,…, x_-d/2.
Evaluation: A(x₀),…,A(x_d/2 and A(-x₁),…,A(-x_d/2); Do the same for B(x)
Multiplication: Compute C(x)=A(x)B(x) at each of the points.
So C(x_i) = A(x_i) B(x_i) and C(x_-i) = A(x_-i) B(x_-i). Let y_i be the constants such that C(x_-i) = y_i and C(x_-i) = y_-i
Interpolation: Compute the unknown coefficients of C(x) using C(x₀) = y₀, and C(x₁) = y₁,…,C(x_d/2) = y_d/2 and C(x_-1) = y_-1,…,C(x_-d/2) = y_-d/2.

Step 1 (Select Points):

We choose d+1 points x₀, x₁,…,x_d at which we want to evaluate A(x), where

x_d/2+i = -x_i.

Step 2 (Evaluation):

Let's see how to do step 2 the fast way. The polynomial A(x) can be split into even and odd terms (for d even):
A(x) = (c₀ + c₂x² +…+ c_d/2x^d/2) + (c₁x + c₃x³ +…+ c_d/2-1x^d/2-1)
We can write this as:
A(x) = A_even(x) + A_odd(x)
where

A_even(x) = c₀ + c₂x² +…+ c_d/2x^d/2
A_odd(x) = c₁x + c₃x³ +…+ c_d/2-1x^d/2-1

(Note that our A_even(x) and A_odd(x) are different from the textbook's A_e(x) and A_o(x). Specifically, A_e(x²) = A_even(x) and xA_o(x²) = A_odd(x).)

Next, we compute A(x_i) = A_even(x) + A_odd(x)
and A(-x_i) = A_even(-x) + A_odd(-x) = A_even(x) - A_odd(x)

Recall that we chose the d+1 points x₀, x₁,…,x_d at which we want to evaluate A(x), where

x_d/2+i = -x_i.

So, we do step 2 by evaluating A_even(x_i) and A_odd(x_i) for x₀, s₁,…, x_d/2. After this, we achieve evaluations for x_d/2+i`free:




  A_even(x_d/2+i) = A_even(-x_i) = 
  A_even(x_i) 

  A_odd(x_d/2+i) = A_odd(-x_i) = 
  -A_even(x_i)


Then, we have d+1 evaluations of A(x),
since A(x) = A_even(x) + A_even(x)

Evaluating the d+1 terms of A() at all d+1 points would have
required evaluating
(d+1) (d+1) terms.  By doing it this alternative way, we only need
to evaluate  about (d+1) (1+d/2) terms.

Do likewise for another polynomial B().



Step 3 (Multiplication):

Multiply at the d+1 points x_i.


  C(x_i) = A(x_i) B(x_i)


When we are done, we have computed constants y_i for which:

C(x_i) = y_i

This requires d+1 multiplications. Our goal is to find a fast multiplication in time O(d log d). This step is O(d) and so it's already fast enough.

Step 4 (Interpolation):

In general, interpolation requires a lot of steps. Given the unknown polynomial C(x) for which we only know:
C(x₀) = y₀,…, C(x_d) = y_d
we wish to find c₀,…, c_d such that
C(x) = c₀ + c₁x + c₂x² +…+ c_dx^d

As you would expect, one plugs in and solves. The c_i are the unknowns that we have to solve for. After plugging into C(x) = y for x₀, x₁,…, x_d, we have d+1 linear equations:

c₀ + c₁(x₀) + c₂(x₀)² +…+ c_d(x₀)^d = y₀
c₀ + c₁(x₁) + c₂(x₁)² +…+ c_d(x₁)^d = y₁
...

This means solving the linear simultaneous equations.

The advantage if we use x₁,…, x_d/2 and and x_-1,…, x_-d/2 is that we can replace the equations to solve:
C(x_i)=y_i and C(-x_i)=y_-i
by:
C(x_i) + C(-x_i) = y_i + y_-i
C(x_i) - C(-x_i) = y_i - y_-i
These last two equations are the same as:

2 C_even(x_i) = y_i + y_-i
2 C_odd(x_i) = y_i + y_-i

Now, we again plug in, but this time we have only half as many unknowns, c_i, in each equation:
2c₀ + 2c₂(x_i)² +…+ 2c_d-1(x_i)^d-1 = y_i + y_-i
2c₁x + 2c₃(x_i)³ +…+ 2c_d(x_i)^d = y_i - y_-i

So, we still have d+1 linear equations to solve, but each linear equation now has only half as many terms.

Generalizing to four polynomials instead of A_even() and A_odd():

We have seen that the polynomial A_even(x) has only half as many terms. So, we win by breaking up A(x) = A_odd(x) + A_odd(x). How can we extend this further?

The answer is to go to complex numbers. Recall that the imaginary number i satisfies i² = -1. Some simple algebra then shows that i³ = -i and i⁴ = 1. (Note that we will also continue to use i for the subscript index. It will be clear from context whether we mean i the imaginary number, or i the subscript index.)

So, instead of using A(x) and A(-x) for A() and replacing them by A(x) + A(-x) and A(x) - A(-x) for A(), let's use P(x), P(ix), P(i²x), P(i³x) (or the equivalent P(x), P(ix), P(-x), P(-ix)) for P(). Let's solve for some polynomials below. From these, we won't construct 2P_even(x) = P(x) + P(-x) and 2P_odd(x) = P(x) - P(-x), but four polynomials based on four linear combinations. We'll guess the right polynomials below.

	`P(x)=1`	`P(x)=x`	`P(x)=x²`	`P(x)=x³`	`P(x)=x⁴`	Short Name for Expression
`P(x) + P(ix) + P(-x) + P(-ix)`:	4	0	0	0	4x⁴	4P_{0 mod 4}(x)
`P(x) - P(ix) + P(-x) - P(-ix)`:	0	0	4x²	0	0	4P_{2 mod 4}(x)
`P(x) + P(ix) - P(-x) - P(-ix)`:	0	(2+2i)x	0	(2-2i)x³	0
`P(x) - P(ix) - P(-x) + P(-ix)`:	0	(2-2i)x	0	(2+2i)x³	0
`P(x) + iP(ix) - P(-x) - iP(-ix)`:	0	0	0	4x³	0	4P_{3 mod 4}(x)
`P(x) - iP(ix) - P(-x) + iP(-ix)`:	0	4x	0	0	0	4P_{1 mod 4}(x)

The good polynomials are:

4P_{0 mod 4}(x) = P(x) + P(ix) + P(-x) + P(-ix)
4P_{1 mod 4}(x) = P(x) - iP(ix) - P(-x) + iP(-ix)
4P_{2 mod 4}(x) = P(x) - P(ix) + P(-x) - P(-ix)
4P_{3 mod 4}(x) = P(x) + iP(ix) - P(-x) - iP(-ix)

The subscript names were chosen to indicate which terms are non-zero. For example P_{1 mod 4}() has terms identical to P() for those terms of degree 1 mod 4. The other terms of P_{1 mod 4}() are all zero. These are the polynomials for which 3/4 of the terms are zero. So, these polynomials can be evaluated in 1/4 of the time.

Note that

P(x) = P_{0 mod 4}(x) + P_{1 mod 4}(x) + P_{2 mod 4}(x) + P_{3 mod 4}(x)

If we're going to make this work, we need an efficient way to convert between P(x), P(ix), P(-x), P(-ix) and the four polynomials above. This is the essence of the Fast Fourier Transform in the text. It allows you to convert from from an ordinary polynomial P() into the four transformed polynomials above. (This is analogous to the previous case in which we transformed into A_even() and A_odd().) Using the transformed polynomials, we can evaluate (Step 2) fast. Step 3 is always fast. And we can interpolate (Step 4) fast.

Step 1 (Select Points):

We choose d+1 points x₀, x₁,…,x_d at which we want to evaluate A(x), where

x_d/4+i = i x_i for 1≤i<d/2
x_2d/4+i = - x_i for 1≤i<d/2
x_3d/4+i = -i x_i for 1≤i<d/2

Note that the coefficients i, -1, and -i are the powers of the imaginary number i.

Step 2 (Evaluation):

We can write this as:
P(x) = P_{0 mod 4}(x) + P_{1 mod 4}(x) + P_{1 mod 4}(x) + P_{2 mod 4}(x) + P_{3 mod 4}(x)
where

P_{0 mod 4}(x) = c₀ + c₄x⁴ + …
P_{1 mod 4}(x) = c₁x + c₅x⁵ + …
...

So, we do step 2 by evaluating P_{0 mod 4}(x_i), P_{1 mod 4}(x_i), P_{2 mod 4}(x_i) and P_{3 mod 4}(x_i), for x₀, s₁,…, x_d/4. After this, we achieve evaluations of P_{0 mod 4}(x_j) with j>d/4 for free:

P_{0 mod 4}(x_d/4+i) = P_{0 mod 4}(ix_i) = P_{0 mod 4}(x_i)
P_{0 mod 4}(x_2d/4+i) = P_{0 mod 4}(-x_i) = P_{0 mod 4}(x_i)
P_{0 mod 4}(x_3d/4+i) = P_{0 mod 4}(-ix_i) = P_{0 mod 4}(x_i)

and evaluations of P_{1 mod 4}(x_j) with j>d/4 for free:

P_{1 mod 4}(x_d/4+i) = P_{1 mod 4}(ix_i) = i P_{1 mod 4}(x_i)
P_{1 mod 4}(x_2d/4+i) = P_{1 mod 4}(-x_i) = - P_{1 mod 4}(x_i)
P_{1 mod 4}(x_3d/4+i) = P_{1 mod 4}(-ix_i) = -i P_{1 mod 4}(x_i)

and evaluations of P_{2 mod 4}(x_j) with j>d/4 for free:

P_{2 mod 4}(x_d/4+i) = P_{2 mod 4}(ix_i) = - P_{2 mod 4}(x_i)
P_{2 mod 4}(x_2d/4+i) = P_{2 mod 4}(-x_i) = P_{2 mod 4}(x_i)
P_{2 mod 4}(x_3d/4+i) = P_{2 mod 4}(-ix_i) = - P_{2 mod 4}(x_i)

and evaluations of P_{3 mod 4}(x_j) with j>d/4 for free:

P_{3 mod 4}(x_d/4+i) = P_{3 mod 4}(ix_i) = -i P_{3 mod 4}(x_i)
P_{3 mod 4}(x_2d/4+i) = P_{3 mod 4}(-x_i) = - P_{3 mod 4}(x_i)
P_{3 mod 4}(x_3d/4+i) = P_{3 mod 4}(-ix_i) = i P_{3 mod 4}(x_i)

Then P(x) is computed as:

P(x) = P_{0 mod 4}(x) + P_{1 mod 4}(x) + P_{1 mod 4}(x) + P_{2 mod 4}(x) + P_{3 mod 4}(x)

P_{0 mod 4}(), P_{1 mod 4}(), P_{2 mod 4}() and P_{3 mod 4}() can each be evaluated fast, since each has only 1/4 as many terms. Each of the four functions is applied to each of x₀, x₁,…, x_d/4. So, this requires an evaluation of (d+1) (d/4) terms.

Do likewise for another polynomial, Q(x).

Step 3 (Multiplication):

Multiply at the d+1 points x_i.

R(x_i) = P(x_i) Q(x_i)

When we are done, we have computed constants y_i for which:

R(x_i) = y_i

This requires d+1 multiplications. Our goal is to find a fast multiplication in time O(d log d). This step is O(d) and so it's already fast enough.

Step 4 (Interpolation):

Recall that with the interpolation phase of A_even() and A_odd(), we needed to determine the unknown coefficients c₀, c₁,…,c_d for the polynomial C(x), given that:

C(x_i) = y_i

We again need to do again find unknown coefficients c₀, c₁,…,c_d, but now for:

R(x_i) = y_i

We use the good polynomials found from the table at the beginning of this subsection:

4R_{0 mod 4}(x) = R(x) + R(ix) + R(-x) + R(-ix)
4R_{1 mod 4}(x) = R(x) - iR(ix) - R(-x) + iR(-ix)
4R_{2 mod 4}(x) = R(x) - R(ix) + R(-x) - R(-ix)
4R_{3 mod 4}(x) = R(x) + iR(ix) - R(-x) - iR(-ix)

We expand:

4R_{0 mod 4}(x_i) = R(x_i) + R(ix_i) + R(-x_i) + R(-ix_i) = R(x_i) + R(x_d/4+i) + R(x_2d/4+i) + R(x_3d/4+i) = y_i + y_d/4+i + y_2d/4+i + y_3d/4+i

4R_{1 mod 4}(x) = R(x_i) - iR(ix_i) - R(-x_i) + iR(-ix_i) = y_i - iy_d/4+i - y_2d/4+i + iy_3d/4+i

4R_{2 mod 4}(x_i) = R(x_i) - R(ix_i) + R(-x_i) - R(-ix_i) = y_i - y_d/4+i + y_2d/4+i - y_3d/4+i

4R_{3 mod 4}(x_i) = R(x_i) + iR(ix_i) - R(-x_i) - iR(-ix_i) = y_i + iy_d/4+i - y_2d/4+i - iy_3d/4+i

So, we have to find the unknowns, c_i:

c₀ + c₄x_i⁴ + … = y_i + y_d/4+i + y_2d/4+i + y_3d/4+i
c₁x_i + c₅x_i⁵ + … = y_i - iy_d/4+i - y_2d/4+i + iy_3d/4+i
c₂x_i + c₆x_i⁶ + … = y_i - y_d/4+i + y_2d/4+i - y_3d/4+i
c₃x_i + c₇x_i⁷ + … = y_i + iy_d/4+i - y_2d/4+i - iy_3d/4+i

But when we do this step, each equation already has 3/4 of the coefficients set to zero, and so each of the d+1 linear equations will have only 1/4 as many terms.

Furthermore, there are savings in computing the sums of the y_i on the right hand side, since many of the partial sums are repeated. In the general case, there will be only O(d log d) sums of the y_i.

Generalizing Further:

From the previous two subsections (case of A_even(), A_odd(), and case of P_{0 mod 4}(), P_{1 mod 4}(), P_{2 mod 4}(), P_{3 mod 4}()), we can derive the general pattern.

The textbook does this in a somewhat more abstract (and more compact) manner, by using recursion. Now read the general case there. In converting between our notation and theirs, it's important to recognize that the textbook uses A_e() and A_o(), where:

A_even(x) = A_e(x²), and
A_odd(x) = xA_o(x²).

In general, if the polynomial is of degree d = n-1 for some n = 2^k, then we let ω = e^2πin/k and choose x_i = ωⁱ:

x₀ = 1
x₁ = ω = e^2πi(n/k)
x₂ = ω² = e^2πi(2n/k)
x₃ = ω³ = e^2πi(3n/k)
...

For further reading about fast multiplication (both for integers and polynomials), try http://en.wikipedia.org/wiki/Multiplication_algorithm.

`4R_{0 mod 4}(x_i)`	`= R(x_i) + R(ix_i) + R(-x_i) + R(-ix_i)`	`= R(x_i) + R(x_d/4+i) + R(x_2d/4+i) + R(x_3d/4+i)`	`= y_i + y_d/4+i + y_2d/4+i + y_3d/4+i`
`4R_{1 mod 4}(x)`	`= R(x_i) - iR(ix_i) - R(-x_i) + iR(-ix_i)`		`= y_i - iy_d/4+i - y_2d/4+i + iy_3d/4+i`
`4R_{2 mod 4}(x_i)`	= R(x_i) - R(ix_i) + R(-x_i) - R(-ix_i)		`= y_i - y_d/4+i + y_2d/4+i - y_3d/4+i`
`4R_{3 mod 4}(x_i)`	`= R(x_i) + iR(ix_i) - R(-x_i) - iR(-ix_i)`		`= y_i + iy_d/4+i - y_2d/4+i - iy_3d/4+i`

Fast Polynomial Multiplication (from Chapter 2.6)

Step 1 (Select Points):

Step 2 (Evaluation):

Step 3 (Multiplication):

Step 4 (Interpolation):

Generalizing to four polynomials instead of Aeven() and Aodd():

Step 1 (Select Points):

Step 2 (Evaluation):

Step 3 (Multiplication):

Step 4 (Interpolation):

Generalizing Further:

Generalizing to four polynomials instead of A_even() and A_odd():