Polynomial Regression with Linear Algebra
Recent article of the artificialturk.com mentions about the linear regression with matrix form. Let’s generalise it for the matrix form. But you should review the linear regression with linear algebra matrix form.
We’ll do slight modifications on it. We have design matrix, X, that has column of ones at the left side. At the right side we have values. This enables us to compute predictions with a simple dot product.

Now, let’s say we want to calculate best second degree polynomial fit. We need three parameters.
y = ax^m+bx^m-1+……+c
What should be the the weight and input matrix then?
Wikipedia
Now, we’ll apply the same matrix multiplications. The idea is exactly the same.
(All the images of equations and math part are from : http://www.stat.purdue.edu/~boli/stat512/lectures/topic3.pdf)
Let’s define the problem with our random dataset, we will discuss the code with Python:

Briefly, this is what we need to observe. We want to apply a polynomial fitting there. This data can be anything that has a linear dependent feature(You can apply this to any dataset but it won’t work for all kinds). We want to put a polynomial between points and we want this polynomial as close as possible to “the points”.
One way to do this is by linear algebra, but let’s find out how we create such a dataset with NumPy. First define libraries.
import numpy as np
import matplotlib.pyplot as plt
import numpy.linalg
np.random.RandomState(42)
To create a dataset, we first define X, it is the numbers [-10,-9,-8,…,9,10].
X = np.arange(-10,10).reshape(20,)
We will define and m and b value at the beginning. But we will add Gaussian noise to it. Let’s start by preparing and graphing without noise.
For those who are not familiar with distributions, just assume that we are adding random +/- values to each point. For x=1, y = 1+2*1-12*1-2*1+6 = -5. We will add random number of 0.2 let’s say. It’ll become 3.2 . We will add -0.3 for x=2 that gives y = 2*2+1-0.3 = 4.7. We’ll do it by adding one random function to code above.
y = np.dot(X,np.array([6,-2,-12,2,1]).T).reshape(20,1)+E
#where E is defined earlier as:
E = np.random.normal(0,50,(20,1)).reshape(20,1)
Add Randomness
For those who’re familiar with distributions know that it’s normal distribution with μ = 0 and σ = 1*3 = 3 . Instead of multiplying, we can replace 1 inside random.normal function with 3, that’s up to you. Now here’s what we have.

It looks like the real world data now. Let’s do some math, then, code at the end.
For the optimization idea, please read.
I don’t want to review it again, matrices are valid and all the math operations are the same. But we’ll code there in here as well.
Coding Polynomial Regression
Let’s import libraries.
import numpy as np
import matplotlib.pyplot as plt
import numpy.linalg
np.random.RandomState(42)
For linear regression, we use this code.
X=np.hstack((np.ones((100,1)),X))
This code produces a familiar output.
[ 1. 0.]
[ 1. 1.]
[ 1. 2.]
[ 1. 3.]
[ 1. 4.]
[ 1. 5.]
[ 1. 6.]
[ 1. 7.]
[ 1. 8.]]
However, we’ll need Vandermonde matrix. It means we need a matrix with other powers stacked.

To produce it, we won’t use hstack. Although we can create such a matrix with boring coding stuff, NumPy hopefully has builtin function.
Let’s say we want to apply fourth degree polynomial regression.
X = np.vander(X,5,increasing=True)
It’ll result in:
[ 1 0 0 0]
[ 1 1 1 1]
[ 1 2 4 8]
[ 1 3 9 27]
Now, the rest is same with linear regression codes.
B=np.dot(np.linalg.inv(np.dot(X.transpose(),X)),np.dot(X.transpose(),y))
And that’s what we end up with.
