Polynomial Regression with Linear Algebra

Recent article of the artificialturk.com mentions about the linear regression with matrix form. Let’s generalise it for the matrix form. But you should review the linear regression with linear algebra matrix form.

We’ll do slight modifications on it. We have design matrix, X, that has column of ones at the left side. At the right side we have values. This enables us to compute predictions with a simple dot product.

We have also randomness term here, review it on here.

Now, let’s say we want to calculate best second degree polynomial fit. We need three parameters.

y = ax^m+bx^m-1+……+c

What should be the the weight and input matrix then?

{\displaystyle {\begin{bmatrix}y_{1}\\y_{2}\\y_{3}\\\vdots \\y_{n}\end{bmatrix}}={\begin{bmatrix}1&x_{1}&x_{1}^{2}&\dots &x_{1}^{m}\\1&x_{2}&x_{2}^{2}&\dots &x_{2}^{m}\\1&x_{3}&x_{3}^{2}&\dots &x_{3}^{m}\\\vdots &\vdots &\vdots &\ddots &\vdots \\1&x_{n}&x_{n}^{2}&\dots &x_{n}^{m}\end{bmatrix}}{\begin{bmatrix}\beta _{0}\\\beta _{1}\\\beta _{2}\\\vdots \\\beta _{m}\end{bmatrix}}+{\begin{bmatrix}\varepsilon _{1}\\\varepsilon _{2}\\\varepsilon _{3}\\\vdots \\\varepsilon _{n}\end{bmatrix}},}
Matrix with x is called Vandermonde matrix.

Now, we’ll apply the same matrix multiplications. The idea is exactly the same.

(All the images of equations and math part are from : http://www.stat.purdue.edu/~boli/stat512/lectures/topic3.pdf)

Let’s define the problem with our random dataset, we will discuss the code with Python:

Bu görsel boş bir alt niteliğe sahip; dosya adı figure_1-4.png

Briefly, this is what we need to observe. We want to apply a polynomial fitting there. This data can be anything that has a linear dependent feature(You can apply this to any dataset but it won’t work for all kinds). We want to put a polynomial between points and we want this polynomial as close as possible to “the points”.

One way to do this is by linear algebra, but let’s find out how we create such a dataset with NumPy. First define libraries.

import numpy as np
import matplotlib.pyplot as plt
import numpy.linalg

To create a dataset, we first define X, it is the numbers [-10,-9,-8,…,9,10].

X = np.arange(-10,10).reshape(20,)

We will define and m and b value at the beginning. But we will add Gaussian noise to it. Let’s start by preparing and graphing without noise.

For those who are not familiar with distributions, just assume that we are adding random +/- values to each point. For x=1, y = 1+2*1-12*1-2*1+6 = -5. We will add random number of 0.2 let’s say. It’ll become 3.2 . We will add -0.3 for x=2 that gives y = 2*2+1-0.3 = 4.7. We’ll do it by adding one random function to code above.

y = np.dot(X,np.array([6,-2,-12,2,1]).T).reshape(20,1)+E
#where E is defined earlier as:
E = np.random.normal(0,50,(20,1)).reshape(20,1)

Add Randomness

For those who’re familiar with distributions know that it’s normal distribution with μ = 0 and σ = 1*3 = 3 . Instead of multiplying, we can replace 1 inside random.normal function with 3, that’s up to you. Now here’s what we have.

After adding randomness

It looks like the real world data now. Let’s do some math, then, code at the end.

For the optimization idea, please read.

I don’t want to review it again, matrices are valid and all the math operations are the same. But we’ll code there in here as well.

Coding Polynomial Regression

Let’s import libraries.

import numpy as np
import matplotlib.pyplot as plt
import numpy.linalg

For linear regression, we use this code.


This code produces a familiar output.

  [ 1.  0.]
  [ 1.  1.]
  [ 1.  2.]
  [ 1.  3.]
  [ 1.  4.]
  [ 1.  5.]
  [ 1.  6.]
  [ 1.  7.]
  [ 1.  8.]]

However, we’ll need Vandermonde matrix. It means we need a matrix with other powers stacked.

vandermonde matrix ile ilgili görsel sonucu
From some discussion at StackOverFlow

To produce it, we won’t use hstack. Although we can create such a matrix with boring coding stuff, NumPy hopefully has builtin function.

Let’s say we want to apply fourth degree polynomial regression.

X = np.vander(X,5,increasing=True)

It’ll result in:

 [     1      0      0      0]
 [     1      1      1      1]
 [     1      2      4      8]
 [     1      3      9     27]

Now, the rest is same with linear regression codes.


And that’s what we end up with.

Bir cevap yazın

E-posta hesabınız yayımlanmayacak. Gerekli alanlar * ile işaretlenmişlerdir