Linear Regression (Machine Learning Part 1)
Linear Regression (Machine Learning Part 1)

Linear Regression (Machine Learning Part 1)

So I have been learning Machine Learning, in this lockdown, and I came across Richard Feynman’s quote, that if you wanna learn something quickly start teaching it. So here it goes. 

The first that I have learned in ML is Linear Regression. 

But before that, if you are a beginner in machine learning. You must know that every technology has a goal to solve specific kinds of problems. In Machine learning, We also have some problems, which go like this. 

1- Classification

2- Regression

3- Cluster

4- Rule Extraction

If you have a little math in your head, you can find understand them with a few equations and examples. If you don’t please study math along with studying the MAchine learning algos. Because math is important for understanding the root. 

Then comes the approaches to solving these problems. 

1- Supervised Learning

2- Unsupervised Learning 

3- Reinforcement Learning

There are more, but we will focus on these three first. 

Supervised learning Has data predefined. 

Unsupervised Learning doesn’t have data, we have to come up with relevant actions.

Reinforcement has data but not meaning full. we get the data to make meaning of it and then convert it to supervise. So we reinforce it. 

Enough talk let’s code. 

We will use Python to implement our Linear Regression.

Linear regression – 

This is a linear regression Example. All the Blue dots are data and the red line is the result that we have created. It actually creates mean values of all the values and generates the best fit line. This line defines the connection between all the results that we already have.  

I have used mat plot lib to generate the map.

from statistics import mean
import numpy as np
import matplotlib.pyplot as mt

Statistics, NumPy, and Matplotlib are the three libraries that we are gonna be using. If you don’t have any knowledge about them please take a look at them. Because we are gonna be using them all the time. 

from statistics import mean
import numpy as np
import matplotlib.pyplot as mt


xs=np.array([1,2,3,4,5,6],dtype=np.float64)
ys=np.array([5,4,6,5,6,7],dtype=np.float64)


mt.scatter(xs,ys)
mt.scatter(xs,ys,color='r')
mt.plot(xs,ys)
mt.show()

This code will generate a Map using matplot lib 

This map is showing our x and y data that we have stored in xs and ys. 

Now we have to find a best-fit line. To do that we need one straight line equation if you remember it from your math class. equation is 

Y= MX+B

This particular equation, will help us building the line. We have all Y and X, now we need m and B

To get them we need a function. 

#to find the slope m and intercept B
def find_m(xs,ys):
    m=((mean(xs)*mean(ys))-(mean(xs*ys)))/((mean(xs)**2) -(mean(xs**2)))

    b= mean(ys)-m*mean(xs)
    return m,b

This particular function will take all the values of x and y and will use mean function to get all the values. and the put them in the order so that we get our M and B . 

And now that we have our M and B then we need to create the best fit now. 


def coefficient_of_determination(ys_orig,ys_line):
    ymean_line=[mean(ys_orig) for y in ys_orig]

    squared_error_regr=squared_error(ys_orig,ys_line)
    squared_error_y_mean=squared_error(ys_orig,ymean_line)

    return 1- (squared_error_regr/squared_error_y_mean)

Now that we have our full code . 

we have o use them in order 

here is the full code. 

from statistics import mean
import numpy as np
import matplotlib.pyplot as mt



#to find the slope m and intercept B
def find_m(xs,ys):
    m=((mean(xs)*mean(ys))-(mean(xs*ys)))/((mean(xs)**2) -(mean(xs**2)))

    b= mean(ys)-m*mean(xs)
    return m,b

#to find the squared error of given value
def squared_error(ys_orig,ys_line):
    return sum((ys_line-ys_orig)**2)

def coefficient_of_determination(ys_orig,ys_line):
    ymean_line=[mean(ys_orig) for y in ys_orig]

    squared_error_regr=squared_error(ys_orig,ys_line)
    squared_error_y_mean=squared_error(ys_orig,ymean_line)

    return 1- (squared_error_regr/squared_error_y_mean)




#xs=np.array([2,4,6,8,10,12],dtype=np.float64)
xs=np.array([1,2,3,4,5,6],dtype=np.float64)
#ys=np.array([2,5,4,3,5,6],dtype=np.float64 )
ys=np.array([5,4,6,5,6,7],dtype=np.float64)


m,b=find_m(xs,ys)

#print(m,b)

y=[]

for x in xs :
    y.append((m*x)+b)


#print(y)

predict_x=5.5
predict_y=(m*predict_x)+b


r_squared=coefficient_of_determination(ys,y)

print(r_squared)


mt.scatter(xs,ys)
mt.scatter(predict_x,predict_y,color='r')
mt.plot(xs,y)
mt.show()


Here is the full code, which will generate the following output. 

Thanks for reading will post more 

Leave a Reply

Your email address will not be published. Required fields are marked *