top of page


What is Linear Regression?

  1. When the relationship between the input and output can be predicted in a linear-like function


  1. The Y or output can be calculated by the equation by y = weight*x + bias, derived by y = mx + b, from a standard straight line. 

How to change weights?

  1. The weights are the most essential part of linear regression and what is changed throughout this process to achieve the most optimal accuracy. Thus, we must look at how to adjust these.

  2. The weights can be initially any value but must be adjusted to correctly explain the significance of each input. Below describes how to change a weight in python:

    • How do I code it in python:

      1. Weights [i] = weights [i] + lr(actual-predicted) * Xrow[i]

        1. Where lr is the learning rate

          1. The learning rate determines the step size for each iteration while trying to move to the minimum of the cost function. If the learning rate is too large you could overshoot the minimum and never actually get to the correct weights (with a minimum cost) while a too small learning rate will require an extremely large number of iterations to get to the minimum. 

      2. Weights [i] = the i-th weight

      3. Xrow [i] = The i-th value of X in a given row

​​​Cost Function​

  1. Conceptual

    • The cost function can be categorized as the mean squared error and generates how far the predicted output is from the actual for each weight adjustment. 

    • Given the amount of error, the weights will be adjusted to minimize the error. This concept is known as gradient descent. Gradient descent helps compute the slope of adjusting the weights and can be seen by this. 

    • The amount of change in weights is affected by the learning rate.

    • Where there is the least amount of error, weights are most optimal. Finding the most optimal weights, or where there is the least cost, in a line is the point of a linear regression model. At the minimum of the cost function, the gradient or tangent line should be equal to 0, and therefore that is the goal of that function: to find where the tangent is equal to 0.























































































Screen Shot 2020-06-07 at 10.31.58
Screen Shot 2020-06-07 at 10.35.20
Screen Shot 2020-06-07 at 10.51.57
Screen Shot 2020-06-07 at 10.55.40
Screen Shot 2020-06-07 at 11.11.23
Screen Shot 2020-06-07 at 11.29.28

How to code a Linear Regression function



​The process (by hand)

  1. Generate random data that fits a linear regression for a given slope and offset. 
  2. Initialize the bias and weights (bias and slope) to small random numbers

  3. Compute your total cost

  4. Compute the gradient/change to apply to weights

  5. Adjust the weights of your model, based on the computed gradient and the learning rate

  6. Repeat 3-5 till the sum of costs gets to a minimum and can compute this over a fixed number of iterations



Note that this is just one STYLE of programming linear regression and you may want to program this differently. This is solely to help others understand how to program a linear regression function. 

Note: Typically, instead of initializing your own array/data in your code, you will get a dataset that you must interpret. In order to optimize understanding, we will be use a supervised model and use a “train” dataset to train the model (in terms of optimal weights) and then a “test” model to test the model and calculate its final cost, based on what the model predicts in the test dataset and its distance from the actual output. 

  • In order to do this we must create a “createDataTrain” method and a “createDataTest” method to interpret and get our X’s and Y’s (input and output). In these methods, the data sets are opened, interpreted, set to X and Y, standardized, and stacked with a bias (Y-intercept) column. For reference, a create data train and a create data test is shown below

Screen Shot 2020-06-10 at 12.36.54

2. Equation

  • Where m is the number of data points in the set:

3. Visualization of the Cost Function

Perceptron - a popular linear regression model

  1. What is perceptron

    • Perceptron is a popular example of linear regressions and uses varying inputs to give a binary (0 or 1) choice.

  2. Vocabulary

    • Decision Boundary: Line that separates two classes

  3. Sample graph:

This graph displays two different categories and the line of best fit separating/classifying it. This line of best fit is called the decision boundary. The axises can be seen as the input while whether this is a single family or town house can be seen as the binary output. The line helps classify whether based on the price and square feet, whether the output is single family or a town house (in this example). 

  1. How the perceptron model works

    • WiX1 + W2x2 + X2W3 + W0Bias = output

    • The output is then categorized into a step function (which is described in the next step), which results in a binary output of 0 or 1

    • This output, in regular, non-perceptron functions, will not be categorized in the step function

  2. Visualization of the concept

3.​ The Step function

  • The unit step activation function categorizes the output as 0 or 1 and can be shown by:

​How do I code a perceptron model

  1. ​​​Write the predicted method, which can sum each value of x in an array, using a for loop. In this same method categorize the value of X as zero or 1 (using the piecewise function listed under the step function above)
  2. Write a change weights function using the equation (in python style) listed further above, Weights [i] = weights [i] + lr(actual-predicted) * Xrow[i]

  3. Set the X (input) equal to a set of any numbers and the Y (output) equal to 0 or 1 for 4 rows

  4. Set the weights equal to a list of any number (preferably closer to 0)

  5. Set the learning rate equal to any number (closer to 0). Occasionally the learning rate should be adjusted until you get the lowest final cost. 

  6. Write a main loop that will call cost function, predicted function, and then update the weights with a certain learning rate. Store the value of the cost function into an array so you can display the cost function, per iteration, after the loop

  7. Display the data and corresponding line with the resultant weights

  8. Perceptron sample code 

    1. The below code displays a sample perceptron program, where given an array of 3 binary inputs and 1 binary output, it outputs the adjusted weights.

1. Import all necessary system

Screen Shot 2020-06-07 at 11.25.39

​2. Create a predicted function that can multiply and sum the row of each X and each column’s respective weight

Screen Shot 2020-06-07 at 11.24.37

3. Create a Calc Gradient function that can use the prediction, subtract that from y, and multiply the subtraction by the x (with a dot product) and divide that by the length of x

Screen Shot 2020-06-07 at 11.27.05

4. Import the data with a create train function and a create test function, described above with screenshots

Screen Shot 2020-06-07 at 11.30.06

5. Write a main loop that will call cost function, gradient function, and then update the weights with a certain learning rate. Store the value of the cost function into an array so you can display the cost function, per iteration, after the loop

Screen Shot 2020-06-07 at 11.30.42

6. Print any helpful data. You can use the test data and the weights from the train to determine any helpful variables, such as cost and weights.​​​​

Screen Shot 2020-06-07 at 11.30.54

Now it's time to watch an explanation of this concept and take a quiz on what you have learnt:

bottom of page