top of page



  1. The goal is to output the probability of Y

  2. Based on this, it can be classified into different categories

  3. The logistic regression model converts the summation of all the weights * inputs, using the sigmoid function, into a value between 0 and 1

Types of classification in logistic regression

  1. Binary (Pass, Fail)

  2. Multi (Pizza, Spaghetti, Ravioli)

  3. Ordinal (Low, medium, high)

Illustration of the network

















































Screen Shot 2020-06-08 at 11.20.12
Screen Shot 2020-06-08 at 11.23.53
Screen Shot 2020-06-08 at 11.25.10
Screen Shot 2020-06-08 at 11.37.31
Screen Shot 2020-06-08 at 11.39.24
Screen Shot 2020-06-08 at 11.50.10

2. Write the cost function using the cost equation listed above

Screen Shot 2020-06-08 at 11.50.49
Screen Shot 2020-06-08 at 11.51.02

3. Write the gradient function using the same code written in linear regression (as both are targeted to find the minimum of the cost function)

4. Clean the data as necessary (with one hot encoding or masking, using a separate method

Screen Shot 2020-06-08 at 11.51.21

How to code a Logistic Regression function


  1. Relevancy

    • Sigmoid is the activation function for a logistic regression algorithm and helps to define this regression

      1. An activation function is a mathematical gate between the input and output. For example, the step function from linear regression is an activation function. 

      2. Sigmoid outputs a number between 0 and 1

  2. Equation

    • This equation involves ​


  1. Visualization

    • Sigmoids graph can be shown by outputting a number between 0 and 1 where the graph can be shown by (where the y axis is sigmoid). A number between 0 and 1 can be seen as the probability. This probability can then be classified. 

Cost Function

  1. Concept

    • The cost function is also an important characteristic of logistic regression. It can be seen that the closer to the correct prediction, the lower the cost. 

    • Instead of mean squared error from linear regression, logistic regression uses an equation known as cross entropy or log loss as its loss function. 

  2. Equation

3. Visualization

Gradient Descent

  1. Concept

    • Gradient descent is the algorithm used to get the weights with least cost by moving in the direction of the steepest descent

    • The concept is quite similar to linear regression (finding where the tangent is 0) with different steps and equations.

  2. Equation

    • For gradient descent we need to calculate the partial derivative of the cost function, like linear regression. For this we should follow the steps involved in multivariable calculus. Since the derivation is long and strenuous, we will display the result from this process: ∂c/∂w= (y-σ)*X

  3. Pseudocode

    • The steps follow:

      1. Calculate the gradient average

      2. Multiply by learning rate

      3. Subtract from weights

1. Import any needed libraries

5. Make a create data, where you import the data, call the clean strings method, your data, create a bias column and stack that on. Have a create test data where a similar process is followed, except predict the output of the test data (and categorize it if needed). 

Screen Shot 2020-06-08 at 11.51.54

6. Create your X and Y from the train data, learning rate, a graph for costs over the amount of iterations, and an array of weights

Screen Shot 2020-06-08 at 11.52.23

7. Train the model through calling the cost function and then the calculate gradient function to keep iterating and adjusting weights till a minimum cost is reached.

Screen Shot 2020-06-08 at 11.52.40

8. Print any necessary variables, such as initial cost, learning rate, ending costs, weights, scaled x, scaled y, and the first value in the gradient list. 

Screen Shot 2020-06-08 at 11.53.16

It's time to take to watch a video explanation and take a quiz on what you have learnt:

bottom of page