Logistic regression

In the previous tutorials, we discussed linear regression and built a model to predict the values. In this tutorial we will get to know about Logistic Regression.

Logistic regression:

This regression is used to solve classification problems, It predicts discrete values. We cannot use linear regression to build a model on classification problems. This is because logistic regression should predict the value either zero ( false ) or one ( True ), whereas a linear regression model can start below zero and may not reach up to one. But the logistic regression somewhere lies only between zero and one.

Cutoff point:

In classification models, we require prediction in the form of either zero or one, what if the value is 0.64. Then we set up a cutoff point ( let here it would be 0.5 ) , if the value is above 0.5 it should be considered as one or else to be zero.

Sigmoid function:

f ( z ) = 1 / ( 1 + e-z  )

This function is also known as a logistic function because it’s range is [ 0, 1 ] and takes any input but provides logistic output.

To evaluate our logistic regression model we have to use a confusion matrix.

Confusion matrix:

 True predictionFalse prediction
Actually True condition True positiveFalse Negative
Actually False conditionFalse positiveTrue negative

There are two types of errors:

  • Type- 1 error ( False positive )
  • Type -2 error ( False negative )

Here in these tutorials, we would work with the most famous dataset “titanic” from kaggle.com. This data set has a list of passengers who died or survived in the Titanic, here we will be predicting whether a person dies or not based on our model.

We would start building our machine learning model from next tutorial.

Spread knowledge

Leave a Comment

Your email address will not be published. Required fields are marked *