Why Logistic Regression is Linear

A common mistake is to classify logistic regression algorithm as a non-linear machine learning model. In this post, we are going to explain the reasons of this misunderstanding, show how it is linear on an example, and finally discuss the root cause of its linearity.

Sigmoid function (credit: Ian Goodfellow)

Vlog

You can either read this tutorial or watch the following video. They both cover the linearity of logistic regression.

🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Sigmoid function

This misunderstanding is because of its base function. Logistic regression is mainly based on sigmoid function and it has a S-shape graph. This might misguide you about its linearity / non-linearity.

The task of sigmoid function in logistic regression is to transform the continuous inputs to probabilities between [0, 1]. The z-term in the equation comes from linear regression. So, sigmoid function cannot make it non-linear.

Exclusive or

We are going to discuss the reason why it is linear but let’s show its linearity on an example first. The easiest way to understand an algorithm is linear to run it for a simple non-linear data set.

Herein, exclusive-or logic gate or shortly xor is one of the simplest non-linear problem. The both random classifiers and linear models will get almost 50% whereas non-linear models will get almost 100% accuracy.

Here, you can find an xor similar data set. It is randomly generated xor similar data set. I applied the following procedure to create this data set. You can alternatively read the referenced csv to generated the data frame.

import random
import pandas as pd

x_values = []; y_values = []; targets = []
for i in range(0, 4):
    
    if i == 0:
        sign_x = 1; sign_y = 1; target = 1
    elif i == 1:
        sign_x = -1; sign_y = 1; target = 0
    elif i == 2:
        sign_x = -1; sign_y = -1; target = 1
    elif i == 3:
        sign_x = 1; sign_y = -1; target = 0
    
    for j in range(0, 100):
        value_x = (5 + random.randint(0, 100)) * sign_x
        value_y = (5 + random.randint(0, 100)) * sign_y
        
        x_values.append(value_x)
        y_values.append(value_y)
        targets.append(target)

df = pd.DataFrame(x_values, columns = ["x"])
df["y"] = y_values
df["Decision"] = targets

Plotting the data set makes it easy to understand.

for i in df['Decision'].unique():
    sub_df = df[df['Decision'] == i]
    
    if i == 1:
        color = 'blue'
    else:
        color = 'orange'
    
    plt.scatter(sub_df['x'], sub_df['y'], c = color)

The blue points represent true classes whereas orange points represent false classes.

Linear models try to separate classes with a single line whereas non-linear models try to separate classes with several lines or curves. As seen, this is not a linearly separable problem. That’s why, linear models will fail against xor data set but non-linear will succeed.

Building logistic regression model

We are going to build a logistic regression model for this data set. X and Y coordinates are features whereas its class highlighted with blue and orange color is the target value.

from sklearn.linear_model import LogisticRegression
logres = LogisticRegression()
model = logres.fit(df[['x', 'y']].values, df['Decision'].values)

It is funny that you can import the logistic regression of scikit-learn library under its linear model module. It is obvious that logistic regression is linear. 🙂

Now, I’m going to evaluate the performance of the built logistic regression model on the training set. We expect that it will get 50% accuracy because logistic regression is a linear model.

model.score(df[['x', 'y']].values, df['Decision'].values)

Really, it returned 50% accuracy. That’s expected!

Building a decision tree model

What if we build a decision tree model. We know that decision trees can handle non-linear data set. It should get 100% accuracy.

from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier()
clf = clf.fit(df[['x', 'y']].values, df['Decision'].values)
clf.score(df[['x', 'y']].values, df['Decision'].values)

It got 100% accuracy!

Why logistic regression is linear?

So, we have shown that linear models have the same level accuracy with random classifiers against simple non-linear data sets whereas non-linear models have same level accuracy with the perfect classifiers. Here, logistic regression underperform against xor data set and this shows that it is a linear model. But why?

Let’s remember the equation of logistic regression.

y = 1 / (1 + e^(-z)) whereas z = w0 + w1x1 + w2x2 + … + wnxn

The results depends on the sum of the coefficients (w) and inputs (x). The key term is the sum here. We can’t express the result as the product or quotient of weights. In other words, if we can express the results as multiplications (w1x1 * w2x2) or divisions (w1x1 / w2x2) of weights, then it becomes a non-linear model. We cannot do it in logistic regression. That is the reason why logistic regression is not a non-linear model. It is totally a linear model.

The nested structure of neural networks and hidden layers make it non-linear. Besides, decision trees are not non-linear algorithms but they apply piecewise linear approximation. So, they can handle non-linear problems.

Support this blog financially if you do like!

Vlog

Sigmoid function

Exclusive or

Building logistic regression model

Building a decision tree model

Why logistic regression is linear?

Related

Leave a Reply Cancel reply

Vlog

Sigmoid function

Exclusive or

Building logistic regression model

Building a decision tree model

Why logistic regression is linear?

Related

Leave a Reply Cancel reply

Discover more from Sefik Ilkin Serengil