Logistic Regression Calculator

Binary classification with sigmoid curve, odds ratios, confusion matrix, and Python sklearn code.

Sample Data:

Settings

Number of features

Feature names

Data (X₁,...,Xₚ,label per line)

Last column must be 0 or 1 (binary label).

Accuracy

100.0%

Precision

100.0%

Recall

100.0%

Log Loss

0.1512

Model Coefficients

Feature	Coefficient (β)	Odds Ratio (e^β)
Intercept	-5.6765	-
Hours	1.3224	3.7523

Sigmoid Curve

Confusion Matrix

True Positive

False Positive

False Negative

True Negative

F1 Score: 1 | Iterations: 1000

Python Code

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import matplotlib.pyplot as plt

# Data - each row: [Hours, label]
data = np.array([
    [1,0],
    [2,0],
    [3,0],
    [4,0],
    [5,1],
    [6,1],
    [7,1],
    [8,1],
])

X = data[:, :1]
y = data[:, 1]

# Fit logistic regression
model = LogisticRegression()
model.fit(X, y)

# Predictions and probabilities
y_pred = model.predict(X)
y_prob = model.predict_proba(X)[:, 1]

print(f"Accuracy: {accuracy_score(y, y_pred):.4f}")
print(f"Coefficients: {dict(zip(["Hours"], model.coef_[0]))}")
print(f"Intercept: {model.intercept_[0]:.4f}")
print(f"Odds Ratios: {dict(zip(["Hours"], np.exp(model.coef_[0])))}")
print("\nClassification Report:")
print(classification_report(y, y_pred))
print("Confusion Matrix:")
print(confusion_matrix(y, y_pred))

# Plot sigmoid curve
x_range = np.linspace(X.min() - 1, X.max() + 1, 200)
z = model.intercept_[0] + model.coef_[0][0] * x_range
prob = 1 / (1 + np.exp(-z))

plt.figure(figsize=(8, 5))
plt.plot(x_range, prob, 'b-', lw=2, label='Sigmoid')
plt.scatter(X[y==0], np.zeros(sum(y==0)), c='red', s=50, label='Class 0', zorder=5)
plt.scatter(X[y==1], np.ones(sum(y==1)), c='green', s=50, label='Class 1', zorder=5)
plt.axhline(y=0.5, color='orange', linestyle='--', alpha=0.5, label='Decision boundary')
plt.xlabel("Hours")
plt.ylabel("P(y=1)")
plt.legend()
plt.title("Logistic Regression")
plt.show()

Frequently Asked Questions

What is logistic regression?

Logistic regression is a classification algorithm that models the probability of a binary outcome (0 or 1) using the sigmoid function. Despite its name, it is used for classification, not regression. It finds coefficients that maximize the likelihood of the observed data.

How do I interpret the coefficients?

Each coefficient βⱼ represents the change in log-odds for a one-unit increase in xⱼ. The odds ratio e^βⱼ is more intuitive: if βⱼ = 0.5, then e^0.5 ≈ 1.65, meaning the odds of y=1 increase by 65% for each unit increase in xⱼ.

What is the decision boundary?

The decision boundary is where P(y=1) = 0.5, which corresponds to z = β₀ + β₁x₁ + … = 0. Points above the boundary are classified as 1, below as 0. For a single feature: x = −β₀/β₁.

When should I use logistic regression vs other classifiers?

Use logistic regression when you need interpretable coefficients and odds ratios, the relationship is approximately linear in log-odds, or you have limited training data. For complex non-linear boundaries, consider decision trees, SVM, or neural networks.

What is log loss (binary cross-entropy)?

Log loss measures how well predicted probabilities match actual labels. It penalizes confident wrong predictions heavily: predicting p=0.01 when the true label is 1 incurs a much larger penalty than predicting p=0.4. Lower log loss = better calibrated probabilities.

Related Tools

Confusion Matrix

Precision, recall, F1

ROC & AUC Calculator

ROC curves, AUC, optimal threshold

KNN Classifier

K-nearest neighbors classification

Feature

Coefficient (β)

Odds Ratio (e^β)

Intercept

-5.6765

Hours

1.3224

3.7523

import numpy as np from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score, classification_report, confusion_matrix import matplotlib.pyplot as plt # Data - each row: [Hours, label] data = np.array([ [1,0], [2,0], [3,0], [4,0], [5,1], [6,1], [7,1], [8,1], ]) X = data[:, :1] y = data[:, 1] # Fit logistic regression model = LogisticRegression() model.fit(X, y) # Predictions and probabilities y_pred = model.predict(X) y_prob = model.predict_proba(X)[:, 1] print(f"Accuracy: {accuracy_score(y, y_pred):.4f}") print(f"Coefficients: {dict(zip(["Hours"], model.coef_[0]))}") print(f"Intercept: {model.intercept_[0]:.4f}") print(f"Odds Ratios: {dict(zip(["Hours"], np.exp(model.coef_[0])))}") print("\nClassification Report:") print(classification_report(y, y_pred)) print("Confusion Matrix:") print(confusion_matrix(y, y_pred)) # Plot sigmoid curve x_range = np.linspace(X.min() - 1, X.max() + 1, 200) z = model.intercept_[0] + model.coef_[0][0] * x_range prob = 1 / (1 + np.exp(-z)) plt.figure(figsize=(8, 5)) plt.plot(x_range, prob, 'b-', lw=2, label='Sigmoid') plt.scatter(X[y==0], np.zeros(sum(y==0)), c='red', s=50, label='Class 0', zorder=5) plt.scatter(X[y==1], np.ones(sum(y==1)), c='green', s=50, label='Class 1', zorder=5) plt.axhline(y=0.5, color='orange', linestyle='--', alpha=0.5, label='Decision boundary') plt.xlabel("Hours") plt.ylabel("P(y=1)") plt.legend() plt.title("Logistic Regression") plt.show()

Frequently Asked Questions

What is logistic regression?

How do I interpret the coefficients?

What is the decision boundary?

The decision boundary is where P(y=1) = 0.5, which corresponds to z = β₀ + β₁x₁ + … = 0. Points above the boundary are classified as 1, below as 0. For a single feature: x = −β₀/β₁.