Get in mind that I'm using...
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(data, target, random_state=42)
from sklearn.preprocessing import StandardScaler
ss = StandardScaler()
ss.fit(x_train)
x_train_scaled = ss.transform(x_train)
x_test_scaled = ss.transform(x_test)
Binary Logistic Regression
Sigmoid Function
- Φ = 1 / (1 + e^-z)
import numpy as np
from matplotlib import pyplot as plt
z = np.arange(-5, 5, 0.1)
phi = 1 / (1 + np.exp(-z))
plt.plot(z, phi)
plt.show()
Preparing data with two classes
bream_smelt_indices = (y_train == 'Bream') | (y_train == 'Smelt')
y_train.shape, bream_smelt_indices.shape
>> ((119,), (119,))
bs_x = x_train_scaled[bream_smelt_indices]
bs_y = y_train[bream_smelt_indices]
x_train_scaled.shape, bs_x.shape
>> ((119, 5), (33, 5))
sklearn.linear_model.LogisticRegression
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
lr.fit(bs_x, bs_y)
lr.predict(bs_x[:5])
>> array(['Bream', 'Smelt', 'Bream', 'Bream', 'Bream'], dtype=object)
Multiple Logistic Regression
C is hyperparameter
- default is 1
- bigger C => less regularization => overfit
uses softmax function (instead of sigmoid function)
- e_sum = e^z1 + e^z2 + ... + e^zn
- sn = e^zn / e_sum (= probability value)
- this makes: sum(s1, s2, ..., sn) == 1
...LogisticRegression
lr = LogisticRegression(C=20, max_iter=1000)
lr.fit(x_train_scaled, y_train)
lr.score(x_train_scaled, y_train), lr.score(x_test_scaled, y_test)
>> (0.9327731092436975, 0.925)
- comparing probabilities with manual probabilities from z value
proba = lr.predict_proba(x_test_scaled[:5])
from scipy.special import softmax
decision = lr.decision_function(x_test_scaled[:5])
proba2 = softmax(decision, axis=1)
np.round(proba, decimals=3), np.round(proba2, decimals=3)
>> (array([[0. , 0.014, 0.841, 0. , 0.136, 0.007, 0.003],
[0. , 0.003, 0.044, 0. , 0.007, 0.946, 0. ],
[0. , 0. , 0.034, 0.935, 0.015, 0.016, 0. ],
[0.011, 0.034, 0.306, 0.007, 0.567, 0. , 0.076],
[0. , 0. , 0.904, 0.002, 0.089, 0.002, 0.001]]),
array([[0. , 0.014, 0.841, 0. , 0.136, 0.007, 0.003],
[0. , 0.003, 0.044, 0. , 0.007, 0.946, 0. ],
[0. , 0. , 0.034, 0.935, 0.015, 0.016, 0. ],
[0.011, 0.034, 0.306, 0.007, 0.567, 0. , 0.076],
[0. , 0. , 0.904, 0.002, 0.089, 0.002, 0.001]]))