Code&Data Insights

[Machine Learning] Linear Model - Linear Regression | Logistic Regression | Multiclass Logistic Regression | Linear Basis Function Models 본문

Data Science/Machine Learning

[Machine Learning] Linear Model - Linear Regression | Logistic Regression | Multiclass Logistic Regression | Linear Basis Function Models

paka_corn 2023. 6. 14. 07:53

Linear Model 

-       Easy to optimize, fast training and prediction

-       Good Interpretability

-       ONLY suitable for linearly separable classes

 

=> The capacity of the linear model depends on the input dimensionality D.

=>  VC dimensions : D + 1 for Logistic regression

VC dimension? 

: a measure of the capacity or complexity of a hypothesis space

 

 

Linear Regression 

 

 

-       Parameter space is convex

-       Objective function : MSE

-       Can find the closed form

=> We can use the analytical solution without using gradient descent!

 

· Analytical Solution

 

=> Direct Solution

 

 

Logistic Regression 

 

- Logistic Regression = Binary Classification

 

·  Why We Use Squashing Function?

-       To makes sure the output is bounded between 0 and 1. (probability)

=> Use Sigmoid (Logistic Function)

 

=> derivative of sigmoid 

 

 

- Draw linear boundaries between two classes

 

- Decision boundaries are all those points where there is maximum uncertainty

=> Maximum uncertainty :

 

-       In Logistic Regression, the direct closed-form solution does not exist!

-> Train logistic regression with Gradient Descent

-> Define Objective Function

-> Compute the gradient of the loss function

 

- Objective Function: Binary Cross Entropy

 

 

 

 

 

Multiclass Logistic Regression 

: a machine learning model that outputs K probabilities.

 

- Linear Transformation + SoftMax

=> SoftMax – squashing function

 

·      Why We Use Squashing Function?

-       To makes sure the output is bounded between 0 and 1. (probability)

-       The sum of all the K probabilities is 1.

-       The softmax introduces competition within the output units: the increase of one probability leads to a decrease in the others.

-> Argmax: select the class with the highest probability

 

 

-  Train Multiclass Logistic Regression,

1) Define a objective(loss) function

=> Categorical Cross Entropy(CCE) = NNL

 

2) Compute the gradient of the loss to update the parameters

=> Derivative of CCE

 

 

 

-   Optimization space for Logistic Regression & Multiclass Logistic Regression 

=> The optimization space is convex, so it’s good for optimization

 

 

 

Linear Basis Function Models

: transform the input features to make the classes linearly seperable 

 

 

 

 

<- Manually engineer this, NOT EASY! 

 

BUT, transformation by hand is NOT easy task! 

=> a challenging task that requires a lot of domain-specific human knowledge 

 

 

--> Why Not we learn this from data?

==> Neural Network ( Fully data-driven ) 

Comments