Evaluation | Classification Report - Accuracy, Precision, Recall, F1 Score

Recent Posts

Recent Comments

Tags more

Archives

Code&Data Insights

Artificial Intelligence/Machine Learning

paka_corn 2023. 7. 19. 10:32

분류 예측시 confusion matrix로 TN, FN, TP, FP 를 표현 할 수 있음.

TN - 실제 데이터셋 Negative(0), 예측 값 Negative(0) => True!!!

TP - 실제 데이터셋 Positive(1), 예측 값 Positive(1) => True!!!

FP - 실제 데이터셋 Negative(0), 예측 값 Positive(1) => False

FN - 실제 데이터셋 Positive(1), 예측 값 Negative(0) => False

정확도 Accuracy rate - 예측 결과와 실제 값이 동일한 건수

계산? (TN + TP) / (TN + FN+TP+FP)

정밀도 Precision - 데이터 셋 안에 Positive한 대상 중에 예측과 실제값이 얼마나 일치하는가?

- 실제 negative 음성인 데이터예측을 Positive 로 예측

계산?

TP / (TP+FP ) FP : 예측값 Positive

- 더 중요할 때 : 스팸메일

재현율 Recall - 실제 값이 Positive인 대상 중 예측과 실제값이 얼마나 일치하는가?

- 실제 Positive 양성인 데이터예측을 negative로 예측

계산?

TP / (TP+FN ) FN : 실제값 Positive

- 암 진단, 금융사기 판별

정밀도, 재현율는 Trade-off 동시에 둘다 만족 시킬수 없음!!!

업무에 따라 정밀도나 재현율 중 더 강조되야 할 것을 조정 => (threshold 조정)

임계값(threshold)가 낮아질 수록 Positive로 예측할 확률이 높아짐 -> 재현율 증가

=> 사이킷런 Estimator predict_proba() : 분류 결정 예측 확률 반환

그래서 나온 게 F1 score인 것!!

ROC Curve

False Positive(예측값: Positive) 가 변할때 TP가 어떻게 변하나?

-> 1에 가까울수록 좋은 수치!

[Machine Learning] Generalization, Capacity, Overfitting, Underfitting (1)	2023.11.01
[Machine Learning] Hyper parameters \| Feature Importance \| Gini Impurity \| Mean Decrease in Impurity (0)	2023.07.21
[Machine Learning] Model Selection - K-Fold Cross Validation \| Grid Search (0)	2023.06.27
[Machine Learning] Dimensionality Reduction - Feature Extraction \| PCA \| LDA (0)	2023.06.26
[Machine Learning] Natural Language Processing(NLP) \| Bag-Of-Words Model (0)	2023.06.21