Code&Data Insights
[Machine Learning] Hyper parameters | Feature Importance | Gini Impurity | Mean Decrease in Impurity 본문
[Machine Learning] Hyper parameters | Feature Importance | Gini Impurity | Mean Decrease in Impurity
paka_corn 2023. 7. 21. 07:06Hyper parameters

Hyper parameters
: configuration values used to control and tune the behavior of machine learning algorithms and models. These parameters have a significant impact on the model's training process and performance.
=> Properly setting hyperparameters can optimize the model's performance and prevent overfitting.
=> Model Performance Optimization, Preventing Overfitting, Saving Training Time and Resources, Understanding and Interpreting Algorithms, and Enhancing Model Generalization
Feature Importance
: a technique used to determine the relative importance of each feature (input variable) in making decisions and predictions.
=> It allows us to understand which features have the most significant impact on the model's predictions. Identifying important features can help in several ways, including: Feature Selection, Feature Engineering, and Model Explanation.
Methods to compute feature importance
1) Gini Impurity (for classification)
Gini Impurity?
: a metric used in classification problems during the construction of decision trees to measure the impurity of a node. Impurity refers to the degree of mixed class labels in a node, and a lower Gini impurity indicates a purer node. The decision tree algorithm aims to minimize Gini impurity by splitting the data in the direction that reduces impurity at each node during the tree-building process.
2) Mean Decrease in Impurity (for regression)
Mean Decrease in Impurity ?
: It measures the importance of a feature by calculating the average reduction in the mean squared error (MSE) across all the nodes in the tree when that feature is used for splitting. A higher Mean Decrease in Impurity score indicates a more significant impact of the feature on the model's predictive performance.