목록전체 글 (134)
Code&Data Insights
Vanishing Gradients - usually occurs due to the fact that some activation functions squash the input into small values result in small gradients that result in negligible updates to the weights of the model - Or sometimes the input values are small to begin with : When backpropagation the gradient through long chains of computations, the gradient gets smaller and smaller - causes the gradient of..
Regularization : prevent overfitting and improve the generalization of a model - It introduces additional constraints or penalties into the model training process to discourage the model from becoming too complex. - It aims to strike a balance between fitting the training data well and maintaining simplicity in the model Early Stopping : monitor the performance after each epoch on the validation..
Benefits of Advanced optimization methods - Faster Convergence - Improved Stability - Avoiding Local Minima - Better Generalization Momentum : accumulates an exponentially-decaying moving average of the past gradients - NOT ONLY denpends on learning rate, but ALSO past gradients (SDG with Batch) If the previous update vt is very different from the current gradient => little update If previous up..
Optimization - Training a machine learning model often requires solving Optimization problem => have to find the parameters of the function f that minimizes the loss function using the training data. Problem in Optimization in Multi dimensional Spaces - TOO MANY CRITICAL POINT! (critical points where f'(x) = 0) => local minima, maxima, and saddle points How to Solve Optimization Problem? Solutio..
Generalization : ability of a machine learning algorithm(model) to perform well on unseen data Training Loss : loss function, computed with training set Test Loss : loss function, computed with test set => More training set leads better generalization! Capacity : Underfittingand Overfittingare connected to the capacity of the model. capacity(= representational capacity) : attempts to quantify ho..
: the process of identifying and connecting records or data entries that correspond to the same real-world entity or individual in one or more data sources. - Improves data quality and integrity - Fosters re-use of existing data sources - Optimize space [ Atomic String Similarity ] Atomic String Similarity, why it is important? - Information Retrieval : similarity of string - Da..
[ Adversarial Search ] : a strategy used in artificial intelligence for decision-making in competitive scenarios, such as games. It involves representing the problem as a game tree, using the Minimax algorithm, evaluation functions, and techniques like (ex) chess, checkers, and in various real-world domains like business strategy, auctions, and negotiations. : Alpha-Beta Pruning to f..
: we try to identify which nodes seems more Optimistic, and explore these first - With heuristics (useful rules or empirical knowledge) or additional information is used to find the most efficient path. - A way to visit the most promising nodes first [ Heuristic ] - A technique improves the efficiency of search - Focus on nodes that seem most optimistic acc..