목록All Contents (170)
Code&Data Insights
Optimization - Training a machine learning model often requires solving Optimization problem => have to find the parameters of the function f that minimizes the loss function using the training data. Problem in Optimization in Multi dimensional Spaces - TOO MANY CRITICAL POINT! (critical points where f'(x) = 0) => local minima, maxima, and saddle points How to Solve Optimization Problem? Solutio..
Generalization : ability of a machine learning algorithm(model) to perform well on unseen data Training Loss : loss function, computed with training set Test Loss : loss function, computed with test set => More training set leads better generalization! Capacity : Underfittingand Overfittingare connected to the capacity of the model. capacity(= representational capacity) : attempts to quantify ho..
: the process of identifying and connecting records or data entries that correspond to the same real-world entity or individual in one or more data sources. - Improves data quality and integrity - Fosters re-use of existing data sources - Optimize space [ Atomic String Similarity ] Atomic String Similarity, why it is important? - Information Retrieval : similarity of string - Da..
[ Adversarial Search ] : a strategy used in artificial intelligence for decision-making in competitive scenarios, such as games. It involves representing the problem as a game tree, using the Minimax algorithm, evaluation functions, and techniques like (ex) chess, checkers, and in various real-world domains like business strategy, auctions, and negotiations. : Alpha-Beta Pruning to f..
: we try to identify which nodes seems more Optimistic, and explore these first - With heuristics (useful rules or empirical knowledge) or additional information is used to find the most efficient path. - A way to visit the most promising nodes first [ Heuristic ] - A technique improves the efficiency of search - Focus on nodes that seem most optimistic acc..
: all nodes are equally optimistic, so we explore them systematically => Simple search method, without heuristics or information, explores all possible paths. · Data Structure - Open List(=frontier) - Closed List(=explored set) · Generic Search Algorithm 1) Initialize OPEN with the initial node n0 and its parent 2) Initialize CLOSED to empty 3) Repeat A) If OPEN is empty, t..
: Finding a solution from initial state to a goal state [ State Space ] · Problem is represented by 1) Initial State - Starting state 2) Set of Operators - Actions for transition between states 3) Goal test function - determine if it is matched with a goal state 4) Calculate Path cost function - Assigns a cost to a path if a path is best among others · State space ..
[ Cohort Analysis ] Cohort analysis breaks the data in a data set into related groups before analysis. - a kind of behavior analytics - a group of subjects which share a defining feature and observing the behaviour of the group(cohort) over time and compare it to other cohorts. * Main Stages for Cohort Analysis 1) Determine what question you want to answer to improve business, product, user expe..