목록Data Science (47)
Code&Data Insights
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/OgBAX/btszOLVASAW/5YFRBsZKkhV3p5qPIkJk2K/img.png)
Generalization : ability of a machine learning algorithm(model) to perform well on unseen data Training Loss : loss function, computed with training set Test Loss : loss function, computed with test set => More training set leads better generalization! Capacity : Underfittingand Overfittingare connected to the capacity of the model. capacity(= representational capacity) : attempts to quantify ho..
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/TkvUt/btsy6g8wSUP/LekdRM6irxrJHRKCezaNn1/img.png)
: the process of identifying and connecting records or data entries that correspond to the same real-world entity or individual in one or more data sources. - Improves data quality and integrity - Fosters re-use of existing data sources - Optimize space [ Atomic String Similarity ] Atomic String Similarity, why it is important? - Information Retrieval : similarity of string - Da..
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/bMyv6w/btstr1LjOZc/AW7D9i9BhdVKiDfzkb7u91/img.png)
[ Cohort Analysis ] Cohort analysis breaks the data in a data set into related groups before analysis. - a kind of behavior analytics - a group of subjects which share a defining feature and observing the behaviour of the group(cohort) over time and compare it to other cohorts. * Main Stages for Cohort Analysis 1) Determine what question you want to answer to improve business, product, user expe..
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/oRHEb/btss73Ja1bp/bQ8jZIDeOc5rpEbrBHQnE1/img.png)
[ Connecting to & Preparing Data ] Q. How to reduce the size of an extract ? A1) Aggregate the data to visible dimension A2) Hide All Unsued field => When we create an extract, we get many size options We can choose only include a sample of the data, Or aggregate to visible dimensions Or use extract filters Or choose the physical table option instead of logical table Q. The best reason to use a ..
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/bogonL/btssvUZuPZf/1evhyAnvKAJ8FRFkv8Z22k/img.png)
Domain 4: Understanding Tableau Concepts 4.1 Understand dimensions and measures 1) Explain what kind of information dimensions usually contain - it contains qualitative values (such as names, dates, or geographical data). You can use dimensions to categorize, segment, and reveal the details in your data. - Dimensions affect the level of detail in the view. => level of detail in a view : how gran..
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/dklIZL/btsswh8lYNt/aKTbI9td96xm4zP6kedm7K/img.png)
[ Domain 3: Sharing Insights ] 3.1 Format view for presentation 1) Use color from the marks card - To assign a color to marks in the view, => From the data pane, drag a field to Color on the Marks card - if you drop a discrete field (a blue field), such as Category, on Color, the marks in the view are broken out by category, and each category is assigned a color - If you drop a continuous field,..
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/W1xUc/btssfRCqV49/n0JyqjbnRJ3KJitvStJ9sk/img.png)
[ Domain 2: Exploring & Analyzing Data ] 2.1 Create basic charts 1) Create a bar chart bar 차트의 종류 - horizontal / stacked / side-by-side bars headers : Sub-Catergory sales : axis A stacked bar - by adding a second dimension('segment') to view creates a stacked bar A side-by-side bar 2) Create a line chart ** Line charts always involve a date dimension. line chart의 종류 - discrete / continuous lines..
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/pdmX4/btssgozzPv8/DNbXu4SLvIO0hHSG076VE1/img.png)
[ Domain 1: Connecting to & Preparing Data ] 1.1 Create live connections and extracts 1) Create a live connection to a data source Live connection : connecting to the data source directly rather than connecting to a copy. ( default in Tableau Desktop ) Extract : the subset of data (that we can use to improve performance or to take advandatage of Tableau functionality not available or supported i..