Code&Data Insights

[Pandas] Pandas DataFrame | Series | Index | Basic APIs 본문

Data Science/Data Analytics

[Pandas] Pandas DataFrame | Series | Index | Basic APIs

paka_corn 2023. 6. 2. 08:25

Pandas :  a Python library used for working with data sets.

-> Pandas has functions for analyzing, cleaning, exploring, and manipulating data.

 

 

 

 

[ DataFrame ] 

DataFrame : a Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns in RDB(relational database-SQL)

 

 

 

 

[ Series ]

Series : Series is a one-dimensional array holding data of any type, like a column in a table. 

 

 

 

 

 

[ Index ]

Index : the rows which it can be defined a set of labels. 

- Pandas creates index by default (index start from 0) 

how to extract index? -> DataFrame.index | Series.index 

 

 

 

 

[ Basic APIs ]

read_csv() : Load the CSV into a DataFrame

 

-----------------------------------------------------------------

import pandas as pd

df= pd.read_csv('data.csv')

-----------------------------------------------------------------

 

 

Convert DataFrame to ndarray

 

-----------------------------------------------------------------

import pandas as pd

df.values() 

-----------------------------------------------------------------

 

 

 

drop() 

- row : axis = 0 | column : axis = 1 

- inplace = False => keep original dataframe and assign the new dataframe to " "

 

 

loc() - indexing by label of column

 

-----------------------------------------------------------------

import pandas as pd

df.loc[row, column)

-----------------------------------------------------------------

 

 

iloc() - indexing by position 

 

-----------------------------------------------------------------

import pandas as pd

df.iloc[row, column]

-----------------------------------------------------------------

 

* iloc does not support boolean indexing! 

 

 

Groupby 

: groupby method returns  

-------------------------------------------------------------------------------------------------

import pandas as pd

dataframe.transform(by, axis, level, as_index, sort, group_keys, observed, dropna)

-------------------------------------------------------------------------------------------------

 

-> To change type to DataFrame, we can use aggregation methods(sum, mean, max, min,...) 

==> DataFrameGroupBy -> DataFrame

 

 

 

 

[ Processing for Missing Data ] 

 

isna()

it returns a DataFrame object where all the values are replaced with a Boolean value True for NA (not-a -number) values, and otherwise False. 

- NaN : returns True 

 

-----------------------------------------------------------------

import pandas as pd

df.isna()

-----------------------------------------------------------------

 

 

=> To Count how many NaN values in DataFrame

DataFrame. isna().sum()

 

 

 

 

 

fillna()

it replaces the NULL values with a specified value.

 

-----------------------------------------------------------------

import pandas as pd

df.fillna(value, method, axis, inplace, limit, downcast)

-----------------------------------------------------------------

 

 

 

 

 

 

 

https://www.w3schools.com/python/pandas/pandas_ref_dataframe.asp

 

Pandas - DataFrame Reference

W3Schools offers free online tutorials, references and exercises in all the major languages of the web. Covering popular subjects like HTML, CSS, JavaScript, Python, SQL, Java, and many, many more.

www.w3schools.com

 

Comments