0% found this document useful (0 votes)
41 views2 pages

Short Notes On Coding

1. The document discusses various pandas functions and methods for data analysis including converting data to a dataframe, reading in csv files, handling missing values, data visualization using matplotlib and seaborn, descriptive statistics using numpy, random number generation, and performing groupby operations. 2. Key pandas functions covered include pd.DataFrame(), pd.read_csv(), .isnull(), .fillna(), .value_counts(), and .info(). Plotting methods like plt.plot, plt.scatter, plt.bar and plt.hist are also discussed. 3. The document also reviews numpy functions such as np.mean(), np.median(), np.var(), np.std(), and random number generation with np.random.uniform

Uploaded by

Pragati jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views2 pages

Short Notes On Coding

1. The document discusses various pandas functions and methods for data analysis including converting data to a dataframe, reading in csv files, handling missing values, data visualization using matplotlib and seaborn, descriptive statistics using numpy, random number generation, and performing groupby operations. 2. Key pandas functions covered include pd.DataFrame(), pd.read_csv(), .isnull(), .fillna(), .value_counts(), and .info(). Plotting methods like plt.plot, plt.scatter, plt.bar and plt.hist are also discussed. 3. The document also reviews numpy functions such as np.mean(), np.median(), np.var(), np.std(), and random number generation with np.random.uniform

Uploaded by

Pragati jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 2

Pandas:

1. to convert into a dataframe -> pd.DataFrame()


2. pd.read_csv
3. train-> a dataframe
train["column_name"].isnull().sum
to change null value -> train["column_name"]= train["column_name"].
fillna(train["column_name"].mean())

train.isnull().sum()

train.select_dtypes(include=[np.object,np.float64,np.int64])
)

train["column_name"].value_counts() -> give count of distint value in column

train["column_name"].replace(["","",""]),["","",""],inplace=True)

train.info()

---------------------------------------
import matplotlib.pyplot as plt
plt.subplots(1,2,1)
plt.xlabel('name on x-axis')
plt.ylabel('name on y-axis')
sns.countplot("columnname",data=train,pallete='ocean','spring','summer')
df.plt(kind= 'scatter or hist',x= 'column as x', y= 'column name x')

plt.bar(x,y)
plt.hist(x)
plt.scatter(x,y)

y=np.array([12,34,56,90])
plt.pie(y)
plt.plot

----------------------------------------------------
import numpy as np
from scipy import stats

np.mean(list)
np.meadian(list)
stats.mode(list)

np.var(list) -> (sum over all i(xi- (mean of xi)))/no. of points= variance
np.std(list) -> sqrt(variance)

np.percentile(list, 75)

75 percentile meaning 75 percentile= 43 that means 75% of the population has values
lower than 43
max_no= m
min_no= n
25 percentile = 0.25* (max_no-min_no)
---------------------------------------------------

Distributions:

np.random.uniform(start,end,size)
np.random.normal(start,end,size)

------------------------------------------------------

how to use map

map we use when we want to perform an operation over all the elments of list

list(map(myfunc,iterable))

-----------------------------------------------------------------

some more dataframe

performing groupby operation on dataframe using pandas

find out the name of district with max mean model_price

df= dataframe

di= dict(df.groupby(['district']).['model_price'].mean())

I want district which is a key in dict

keymax= max(di, key=di.get)


print(keymax)

data[data['state']== 'Telanagana']['commodity'].value_counts

to get the count of unique commodity each state has

data.groupby('state')['commodity'].nunique()

data.sort_values(by=['column_name'],inplace=True)

You might also like