session-1 DataFrame
session-1 DataFrame
Exploratory Data Analysis (EDA) is a crucial step in the data analysis process.
Out[7]: pandas.core.frame.DataFrame
Out[9]:
# we created a DataFrame
# But no data (no rows and no columns)
# we saved our DataFrame with a name 'data'
Out[13]:
In [16]: name=['Navya','Sneha','Yamu']
pd.DataFrame()
Out[16]:
In [18]: name=['Navya','Sneha','Yamu']
pd.DataFrame(name)
Out[18]: 0
0 Navya
1 Sneha
2 Yamu
In [20]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
pd.DataFrame(zip(name,age))
Out[20]: 0 1
0 Navya 20
1 Sneha 21
2 Yamu 22
In [22]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
pd.DataFrame(zip(name,age,city))
Out[22]: 0 1 2
0 Navya 20 Hyd
1 Sneha 21 Delhi
2 Yamu 22 Pune
In [24]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
data=[name,age,city]
pd.DataFrame(data)
Out[24]: 0 1 2
1 20 21 22
In [26]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
df=pd.DataFrame(zip(name,age,city))
df
Out[26]: 0 1 2
0 Navya 20 Hyd
1 Sneha 21 Delhi
2 Yamu 22 Pune
In [30]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
df=pd.DataFrame(zip(name,age,city),columns=cols)
df
0 Navya 20 Hyd
1 Sneha 21 Delhi
2 Yamu 22 Pune
In [33]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=[1,2,3]
df=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df
1 Navya 20 Hyd
2 Sneha 21 Delhi
3 Yamu 22 Pune
In [35]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df
Out[35]: Names Age City
A Navya 20 Hyd
B Sneha 21 Delhi
C Yamu 22 Pune
It has 3 columns
In [38]: marks=[100,200,300]
df['Marks']=marks
df
In [41]: df1=pd.DataFrame()
df1
Out[41]:
In [43]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
df1['Name']=name
df1['Age']=age
df1['City']=city
df1
Out[43]: Name Age City
0 Navya 20 Hyd
1 Sneha 21 Delhi
2 Yamu 22 Pune
In [50]: dict1={'Names':['Navya','Sneha','Yamu'],'Age':[20,21,22],'City':['Hyd','Delhi','Pune']}
dict1
In [52]: df2=pd.DataFrame(dict1)
df2
0 Navya 20 Hyd
1 Sneha 21 Delhi
2 Yamu 22 Pune
In [54]: df2=pd.DataFrame(dict1,index=['A','B','C'])
df2
A Navya 20 Hyd
B Sneha 21 Delhi
C Yamu 22 Pune
In [57]: dict2={'Name':'Navya','Age':20,'City':'Hyd'}
dict2
In [61]: df3=pd.DataFrame(dict2)
df3
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[61], line 1
----> 1 df3=pd.DataFrame(dict2)
2 df3
In [63]: dict2={'Name':'Navya','Age':20,'City':'Hyd'}
pd.DataFrame(dict2,index=[1])
1 Navya 20 Hyd
In [65]: dict2={'Name':'Navya','Age':20,'City':'Hyd'}
pd.DataFrame(dict2,index=[1,2])
Out[65]: Name Age City
1 Navya 20 Hyd
2 Navya 20 Hyd
tensor: Tensorflow
In [68]: l1=[1,2,3]
import numpy as np
np.array(l1)
In [70]: l1=[1,2,3]
l2=[11,12,13]
l1+l2
In [74]: l1=[1,2,3]
a=np.array(l1)
l2=[11,12,13]
b=np.array(l2)
a+b
In [76]: l1=[1,2,3]
a=np.array(l1)
l2=[11,12,13]
b=np.array(l2)
a*b
In [78]: l1=[1,2,3]
a=np.array(l1)
l2=[11,12,13]
b=np.array(l2)
a+b,a*b
All the methods based on dataframe names similar as the string names
1.Column name
2.axis
3.Inplace
In [81]: df4=pd.DataFrame()
df4
Out[81]:
In [ ]: df4.drop()
In [87]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df4=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df4
A Navya 20 Hyd
B Sneha 21 Delhi
C Yamu 22 Pune
In [97]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df4=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df4.drop('City',axis=1)
Out[97]: Names Age
A Navya 20
B Sneha 21
C Yamu 22
In [103… name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df4=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df4.drop('A',axis=0)
B Sneha 21 Delhi
C Yamu 22 Pune
In [107… name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df4=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df4.drop('A',axis=0,inplace=True)
In [109… df4
B Sneha 21 Delhi
C Yamu 22 Pune
In [4]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df4=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df4.drop('A',axis=0,inplace=False)
B Sneha 21 Delhi
C Yamu 22 Pune
append
concate
join
In [34]: dict1={'Name':'Ramesh','Age':20,'City':'Hyd'}
df5=pd.DataFrame(dict1,index=[1])
dict2={'Name':'Suresh','Age':21,'City':'Blr'}
df6=pd.DataFrame(dict2,index=[2])
result=pd.concat([df5,df6],ignore_index=True)
print(result)
now we want to replace all the values of specific column with new values
Then update the column with new values , in the same way of how to create a new column
In [8]: df4['Age']=[33,44,34]
df4
A Navya 33 Hyd
B Sneha 44 Delhi
C Yamu 34 Pune
In [48]: df4['Names']=['anshu','chinni','adya']
df4
excel
name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df
A Navya 20 Hyd
B Sneha 21 Delhi
C Yamu 22 Pune
Csv Format
In [56]: # DataFramename.methodname
# where you want to save
# in what name you want to save
df.to_csv('data12.csv')
Excel sheet
In [61]: df.to_excel('data13.xlsx')
read_csv
read_excel
In [65]: pd.read_csv('data12.csv')
Out[65]: Unnamed: 0 Names Age City
0 A Navya 20 Hyd
1 B Sneha 21 Delhi
2 C Yamu 22 Pune
In [69]: pd.read_excel('data13.xlsx')
0 A Navya 20 Hyd
1 B Sneha 21 Delhi
2 C Yamu 22 Pune
keep index=False
df.to_csv('data21.csv',index=False)
pd.read_csv('data21.csv')
0 Navya 20 Hyd
1 Sneha 21 Delhi
2 Yamu 22 Pune
In [76]: df.to_excel('data31.xlsx',index=False)
pd.read_excel('data31.xlsx')
0 Navya 20 Hyd
1 Sneha 21 Delhi
2 Yamu 22 Pune
In [ ]: