Working With Panda
Working With Panda
No : 05
Date:
WORKING WITH PANDAS DATA FRAMES
Aim
To gain knowledge in various features of pandas DataFrame.
Pandas DataFrame
The pandas data frame is a structure that contains two dimensional data and that
contains two dimensional data and its corresponding labels . Data frames are widely used in data
science , machine learning , scientific computing and many other data intensive fields.
Importing pandas
>>> import pandas as pd
Code
import pandas as pd data = {
'Name':['A','B','C','D','E','F'],
'Reg No':[7111,7112,7113,7114,7115,7116], 'Age':[16,17,17,16,17,16],
'Mail Id': ['[email protected]','[email protected]','[email protected]','[email protected]',
'[email protected]','F@ gmail.com'],
'State':['AP','TN','KA','AP','TS','KL'],
'Mother Tongue':['Telugu','Tamil','Kannada','Telugu','Telugu','Malayalam']
}
df = pd.DataFrame(data = data)
Code
>>> df
Output
Code
For first 3 rows
>>> df.head(n =3)
Output
Code
For last 2 rows
>>> df.tail(n = 2)
Output
4. Displaying any one column
Code
>>> State = df[‘State’] df.State
Output
Code
>>> df.loc[4]
Output
Code
>>> data = {'Reg No':[7111,7112,7113],'Age': np.array([15,16,15]),'State':"AP"}
pd.DataFrame(data)
Output
{'x':2,'y':3,'z':100},
{'x':3,'y':4,'z':100}]
pd.DataFrame(data)
Output
Code
>>> c = np.array([[1,2,100],
[2,3,100],
[3,4,100]])
>>> df
Output
Code
>>>c = pd.read_csv('student.csv') c
Output
10. Displaying data frame row labels and column labels
Output
Output
Code
Output
Code
>>> df.to_numpy()
Output
13. Displaying datatypes for each column to a pandas data frame
Code
>>> df.dtypes
Output
14. Displaying the number of dimensions, number of data values across each
dimension and total number of data values in Dataframe.
Code:
>>> df.ndim
Output: 2
Code
>>> df.shape
Output: (6,6)
Code
>>> df.size
Output: 36
Code
>>> df.memory_usage()
Output
16. Displaying elements in particular row label Code:
Code
>>> df[‘Name']
Output
Code
>>> df.loc[10]
Output
18. Retrieving single value , using row label pandas recommends using specialised
accessors at [] and iat[]
Code
>>> df.at[10,’Name']
Output: ‘A’
Code
>>> df.iat[0,1]
Output: 7111
19. Changing the age of three students we ca use accessors to modify part of a pandas
data frame by passing a python sequence Numpy array or single values.
Code
>>>df.loc[:12,'Age']=[24,26,28]
df[‘Age']
Output
20. Adding the details of new student in the existing pandas Dataframe
Code
>>> G = pd.Series(data=['G',7117,17,'[email protected]','KA','Kannada'],
index = df.columns,name = 16)
df = df.append(G) df
Output
Code
>>> df = df.drop(labels =[14]) df
Output
22. Adding a column CGPA in the data frame at the last
Code
>>> df['CGPA']= np.array([6.5,8.3,7.9,9.6,8.1,9.3])
df
Output
Code
>>> df.insert(loc=3,column = 'DOB',value = np.array(['5/10/2004',
'3/10/2005','10/10/2005','3/1/2005','14/10/2005','20/06/200 4']))
df
Output
Code
>>>del df['DOB'] df
Output
Code
>>> df[‘CGPA']*10
Output
26. Sorting the data frame in ascending order based on CGPA Code:
Code
>>> df.sort_values(by='CGPA',ascending = True)
Output
27. Filtering the data those who have CGPA greater than 8 . You can create very
powerful and sophisticated expression by combining logical operations with the
following operators:
• NOT(~)
• AND(&)
• OR( | )
• XOR (^)
Code
>>> df[(df[‘CGPA']>8)]
Output
28. Filtering the data those who have CGPA greater than 8 and age is less than 19
Code
>>> df[(df[‘CGPA']>=8)&(df['Age']<19)]
Output
29. Describing the data , so that it displays mean ,standard deviation, minimum,
maximum and quartiles of column.
Code
>>> df.describe()
Output
Code
>>>df.mean()
Output
Code
>>> df[‘CGPA'].mean()
Output
8.283333333333333
Code
>>> df.std()
Output
Code
>>> df['CGPA'].std()
Output
1.1070983093956321
30. Filter the CGPA of the student greater than 7 CGPA and print the Dataframe
Code
>>> filter_ = df['CGPA'] >= 7 df[filter_]
Output
Conclusion
Thus the various features of Pandas are learned and practiced in Python successfully.