Dataframes-I (Create & Selection)
Dataframes-I (Create & Selection)
It is a two dimensional object that is used to represent data in rows and columns. It is similar
to our mysql tables. Once we store data in this format, we can perform various operations
that are useful in analyzing and understanding the data. It can contain heterogeneous data.
The size and data of a dataframe are mutable ie. they can change.
Note: A dataframe can contain only one column of data also:
Column
Admno Name Class Section Marks names
100 Sushmita 12 A 78
101 Sarika 12 A 84
102 Aman 12 B 90
103 Kartavya 12 C 70 Row
Column
Dataframe has row and column index
A dataframe can be created using any of the following:
1. Lists
2. Dictionary
3. Numpy 2D array
4. Series
CREATION OF DATAFRAMES
Ways to create a dataframe:
a) Creating an empty dataframe
import pandas as pd
df=pd.DataFrame()
print(df)
Output:
Empty Dataframe
Columns: [ ]
Index: [ ]
b) Creating a dataframe using list
Example 1: import pandas as pd
d=[12,13,14,15,16]
df=pd.DataFrame(d)
print(df)
Output:
0 Default column name
0 12
1 13
2 14
3 15
4 16
Output:
name age
0 aman 45
1 vishal 56
2 soniya 67
Output:
name age
0 aman 45.0
1 vishal 56.0
2 soniya 67.0
Output:
names marks s_class sec
100 aman 45 11 a
101 vishal 56 12 a
102 soniya 67 12 e
103 parth 78 12 d
104 sushant 80 10 c
105 Umang 89 10 d
SELECTION OF DATA FROM A DATAFRAME
I. ROWS SELECTION
a) Selection by label: Rows can be selected by passing row label to
a .loc function.
Example 1: selecting single row label
>>> print(df.loc[101])
Output:
names vishal
marks 56
s_class 12
sec a
Name: 101, dtype: object
Output:
names marks s_class sec
103 parth 78 12 d
104 sushant 80 10 c
105 Umang 89 10 d
c) Slice Rows
Multiple rows can be selected using ‘ : ’ operator.
Example 1:
>>> print(df[2:4])
Output:
names marks s_class sec
102 soniya 67 12 e
103 parth 78 12 d
tail() returns the last n rows (observe the index values). The default
number of elements to display is five, but you may pass a custom number.
>>> print(df.tail(4))
Output
names marks s_class sec
102 soniya 67 12 e
103 parth 78 12 d
104 sushant 80 10 c
105 Umang 89 10 d
II. COLUMN SELECTION:
a) To display the contents of a particular column from the
DataFrame we write:
df [‘col name’])
OR
df.colname
Example 1:
print(df[‘names’])
Output:
100 aman
101 vishal
102 soniya
103 parth
Name: names, dtype: object
Example 2:
print(df.sec)
Output:
100 a
101 a
102 e
103 d
104 c
105 d
b) To access multiple columns we can write as:
df[ [‘col1’,’col2’,---] ]
Example 2:
print(df[['marks','sec']])
Output:
marks sec
100 45 a
101 56 a
102 67 e
103 78 d
104 80 c
105 89 d