Lab 9
Lab 9
1 data
data takes various forms like ndarray, series, map, lists, dict, constants and also
another DataFrame.
2 index
For the row labels, the Index to be used for the resulting frame is Optional Default
np.arange(n) if no index is passed.
3 columns
For column labels, the optional default syntax is - np.arange(n). This is only true if
no index is passed.
4 dtype
Data type of each column.
5 copy
This command (or whatever it is) is used for copying of data, if the default is False.
Create DataFrame
A pandas DataFrame can be created using various inputs like −
Lists
dict
Series
Numpy ndarrays
Another DataFrame
In the subsequent sections of this chapter, we will see how to create a DataFrame
using these inputs.
Example 2
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print (df)
Example 2
The following example shows how to create a DataFrame by passing a list of
dictionaries and the row indices.
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data, index=['first', 'second'])
print (df)
#With two column indices with one index with other name
df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
print (df1)
print (df2)
Column Addition
We will understand this by adding a new column to an existing data frame.
Example
import pandas as pd
df = pd.DataFrame(d)
# Adding a new column to an existing DataFrame object with column label by passing
new series
print (df)
Column Deletion
Columns can be deleted or popped; let us take an example to understand how.
Example
# Using the previous DataFrame, we will delete a column
df = pd.DataFrame(d)
print ("Our dataframe is:")
print (df)
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
# importing numpy as np
import numpy as np
# dictionary of lists
print(df)
#Now we drop rows with at least one Nan value (Null value)
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, np.nan, 45, 56],
'Third Score':[52, 40, 80, 98],
'Fourth Score':[np.nan, np.nan, np.nan, 65]}
print(df)
# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
'degree': ["MBA", "BCA", "M.Tech", "MBA"],
'score':[90, 40, 80, 98]}
# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
'degree': ["MBA", "BCA", "M.Tech", "MBA"],
'score':[90, 40, 80, 98]}
print(df)
for i in columns: