Exp 3
Exp 3
(EXPERIMENTS-3)
a) Merging Dataframes
Creating First Dataframe to Perform Merge Operation
# import module
import pandas as pd
# creating DataFrame for Student Details
details = pd.DataFrame({
'ID': [101, 102, 103, 104, 105, 106, 107, 108, 109, 110],
'NAME': ['Jagroop', 'Praveen', 'Harjot', 'Pooja', 'Rahul',
'Nikita', 'Saurabh', 'Ayush', 'Dolly', "Mohit"],
'BRANCH': ['CSE', 'CSE', 'CSE', 'CSE', 'CSE', 'CSE', 'CSE',
'CSE', 'CSE', 'CSE']})
# printing details
print(details)
Merge Operation
# Import module
import pandas as pd
# Creating Dataframe
details = pd.DataFrame({
'ID': [101, 102, 103, 104, 105,
106, 107, 108, 109, 110],
'NAME': ['Jagroop', 'Praveen', 'Harjot',
'Pooja', 'Rahul', 'Nikita',
'Saurabh', 'Ayush', 'Dolly', "Mohit"],
'BRANCH': ['CSE', 'CSE', 'CSE', 'CSE', 'CSE',
'CSE', 'CSE', 'CSE', 'CSE', 'CSE']})
# Creating Dataframe
fees_status = pd.DataFrame(
{'ID': [101, 102, 103, 104, 105,
106, 107, 108, 109, 110],
'PENDING': ['5000', '250', 'NIL',
'9000', '15000', 'NIL',
'4500', '1800', '250', 'NIL']})
# Merging Dataframe
MultiIndex([('a', 1),
('a', 2),
('a', 3),
('b', 1),
('b', 3),
('c', 1),
('c', 2),
('d', 2),
('d', 3)],
data['b']
1 1.059833
3 -1.104780
dtype: float64
data['b':'c']
b 1 1.059833
3 -1.104780
c 1 0.210634
2 1.423999
dtype: float64
data.loc[['b','d']]
b 1 1.059833
3 -1.104780
d 2 -1.256163
3 -1.129026
dtype: float64
c) Data Deduplication
Duplicate data from the Dataset
# import module
import pandas as pd
# initializing Data
student_data = {'Name': ['Amit', 'Praveen', 'Jagroop',
'Rahul', 'Vishal', 'Suraj',
'Rishab', 'Satyapal', 'Amit',
'Rahul', 'Praveen', 'Amit'],
'Roll_no': [23, 54, 29, 36, 59, 38,
12, 45, 34, 36, 54, 23],
'Email': ['[email protected]', '[email protected]',
'[email protected]', '[email protected]',
'[email protected]', '[email protected]',
'[email protected]', '[email protected]',
'[email protected]', '[email protected]',
'[email protected]',
'[email protected]']}
# creating dataframe
df = pd.DataFrame(student_data)
# Here df.duplicated() list duplicate Entries in ROllno.
# So that ~(NOT) is placed in order to get non duplicate values.
non_duplicate = df[~df.duplicated('Roll_no')]
# printing non-duplicate values
print(non_duplicate)
OUTPUT:
D) Replacing Values
import pandas as pd
df = { "Array_1": [49.50, 70], "Array_2": [65.1, 49.50]}
data = pd.DataFrame(df)print(data.replace(49.50, 60))
You can replace specific values in a Data Frame using the replace () method. Here’s a basic
example:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6]
})
# Replace all occurrences of 1 with 100
df.replace(1, 100, inplace=True)
print(df)
Replace Values in Pandas Dataframe
# importing pandas as pd
import pandas as pd
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
# Printing the first 10 rows of the data frame for visualization
df[:10]