0% found this document useful (0 votes)
10 views2 pages

EXPERIMENT 2 - Colab

The document describes cleaning a Titanic dataset by handling missing values, removing duplicates, handling outliers, and displaying the cleaned dataset.

Uploaded by

prothemesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views2 pages

EXPERIMENT 2 - Colab

The document describes cleaning a Titanic dataset by handling missing values, removing duplicates, handling outliers, and displaying the cleaned dataset.

Uploaded by

prothemesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

 EXPERIMENT 2

# Importing necessary libraries


import pandas as pd

# Importing Titanic dataset


titanic_data = pd.read_csv("/content/titanic.csv")

# Displaying the first few rows of the dataset


print("Titanic Dataset:")
print(titanic_data.head())
Titanic Dataset:
PassengerId Survived Pclass \
0 1 0 3
1 2 1 1
2 3 1 3
3 4 1 1
4 5 0 3

Name Sex Age SibSp \


0 Braund, Mr. Owen Harris male 22.0 1
1 Cumings, Mrs. John Bradley (Florence Briggs Th��� female 38.0 1
2 Heikkinen, Miss. Laina female 26.0 0
3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1
4 Allen, Mr. William Henry male 35.0 0

Parch Ticket Fare Cabin Embarked


0 0 A/5 21171 7.2500 NaN S
1 0 PC 17599 71.2833 C85 C
2 0 STON/O2. 3101282 7.9250 NaN S
3 0 113803 53.1000 C123 S
4 0 373450 8.0500 NaN S

# Data Cleaning
# Handling missing values
titanic_data.dropna(inplace=True)

# Removing duplicates
titanic_data.drop_duplicates(inplace=True)

# Handling outliers (if any)


# For example, removing outliers in 'Age' column
titanic_data = titanic_data[titanic_data['Age'] < 100]

# Displaying the cleaned data


print("\nCleaned Titanic Dataset:")
print(titanic_data.head())

Cleaned Titanic Dataset:


PassengerId Survived Pclass \
1 2 1 1
3 4 1 1
6 7 0 1
10 11 1 3
10 11 1 3
11 12 1 1

Name Sex Age SibSp \


1 Cumings, Mrs. John Bradley (Florence Briggs Th��� female 38.0 1
3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1
6 McCarthy, Mr. Timothy J male 54.0 0
10 Sandstrom, Miss. Marguerite Rut female 4.0 1
11 Bonnell, Miss. Elizabeth female 58.0 0

Parch Ticket Fare Cabin Embarked


1 0 PC 17599 71.2833 C85 C
3 0 113803 53.1000 C123 S
6 0 17463 51.8625 E46 S
10 1 PP 9549 16.7000 G6 S
11 0 113783 26.5500 C103 S

You might also like