0% found this document useful (0 votes)
18 views

Assignment 12

Uploaded by

riteshpc13
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Assignment 12

Uploaded by

riteshpc13
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Assignment no 12

write a python program to creating a dataframe to implement one hot


encoding from csv file.

import csv

# Define the header and rows

header = ['Name', 'Color', 'Size']

rows = [

['Alice', 'Red', 'Small'],

['Bob', 'Blue', 'Medium'],

['Charlie', 'Red', 'Large'],

['David', 'Green', 'Medium'],

['Eva', 'Blue', 'Small']

# Specify the file name

filename = 'data.csv'

# Write to the CSV file

with open(filename, 'w', newline='') as file:

writer = csv.writer(file)

writer.writerow(header) # Write the header

writer.writerows(rows) # Write the rows

print(f"CSV file '{filename}' created successfully.")

import pandas as pd
# Load the CSV file into a DataFrame

df = pd.read_csv('data.csv')

# Display the original DataFrame

print("Original DataFrame:")

print(df)

# Perform one-hot encoding on categorical columns

df_encoded = pd.get_dummies(df, columns=['Color', 'Size'])

# Display the DataFrame with one-hot encoding

print("\nDataFrame with One-Hot Encoding:")

print(df_encoded)

# Optionally, save the encoded DataFrame to a new CSV file

df_encoded.to_csv('data_encoded.csv', index=False)

output:-

Original DataFrame:

Name Color Size

0 Alice Red Small

1 Bob Blue Medium

2 Charlie Red Large

3 David Green Medium

4 Eva Blue Small

DataFrame with One-Hot Encoding:


Name Color_Blue Color_Green ... Size_Large Size_Medium
Size_Small

0 Alice 0 0 ... 0 0 1

1 Bob 1 0 ... 0 1
0

2 Charlie 0 0 ... 1 0
0

3 David 0 1 ... 0 1
0

4 Eva 1 0 ... 0 0
1

[5 rows x 7 columns]

To perform one-hot encoding on categorical data from a CSV file using


Pandas, you will follow these steps:

1. Load the CSV File: Use Pandas to read the data from a CSV file
into a DataFrame.

2. Perform One-Hot Encoding: Use Pandas' get_dummies() function


to encode categorical features into one-hot vectors.

3. Display or Save the Result: Show the DataFrame with the one-hot
encoded features, and optionally save it to a new CSV file.

Explanation

1. Load CSV File:

o pd.read_csv('data.csv') reads the CSV file into a DataFrame.

2. One-Hot Encoding:

o pd.get_dummies(df, columns=['Color', 'Size']) performs one-


hot encoding on the specified categorical columns. The
columns parameter specifies which columns to encode.

o Each unique value in the categorical columns is transformed


into a binary vector.

3. Display the Result:

o Print the original and the one-hot encoded DataFrame to


compare.

4. Save to CSV (Optional):


o df_encoded.to_csv('data_encoded.csv', index=False) saves the
one-hot encoded DataFrame to a new CSV file.

You might also like