0% found this document useful (0 votes)
5 views

Introduction to Pandas Programming 2

The document provides an introduction to Pandas programming, focusing on data cleaning, manipulation, and analysis techniques. It covers handling missing data, adding/removing columns, grouping and aggregation, sorting, merging, and exporting data. Additionally, it includes a simple data analysis example demonstrating total sales by product and sales summary by region.

Uploaded by

Marou fan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Introduction to Pandas Programming 2

The document provides an introduction to Pandas programming, focusing on data cleaning, manipulation, and analysis techniques. It covers handling missing data, adding/removing columns, grouping and aggregation, sorting, merging, and exporting data. Additionally, it includes a simple data analysis example demonstrating total sales by product and sales summary by region.

Uploaded by

Marou fan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Introduction to Pandas Programming

4. Data Cleaning

Handling Missing Data:

 Check for missing values:

python
CopierModifier
print(df.isnull().sum())

 Fill missing values:

python
CopierModifier
df["Age"].fillna(df["Age"].mean(), inplace=True) # Fill with mean

 Drop rows with missing values:

python
CopierModifier
df.dropna(inplace=True)

5. Adding and Removing Columns

Add a New Column:

python
CopierModifier
df["Bonus"] = df["Salary"] * 0.1
print(df)

Remove a Column:

python
CopierModifier
df.drop("Bonus", axis=1, inplace=True)

6. Grouping and Aggregation

 Group by a column and calculate summary statistics:

python
CopierModifier
grouped = df.groupby("Age")["Salary"].mean()
print(grouped)
 Aggregate multiple functions:

python
CopierModifier
agg = df.groupby("Age").agg({"Salary": ["mean", "sum"]})
print(agg)

7. Sorting Data

 Sort by a single column:

python
CopierModifier
df.sort_values("Salary", ascending=False, inplace=True)
print(df)

 Sort by multiple columns:

python
CopierModifier
df.sort_values(["Age", "Salary"], ascending=[True, False], inplace=True)

8. Merging and Joining DataFrames

Merging:

python
CopierModifier
df1 = pd.DataFrame({"ID": [1, 2], "Name": ["Alice", "Bob"]})
df2 = pd.DataFrame({"ID": [1, 2], "Salary": [50000, 60000]})

merged = pd.merge(df1, df2, on="ID")


print(merged)

Joining:

python
CopierModifier
df1 = df1.set_index("ID")
df2 = df2.set_index("ID")

joined = df1.join(df2)
print(joined)

9. Exporting Data

To a CSV File:

python
CopierModifier
df.to_csv("output.csv", index=False)

To an Excel File:

python
CopierModifier
df.to_excel("output.xlsx", index=False)

10. Example: Simple Data Analysis

Dataset Example:

python
CopierModifier
data = {
"Product": ["A", "B", "C", "A", "B", "C"],
"Sales": [100, 200, 300, 400, 500, 600],
"Region": ["East", "West", "East", "West", "East", "West"]
}
df = pd.DataFrame(data)

Analysis:

 Total sales by product:

python
CopierModifier
print(df.groupby("Product")["Sales"].sum())

 Sales summary by region:

python
CopierModifier
print(df.groupby("Region")["Sales"].agg(["mean", "sum"]))

You might also like