Introduction to Pandas Programming 2
Introduction to Pandas Programming 2
4. Data Cleaning
python
CopierModifier
print(df.isnull().sum())
python
CopierModifier
df["Age"].fillna(df["Age"].mean(), inplace=True) # Fill with mean
python
CopierModifier
df.dropna(inplace=True)
python
CopierModifier
df["Bonus"] = df["Salary"] * 0.1
print(df)
Remove a Column:
python
CopierModifier
df.drop("Bonus", axis=1, inplace=True)
python
CopierModifier
grouped = df.groupby("Age")["Salary"].mean()
print(grouped)
Aggregate multiple functions:
python
CopierModifier
agg = df.groupby("Age").agg({"Salary": ["mean", "sum"]})
print(agg)
7. Sorting Data
python
CopierModifier
df.sort_values("Salary", ascending=False, inplace=True)
print(df)
python
CopierModifier
df.sort_values(["Age", "Salary"], ascending=[True, False], inplace=True)
Merging:
python
CopierModifier
df1 = pd.DataFrame({"ID": [1, 2], "Name": ["Alice", "Bob"]})
df2 = pd.DataFrame({"ID": [1, 2], "Salary": [50000, 60000]})
Joining:
python
CopierModifier
df1 = df1.set_index("ID")
df2 = df2.set_index("ID")
joined = df1.join(df2)
print(joined)
9. Exporting Data
To a CSV File:
python
CopierModifier
df.to_csv("output.csv", index=False)
To an Excel File:
python
CopierModifier
df.to_excel("output.xlsx", index=False)
Dataset Example:
python
CopierModifier
data = {
"Product": ["A", "B", "C", "A", "B", "C"],
"Sales": [100, 200, 300, 400, 500, 600],
"Region": ["East", "West", "East", "West", "East", "West"]
}
df = pd.DataFrame(data)
Analysis:
python
CopierModifier
print(df.groupby("Product")["Sales"].sum())
python
CopierModifier
print(df.groupby("Region")["Sales"].agg(["mean", "sum"]))