Introduction to Pandas Programming
Pandas is a powerful Python library for data manipulation and analysis. It is widely used in data
science for working with structured data like spreadsheets and SQL tables. Below are essential
concepts and operations in Pandas:
1. Importing Pandas
To use Pandas, first, install it (if not already done):
bash
CopierModifier
pip install pandas
Then, import it:
python
CopierModifier
import pandas as pd
2. Creating DataFrames
From a Dictionary:
python
CopierModifier
data = {
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"Salary": [50000, 60000, 70000]
}
df = pd.DataFrame(data)
print(df)
From a CSV File:
python
CopierModifier
df = pd.read_csv("data.csv")
print(df.head()) # Displays the first 5 rows
From an Excel File:
python
CopierModifier
df = pd.read_excel("data.xlsx")
print(df.info()) # Summary of the dataset
3. Basic DataFrame Operations
Viewing Data:
python
CopierModifier
print(df.head()) # First 5 rows
print(df.tail()) # Last 5 rows
print(df.shape) # Number of rows and columns
print(df.columns) # List of column names
Selecting Data:
Select a single column:
python
CopierModifier
print(df["Name"])
Select multiple columns:
python
CopierModifier
print(df[["Name", "Salary"]])
Select rows by index:
python
CopierModifier
print(df.iloc[0]) # First row
print(df.iloc[1:3]) # Rows 2 to 3
Select rows based on condition:
python
CopierModifier
print(df[df["Age"] > 30]) # Rows where Age > 30