0% found this document useful (0 votes)
12 views9 pages

IP 12th Chapter 3

Chapter 2 discusses data handling using Pandas, focusing on the DataFrame structure, which organizes data in rows and columns. It covers how to create DataFrames from various data sources, indexing methods, operations on rows and columns, and methods for importing and exporting data to and from CSV files. The chapter provides practical examples and syntax for performing these operations.

Uploaded by

LAKSHYA GOSWAMI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views9 pages

IP 12th Chapter 3

Chapter 2 discusses data handling using Pandas, focusing on the DataFrame structure, which organizes data in rows and columns. It covers how to create DataFrames from various data sources, indexing methods, operations on rows and columns, and methods for importing and exporting data to and from CSV files. The chapter provides practical examples and syntax for performing these operations.

Uploaded by

LAKSHYA GOSWAMI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Chapter 2 : Data Handling using Pandas 2

Class 12th Informatic Practices


DataFrame
● A data frame is a two-dimensional tabular data structure in which data is organized
in rows and columns.
● It is a popular data structure used in data analysis and manipulation in various
programming languages, including Python, R, and Julia.
● Basic Features of DataFrame
1. Columns may be of different types
2. Size can be changed(Mutable)
3. Labelled axes (rows / columns)
4. Arithmetic operations on rows and columns
● Way to create dataframe
1. Lists
2. Dictionary
3. Series
4. Numpy ndarrays

Creation of Dataframe
1. Empty Dataframe
import pandas as pd

df = pd.DataFrame()
print(df)

2. Create a DataFrame from Lists


import pandas as pd1
data = [['Freya',10],['Mohak',12],['Dwivedi',13]]
df = pd1.DataFrame(data,columns=['Name','Age'])
print (df)

3. Create a DataFrame from List of Dicts


import pandas as pd

# Create a list of dictionaries


data = [
{'Name': 'John', 'Age': 25, 'City': 'New York'},
{'Name': 'Emma', 'Age': 30, 'City': 'London'},
{'Name': 'Tom', 'Age': 35, 'City': 'Paris'}
]

# Create the DataFrame


df = pd.DataFrame(data)

# Print the DataFrame


print(df)

4. Create a DataFrame from Dict of ndarrays / Lists


import pandas as pd
import numpy as np

# Create a dictionary of ndarrays or lists


data = {
'Name': ['John', 'Emma', 'Tom'],
'Age': np.array([25, 30, 35]),
'City': ['New York', 'London', 'Paris']
}

# Create the DataFrame


df = pd.DataFrame(data)

# Print the DataFrame


print(df)
5. Create a DataFrame from Dict of Series
import pandas as pd

# Define the dictionary of Series


data = {'Name': pd.Series(['Alice', 'Bob', 'Charlie']),
'Age': pd.Series([25, 30, 35]),
'City': pd.Series(['New York', 'London', 'Paris'])}

# Create the DataFrame


df = pd.DataFrame(data)

# Print the DataFrame


print(df)

6. Create a DataFrame from csv file


import pandas as pd

# Read the CSV file into a DataFrame


df = pd.read_csv('data.csv')

# Print the DataFrame


print(df)

DataFrame Indexing
● DataFrame indexing refers to the process of selecting and retrieving specific data
elements from a DataFrame.
● It allows you to access and manipulate the data within a DataFrame using various
methods and techniques.
● Indexing in a DataFrame can be performed in multiple ways:
1. Label-based Indexing
● This indexing method is primarily used for label-based indexing, allowing
you to access data based on row and column labels.
● You can use it to retrieve specific rows or columns by providing the labels
as arguments.
● Syntax : df.loc[row_label, column_label]
2. Boolean Indexing
● Boolean indexing involves using a Boolean condition to filter the
DataFrame.
● It allows you to select rows or columns based on certain conditions using
logical operators such as ==, >, <, >=, <=, !=, and combining them with
logical operators like & (and) and | (or).
● Syntax : df[df['column_name'] > 10]

Traversing of Dataframe
● The iterrows() method allows you to iterate over the rows of a DataFrame, returning
each row as a tuple containing the index and row data.
● Syntax
for index, row in df.iterrows():
# Access row data
print(index, row['Column1'], row['Column2'])

● Example :
import pandas as pd

# Define the dictionary of Series


data = {'Name': pd.Series(['Alice', 'Bob', 'Charlie']),
'Age': pd.Series([25, 30, 35]),
'City': pd.Series(['New York', 'London', 'Paris'])}

# Create the DataFrame


df = pd.DataFrame(data)

for index, row in df.iterrows():


# Access row data
print(index, row['Name'], row['Age'], row['City'])
Operation on Rows and Columns
● Dataframes provide various operations on rows and columns to manipulate and
analyse data.
● Here are some common operations on rows and columns in a dataframe:
1. Operations on Columns
● Column Selection : To select a specific column or multiple columns from a
DataFrame, you can use the indexing operator [] or the loc[] or iloc[]
accessor.
# Select a single column
single_column = df['ColumnName']

# Select multiple columns


multiple_columns = df[['Column1', 'Column2']]

● Column Addition : To add a new column to a DataFrame, you can assign


values to a new column name.
# Add a new column with a constant value
df['NewColumn'] = 'Value'

# Add a new column based on existing columns


df['SumColumn'] = df['Column1'] + df['Column2']

● Column Deletion : To delete a column from a DataFrame, you can use the
drop() method.
# Delete a single column
df = df.drop('ColumnName', axis=1)

# Delete multiple columns


df = df.drop(['Column1', 'Column2'], axis=1)

● Column Rename : To rename a column in a DataFrame, you can use the


rename() method.
# Rename a single column
df = df.rename(columns={'OldName': 'NewName'})

# Rename multiple columns


df = df.rename(columns={'OldName1': 'NewName1', 'OldName2': 'NewName2'})

2. Operation on Row
● Row Selection : To select a specific row or multiple rows from a
DataFrame, you can use the loc[] or iloc[] accessor.
# Select a single row by index label
single_row = df.loc[3]

# Select multiple rows by index labels


multiple_rows = df.loc[1:3]

# Select a single row by index position


single_row = df.iloc[0]

# Select multiple rows by index positions


multiple_rows = df.iloc[1:3]

● Row Addition : To add a new row to a DataFrame, you can use the loc[]
accessor along with the assignment operator.

# Add a new row with values


df.loc[len(df)] = ['Value1', 'Value2', 'Value3']

# Add a new row with values from another row


df.loc[len(df)] = df.loc[0]

● Row Deletion : To delete a row from a DataFrame, you can use the drop()
method.
# Delete a single row by index label
df = df.drop(3)

# Delete multiple rows by index labels


df = df.drop([1, 2])

# Delete a single row by index position


df = df.drop(df.index[0])

# Delete multiple rows by index positions


df = df.drop(df.index[1:3])

● Row Rename : Rows in a DataFrame are identified by their index, which


can be changed using the set_index() method.
# Rename a single row by index label
df = df.set_index({'OldIndex': 'NewIndex'})
# Rename multiple rows by index labels
df = df.set_index({'OldIndex1': 'NewIndex1', 'OldIndex2': 'NewIndex2'})

Methods of Dataframe
1. head( )
● The head() method is used to retrieve the first n rows of a DataFrame.
● By default, if no argument is provided, it returns the first 5 rows.
● Example
import pandas as pd

# Define the dictionary of Series


data = {'Name': pd.Series(['Alice', 'Bob', 'Charlie']),
'Age': pd.Series([25, 30, 35]),
'City': pd.Series(['New York', 'London', 'Paris'])}

# Create the DataFrame


df = pd.DataFrame(data)

# Retrieve the first 10 rows of a DataFrame


print(df.head(2))

2. tail( )
● The tail() method is the counterpart of head(). It is used to retrieve the last n
rows of a DataFrame.
● By default, if no argument is provided, it returns the last 5 rows.
● Example
import pandas as pd

# Define the dictionary of Series


data = {'Name': pd.Series(['Alice', 'Bob', 'Charlie']),
'Age': pd.Series([25, 30, 35]),
'City': pd.Series(['New York', 'London', 'Paris'])}

# Create the DataFrame


df = pd.DataFrame(data)

# Retrieve the first 10 rows of a DataFrame


print(df.tail(2))

Importing/Exporting Data between CSV files and Data


Frames
1. Importing data between csv files and dataframes
● To import data from a CSV file into a DataFrame, you can use the pd.read_csv()
function. Specify the path to the CSV file as the argument to the function.
● Example :
import pandas as pd

# Import data from a CSV file into a DataFrame


df = pd.read_csv('data.csv')

print(df)

2. Exporting data between csv and dataframe


● To export a DataFrame to a CSV file, you can use the to_csv() method of the
DataFrame.
● The index=False parameter is used to exclude the index column from being
saved in the CSV file. If you want to include the index column, you can omit this
parameter or set it to True.
● Example
import pandas as pd

# Define the dictionary of Series


data = {'Name': pd.Series(['Alice', 'Bob', 'Charlie']),
'Age': pd.Series([25, 30, 35]),
'City': pd.Series(['New York', 'London', 'Paris'])}

# Create the DataFrame


df = pd.DataFrame(data)

# Export a DataFrame to a CSV file


df.to_csv('file.csv', index=False)

You might also like