0% found this document useful (0 votes)

31 views25 pages

Acknowledgement

Uploaded by

deyr9295

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views25 pages

Acknowledgement

Uploaded by

deyr9295

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Acknowledgement

I, Debangshu Karmakar of class "XII-Sci," would like to extend

my heartfelt gratitude to everyone who guided and supported
me in completing this project on the topic "Dataframe."

I am grateful to my teacher, "Mr. Neeraj Sharsar," for providing

valuable insights and fostering a learning environment that
encourages creativity and exploration. I also want to thank my
parents for their unwavering encouragement, patience, and
understanding, which have been the foundations of my
achievements.

Lastly, I appreciate my friends for their valuable ideas and

perspectives that contributed to the project’s enrichment.
Thank you all for making this project a rewarding learning
experience.

Debangshu Karmakar
“XII” – Science

1.
Index

Sl No. Topic Page No.

1 Introduction 3
2 Programs 4-23
3 Conclusion 24
4 Bibliography 25

2.
Introduction

What is a DataFrame?

A DataFrame is a structured data format that organizes information

into a two-dimensional table of rows and columns, resembling a
spreadsheet. This structure is one of the most widely used in modern
data analytics, as it provides a flexible and intuitive way to handle and
analyze data.

Each DataFrame has a "schema," which acts as a blueprint that outlines

the name and data type of each column. DataFrames in Spark, for
example, can contain generic data types like StringType and
IntegerType, as well as Spark-specific types, such as StructType.
Missing or incomplete values are typically stored as null values in a
DataFrame.

A useful analogy is to think of a DataFrame as a digital spreadsheet

with labeled columns. Unlike traditional spreadsheets, though,
DataFrames can span thousands of computers, allowing large-scale
data analytics and computation across distributed clusters.

The main reason for distributing data across multiple computers is

either because the data volume is too large to fit on a single machine
or because performing calculations on a single machine would take too
long.

3.
Programs

import pandas as pd

dt= {'City’: ['Delhi','Mumbai','Kolkata','Chennai'],'Hospitals’: [189,208,149,157],'schools’:

[7916,8508,7226,7617]}

dtf=pd.DataFrame(dt)
print(dtf)

4.
import pandas as pd
dt={'Yr1':[34500,56000,47000,49000],'Yr2':[44900,46100,57000,59000]}
dtf=pd.DataFrame(dt,index=['Qtr1','Qtr2','Qtr3','Qtr4'])
print(dtf)

5.
import pandas as pd
dt={'Rollno':[115,236,307,422],'Name':['Pavni','Rishi','Preet','Parul'],'Marks':[97.5,98.0,98.5,98.0]}
dtf=pd.DataFrame(dt)
print(dtf)

6.
import pandas as pd
dt={'Zone1':[56000,58000],'Zone2':[70000,68000],'Zone3':[75000,78000],'Zone4':[60000,61000]}
dtf=pd.DataFrame(dt,index=['Target','Sales'])
print(dtf)

7.
import pandas as pd
r1=[101,113,124]
r2=[130,140,200]
r3=[115,216,217]
combine=[r1,r2,r3]
df=pd.DataFrame(combine)
print(df)

8.
import pandas as pd
df={'city':['Delhi','Bengaluru','Chennai','Mumbai'],'Maxtemp':[40,31,35,29],'Mintemp':[32,25,27,21],'
Rainfall':[24.1,36.2,40.8,35.2]}
temp=pd.DataFrame(df)
print(temp)

9.
import pandas as pd
data = {
'A': [50,110],
'B': [80,120],
'C': [120,130],
'D': [180,140],
}
df = pd.DataFrame(data)
print(df)
df['E'] = [14, 220]
print("DataFrame after adding column E:")
print(df)
new_row = {'A': 2, 'B': 130, 'C': 140, 'D': 150, 'E': 300}
df = df._append(new_row, ignore_index=True)
print("DataFrame after adding a new row:")
print(df)
df = df.drop(columns=['A', 'C'])
print("DataFrame after removing columns A and C:")
print(df)
df = df.drop([0, 1])
print("DataFrame after removing the first and second rows:")
print(df)

10.
11.
import pandas as pd

data = {

'Product': ['cpu', 'mouse', 'keyboard', 'printer', 'hdd','cd', 'scanner', 'speaker'],

'Company': ['compaq', 'compaq', 'dell', 'hp', 'sony', 'sony','hp', 'dell'],

'qty': [40, 20, 10, 2, 500, 1000,4, 6],

'price': [9000, 400, 700, 20000, 450, 25,5500, 900]

index = [101, 102, 103, 104, 105, 106, 107,108]

df1 = pd.DataFrame(data, index=index)

print(df1)

print("Details of records 102, 104, and 106:")

print(df1.loc[[102, 104, 106]])

print("Product and Company details of records 101 and 104:")

print(df1.loc[[101, 104], ['Product', 'Company']])

print("First and third records of df1:")

12.
print(df1.iloc[[0, 2]])

print("Quantity and Company details of all records:")

print(df1[['qty', 'Company']])

df1.at[104, 'price'] = 50000

print("Updated DataFrame with modified price for record 104:")

print(df1)

print("Details of record 104:")

print(df1.loc[104])

df1.loc[[101, 102], 'Company'] = 'acer'

df1.loc[[101, 102], 'qty'] = 400

print("Updated DataFrame for company name and quantity of records 101 and 102:")

print(df1)

df1.loc[108] = ["mic", "dell", 100, 450]

print("DataFrame after adding new record:")

print(df1)

13.
14.
15.
import pandas as pd
data2 = {
'Bno': [1, 2, 3, 4],
'name': ['Sunil Grover', 'sourav ganguli', 'virat kohli', 'rahul dravid'],
'score1': [60, 65, 70, 80],
'score2': [70, 45, 90, 70]
}
batsman = pd.DataFrame(data2)
print(batsman)
batsman['total'] = batsman['score1'] + batsman['score2']

16.
print('Dataframe after adding total column is')
print(batsman)
print('lowest score of score 1 is', batsman['score1'].min())
print("Highest score of score2:", batsman['score2'].max())
batsman.index = ['player1', 'player2', 'player3', 'player4']
print('DataFrame with new index:')
print(batsman)
print("Details of batsmen with score1 < 75:")
print(batsman[batsman['score1'] < 75])
print("Names of batsmen with score1 < 75:")
print(batsman.loc[batsman['score1'] < 75, 'name'])
print("Name and score1 of batsmen with score1 < 75:")
print(batsman.loc[batsman['score1'] < 75, ['name', 'score1']])
batsman_sorted = batsman.sort_values(by='score2', ascending=False)
print("DataFrame in descending order of score2:")
print(batsman_sorted)
batsman.columns = ['batsmanno', 'bname', 's1', 's2', 'sum']
print("DataFrame after renaming columns:")
print(batsman)
batsman.loc[batsman['s2'] > 75, 's1'] += 5
print("DataFrame after adding 5 to s1 where s2 > 75:")
print(batsman)

17.
18.
import pandas as pd

data_df1 = {'mark1': [10, 40, 15, 40, 10], 'mark2': [15, 45, 30, 70, 50]}

data_df2 = {'mark1': [30,20,20,40,50], 'mark2': [20, 25, 30, 10, 30]}

df1 = pd.DataFrame(data_df1,index=[0,1,2,3,5])

df2 = pd.DataFrame(data_df2,index=[0,1,2,4,3])

print('df1')

print(df1)

print('df2')

print(df2)

df_sum = df1 + df2

print("Result of adding df1 and df2:")

print(df_sum)

df1 += 10

print("DataFrame df1 after adding 10 to all values:")

print(df1)

19.
df1['mark1'] += 5

print("DataFrame df1 after adding 5 to mark1 column:")

print(df1)

d2 = df1.add(df2, fill_value=0)

print("Result of adding df1 into df2:")

print(d2)

20.
21.
import matplotlib.pyplot as plt

overs = [5, 10, 15, 20]

runs = [45, 79, 145, 234]
plt.figure(figsize=(8, 5))
plt.plot(overs, runs, marker='o', linestyle='-', color='b', label='Runs')
plt.xlabel('Overs')
plt.ylabel('Runs')
plt.title('Run Rate of T20 Match')
plt.legend()
plt.grid(True)
plt.show()

22.
23.
Conclusion

The DataFrame is a highly versatile and widely adopted data structure that serves
as a cornerstone in data manipulation and analysis across various programming
languages and frameworks. In Python, it is a central component of the pandas
library, which is one of the most popular tools for data analysis and manipulation
in the data science ecosystem.

A DataFrame can be thought of as a table-like structure, similar to a spreadsheet

or a SQL table, where data is organized into rows and columns. It allows users to
store and manipulate large datasets efficiently while providing powerful
functionality for tasks such as data cleaning, filtering, aggregation, and
visualization. The ease of use and flexibility of DataFrames make them an essential
tool for data scientists, analysts, and engineers.

Beyond Python, the concept of DataFrames is also implemented in other

programming environments. For example, the R programming language offers a
DataFrame structure that has been a fundamental part of statistical computing for
decades. Similarly, the Apache Spark framework in Scala (and other languages)
provides a DataFrame API designed for large-scale data processing and distributed
computing. These implementations share the common goal of enabling users to
handle structured data intuitively while maintaining high performance.

The universality of DataFrames across different programming ecosystems

underscores their importance in the field of data analysis, providing a consistent
and powerful toolset regardless of the specific language or framework being used.
This consistency simplifies the learning curve for users transitioning between tools
while fostering collaboration among data professionals working in diverse
environments.

24.
Bibliography

1. Databricks - DataFrame Documentation

2. Pandas - Python Data Analysis Library
3. DataCamp - Online Data Science Learning
Platform

25.

PDS Lab Manual - 23 Om
No ratings yet
PDS Lab Manual - 23 Om
97 pages
Python For Beginners Part 1 B0DXDCL6KJ
No ratings yet
Python For Beginners Part 1 B0DXDCL6KJ
404 pages
FROM ZERO TO AI HERO BOOKLET - Compressed
No ratings yet
FROM ZERO TO AI HERO BOOKLET - Compressed
8 pages
Effective Pandas Patterns For Data Manipulation Treading On Python Matt Harrison Independently Published 2021
No ratings yet
Effective Pandas Patterns For Data Manipulation Treading On Python Matt Harrison Independently Published 2021
40 pages
Intro To Python For Computer Science and Data Science Learning To Program With AI Big Data and The Cloud Deitel Test Bank PDF Download
No ratings yet
Intro To Python For Computer Science and Data Science Learning To Program With AI Big Data and The Cloud Deitel Test Bank PDF Download
407 pages
FINTECH QBank MCQ
No ratings yet
FINTECH QBank MCQ
34 pages
DATAFRAME
No ratings yet
DATAFRAME
11 pages
Dataframe Ip
No ratings yet
Dataframe Ip
75 pages
DATAFRAME
0% (1)
DATAFRAME
6 pages
Ge3171-Pspp Lab Manual Final
No ratings yet
Ge3171-Pspp Lab Manual Final
106 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
A Road Accident Prediction Model Using Data Mining Techniques
No ratings yet
A Road Accident Prediction Model Using Data Mining Techniques
39 pages
Even Students
No ratings yet
Even Students
36 pages
Oddstudents
No ratings yet
Oddstudents
35 pages
Chapter 2 Data Handling Using Pandas - I (DATA FRAME)
No ratings yet
Chapter 2 Data Handling Using Pandas - I (DATA FRAME)
15 pages
Data Frame Demo
No ratings yet
Data Frame Demo
73 pages
Unit 1
No ratings yet
Unit 1
164 pages
Pandas, Numpy, Matplotlib
No ratings yet
Pandas, Numpy, Matplotlib
11 pages
Programs of Python Pandas
No ratings yet
Programs of Python Pandas
15 pages
Unit-4 PSC
No ratings yet
Unit-4 PSC
105 pages
Xii Ip Practical File 24-25
No ratings yet
Xii Ip Practical File 24-25
111 pages
Pandas
No ratings yet
Pandas
27 pages
8th - Sem - Shreya - Internship - Report
No ratings yet
8th - Sem - Shreya - Internship - Report
43 pages
Pandas Dataframe1
No ratings yet
Pandas Dataframe1
43 pages
Ip Study
No ratings yet
Ip Study
18 pages
IP Record Python 23-24 Aryan
No ratings yet
IP Record Python 23-24 Aryan
42 pages
PDF&Rendition 1
No ratings yet
PDF&Rendition 1
47 pages
Chapter 1 Python Pandas - I
No ratings yet
Chapter 1 Python Pandas - I
35 pages
Ip File
No ratings yet
Ip File
66 pages
Journal 12
No ratings yet
Journal 12
54 pages
Record Ip Mithun
No ratings yet
Record Ip Mithun
25 pages
Pandas
No ratings yet
Pandas
25 pages
Python Pandas-Data Frames
No ratings yet
Python Pandas-Data Frames
41 pages
Cover Letter For Python Developer
100% (1)
Cover Letter For Python Developer
7 pages
Pandas
No ratings yet
Pandas
44 pages
Creation of Series Using List, Dictionary & Ndarray
No ratings yet
Creation of Series Using List, Dictionary & Ndarray
65 pages
Practical
No ratings yet
Practical
29 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Python Programs
No ratings yet
Python Programs
29 pages
Class Xii Month Wise Syllabus 2025-26
No ratings yet
Class Xii Month Wise Syllabus 2025-26
15 pages
Project Report Cricket20 20 Analysis
No ratings yet
Project Report Cricket20 20 Analysis
22 pages
PEOJECTTTTTTTTTT
No ratings yet
PEOJECTTTTTTTTTT
22 pages
Dataframe
No ratings yet
Dataframe
19 pages
BCA4 TH Sem Syllabus
No ratings yet
BCA4 TH Sem Syllabus
26 pages
Ip Project
No ratings yet
Ip Project
21 pages
Analyzing The Impact of Python Libraries On Data Science
No ratings yet
Analyzing The Impact of Python Libraries On Data Science
23 pages
Lab Record IP
No ratings yet
Lab Record IP
13 pages
PDS Q
No ratings yet
PDS Q
11 pages
Data Science Workshop Brainovision
No ratings yet
Data Science Workshop Brainovision
25 pages
Data Analytics Using Python
No ratings yet
Data Analytics Using Python
19 pages
Practical File IP
No ratings yet
Practical File IP
27 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
APS Practical Question Paper-1
No ratings yet
APS Practical Question Paper-1
10 pages
Pandas Presentation Ip
No ratings yet
Pandas Presentation Ip
28 pages
List of Practicals Python 2024 - 25
No ratings yet
List of Practicals Python 2024 - 25
13 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
48 pages
Lab 9
No ratings yet
Lab 9
9 pages
12 IP Pandas DataFrame - Question Bank
No ratings yet
12 IP Pandas DataFrame - Question Bank
10 pages
Pandas Plots
No ratings yet
Pandas Plots
14 pages
DS Practical
No ratings yet
DS Practical
30 pages
Program Dataframe
No ratings yet
Program Dataframe
8 pages
Answers Practical File
No ratings yet
Answers Practical File
19 pages
Term 1 IP AK
No ratings yet
Term 1 IP AK
6 pages
Minimum Level Pandas Skill Based Questions
No ratings yet
Minimum Level Pandas Skill Based Questions
8 pages
Python Practical List 24
No ratings yet
Python Practical List 24
6 pages
Python ClassXII AI
No ratings yet
Python ClassXII AI
4 pages
Xii Record (Dataframe & CSV)
No ratings yet
Xii Record (Dataframe & CSV)
11 pages
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
No ratings yet
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
19 pages
Practical File Programs
No ratings yet
Practical File Programs
8 pages
IP (12) Proj File Pandas&Matplotlib
No ratings yet
IP (12) Proj File Pandas&Matplotlib
12 pages
Dataframe in Pandas
No ratings yet
Dataframe in Pandas
23 pages
Latebloomerworksheet
No ratings yet
Latebloomerworksheet
8 pages
Pre-Processing Techniques - Ipynb - Colab
No ratings yet
Pre-Processing Techniques - Ipynb - Colab
3 pages
Data Filtering
No ratings yet
Data Filtering
3 pages
Programs For Practical
No ratings yet
Programs For Practical
3 pages
Practical - With Solution - XII - IP
No ratings yet
Practical - With Solution - XII - IP
13 pages
Standardqp
No ratings yet
Standardqp
4 pages
Pandas & Mysql
No ratings yet
Pandas & Mysql
20 pages
Case Base Practice Question
No ratings yet
Case Base Practice Question
7 pages
Pandas
No ratings yet
Pandas
8 pages
Important Pandas Operations 1697910759
No ratings yet
Important Pandas Operations 1697910759
6 pages
Data Analyst - Ngo Gia Duc
No ratings yet
Data Analyst - Ngo Gia Duc
2 pages
Ip Project
No ratings yet
Ip Project
27 pages
Pandas
No ratings yet
Pandas
5 pages
DF Ques1
No ratings yet
DF Ques1
2 pages
Analyst Resume
No ratings yet
Analyst Resume
1 page
Practical File Questions With Answers
No ratings yet
Practical File Questions With Answers
7 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
Data Science Programming In Python
From Everand
Data Science Programming In Python
Anita Raichand
No ratings yet

Acknowledgement

Uploaded by

Acknowledgement

Uploaded by

Acknowledgement

I, Debangshu Karmakar of class "XII-Sci," would like to extend

I am grateful to my teacher, "Mr. Neeraj Sharsar," for providing

Lastly, I appreciate my friends for their valuable ideas and

Sl No. Topic Page No.

A DataFrame is a structured data format that organizes information

Each DataFrame has a "schema," which acts as a blueprint that outlines

A useful analogy is to think of a DataFrame as a digital spreadsheet

The main reason for distributing data across multiple computers is

dt= {'City’: ['Delhi','Mumbai','Kolkata','Chennai'],'Hospitals’: [189,208,149,157],'schools’:

'Product': ['cpu', 'mouse', 'keyboard', 'printer', 'hdd','cd', 'scanner', 'speaker'],

'Company': ['compaq', 'compaq', 'dell', 'hp', 'sony', 'sony','hp', 'dell'],

'qty': [40, 20, 10, 2, 500, 1000,4, 6],

'price': [9000, 400, 700, 20000, 450, 25,5500, 900]

index = [101, 102, 103, 104, 105, 106, 107,108]

df1 = pd.DataFrame(data, index=index)

print("Details of records 102, 104, and 106:")

print(df1.loc[[102, 104, 106]])

print("Product and Company details of records 101 and 104:")

print(df1.loc[[101, 104], ['Product', 'Company']])

print("First and third records of df1:")

print("Quantity and Company details of all records:")

df1.at[104, 'price'] = 50000

print("Updated DataFrame with modified price for record 104:")

print("Details of record 104:")

df1.loc[[101, 102], 'Company'] = 'acer'

df1.loc[[101, 102], 'qty'] = 400

df1.loc[108] = ["mic", "dell", 100, 450]

data_df2 = {'mark1': [30,20,20,40,50], 'mark2': [20, 25, 30, 10, 30]}

df_sum = df1 + df2

print("Result of adding df1 and df2:")

print("DataFrame df1 after adding 10 to all values:")

print("DataFrame df1 after adding 5 to mark1 column:")

print("Result of adding df1 into df2:")

overs = [5, 10, 15, 20]

A DataFrame can be thought of as a table-like structure, similar to a spreadsheet

Beyond Python, the concept of DataFrames is also implemented in other

The universality of DataFrames across different programming ecosystems

1. Databricks - DataFrame Documentation

You might also like