0% found this document useful (0 votes)
14 views42 pages

Rajni Ip File Final

The document certifies that Ms. Rajni and Ms. Manjot completed practical research on 'Data Handling using Python & SQL' under Ms. Harsha's guidance. It provides an overview of data handling techniques using Python libraries like Pandas and Matplotlib, as well as SQL for managing relational databases. The document includes detailed explanations of data structures, operations, and various SQL queries for data manipulation and analysis.

Uploaded by

kaurmehman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views42 pages

Rajni Ip File Final

The document certifies that Ms. Rajni and Ms. Manjot completed practical research on 'Data Handling using Python & SQL' under Ms. Harsha's guidance. It provides an overview of data handling techniques using Python libraries like Pandas and Matplotlib, as well as SQL for managing relational databases. The document includes detailed explanations of data structures, operations, and various SQL queries for data manipulation and analysis.

Uploaded by

kaurmehman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

YA GIRLS PUBLIC SCHOO

AR L

NAME : Rajni
MANJOT
CLASS : 12 “A”
12TH-A
ROLL NO. : 25
17711593
Admission NO. : 3869
4425
SUBMITTED TO : MS. HARSHA
MS. HARSHA
CERTIFICATE
This is to certify that Ms. Rajni, a student of
Ms. Manjot
class 12th has successfully completed the
practical research on the topic of ‘Data
Handling using: Python & SQL’ under the
guidance of Ms. Harsha during the year
2024-2025.

Internal Examiner External Examiner


DATA HANDLING
USING PYTHON
There are three established python libraries for scientific
analytical use-
• Numpy
• Pandas
• Matplotlib

PANDAS
Introduction to Pandas
PANDAS (PANel DAta) is a high-level data manipulation tool
used for analysing data. It is very easy to export and import
data for pandas library which has a very rich set of functions.

Data Structure in Pandas


A data structure is a collection of data values and operations
that can be applied to that data.
SERIES: A one-dimensional labeled array, similar to a list or
a column in a spreadsheet.
DATA FRAME: A two-dimensional labeled data structure, like
a table with rows and columns.

SERIES
A one-dimensional labeled array in Pandas, capable of
holding data of any type (e.g., integers, strings, floats). It's
similar to a column in a spreadsheet or a single Python list
but with labels (indices) for each element.

CREATION OF SERIES
• FROM SCALAR VALUES

OUTPUT
• FROM NUMPY ARRAY

OUTPUT

• FROM DICTIONARY

OUTPUT

ACCESSING ELEMENTS OF
SERIES
• INDEXING
o By using defined index
OUTPUT

o By using positional index

OUTPUT

• SLICING
o By using defined index

OUTPUT

o By using positional index

OUTPUT
ATTRIBUTES OF SERIES
Series:

Attributes:
● .name

● .index.name

● .values
● .size

● .empty

MATHEMATICAL OPERATION
ON SERIES
Series:
Operations:
• ADDITION
o By using + operator

o By using add() function

• SUBSTRACTION
o By using - operator
o By using sub() function

• MULTIPLICATION
o By using * operator

o By using mul() function

• DIVISION
o By using / operator
o By using div() function

METHOD OF SERIES
Series:

Methods:
• HEAD(n)

OUTPUT

• COUNT()

OUTPUT

• TAIL(n)

OUTPUT
DATAFRAME
A two-dimensional labeled data structure in Pandas, similar
to a table in a database or a spreadsheet. It consists of rows
and columns, where each column is a Series, and it supports
various data types and operations like filtering, grouping,
and statistical analysis.

CREATION OF DATAFRAME
• FROM EMPTY DATFRAME

OUTPUT

• FROM NUMPY NDARRAYS

OUTPUT
• FROM LIST OF DICTIONARY

OUTPUT

• FROM DICTIONARY OF SERIES

OUTPUT

OPERATIONS ON ROWS AND


COLUMNS IN DATAFRAME
• ADDING A NEW COLUMN
OUTPUT

• ADDING A NEW ROW

OUTPUT

• DELETING A NEW ROW

OUTPUT

• DELETING A NEW COLUMN

OUTPUT
• RENAMING A NEW ROW

OUTPUT

• RENAMING A NEW COLUMN

OUTPUT
ACCESSING DATAFRAME
ELEMENT THROUGH
INDEXING
• LABEL BASED INDEXING

OUTPUT

• BOOLEAN INDEXING

OUTPUT
JOINING OF DATAFRAME

OUTPUT

ATTRIBUTES OF DATAFRAME
Dataframe:
Attributes:
ATTRIBUTE OUTPUT
NAME
CSV FILE
A COMMA SEPARATED VALUES (CSV) is a text file format
that uses a comma to separate values and newlines to
separate records. A CSV file stores tabular data in plain text,
where each line of file typically represents one record.

IMPORTING AND EXPORTING


DATA BETWEEN CSV FILES
AND DATAFRAMES
We can store and export data in a data frame as a .csv file
where values are separated by commas.

• Importing CSV file to Data Frame


STEP 1: TO CREATE A TABLE IN SPREADSHEET
STEP 2: TO SAVE AS .CSV OR CSV (COMMA
DELIMITED) IN DESKTOP.

STEP 3: RIGHT CLICK ON FILE AND COPY LOCATION


OF FILE IN PROPERTIES.
STEP 4: WRITE A PROGRAM TO IMPORT CSV FILE

OUTPUT

• Exporting CSV file to Data Frame


STEP 1: MAKE A DATAFRAME TO EXPORT CSV FILE

OUTPUT
MATPLOTLIB
Matplotlib in python is used for plotting graphs and
visualization using matplotlib, with just a few lines of code
we can generate publication quality plots, Histograms, Bar
charts, Scatter plots etc.

PLOTTING MATPLOTLIB
COMPONENTS OF PLOT

• A Figure is the overall window where the outputs of


pyplot functions are plotted.
• A Figure contains a plotting area, Legend, Axis labels,
Ticks and Titles etc.
PROGRAM ON HOW TO PLOT
A GRAPH:

OUTPUT
CUSTOMISATION OF PLOTS
PYPLOT LIBRARY GIVES US NUMEROUS FUNCTIONS WHICH
CAN BE USED TO CUSTOMISE CHARTS SUCH AS ADDING
TITLES OR LEGENDS.

OUTPUT
PANDAS PLOT FUNCTIONS
We can call the plot method by writing:
s.plot() or df.plot()
We will learn to use plot() functions to create various types
of charts. They are:
LINE CHART BAR CHART HISTOGRAM

LINE CHART
A LINE CHART displays the evolution of one or several
numeric variables.

BAR CHART
BAR plots are a type of data visualization used to represent
data in the form of rectangular bars.

HISTOGRAM
It represents distribution of continuous dataset.
PLOTTING LINE CHART
A LINE plot is a graph that shows a frequency of data along a
number line.

OUTPUT
CUSTOMISING LINE CHART
We can substitute the ticks at x-axis with a list of values, by
using plt.xticks where ticks is a list of location on x axis at which
ticks should be placed.

OUTPUT
PLOTTING BAR CHART
To plot a BAR chart, we will specify kind= “bar”. We can also
specify the DATAFRAME columns to be used as X and Y Axis.

OUTPUT
CUSTOMISING BAR CHART
We can customize the bar chart by adding certain
parameters to the plot functions. We can control the edge
color, line style and line width of the bar.

OUTPUT
PLOTTING HISTOGRAM
CHART
HISTOGRAMS are column charts where each column
represents a range of values and the Height of the Columns
corresponds to how many values are in that range.

OUTPUT
CUSTOMISING HISTOGRAM
CHART
We will explore how to leverage Pandas to customize
histograms, making it good looking and studying available
options.

OUTPUT
DATA HANDLING
USING SQL
Data handling using SQL involves managing and analyzing
data in relational databases. It includes storing, retrieving,
modifying, filtering, and combining data efficiently, ensuring
integrity and enabling insightful analysis for various
applications.

SQL
SQL (Structured Query Language) is a powerful tool for
managing and manipulating data in relational databases. It
includes operations like:
• Defining database structure (DDL)
• Querying and retrieving data (DQL)
• Modifying data (DML)
It also manages user access and permissions through DCL,
making it essential for database management and analysis.
Database query using SQL
(Mathematical, string,
Date and time functions in
SQL)
Table:
Consider table SALESMAN with following data:
SNO SNAME SALARY BONUS DATEOFJOIN
A01 Beena Mehta 30000 45.23 2019-10-29
A02 K. L. Sahay 50000 25.34 2018-03-13
B03 Nisha Thakkar 30000 35.00 2017-03-18
B04 Leela Yadav 80000 NULL 2018-12-31
C05 Gautam Gola 20000 NULL 1989-01-23
C06 Trapti Garg 70000 12.37 1987-06-15
D07 Neena Sharma 50000 27.89 1999-03-18

Queries:
• Display Salesman name, bonus after rounding
off to zero decimal places.
Select SNAME, round(BONUS,0) from SALESMAN;
• Display name, total salary of all salesman after
addition of salary and bonus and truncate it to 1
decimal places.
Select sname, truncate((SALARY+BONUS),1) from
SALESMAN;

• Display remainder of salary and bonus of


Salesman whose SNO starting with ‘A’.
Select MOD(SALARY,BONUS) from SALESMAN where
SNO like ’A%’;

• Display position of occurrence of string “ta” in


salesmen name.
Select sname, instr(Sname,”ta”) from
SALESMAN;
• Display four characters from salesman name
starting from second character.
Select sname, substr(Sname,2,4) from
SALESMAN;

• Display last 5 characters of name of SALESMAN.


Select sname, right(Sname,5) from SALESMAN;

• Display details of salesman whose name


containing 10 characters.
Select * from salesman where
length(sname)=10;
• Display month name for the date of join of
salesman
Select DATEOFJOIN, monthname(DATEOFJOIN)
from SALESMAN;

• Display currentdate and day of the year of


current date.
Select date (now()),dayofyear(date(now()))
from dual;

• Display name of the weekday for the


DATEOFJOIN of SALESMAN;
Select DATEOFJOIN,dayname(DATEOFJOIN) from
SALESMAN;
• Display SNO, name of the youngest SALESMAN.
Select sno, sname, dateofjoin from salesman
where dateofjoin=(select max(DATEOFJOIN)
from SALESMAN);

• Display name and salary of the oldest


SALESMAN.
Select sname, salary, dateofjoin from
salesman where dateofjoin=(select
min(dateofjoin) from salesman);
Database query using SQL
(Aggregate functions, Group
by, order by query in SQL)
Table:
Consider table VEHICLE with following data:
V_no Type Company Price Qty
AW125 Wagon Maruti 250000 25
J00083 Jeep Mahindra 4000000 15
S9090 SUV Mistubishi 2500000 18
M0892 Mini Van Datsun 1500000 26
W9760 SUV Maruti 2500000 18
R2409 Mini Van Mahindra 350000 15

Queries:
• Display the average price of each type of vehicle
having quantity more than 20.
Select Type, avg(price) from vehicle where
qty>20 group by Type;
• Count the type of vehicles manufactured by each
company.
Select Company, count(distinct Type) from
Vehicle group by Company;

• Display total price of all types of vehicle.


Select Type, sum(Price* Qty) from Vehicle
group by Type;

• Display the details of the vehicle having


maximum price.
Select * from vehicle where price=(select
max(price) from vehicle);
• Display total vehicles of Maruti company.
Select company,sum(qty) from vehicle group
by company having company='Maruti';

• Display average price of all type of vehicles.


Select type,avg(price) from vehicle group by
type;

• Display type and minimum price of each vehicle


company.
Select type,company,min(price) from vehicle
group by company;
• Display minimum, maximum, total and average
price of Mahindra company vehicles.
Select company,
min(price),max(price),sum(price),avg(price)
from vehicle where company='Mahindra';

• Display details of all vehicles in ascending order


of their price.
Select * from vehicle order by price asc;

• Display details of all vehicles in ascending order


of type and descending order of vehicle number.
Select * from vehicle order by type asc,
v_no desc;

You might also like