0% found this document useful (0 votes)
43 views15 pages

IP Practical

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views15 pages

IP Practical

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Informatics Practices Practical

Class 12

DAV PUBLIC SCHOOL


Malighat, Muzaffarpur
2024-25

Submitted by:
Name: Palak Saha
Class: 12 A
Class Roll: 05
Board Roll:
Admission no: 8878
Submitted to: Mr. Servesh Kumar(IP teacher)

Date ____________________

1
CERTIFICATE

This is to certify that PALAK SAHAJ, student of Class XII, DAV PUBLIC
SCHOOL malighat has completed the the academic year 2024-2025
towards partial fulfillment of credit of the Informatics Practices project
evaluation of CBSE and submitted a satisfactory report, as compiled in the
following pages, under supervision.

_______________________________ ________________________________
Internal Examiner Signature Principle seal and signature

_______________________________ ________________________________
External Examiner Signature Head of Department Signature

2
ACKNOWLEDGEMENT

I wish to express my deep sense of gratitude and indebtedness learned


teacher SARVESH KUMAR, DAV PUBLIC SCHOOL for valuable help, advice
and guidance in the preparation of this project.

I am also greatly indebted to our principal DR. HIMANSHU KUMAR


PANDEY and the school authorities for providing me with the facilities for
making this practical file.

I also extend to thank my teachers and my class mate and friends who help
me comple this project successfully.

3
INDEX
1. Data Handling

1.1. Create a Pandas Series from a Dictionary of Values and a Numpy ndarray

1.2. Extract Elements Above the 70th Percentile

1.3. Create a DataFrame for Sales Data

1.4. Create a DataFrame for Student Results

1.5. Remove Duplicate Entries from Data

1.6. Importing and Exporting Data Between Pandas and CSV File

2. Visualization
2.1. Analyze Employee Salary Distribution

2.2. Visualize Total Sales by Category

2.3. Histogram for ages survey of 100 participants

3. Data Management with MySQL


3.1. Create a Products Table in MySQL

3.2. Insert New Product Records

3.3. Delete a Product Record

3.4. Select Products with Price Greater Than 500

3.5. Calculate Total Price of Products

3.6. Count Products by Price Range

3.7. Order Products by Price

4
Data Handling and Visualization Project for Class 12 CBSE
In this project, we will explore various data handling techniques using the pandas
library, perform data visualization using matplotlib, and manage data using SQL
queries in a MySQL database. Each section includes code snippets, detailed
explanations, related content, and expected outputs.

1. Data Handling
1.1. Create a Pandas Series from a Dictionary of Values and a Numpy ndarray
Overview: A Pandas Series allows for easy manipulation and analysis of one-
dimensional data. It can be thought of as a more powerful version of a list or array, with
added functionalities such as labels (indices) and built-in methods for operations.
Technical Details: - The pandas library is essential for data manipulation and analysis.
- A dictionary can be transformed into a Series where keys become indices, and values
become data points.
Code:
import pandas as pd
import numpy as np

# Define a dictionary with sample data


data_dict = {'Math': 85, 'English': 90, 'Science': 75, 'History':
80}

# Create a Pandas Series from the dictionary


subject_series = pd.Series(data_dict)

# Create a Numpy ndarray to represent scores


scores_array = np.array([88, 92, 76, 81])

# Create a Pandas Series from the Numpy array


scores_series = pd.Series(scores_array)

# Print the Series created from the dictionary


print("Subject Scores Series:")
print(subject_series)

# Print the Series created from the Numpy ndarray


print("\nScores from Numpy ndarray:")
print(scores_series)

5
Expected Output:
Subject Scores Series:
Math 85
English 90
Science 75
History 80
dtype: int64

Scores from Numpy ndarray:


0 88
1 92
2 76
3 81
dtype: int64

1.2. Extract Elements Above the 70th Percentile


Overview: Percentiles are critical in data analysis as they help understand the
distribution and performance of data points. The 70th percentile indicates that 70% of
the data is below this value, which can be useful for identifying students or scores that
are performing above average.
Technical Details: - The quantile() method computes the specified percentile. -
Boolean indexing allows us to filter the Series based on the calculated percentile.
Code:
import pandas as pd

# Sample data representing scores


scores = pd.Series([65, 70, 75, 80, 85, 90, 95])

# Calculate the 70th percentile


percentile_70 = scores.quantile(0.70)

# Extract scores above the 70th percentile


above_70 = scores[scores > percentile_70]

print("Scores above the 70th percentile:")


print(above_70)

6
Expected Output:
Scores above the 70th percentile:
5 90
6 95
dtype: int64

1.3. Create a DataFrame for Sales Data


Overview: A DataFrame is a two-dimensional labeled data structure that can hold
various data types. It is widely used for data analysis tasks. In this example, we create a
DataFrame to represent sales data for different products and their corresponding
expenditures.
Technical Details: - DataFrames can be created from dictionaries or lists, where keys
become column headers and values become rows. - The groupby() method allows for
aggregation, which is essential for summarizing data.
Code:
import pandas as pd

# Create a DataFrame with sales data


sales_data = {
'Product': ['Laptop', 'Smartphone', 'Tablet', 'Smartwatch',
'Headphones'],
'Category': ['Electronics', 'Electronics', 'Electronics',
'Wearables', 'Accessories'],
'Expenditure': [1500, 800, 300, 200, 100]
}
sales_df = pd.DataFrame(sales_data)

# Group by Category and sum the expenditures


total_sales = sales_df.groupby('Category')['Expenditure'].sum()

print("Total sales expenditure per category:")


print(total_sales)

Expected Output:
Total sales expenditure per category:
Category
Accessories 100
Electronics 2600
Wearables 200
Name: Expenditure, dtype: int64

7
1.4. Create a DataFrame for Student Results
Overview: Managing student results is a common application of DataFrames. In this
section, we create a DataFrame to store student results, including attributes like student
ID, name, and grades. This showcases how to organize and analyze student performance
data.
Technical Details: - DataFrames can be easily manipulated using indexing and filtering.
- Methods such as dtypes and shape provide insights into the structure of the
DataFrame.

Code:
import pandas as pd

# Create a DataFrame for student results


student_data = {
'Student ID': [101, 102, 103],
'Name': ['John', 'Doe', 'Smith'],
'Grade': ['A', 'B', 'A']
}
students_df = pd.DataFrame(student_data)

# Display row labels, column labels, data types, and dimensions


print("Row Labels:", students_df.index.tolist())
print("Column Labels:", students_df.columns.tolist())
print("Data Types:\n", students_df.dtypes)
print("Dimensions:", students_df.shape)

Expected Output:
Row Labels: [0, 1, 2]
Column Labels: ['Student ID', 'Name', 'Grade']
Data Types:
Student ID int64
Name object
Grade object
dtype: object
Dimensions: (3, 3)

1.5. Remove Duplicate Entries from Data


Overview: Data cleaning is a vital step in ensuring the accuracy of data analysis.
Duplicate entries can lead to misleading results. This section demonstrates how to
identify and remove duplicate rows from a DataFrame.
Technical Details: - The drop_duplicates() method is employed to eliminate
duplicate records, ensuring a clean dataset for analysis.

8
Code:
import pandas as pd

# Sample DataFrame with duplicate entries


data_with_duplicates = {'Name': ['Alice', 'Bob', 'Alice', 'Charlie',
'Bob'],'Score': [95, 85, 95, 90, 85]}
duplicates_df = pd.DataFrame(data_with_duplicates)

# Remove duplicate rows


cleaned_df = duplicates_df.drop_duplicates()

print("DataFrame after removing duplicates:")


print(cleaned_df)

Expected Output:
DataFrame after removing duplicates:
Name Score
0 Alice 95
1 Bob 85
3 Charlie 90

1.6. Importing and Exporting Data Between Pandas and CSV File
Overview: Data import and export are essential for data manipulation, allowing for
seamless data transfer between formats. This section illustrates how to export a
DataFrame to a CSV file and subsequently import it back into a DataFrame.
Technical Details: - The to_csv() method is used to write the DataFrame to a CSV file.
- The read_csv() method allows for reading CSV files into DataFrames, supporting
various options for data parsing.
Code:
import pandas as pd

# Create a DataFrame for demonstration


data = {'Employee ID': [1, 2, 3],'Name': ['Alice', 'Bob',
'Charlie'],'Salary': [70000, 80000, 60000]}

employees_df = pd.DataFrame(data)

# Export DataFrame to CSV


employees_df.to_csv('employees.csv', index=False)

# Import DataFrame from CSV


imported_employees_df = pd.read_csv('employees.csv')

print("Imported Employees DataFrame:")


print(imported_employees_df)

9
Expected Output:
Imported Employees DataFrame:
Employee ID Name Salary
0 1 Alice 70000
1 2 Bob 80000
2 3 Charlie 60000

2. Visualization
2.1. Analyze Employee Salary Distribution
Overview: Data visualization is crucial for interpreting and presenting data insights
effectively. This section analyzes employee salaries by creating a histogram to visualize
the distribution of salaries among employees.
Technical Details: - A histogram provides a visual representation of the distribution of
numerical data. - The hist() function in Matplotlib is utilized for creating the
histogram.
Code:
import pandas as pd
import matplotlib.pyplot as plt

# Create a DataFrame for employee salaries


salary_data = {
'Employee ID': [1, 2, 3],
'Name': ['Alice', 'Bob', 'Charlie'],
'Salary': [70000, 80000, 60000]
}
salary_df = pd.DataFrame(salary_data)

# Plotting the salary distribution


plt.hist(salary_df['Salary'], bins=3, color='skyblue',
edgecolor='black')
plt.title('Employee Salary Distribution')
plt.xlabel('Salary')
plt.ylabel('Number of Employees')
plt.show()

2.2. Visualize Total Sales by Category


Overview: This section visualizes total sales by product category using a bar chart. This
type of chart allows stakeholders to quickly assess which categories are performing
well, facilitating better decision-making for future sales strategies.
Technical Details: - The plot() method is used with the kind parameter set to 'bar'
to create a bar chart representing total expenditures by category.

10
Code:
import pandas as pd
import matplotlib.pyplot as plt

# Create a DataFrame for sales data


sales_data = { 'Product': ['Laptop', 'Smartphone', 'Tablet',
'Smartwatch', 'Headphones'], 'Expenditure': [1500, 800, 300, 200,
100]}
sales_df = pd.DataFrame(sales_data)
# Group by Product and sum the expenditures
total_sales = sales_df.groupby('Product')['Expenditure'].sum()

# Plotting total sales expenditure by product


total_sales.plot(kind='bar', color='orange')
plt.title('Total Expenditure by Product')
plt.xlabel('Product')
plt.ylabel('Expenditure')
plt.show()

2.3. Histogram for ages survey of 100 participants


Overview: This section explores how to visualize the expenditure distribuacross
different ages using a Hustogram.
Technical Details: - The plt.hist() function creates a histogram.
Code:
import matplotlib.pyplot as plt

# Create a list of ages


ages =
[1,1,2,3,5,7,8,9,10,10,11,13,13,15,16,17,18,19,20,21,21,23,24,24,24,
25,25,25,25,26,26,26,27,27,27,27,27,29,30,30,30,30,31,33,34,34,34,35
,36,36,37,37,37,38,38,39,40,40,41,41,42,43,45,45,46,46,46,47,48,48,4
9,50,51,51,52,52 ,53,54,55,56,57,58,60,61,63,65,66,68,70,72,74,75,77
,81,83,84,87,89,90,91]

# Plotting the histogram

plt.hist(ages,bins=20)
plt.title('Participants’ ages Histogram ')
plt.show()

11
3. Data Management with MySQL
3.1. Create a Products Table in MySQL
Overview: In relational databases, tables are fundamental structures used to store data.
This section outlines how to create a Products table in a MySQL database to manage
product information.
Technical Details: - The CREATE TABLE statement is used to define the structure of the
table, including the data types for each column.
Code:
CREATE TABLE IF NOT EXISTS Products (
ProductID INT PRIMARY KEY,
ProductName VARCHAR(100),
Price DECIMAL(10, 2));

Expected Output: Table created successfully (no direct output from SQL command).

3.2. Insert New Product Records


Overview: Adding records to a database table is a fundamental operation in database
management. This section demonstrates how to insert new product records into the
Products table.

Technical Details: - The INSERT INTO statement adds new rows to the table,
specifying values for each column.
Code:
INSERT INTO Products (ProductID, ProductName, Price) VALUES (1,
'Laptop', 1500.00);
INSERT INTO Products (ProductID, ProductName, Price) VALUES (2,
'Smartphone', 800.00);
INSERT INTO Products (ProductID, ProductName, Price) VALUES (3,
'Tablet', 300.00);

Expected Output: New records inserted successfully (no direct output from SQL
command).

3.3. Delete a Product Record


Overview: Managing data also involves the ability to remove records. This section
shows how to delete a specific product record from the Products table.
Technical Details: - The DELETE FROM statement specifies the condition under which
records should be removed.
Code:
DELETE FROM Products WHERE ProductID = 2;

12
Expected Output: Product deleted successfully (no direct output from SQL command).

3.4. Select Products with Price Greater Than 500


Overview: Retrieving specific data based on conditions is crucial for data analysis. This
section demonstrates how to select products from the Products table that are priced
above a certain threshold.
Technical Details: - The SELECT statement retrieves specific columns, and the WHERE
clause filters the results based on conditions.
Code:
SELECT * FROM Products WHERE Price > 500;

Expected Output:
+-----------+--------------+--------+
| ProductID | ProductName | Price |
+-----------+--------------+--------+
| 1 | Laptop | 1500.00|
| 3 | Tablet | 300.00 |
+-----------+--------------+--------+

3.5. Calculate Total Price of Products


Overview: Performing calculations on data is a key function in database management.
This section demonstrates how to calculate the total price of all products in the
Products table.

Technical Details: - The SUM() aggregate function computes the total of a specified
column.
Code:
SELECT SUM(Price) AS TotalPrice FROM Products;

Expected Output:
+-----------+
| TotalPrice|
+-----------+
| 1800.00 |
+-----------+

3.6. Count Products by Price Range


Overview: Understanding the distribution of products by price range can provide
valuable insights for inventory and sales strategies. This section demonstrates how to
count the number of products within specified price ranges.

13
Technical Details: - The CASE statement in SQL can be used to categorize data for
aggregation.
Code:
SELECT
CASE
WHEN Price < 500 THEN 'Under 500'
WHEN Price >= 500 AND Price < 1000 THEN '500 to 999'
ELSE '1000 and above'
END AS PriceRange,
COUNT(*) AS ProductCount

FROM Products
GROUP BY PriceRange;

Expected Output:
+-------------+-------------+
| PriceRange | ProductCount|
+-------------+-------------+
| 1000 and above| 2 |
| 500 to 999 | 1 |
+-------------+-------------+

3.7. Order Products by Price


Overview: Sorting data is essential for analysis and reporting. This section shows how
to retrieve products from the Products table ordered by their price. This is useful for
identifying high and low-priced products.
Technical Details: - The ORDER BY clause sorts the retrieved records based on one or
more columns, with ASC for ascending and DESC for descending order.
Code:
SELECT ProductID, ProductName, Price FROM Products ORDER BY Price
DESC;

Expected Output:
+-----------+--------------+--------+
| ProductID | ProductName | Price |
+-----------+--------------+--------+
| 1 | Laptop | 1500.00|
| 3 | Tablet | 300.00 |
+-----------+--------------+--------+

14
CONCLUSION
This project provides a comprehensive overview of data handling, visualization, and
management using Python, pandas, and MySQL. Each section is designed to be self-
contained, ensuring that all necessary imports and connections are included. By
following this project, students gain practical experience in analyzing, manipulating, and
visualizing data, which are essential skills in today’s data-driven world. This project
promotes critical thinking and problem-solving abilities, and fosters a deeper
understanding of how to work with data effectively.

15

You might also like