0% found this document useful (0 votes)
4 views

Python Record

Uploaded by

Charan Aharon
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Python Record

Uploaded by

Charan Aharon
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

P.B.

SIDDHARTHA COLLEGE OF ARTS & SCIENCE


An Autonomous College in the Jurisdiction of Krishna University
Accredited with A+ Grade by NAAC (cycle 2)
ISO 9001:2015 Certified
A COLLEGE WITH POTENTIAL FOR EXCELLENCE (ACCREDITED BY UGC)

LAB MANUAL
PYTHON FOR DATA SCIENCE
(22ANAL47)

Submitted by

KATAKAM D V M PAVAN KUMAR


ROLL NO: - 222208P

DEPARTMENT OF BUSINESS ANALYTICS


SIDDHARTHA NAGAR
VIJAYAWADA-520010
2023 - 2024
P.B. SIDDHARTHA COLLEGE OF ARTS & SCIENCE

BONAFIDE CERTIFICATE

This is to certify that the bona fide lab record of practical work done in Business
Analytics Lab of P.B. Siddhartha College of Arts and Science, during the
Academic Year 2023-2024.

Internal Examiner External Examiner


B. Sai Ram
Assistant Professor
Department of Business Administration

Signature of HOD
D. Vasu
Department of Business Analytics
DECLARATION

I KATAKAM D V M PAVAN KUMAR (Roll No: 222208P), a student


in the Bachelor of Business Administration (Business Analytics) declare
that the data and information created and presented is the manual is original to the
best of my knowledge.

Signature of Student
CONTENTS

SI.NO LIST OF EXPERIMENTS SIGNATURE OF


. FACULTY
1
Write a list of Operators in Python

Define a Program for LISTS


2 • Length of a list List Consisting of
• Joining of Two List Other List operators

Define a Program for Sets and Dictionaries


3 andperform various operators for it

How to Write a Function in python prepare a


4 functionin writing all arithmetic operators

Write a program for Tuple, Assignment operators


5 andcomparison operators and execute with the
examples

Frame steps involving handling a Data frame, Handling


6 Missing Data - dropna, fillna, grouping data, Read, write .csv,
.html, excel file,

Write a program for plotting various graphs using


Matplot
7

Write a program on Categorical Data, Splitting


8 Data,Testing Set, Normalize Data

Write a program for application of


9 Statistical Techniques using Python

Write a program for application of Statistical techniques


10 usingPython.
LAB – 1
List of operators in Python
Numeric operators: - Python provides a wide range of numerical operators toperform
arithmetic operations on numerical values. These operators allow you to manipulate
numeric data types such as integers and floating-point numbers, enabling you to perform
addition, subtraction, multiplication, division, and more.
Addition (+)
The addition operator (+) is used to add two operands together. Syntax:
result = operand1 + operand2

Subtraction (-)
The subtraction operator (-) subtracts the second operand from the first.Syntax:
result = operand1 - operand2

Multiplication (*)
The multiplication operator (*) multiplies two operands.Syntax:
result = operand1 * operand2

Division (/)
The division operator (/) divides the first operand by the second. Syntax:
result = operand1 / operand2
Floor Division (//)
The integer division operator (//) divides the first operand by the second and
returns the integer part of the result.
Syntax:
result = operand1 // operand2

Modulus (%)
The modulus operator (%) returns the remainder when the first operand is
divided by the second.
Syntax:
result = operand1 % operand2

Exponentiation (**)
The exponentiation operator () raises the first operand to the power of the
second.
Syntax:
result = operand1 ** operand2
Assignment operators: - Assignment operators are used to assign values to
variables. Python provides various assignment operators to perform assignments
efficiently and concisely. These operators not only assign values but also
perform operations simultaneously, making code more readable and compact.
Assignment (=)
The assignment operator (=) assigns the value on the right side to the variable
on the left side.
Syntax:
variable = value

Increment (+=)
The increment assignment operator (+=) adds the value on the right side to the
variable on the left side and assigns the result to the variable.
Syntax:
variable += value

Decrement (-=)
The decrement assignment operator (-=) subtracts the value on the right side
from the variable on the left side and assigns the result to the variable.
Syntax:
variable -= value

Multiplication Assignment (*=)


The multiplication assignment operator (*=) multiplies the variable on the left
side by the value on the right side and assigns the result to the variable.
Syntax:
variable *= value
Division Assignment (/=)
The division assignment operator (/=) divides the variable on the left side by the
value on the right side and assigns the result to the variable.
Syntax:
variable /= value

Modulus Assignment (%=)


The modulus assignment operator (%=) computes the modulus of the variable
on the left side with the value on the right side and assigns the result to the
variable.
Syntax:
variable %= value

Exponentiation Assignment (=)


The exponentiation assignment operator (=) raises the variable on the left side to
the power of the value on the right side and assigns the result to the variable.
Syntax:
variable **= value

Floor Division Assignment (//=)


The floor division assignment operator (//=) divides the variable on the left side
by the value on the right side and assigns the integer part of the result to the
variable.
Syntax:
variable //= value
Bitwise AND Assignment (&=)
The bitwise AND assignment operator (&=) performs a bitwise AND operation
between the variable on the left side and the value on the right side and assigns
the result to the variable.
Syntax:
variable &= value

Bitwise OR Assignment (|=)


The bitwise OR assignment operator (|=) performs a bitwise OR operation
between the variable on the left side and the value on the right side and assigns
the result to the variable.
Syntax:
variable |= value

Bitwise XOR Assignment (^=)


The bitwise XOR assignment operator (^=) performs a bitwise XOR (exclusive
OR) operation between the variable on the left side and the value on the right
side and assigns the result to the variable.
Syntax:
variable ^= value

Bitwise Right Shift Assignment (>>=)


The bitwise right shift assignment operator (>>=) shifts the bits of the variable
on the left side to the right by the number of positions specified by the value on
the right side and assigns the result to the variable.
Syntax:
variable >>= value

Bitwise Left Shift Assignment (<<=)


The bitwise left shift assignment operator (<<=) shifts the bits of the variable on
the left side to the left by the number of positions specified by the value on the
right side and assigns the result to the variable.
Syntax:
variable <<= value
Comparison Operators in Python
Comparison operators are used to compare values and return a Boolean result
(True or False). Python provides several comparison operators to compare
different types of values, such as numbers, strings, and sequences.

Equal to (==)
The equal to operator (==) compares two operands and returns True if they are
equal, otherwise False.
Syntax:
result = operand1 == operand2

Not equal to (!=)


The not equal to operator (!=) compares two operands and returns True if they are
not equal, otherwise False.
Syntax:
result = operand1 != operand2

Greater than (>)


The greater than operator (>) compares two operands and returns True if the first
operand is greater than the second operand, otherwise False.
Syntax:
result = operand1 > operand2

Less than (<)


The less than operator (<) compares two operands and returns True if the first
operand is less than the second operand, otherwise False.
Syntax:
result = operand1 < operand2

Greater than or equal to (>=)


The greater than or equal to operator (>=) compares two operands and returns
True if the first operand is greater than or equal to the second operand, otherwise
False.
Syntax:
result = operand1 >= operand2

Less than or equal to (<=)


The less than or equal to operator (<=) compares two operands and returns True if
the first operand is less than or equal to the second operand, otherwise False.
Syntax:
result = operand1 <= operand2
Logical Operators in Python
Logical operators are used to combine multiple conditions or expressions and
return a Boolean result (True or False). Python provides three main logical
operators: and, or, and not, which allow you to perform logical operations on
Boolean values or expressions.

Logical AND (and)


The logical AND operator (and) returns True if both operands are True, otherwise
it returns False.
Syntax:
result = operand1 and operand2

Logical OR (or)
The logical OR operator (or) returns True if at least one of the operands is True,
otherwise it returns False.
Syntax:
result = operand1 or operand2

Logical NOT (not)


The logical NOT operator (not) returns True
if the operand is False, and False if the
operand is True.
Syntax:
result = not operand
Identity Operators in Python
Identity operators are used to compare the memory locations of two objects and
determine if they refer to the same object. In Python, there are two identity
operators: is and is not.

Identity Operator (is)


The is operator returns True if two variables refer to the same object, otherwise it
returns False.

Syntax:
result = variable1 is variable2

Identity Operator (is not)


The is not operator returns True if two variables do not refer to the same object,
otherwise it returns False.

Syntax:
result = variable1 is not variable2

Membership Operators in Python


Membership operators are used to test whether a value or variable is found within
a sequence such as strings, lists, tuples, sets, or dictionaries. Python provides two
membership operators: in and not in.

Membership Operator (in)


The in operator returns True if a specified value is found in the sequence,
otherwise it returns False.
Syntax:
result = value in sequence

Membership Operator (not in)


The not in operator returns True if a specified value is not found in the sequence,
otherwise it returns False.
Syntax:
result = value not in sequence
Bitwise Operators in Python
Bitwise operators are used to perform operations on individual bits of integer
values. These operators treat numbers as sequences of binary digits and perform
operations bit by bit.

Bitwise AND (&):


Syntax: result = operand1 & operand2
Performs a bitwise AND operation between corresponding bits of two operands.
Sets each bit to 1 if both bits are 1, otherwise sets it to 0.

Bitwise OR (|):
Syntax: result = operand1 | operand2
Performs a bitwise OR operation between corresponding bits of two operands.
Sets each bit to 1 if at least one of the bits is 1.

Bitwise XOR (^):


Syntax: result = operand1 ^ operand2
Performs a bitwise XOR (exclusive OR) operation between corresponding bits of
two operands.
Sets each bit to 1 if only one of the bits is 1, but not both.

Bitwise NOT (~):


Syntax: result = ~operand
Inverts all bits of the operand.
Changes every 0 to 1 and every 1 to 0.

Bitwise Left Shift (<<):


Syntax: result = operand << num_bits
Shifts all bits of the operand to the left by a specified number of positions.
Equivalent to multiplying the operand by 2 raised to the power of the shift
amount.
Bitwise Right Shift (>>):
Syntax: result = operand >> num_bits
Shifts all bits of the operand to the right by a specified number of positions.
Equivalent to dividing the operand by 2 raised to the power of the shift amount.
LAB – 2
Programme for LISTS

Introduction:
Lists are an essential data structure in Python, serving as dynamic containers to
store collections of elements. Whether you're managing a sequence of numbers,
strings, or even complex objects, lists offer unparalleled flexibility and
efficiency. In Python, lists are incredibly versatile, allowing for easy
modification, iteration, and manipulation.
i. Length of list:
The length of a list refers to the number of elements it contains. In Python, you
can determine the length of a list using the built-in function len().
For example:
my_list = [1, 2, 3, 4, 5]
length = len(my_list)
print("Length of the list:", length)

ii. List consisting of:


This typically refers to creating a list containing specific elements. In Python,
you can create a list by enclosing the elements within square brackets [],
separated by commas.
For example:
my_list = [1, 2, 3, 4, 5]
iii. Joining of two lists:
Joining two lists means combining their elements to form a single list. In
Python, you can achieve this using the + operator or the extend() method.
For example:
list1 = [1, 2, 3]
list2 = [4, 5, 6]

# Using the + operator


combined_list = list1 + list2
print("Combined list:", combined_list)

# Using the extend() method


list1.extend(list2)
print("Combined list:", list1)
Both methods produce the same result, which is a list containing elements from
both list1 and list2.

iv. Other list operators:


In addition to joining lists, there are several other operators that you can use
with lists in Python, such as:
*: Repetition operator, used to repeat a list a certain number of times.

in: Membership operator, used to check if an element is present in a list.

not in: Negated membership operator, used to check if an element is not present
in a list.

[]: Indexing operator, used to access individual elements or slices of a list.

[:]: Slice operator, used to extract a portion of a list.

max(): Function to find the maximum value in a list.

min(): Function to find the minimum value in a list.

sum(): Function to find the sum of all elements in a list.

These operators and functions provide powerful tools for working with lists in
Python.
LAB – 3
Programme for Sets & Dictionaries

1. Sets:
Sets in Python are unordered collections of unique elements. They are defined
by enclosing elements within curly braces {} or by using the set() constructor.
Sets are useful for tasks where you need to eliminate duplicates from a
collection or perform operations like intersection, union, and difference
efficiently.
Here are some of the common set operators and methods in Python:
i. Creating Sets:
Sets can be created using curly braces {} or the set() constructor.
set1 = {1, 2, 3, 4, 5}
set2 = set([4, 5, 6, 7, 8])

ii. Union (|):


Combines elements from two sets, eliminating duplicates.
union_set1 = set1 | set2
union_set2 = set1.union(set2)
iii. Intersection (&):
Finds common elements between two sets.
intersection_set1 = set1 & set2
intersection_set2 = set1.intersection(set2)

iv. Difference (-):


Finds elements in the first set that are not in the second set.
difference_set1 = set1 - set2
difference_set2 = set1.difference(set2)
v. Symmetric Difference (^):
Finds elements that are in either of the sets, but not in both.
symmetric_difference_set1 = set1 ^ set2
symmetric_difference_set2 = set1.symmetric_difference(set2)

vi. Subset (<=) and Superset (>=):


Checks if one set is a subset or superset of another.
is_subset = set1 <= set2
is_superset = set1 >= set2

vii. Add and Remove Elements:


You can add elements to a set using the add() method and remove elements
using the remove() or discard() methods.
set1.add(6)
set1.remove(3)
viii. Clearing a Set:
Removes all elements from a set.
set1.clear()

2. Dictionary:
A dictionary in Python is an unordered collection of key-value pairs. Each key
is unique and associated with a value, similar to a real-world dictionary where
words (keys) have corresponding definitions (values). Dictionaries are mutable,
meaning their contents can be modified after creation.
i. Creation:
You can create a dictionary by enclosing comma-separated key-value pairs
within curly braces {}.
For example:
my_dict = {'apple': 3, 'banana': 5, 'orange': 2}
ii. Accessing values:
You can access the value associated with a key using square brackets [] and the
key itself.
For example:
print(my_dict['apple'])

iii. Adding or updating elements:


You can add a new key-value pair or update an existing one by assigning a
value to a key:
my_dict['grapes'] = 4 # Adding a new key-value pair
my_dict['apple'] = 6 # Updating the value for an existing key

iv. Deleting elements:


You can delete a key-value pair from the dictionary using the del keyword or
the pop() method:
del my_dict['banana'] # Deleting a key-value pair
value = my_dict.pop('orange') # Deleting a key-value pair and retrieving the
value
v. Checking membership:
You can check if a key exists in a dictionary using the in and not in operators:
if 'apple' in my_dict:
print('Apple is present in the dictionary.')

vi. Length:
You can determine the number of key-value pairs in a dictionary using the len()
function:
print(len(my_dict))
LAB – 4
Function in writing all Arithmetic operators
In Python, functions serve as reusable blocks of code that perform specific
tasks. They encapsulate a set of instructions, allowing you to execute them
multiple times without having to rewrite the code. Functions enhance code
readability, maintainability, and reusability, making them a fundamental
concept in Python programming.
The code defines a simple Python program that performs arithmetic operations
based on user input. Let's break down how it works:

Function Definitions:
Four functions are defined: add(), subtract(), multiply(), and divide(). Each
function takes two parameters (P and Q) representing the numbers on which the
respective operation will be performed.
Each function returns the result of the corresponding arithmetic operation:
addition, subtraction, multiplication, or division.

User Interface:
The program prompts the user to select an operation by displaying options for
addition, subtraction, multiplication, and division.
The user is asked to input their choice (a, b, c, or d) corresponding to the desired
operation.
The user is prompted to enter two numbers (num_1 and num_2) on which the
selected operation will be performed.

Conditional Statements:
The program uses conditional statements (if, elif, else) to determine which
operation to perform based on the user's choice.
Depending on the user's choice, the program calls the corresponding function
(add(), subtract(), multiply(), or divide()) with the provided numbers as
arguments.
The result of the operation is printed to the console.

Input Handling:
The program ensures that the user's input for the choice and numbers are
converted to the appropriate data types (‘str’ to ‘int’) before performing any
operations.

Output:
After performing the selected
arithmetic operation, the program
prints the expression along with the
result to the console.
LAB – 5
Python Programming: Tuples, Assignment Operators, and
Comparison Operators
1. Introduction
This document provides an overview of tuples, assignment operators, and
comparison operators in Python, along with examples to demonstrate their
usage.

2. Tuples
Definition
A tuple is an immutable sequence type in Python. It can hold a collection of
items of any data type.

Example
# Creating a tuple
tuple1 = (1, 2, 3, 4, 5)
tuple2 = ("apple", "banana", "cherry")

print("Tuple1:", tuple1)
print("Tuple2:", tuple2)

# Accessing elements in a tuple


print("First element of Tuple1:", tuple1[0])
print("Last element of Tuple2:", tuple2[-1])

# Tuples can be nested


nested_tuple = (tuple1, tuple2)
print("Nested Tuple:", nested_tuple)
# Tuples are immutable, so the following line would raise an error if
uncommented
# tuple1[0] = 10
Explanation
• Tuples are created using parentheses ().
• Elements in a tuple are accessed using indexing.
• Tuples can be nested, meaning a tuple can contain other tuples.
• Tuples are immutable; once created, their elements cannot be changed.

3. Assignment Operators
Definition
Assignment operators are used to assign values to variables. Python provides
several assignment operators to perform arithmetic and other operations in a
concise manner.
Example
a=5
b = 10
# Using assignment operators
a += b # equivalent to a = a + b
print("a += b:", a)

a -= b # equivalent to a = a - b
print("a -= b:", a)

a *= b # equivalent to a = a + b
print("a *= b:", a)

a /= b # equivalent to a = a / b
print("a /= b:", a)

a %= b # equivalent to a = a % b
print("a %= b:", a)

a **= 2 # equivalent to a = a ** 2
print("a **= 2:", a)

a //= b # equivalent to a = a // b
print("a //= b:", a)
Explanation
+=: Adds right operand to the left operand and assigns the result to the left
operand.
-=: Subtracts right operand from the left operand and assigns the result to the
left operand.
*=: Multiplies left operand with the right operand and assigns the result to the
left operand.
/=: Divides left operand by the right operand and assigns the result to the left
operand.
%=: Takes the modulus using left and right operands and assigns the result to
the left operand.
**=: Performs exponential (power) calculation on operators and assigns the
result to the left operand.
//=: Performs floor division on operators and assigns the result to the left
operand.

4. Comparison Operators
Definition
Comparison operators are used to compare two values. They return a Boolean
value (True or False) based on the comparison.
Example
x = 10
y = 20
# Using comparison operators
print("x == y:", x == y) # Equal to
print("x != y:", x != y) # Not equal to
print("x > y:", x > y) # Greater than
print("x < y:", x < y) # Less than
print("x >= y:", x >= y) # Greater than or equal to
print("x <= y:", x <= y) # Less than or equal to
# Tuple comparison
tuple3 = (1, 2, 3)
tuple4 = (1, 2, 4)
print("tuple3 == tuple4:", tuple3 == tuple4)
print("tuple3 != tuple4:", tuple3 != tuple4)
print("tuple3 < tuple4:", tuple3 < tuple4)
print("tuple3 > tuple4:", tuple3 > tuple4)

Explanation
==: Checks if two operands are equal.
!=: Checks if two operands are not equal.
>: Checks if the left operand is greater than the right operand.
<: Checks if the left operand is less than the right operand.
>=: Checks if the left operand is greater than or equal to the right operand.
<=: Checks if the left operand is less than or equal to the right operand.
Tuple comparison is done element-wise until a difference is found.
5. Conclusion
This document has provided a brief overview of tuples, assignment operators,
and comparison operators in Python, including examples to demonstrate their
functionality. Tuples are useful for storing immutable sequences of items,
assignment operators simplify arithmetic operations, and comparison
operators help in making decisions based on comparisons.
LAB – 6
Handling DataFrames, Missing Data, Grouping, and File
Operations
Introduction to DataFrames
DataFrames are a fundamental data structure in Python's pandas library, used
for data manipulation and analysis. They are a two-dimensional labeled data
structure with columns of potentially different types. DataFrames are similar
to SQL tables or spreadsheets, but with more powerful and flexible data
manipulation capabilities.
Step 1: Load Data
To begin, we need to load the data into a DataFrame. This can be done by
reading data from various sources, such as CSV files, Excel files, HTML
tables, or SQL databases. Here's an example of how to read a CSV file into a
DataFrame:
import pandas as pd
df = pd.read_csv(r"C:\Users\hp\OneDrive\Documents\students.csv")
Step 2 : Manipulate Dataframe
Filtering Rows:
• Filter rows where the CGPA is greater than 8.0:
high_cgpa_students = df[df['CGPA'] > 8.0]
print(high_cgpa_students)

• Filter rows where the age is 22 and the weight is less than 60:
young_light_students = df[(df['Age'] == 22) & (df['Weight'] < 60)]
print(young_light_students)

Selecting Columns:
• Select the 'Student', 'CGPA', and 'Height' columns:
selected_columns = df[['Student', 'CGPA', 'Height']]
print(selected_columns)
Select columns by label:
selected_by_label = df.loc[:10, ['Student', 'CGPA', 'Height']]
print(selected_by_label)

Select columns by integer position:


selected_by_position = df.iloc[:, [0, 3, 4]]
print(selected_by_position)
Renaming Columns:
• Rename the 'CGPA' column to 'GPA':
df.rename(columns={'CGPA': 'GPA'}, inplace=True)
print(df.head())

Handling Duplicates:
• Remove duplicate rows:
df_unique = df.drop_duplicates()
print(df_unique)
• Remove duplicates based on the 'Student' and 'Roll No' columns:
df_unique_students = df.drop_duplicates(subset=['Student', 'Roll No'])
print(df_unique_students)
Step 3 : Handle Missing Data:

• Drop rows with missing values:


df_dropped = df.dropna()
print(df_dropped)

• Drop columns with missing values:


df_dropped_cols = df.dropna(axis=1)
print(df_dropped_cols)
• Drop rows with fewer than 6 non-missing values:
df_dropped_thresh = df.dropna(thresh=6)
print(df_dropped_thresh)

• Fill missing values with a specific value (0 in this case):


df_filled = df.fillna(0)
print(df_filled)
• Fill missing values with the mean of the column:
df_filled_mean = df.fillna(df.mean())
print(df_filled_mean)

• Fill missing values with the median of the column:


df_filled_median = df.fillna(df.median())
print(df_filled_media
Step 4 : Group Data:
Group by the 'Age' column and calculate the mean CGPA:
grouped_age = df.groupby('Age')['CGPA'].mean()
print(grouped_age)

• Group by the 'Age' and 'Height' columns and calculate the sum of the
'Weight' column:
grouped_age_height = df.groupby(['Age', 'Height'])['Weight'].sum()
print(grouped_age_height)

These examples demonstrate how to handle missing data in a DataFrame by


dropping rows or columns with missing values, and filling in missing values
with a specific value, the mean, or the median. Additionally, we've shown
how to group the data by one or more columns and perform various operations
on the groups, such as calculating the mean or sum.
Step 5: Reading and Writing various Files
1. Reading a CSV file:
• Use pd.read_csv('file.csv') to read a CSV file into a DataFrame.
• You can specify additional parameters like index_col, parse_dates, etc.
Writing a CSV file:
• Use df.to_csv('file.csv') to write a DataFrame to a CSV file.
• You can also specify parameters like index, header, etc.

2. Reading an HTML file:


• Use pd.read_html('file.html') to read HTML tables into a DataFrame.
•You'll need to install an HTML parser library like lxml or html5lib.
Writing an HTML file:
• Use df.to_html('file.html') to write a DataFrame to an HTML file.
• You can also specify parameters like index, header, etc.

3. Reading an Excel file:


• Use pd.read_excel('file.xlsx') to read an Excel file into a DataFrame.
•You can specify the sheet_name parameter to read a specific worksheet.
Writing an Excel file:
• Use df.to_excel('file.xlsx') to write a DataFrame to an Excel file.
• You can specify the sheet_name parameter to write to a specific
worksheet.
You'll need to install additional packages like openpyxl or xlsxwriter to write
Excel files.
The search results provide detailed examples and explanations for each of these
file operations in Jupyter Notebook. The key points are:
Use the appropriate pd.read_*() and df.to_*() functions to read and write files.
Specify file paths, sheet names, and other parameters as needed.
LAB – 7
Plotting various graphs using Matplot lib

1. Scatter and Line Chart


Scatter Plot
• Explanation: A scatter plot displays points based on two numerical
variables. It's useful for observing the relationship between these
variables.
• Code Explanation:
✓ plt.scatter(x, y1, color='blue', label='Scatter Data'): Plots x vs. y1 with
blue markers.
✓ plt.xlabel('X-axis') and plt.ylabel('Y-axis'): Label the x and y axes.
✓ plt.legend(): Adds a legend to distinguish between multiple plots.
Line Chart
• Explanation: A line chart connects individual data points with a
continuous line. It's used to show trends over time or other ordered
categories.
• Code Explanation:
➢ plt.plot(x, y2, color='green', label='Line Data'): Plots x vs. y2 with a
green line.
➢ plt.title('Scatter and Line Chart'): Adds a title to the chart.
➢ plt.show(): Displays the chart.

Combined Code:
import matplotlib.pyplot as plt
# Data for scatter and line plot
x = [1, 2, 3, 4, 5]
y1 = [2, 3, 5, 7, 11] # Data for scatter plot
y2 = [1, 4, 6, 8, 10] # Data for line plot
# Scatter plot
plt.scatter(x, y1, color='blue', label='Scatter Data')
# Line plot
plt.plot(x, y2, color='green', label='Line Data')
plt.title('Scatter and Line Chart')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend()
plt.show()
2. Bubble Chart
• Explanation: A bubble chart is a type of scatter plot where a third
dimension is represented by the size of the bubbles. It's useful for
comparing three variables.
• Code Explanation:
✓ s: Defines the size of the bubbles.
✓ plt.scatter(x, y, s=s, alpha=0.5, c='red', label='Bubble Data'): Plots x vs.
y with varying bubble sizes.

Code:
import matplotlib.pyplot as plt

# Data for bubble chart

x = [1, 2, 3, 4, 5]

y = [10, 15, 20, 25, 30]

sizes = [100, 200, 300, 400, 500] # Bubble sizes

# Bubble plot

plt.scatter(x, y, s=sizes, alpha=0.5, c='red', label='Bubble Data')

# Adding titles and labels

plt.title('Bubble Chart')

plt.xlabel('X-axis')

plt.ylabel('Y-axis')

plt.legend()

# Show plot

plt.show()
3. Histogram
• Explanation: A histogram displays the distribution of a dataset by grouping
data into bins and counting the number of observations in each bin. It's
useful for understanding the underlying frequency distribution of a dataset.
• Code Explanation:
✓ plt.hist(data, bins=bins, color='blue', edgecolor='black'): Plots a
histogram of data with specified bins.

Code:
import matplotlib.pyplot as plt
# Data for histogram
data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 6, 7, 8, 9, 10]
bins = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Histogram
plt.hist(data, bins=bins, color='blue', edgecolor='black')
# Adding titles and labels
plt.title('Histogram')
plt.xlabel('Value')
plt.ylabel('Frequency')
# Show plot
plt.show()

4. Trend Line
• Explanation: A trend line is used to highlight the general direction in
which a dataset is moving. It's often used in time series data to show
trends over time.
• Code Explanation:
✓ plt.plot(x, y, 'o'): Plots the original data points.
✓ plt.plot(x, m*x + b, '-'): Plots the trend line using the linear equation y =
mx + b.
Code:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([0, 1, 2, 3, 4, 5])
y = np.array([1, 3, 2, 5, 7, 8])
m, b = np.polyfit(x, y, 1) # Slope (m) and intercept (b)# Plotting
data points
plt.plot(x, y, 'o')
# Plotting trend line plt.plot(x, m*x
+ b, '-')
# Adding titles and labels
plt.title('Trend Line') plt.xlabel('X-
axis') plt.ylabel('Y-axis')
# Show plotplt.show()
These examples demonstrate how to create different types of charts using Matplotlib. Each

type of chart provides a unique way to visualize and interpret data, making it easier to
understand patterns, relationships, and distributions within a dataset.
LAB – 8
PROGRAM USING CATEGORICAL DATA, SPLITTING
DATA TESTING SET NORMALIZE DATA
1. Importing the necessary libraries:

2. Load Data: Here, we'll create a sample Data Frame for demonstration.

3. Split Data into Features and Target:

4. Handle Categorical Data and Normalize Data:

• OneHotEncode categorical features.


• Standardize numerical features.
5. Split Data into Training and Testing Sets:
LAB – 9
APPLICATION OF STATISTICAL TECHNIQUES
USING PYTHON
Step 1: Import Required Libraries: First, import the necessary libraries for data manipulation, visualization, statistical
analysis, and machine learning.

Step 2: Create or Load a Sample Dataset: Create a sample dataset for demonstration purposes. This dataset
includes information about customer demographics and spending behavior.

Step 3: Descriptive Statistics: Compute and display summary statistics for numerical
variables in the dataset using describe() and check data distribution using value_counts().

Step 4: Data Visualization: Visualize the dataset using histograms, boxplots, and scatter
plots to understand the distribution and relationships between variables.
Histograms: A histogram is a graphical representation that organizes a group of data points into specified
ranges or bins. It shows the frequency distribution of a continuous variable
Components
• Bins: Intervals that divide the entire range of data into smaller segments. The width of each bin
represents a range of values, and the height represents the frequency or count of values within that
range.
• Frequency: The number of data points that fall within each bin.

Purpose
• To visualize the distribution of a continuous variable.
• To identify patterns such as skewness, modality, and the presence of outliers.
• To understand the spread and central tendency of the data.

Boxplot: A box plot (or box-and-whisker plot) is a standardized way of displaying the distribution of data based on a
five-number summary: minimum, first quartile (Q1), median, third quartile (Q3), and maximum.

Components
• Box: Represents the interquartile range (IQR), which is the distance between the first quartile (Q1) and
the third quartile (Q3).
• Median: The line inside the box, indicating the median or 50th percentile.
• Whiskers: Lines extending from the box to the minimum and maximum values within 1.5 times the
IQR from Q1 and Q3.
• Outliers: Points outside the whiskers, representing unusually high or low values.

Purpose
• To provide a visual summary of data dispersion and skewness.
• To identify outliers.
• To compare distributions across different groups or categories.
Scatter plot: A scatter plot is a type of plot that shows the relationship between two continuous variables by
displaying data points on a two-dimensional plane.

Components
• Data Points: Each point represents an observation in the dataset, with its position determined by its
values on the x and y axes.
• Axes: The horizontal (x) axis and vertical (y) axis represent the two variables being compared.

Purpose
• To visualize the relationship between two continuous variables.
• To identify trends

Step 5: Hypothesis Testing: Perform a T-test to compare means of two variables (Age and
Spending Score).
Step 6: Correlation Analysis: Compute the correlation matrix and visualize it using a
heatmap to understand the relationships between variables.

Step 7: Regression Analysis: Perform linear regression analysis to predict Spending Score
based on Annual Income. Visualize the regression line.
LAB – 10
APPLICATION OF EXPLORATORY DATA ANALYSIS

Exploratory Data Analysis (EDA) is a crucial step in the data analysis process. It involves summarizing the
main characteristics of the data, often using visual methods. Here, we’ll go through a step-by-step guide to
performing EDA in Python. We'll cover:

1. Loading the dataset


2. Data overview
3. Descriptive statistics
4. Handling missing data
5. Data visualization
6. Correlation analysis
7. Insights and conclusion

Step 1: Import Required Libraries: Start by importing the necessary libraries for data manipulation and
visualization.

Step 2: Load the Dataset: Load your dataset. For demonstration, we'll create a sample
dataset.

Step 3: Data Overview: Examine the first few rows, column data types, and summary
statistics.
Step 4: Handle Missing Data: Identify and handle missing values in the dataset.
Step 5: Data Visualization: Use various plots to visualize the distribution of variables and
relationships between them.

Histogram: Histograms are powerful tools for visualizing the distribution and variability of data. By
understanding the components and interpretation of histograms, you can effectively analyze and communicate
data trends and patterns.

Box plot: Boxplots are valuable tools for visualizing data distribution, identifying outliers, and comparing
datasets. Their simplicity and effectiveness make them a staple in data analysis and statistics.
Pair plot:

Heatmap of correlation matrix: Heatmaps are a powerful tool for visualizing data distributions, correlations,
and patterns. They offer a compact and intuitive way to present complex data and are widely used in various fields for
data analysis and visualization.
Step 6: Correlation Analysis: Analyze correlations between numerical variables to
understand their relationships.

Step 7: Insights and Conclusion: Summarize key findings and insights from the analysis.

You might also like