1
1
CODE:
pandas as pd
student_percentage={"srishti":100,"shiksha":80,"
mandvi":30,"taruna":90,"josh":60}
s=pd.Series(student_percentage)
print(s[s>75])
1
OUTPUT:
srishti 100
shiksha 80
taruna 90
dtype: int64
2
2)AIM:WAP to create a series using numpy array
CODE:
import pandas as pd
import numpy as np
i=([1,2,3,4])
s=np.array(i)
print(s)
3
OUTPUT: [1 2 3 4]
4
3)AIM: write a program to to sort the values of sereies
objects1 in ascending order of its value and store it into
object2
CODE:
import pandas as pd
s1 = pd.Series([82, 67, 90, 75, 80])
s2 = s1.sort_values(ascending=True)
print(s2)
5
OUTPUT:
1 67
3 75
4 80
0 82
2 90
dtype: int64
6
4)AIM: wap for performing mathematical operations on
two series object
CODE:
import pandas as pd
series1 = pd.Series([10, 20, 30, 40, 50])
series2 = pd.Series([5, 10, 15, 20, 25])
7
addition_result = series1 + series2
print("\nAddition (series1 + series2):")
print(addition_result)
8
OUTPUT:
Addition (series1 + series2):
0 15
1 30
2 45
3 60
4 75
dtype: int64
9
2 450
3 800
4 1250
dtype: int64
10
5)AIM:WAP for calculating cube of series values
CODE:
import pandas as pd
series = pd.Series([1, 2, 3, 4, 5])
cube_series = series ** 3
print(cube_series)
11
OUTPUT:
0 1
1 8
2 27
3 64
4 125
dtype: int64
12
6)AIM:WAP to display attributes of a series
THEORY: Creating the Series:
We create a Pandas Series series with values [10, 20,
30, 40, 50].
series = pd.Series([10, 20, 30, 40, 50])
Accessing Attributes:
index: This gives the index of the Series (by default, it is
a range of integers starting from 0).
series.index
• dtype: This returns the data type of the elements
in the Series (e.g., int64, float64).
series.dtype
• size: This gives the total number of elements in the
Series.
series.size
• shape: This returns a tuple representing the shape
of the Series (it will have one value, the number of
elements).
series.shape
values: This returns the actual data as a NumPy array.
series.values
13
• name: If the Series has a name, this will return it. If
not, it returns None.
series.name
• empty: This checks if the Series is empty (i.e., has
no elements).
series.empty
CODE:
import pandas as pd
series = pd.Series([10, 20, 30, 40, 50])
print("Series Values:")
print(series)
print("\nAttributes of the Series:")
print("Index:", series.index)
print("Data Type:", series.dtype)
print("Size:", series.size)
print("Shape:", series.shape)
print("Values:", series.values)
print("Name of the Series:", series.name)
print("Is Empty:", series.empty)
print("hasnans",series.hasnans)
14
print("nbytes",series.nbytes)
print("ndim",series.ndim)
print("values.itemsize",series.values.itemsize)
15
OUTPUT:
Series Values:
0 10
1 20
2 30
3 40
4 50
dtype: int64
16
hasnans False
nbytes 40
ndim 1
values.itemsize 8
17
7)AIM:WAP to display 3 largest and 3 smallest number
in a series
CODE:
n1=int(input("enter the number 1:"))
n2=int(input("enter the number 2:"))
n3=int(input("enter the number 3:"))
print("largest
among{}{}and{}is{}".format(n1,n2,n3,largest))
18
if n1<n2 and n1<n3:
smallest=n1
elif n2<n1 and n2<n3:
smallest=n2
else:
smallest=n3
print("smallest
among{}{}and{}is{}".format(n1,n2,n3,smallest))
19
OUTPUT:
enter the number 1:12
enter the number 2:43
enter the number 3:54
largest among12 43and 54 is 54
smallest among12 43and 54 is 12
20
8)WAP for creating a dataframe using the a nested list
THEORY:
The nested_list contains sublists, where each sublist
represents a row in the DataFrame.
Each row contains multiple values (ID, Name, and Age).
The pd.DataFrame() function is used to create a
DataFrame from the nested list.
It takes two arguments:
• The nested list (nested_list) representing the data.
• The columns list representing the names of the
columns.
CODE:
import pandas as pd
nested_list = [[1, 'jaya', 24],[2, 'sushma', 27],[3, 'rekha',
22],[4, 'amitabh', 30]]
columns = ['ID', 'Name', 'Age']
df = pd.DataFrame(nested_list, columns=columns)
print(df)
21
OUTPUT:
ID Name Age
0 1 jaya 24
1 2 sushma 27
2 3 rekha 22
3 4 amitabh 30
22
9)AIM:WAP to create dataframe using dictionary of list
• THEORY:
The data dictionary contains keys representing column
names ('ID', 'Name', and 'Age'), and each key is
associated with a list of values. Each list represents the
values for that particular column.
23
OUTPUT:
ID Name Age
0 1 Alice 24
1 2 Bob 27
2 3 Charlie 22
3 4 David 30
24
10)AIM:WAP to display number of rows and columns in
dataframe
CODE:
import pandas as pd
data = {'ID': [1, 2, 3, 4],'Name': ['Alice', 'Bob', 'Charlie',
'David'],'Age': [24, 27, 22, 30]}
df = pd.DataFrame(data)
print(df.shape)
25
OUTPUT:
(4, 3)
26
11)AIM:WAP to perform operations on a
dataframe(rename,count,update,replace)
THEORY:
Rename Columns:
• The rename() method is used to change column
names. The inplace=True ensures the DataFrame is
modified directly.
Count Unique Values:
• The value_counts() function counts the
occurrences of unique values in a column (Location
in this case).
Update Values:
• Here, we simply add 1 to each value in the Years
column using vectorized operations.
Replace Values:
• The replace() function replaces specified values in
the column ('Houston' with 'Austin' in the Location
column).
27
CODE:
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David',
'Edward'],'Age': [25, 30, 35, 40, 45],'City': ['New York',
'Los Angeles', 'Chicago', 'Houston', 'Phoenix']}
df = pd.DataFrame(data)
df.rename(columns={'City': 'Location', 'Age': 'Years'},
inplace=True)
location_count = df['Location'].value_counts()
df['Years'] = df['Years'] + 1
df['Location'] = df['Location'].replace('Houston',
'Austin')
print("Updated DataFrame:")
print(df)
print("\nCount of Locations:")
print(location_count)
28
OUTPUT:
Updated DataFrame:
Name Years Location
0 Alice 26 New York
1 Bob 31 Los Angeles
2 Charlie 36 Chicago
3 David 41 Austin
4 Edward 46 Phoenix
Count of Locations:
Location
New York 1
Los Angeles 1
Chicago 1
Houston 1
Phoenix 1
Name: count, dtype: int64
(26)
29
12)AIM:WAP to filter the data of a dataframe
THEORY:
df[df['Age'] > 30] filters rows where the 'Age' column is
greater than 30. This method works by creating a
boolean mask where the condition is True or False.
Combining Conditions:
• (df['Salary'] >= 55000) & (df['Salary'] <= 65000)
combines two conditions using the & operator to
filter rows where 'Salary' falls within the specified
range.
.loc[]:
• df.loc[df['Age'] < 40] is another way to filter rows
based on a condition.
CODE:
import pandas as pd
data = { 'Name': ['Alice', 'Bob', 'Charlie', 'David',
'Edward'], 'Age': [25, 30, 35, 40, 45], 'City': ['New York',
'Los Angeles', 'Chicago', 'Houston', 'Phoenix'], 'Salary':
[50000, 60000, 55000, 70000, 65000]}
30
df = pd.DataFrame(data)
filtered_age = df[df['Age'] > 30]
print("Filtered by Age > 30:")
print(filtered_age)
31
OUTPUT:
Filtered by Age > 30:
Name Age City Salary
2 Charlie 35 Chicago 55000
3 David 40 Houston 70000
4 Edward 45 Phoenix 65000
32
13)AIM: WAP to display attributes of a dataframe
THEORY:
shape: Returns a tuple (rows, columns) representing
the number of rows and columns in the DataFrame.
columns: Returns the list of column names.
index: Returns the index (row labels) of the
DataFrame.
dtypes: Returns the data types of each column.
describe(): Provides summary statistics (e.g., count,
mean, min, max, etc.) for numeric columns in the
DataFrame.
head(): Displays the first 5 rows of the DataFrame (you
can specify the number of rows as an argument).
tail(): Displays the last 5 rows of the DataFrame (you
can specify the number of rows as an argument).
count(): Returns the count of non-null entries for each
column.
size: Returns the total number of elements (rows ×
columns) in the DataFrame.
shape[0]: Returns the total number of rows.
shape[1]: Returns the total number of columns.
33
CODE:
import pandas as pd
data = {'Name': ['mercury', 'venus', 'earth', 'mars',
'jupitar'],'Age': [25, 30, 35, 40, 45],'City': ['New York',
'Los Angeles', 'Chicago', 'Houston', 'Phoenix'],'Salary':
[50000, 60000, 55000, 70000, 65000]}
df = pd.DataFrame(data)
print("Shape of DataFrame (rows, columns):", df.shape)
print("\nColumn Names:", df.columns)
print("\nIndex of DataFrame:", df.index)
print("\nData Types of Each Column:")
print(df.dtypes)
print("\nSummary Statistics (Numeric Data):")
print(df.describe())
print("\nFirst 5 rows of the DataFrame:")
print(df.head())
print("\nLast 5 rows of the DataFrame:")
print(df.tail())
print("\nNon-null count in each column:")
print(df.count())
34
print("\nTotal number of elements in the DataFrame:",
df.size)
print("\nTotal number of rows in the DataFrame:",
df.shape[0])
print("\nTotal number of columns in the DataFrame:",
df.shape[1])
35
OUTPUT:
Shape of DataFrame (rows, columns): (5, 4)
Column Names: Index(['Name', 'Age', 'City',
'Salary'],dtype='object')
Index of DataFrame: RangeIndex(start=0, stop=5,
step=1)
Data Types of Each Column:
Name object
Age int64
City object
Salary int64
dtype: object
Summary Statistics (Numeric Data):
Age Salary
count 5.000000 5.00000
mean 35.000000 60000.00000
std 7.905694 7905.69415
min 25.000000 50000.00000
25% 30.000000 55000.00000
50% 35.000000 60000.00000
36
75% 40.000000 65000.00000
max 45.000000 70000.00000
37
Non-null count in each column:
Name 5
Age 5
City 5
Salary 5
dtype: int64
38
14) AIM:given dataframe df wap to display only
name,age and position for all rows:
name gender position city age project budget
0 rabina F manager bangalore 30 13 48
1 evan M programmer new delhi 27 17 13
2 jia F manager chennai 32 16 32
3 lalit M manager mumbai 40 20 21
THEORY:
In this case, we select ['name', 'age', 'position'].
The result is a DataFrame containing only those three
columns
CODE:import pandas as pd
data = {'name': ['rabina', 'evan', 'jia', 'lalit'], 'gender':
['F', 'M', 'F', 'M'], 'position': ['manager', 'programmer',
'manager', 'manager'], 'city': ['bangalore', 'new delhi',
'chennai', 'mumbai'],'age': [30, 27, 32, 40], 'project':
[13, 17, 16, 20],'budget': [48, 13, 32, 21]}
df = pd.DataFrame(data)
filtered_df = df[['name', 'age', 'position']]
print(filtered_df)
39
OUTPUT:
name age position
0 rabina 30 manager
1 evan 27 programmer
2 jia 32 manager
3 lalit 40 manager
40
15) AIM: WAP to perform writing and reading
operations in a csv file
THEORY:
Writing to CSV (to_csv):
• The to_csv() method is used to save the DataFrame
to a CSV file.
• index=False is used to prevent pandas from writing
row indices (row numbers) into the CSV file.
• The file will be saved as 'employee_data.csv' in the
current working directory.
Reading from CSV (read_csv):
• The read_csv() method is used to read a CSV file
and load it into a pandas DataFrame.
• Here, we read the 'employee_data.csv' file back
into df_read.
CODE:
data = {'name': ['rabina', 'evan', 'jia', 'lalit'],'gender': ['F',
'M', 'F', 'M'],'position': ['manager', 'programmer',
'manager', 'manager'],'age': [30, 27, 32, 40]}
41
df = pd.DataFrame(data)
df.to_csv('employee_data.csv', index=False)
print("Data written to 'employee_data.csv'.")
df_read = pd.read_csv('employee_data.csv')
print("\nData read from 'employee_data.csv':")
print(df_read)
42
OUTPUT:
Data written to 'employee_data.csv'.
43
16) AIM:WAP for plotting a line chart
THEORY:
Data Creation: We create a simple dataset with Year
and Sales columns in a pandas DataFrame.
Plotting:
• plt.plot(df['Year'], df['Sales']) plots a line chart with
Year on the x-axis and Sales on the y-axis.
• The marker='o' adds circular markers at each data
point.
• linestyle='-' specifies a solid line between points.
• color='b' sets the color of the line to blue.
• label='Sales' provides a label for the line (used in
the legend).
Title and Labels: The plt.title(), plt.xlabel(), and
plt.ylabel() functions set the title of the chart and the
labels for the axes.
Legend: plt.legend() adds the legend to the chart.
Display: Finally, plt.show() displays the plot.
44
CODE:
import matplotlib.pyplot as plt
import pandas as pd
data = {'Year': [2017, 2018, 2019, 2020, 2021],'Sales':
[200, 250, 300, 350, 400]}
df = pd.DataFrame(data)
plt.plot(df['Year'], df['Sales'], marker='o', linestyle='-',
color='b', label='Sales')
plt.title('Sales Over Years')
plt.xlabel('Year')
plt.ylabel('Sales')
plt.legend()
plt.show()
45
OUTPUT:
46
17)AIM:WAP for plotting a bar chart from a csv files
THEORY:
Matplotlib: a python library for creating various types
of charts and graphs.
Bar charts:a chart that uses rectangular bars to represt
data.
Bar()function:used to create a bar chart specifying the
categories and their corresponding values from a csv
files.
CODE:
import pandas as pd
import matplotlib.pyplot as plt
#sample data(create a csv file)
data={
"product":["a","b","c","d","e"],
"sales":[100,150,200,250,300]}
df=pd.DataFrame(data)
df.to_csv("sales.csv",index=False)
df=pd.read_csv("sales.csv")
plt.bar(df["product"],df["sales"],color="blue",width=0.
3,label="product sales")
47
plt.title("product sales bar chart")
plt.xlabel("products")
plt.ylabel("sales")
plt.legend()
plt.show()
48
OUTPUT:
49
18)AIM:write a python program for plotting a
horizontal bar chart from a csv file
THEORY:Matplotlip: a python library where data is
represted with horizontal bars.
Barh():creates a horizontal bar chart by specifying
categories and their corresponding values from a csv
file
CODE:
import pandas as pd
import matplotlib.pyplot as plt
data={ "product":["a","b","c","d","e"],
"sales":[100,150,200,250,300]}
df=pd.DataFrame(data)
df.to_csv("sales.csv",index=False)
df=pd.read_csv("sales.csv")
plt.barh(df["product"],df["sales"],color="green",label="
product sales")
plt.title("product sales horizontal bar chart")
plt.xlabel("products")
plt.ylabel("sales")
plt.legend()
plt.show()
50
OUTPUT:
51
19)AIM:write a python program for plotting histogram
THEORY:
import matplotlib.pyplot as plt: This imports the
matplotlib.pyplot module, which provides functions to
plot the histogram.
import numpy as np: This imports the numpy library,
which helps generate random numbers for creating the
data.
data = np.random.randint(0, 100, 1000): This
generates an array of 1000 random integers between 0
and 100, representing our dataset.
plt.hist(data, bins=20, edgecolor='black'): This plots
the histogram using the data array, dividing the data
into 20 bins (or intervals). The edgecolor='black' makes
the borders of the bars black, which improves visibility.
plt.title('Histogram of Random Data'): This adds a
title to the histogram.
plt.xlabel('Value') and plt.ylabel('Frequency'): These
add labels to the x-axis (representing the data values)
and the y-axis (representing the frequency of values in
each bin).
plt.show(): This function displays the histogram.
52
CODE:
import matplotlib.pyplot as plt
import numpy as np
data = np.random.randint(1, 101, 100)
plt.hist(data, bins=10, edgecolor='black')
plt.title('Histogram of Random Data')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()
53
OUTPUT:
54
20)AIM:create a table student with the student
id,name ,marks, country,as atrributes where the
students id is the primary key .create a new table
named passion with attributes hobbyid (primary key )
and name of hobby.add a new atrribute named
habbyid with default value 1 in the student table which
which refers to hobbyid attribute of passion table.
THEORY:
passion table:
• hobby_id: The primary key of this table, uniquely
identifying each hobby.
• name_of_hobby: A text field that stores the name
of the hobby.
student table:
• student_id: The primary key of this table, uniquely
identifying each student.
• name: Name of the student.
• marks: Marks scored by the student.
• country: Country of the student.
• hobby_id: A foreign key referencing the hobby_id
from the passion table, with a default value of 1.
55
The foreign key constraint ensures referential integrity
by ensuring that any hobby_id in the student table
must exist in the passion table.
CODE:
CREATE TABLE passion(hobby_id INT PRIMARY
KEY,name_of_hobby VARCHAR (255) NOT NULL);
56
OUTPUT:
hobby_id | name_of_hobby
1 | Reading
2 | Painting
3 | Cycling
4 | Photography
5 | Cooking
57
21) AIM:find min ,max, sum ,and average of the marks
in a student table
THEORY:
• MIN(marks): Returns the minimum value of marks.
• MAX(marks): Returns the maximum value of
marks.
• SUM(marks): Returns the sum of all marks.
• AVG(marks): Returns the average of all marks.
CODE:
SELECT MIN(marks) AS min_marks,MAX(marks) AS
max_marks, SUM(marks) AS total_marks, AVG(marks)
AS average_marks FROM student;
58
OUTPUT:
min_marks , max_marks , total_marks ,average_marks
78 95 438 87.6000
59
22)AIM:find the total number of students from each
country in the table using group by
THEORY:
Create Table: The fellows table has two columns:
country (a VARCHAR to store the country name) and
total_students (an INT to store the count of students).
Insert Data: The INSERT INTO statement will select the
country and the count of students from the students
table, then insert the result into the fellows table.
GROUP BY: This groups the data by country and
counts the students per group.
CODE:
CREATE TABLE fellows (country
VARCHAR(255),total_students INT);
INSERT INTO fellows (country,
total_students)VALUES('USA', 50),('Canada', 30),('India',
40),('UK', 20);('canada',10):('india',45);
SELECT country, COUNT(*) AS total_students FROM
students GROUP BY country;
OUTPUT:
60
country | total_student |
USA 1
Canada 1
UK 1
Australia 1
India 1
61
23)AIM:write a sql query to display the marks without
decimal places
THEORY:
ROUND(marks, 0): This rounds the marks to 0 decimal
places.
AS marks_without_decimal: This gives the resulting
column a more descriptive name.
CODE:
SELECT ROUND(marks, 0) AS marks_without_decimal
FROM student;
62
OUTPUT:
| marks_without_decimal |
85
92
78
88
95
63
24)AIM:write a sql query to display the reminder after
dividing marks by 3
THEORY:
• marks % 3: This calculates the remainder when
marks is divided by 3.
• AS remainder: This assigns the alias remainder to
the result.
CODE:
SELECT marks % 3 AS remainder FROM student;
64
OUTPUT:
remainder
1
2
0
1
2
65
25)AIM:write a sql query to display the squre of marks.
THEORY:
marks * marks: This multiplies the marks column by
itself, effectively calculating the square.
AS square_of_marks: This gives the resulting column a
name (square_of_marks)
CODE:
SELECT marks * marks AS square_of_marks from
student;
66
OUTPUT:
square_of_marks
7225
8464
6084
7744
9025
67
26)AIM: write a sql query to display names in
uppercase and lowercase
THEORY:
UPPER(name): Converts the name column to
uppercase.
LOWER(name): Converts the name column to
lowercase.
AS name_in_uppercase and AS name_in_lowercase:
These are aliases that will be used to name the
resulting columns.
CODE:
SELECT UPPER(name) AS
name_in_uppercase,LOWER(name) AS
name_in_lowercase FROM student;
68
OUTPUT:
name_in_uppercase | name_in_lowercase |
69
27) AIM:write a sql query to display first 3 letters of
name
THEORY:
LEFT(name, 3): This function extracts the first 3
characters from the name column.
AS first_three_letters: This gives the resulting column
a name (first_three_letters).
CODE:
SELECT LEFT(name, 3) AS first_three_letters FROM
STUDENT;
70
OUTPUT:
first_three_letters
Joh
Jan
Emm
Mic
Oli
71
28)AIM: write a sql query to display last 3 letters of
name
THEORY:
RIGHT(name, 3): This function extracts the last 3
characters from the name column.
AS last_three_letters: This gives the resulting column
a name (last_three_letters).
72
OUTPUT:
last_three_letters
+--------------------+
Doe
ith
own
son
ams
73
29)write a sql query to display the position the letter a
in name:
THEORY:
INSTR(name, 'a'): This function returns the position of
the first occurrence of 'a' in the name column.
AS position_of_a: This gives the resulting column a
name (position_of_a).
CODE:
SELECT INSTR(name, 'a') AS position_of_a FROM
STUDENT;
74
OUTPUT:
position_of_a
+---------------+
0
2
4
5
6
75
30)AIM:write sql query to display day name,month
name for today’s date
THEORY:
CURDATE(): Returns the current date.
%W: Returns the full name of the day (e.g., "Monday").
%M: Returns the full name of the month (e.g.,
"December")
CODE:
SELECT DATE_FORMAT(CURDATE(), '%W') AS
day_name,DATE_FORMAT(CURDATE(), '%M') AS
month_name;
76
OUTPUT:
day_name | month_name |
| Wednesday | December
77
31) AIM:write sql query to display day,day of the month
for todays date
THEORY:
Day will return day of the month
Dayname will return the name of the day
CODE:
SELECT day (now()) AS
day_of_the_month,dayname(now()) AS
dayname_of_week;
78
OUTPUT:
day_of_the_month | dayname_of_week |
+------------------+-----------------+
| 25 | Wednesday |
79
32)write a sql query to display day of year for todays
date:
THEORY:
CURDATE(): Returns the current date.
DAYOFYEAR(CURDATE()): Returns the day of the year
(e.g., for January 25th, it would return 25).
CODE:
SELECT DAYOFYEAR(CURDATE()) AS day_of_year;
80
OUTPUT:
+-------------+
| day_of_year |
+-------------+
| 360 |
81
33) AIM:write sql query to display student
id,name,hobbyname from the tables where hobbyid of
student table=hobbyid of passion table and marks>60
THEORY:
JOIN passion p ON s.hobbyid = p.hobbyid: This
performs an inner join between the student table
(aliased as s) and the passion table (aliased as p),
matching rows where the hobbyid in both tables are
the same.
WHERE s.marks > 60: This condition filters the results
to include only students who have marks greater than
60.
SELECT s.student_id, s.name, p.hobby_name: This
selects the columns student_id and name from the
student table and hobby_name from the passion table.
82
OUTPUT:
| STUDENT_ID | NAME| HOBBY_NAME |
+------------+------------+------------+
| 1 | John Doe | Reading |
| 2 | Jane Smith | Painting |
| 3 | Emma brown | Cycling|
+------------+------------+------------+
83