0% found this document useful (0 votes)
2 views9 pages

Experiment 8

The document consists of various NumPy and Pandas lab exercises aimed at teaching array operations, matrix multiplication, DataFrame creation, and data analysis. It includes tasks such as creating and manipulating NumPy arrays and DataFrames, performing mathematical operations, filtering, grouping, and analyzing sales data. Each section provides sample code and expected output to illustrate the concepts.

Uploaded by

sanoopsamson77
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views9 pages

Experiment 8

The document consists of various NumPy and Pandas lab exercises aimed at teaching array operations, matrix multiplication, DataFrame creation, and data analysis. It includes tasks such as creating and manipulating NumPy arrays and DataFrames, performing mathematical operations, filtering, grouping, and analyzing sales data. Each section provides sample code and expected output to illustrate the concepts.

Uploaded by

sanoopsamson77
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

NumPy Lab Questions

1.​ NumPy Array Operations

Understanding NumPy array creation, indexing, and mathematical operations.

Tasks:

1.​ Create a 1D NumPy array with five random integers between 10 and 50.
2.​ Create a 2D NumPy array (3×3) filled with random values.
3.​ Perform the following operations on the 1D array:
o​ Compute the sum, mean, maximum, and minimum values.
o​ Replace all even numbers with -1.
4.​ Slice the 2D array to extract the first two rows and last two columns.

Program:
import numpy as np

array_1d = np.random.randint(10,50,size=5)
print("1D Numpy Array=",array_1d)

array_2d = np.random.random((3,3))
print("2D Numpy Array=",array_2d)

print("Sum of the 1d array=",array_1d.sum())


print("Mean of the 1d array",array_1d.mean())
print("Max of the 1d array",array_1d.max())
print("Min of the 1d array",array_1d.min())

array_1d[array_1d%2==0] = -1

print("Updated 1d Array(Replaced even with -1)=",array_1d)


print("Last 2 coloumns of first 2 rows of 1D
Array=",array_2d[:2,-2:])

Output:
1D Numpy Array= [48 22 25 40 21]
2D Numpy Array= [[0.83544465 0.20750278 0.81936127]
[0.37480703 0.92931545 0.44149517]
[0.7756679 0.51987102 0.4371961 ]]
Sum of the 1d array= 156
Mean of the 1d array 31.2
Max of the 1d array 48
Min of the 1d array 21
Updated 1d Array(Replaced even with -1)= [-1 -1 25 -1 21]
Last 2 coloumns of first 2 rows of 1D Array= [[0.20750278
0.81936127]
[0.92931545 0.44149517]]

2.​ NumPy Matrix Multiplication

Perform matrix operations using NumPy.

Tasks:

1.​ Create two 2×2 matrices using NumPy arrays with integer values.
2.​ Perform element-wise addition and subtraction of these matrices.
3.​ Compute the matrix multiplication (dot product) of the two matrices.
4.​ Find the transpose and determinant of one of the matrices.

Program:
import numpy as np

a = np.array([[1, 2], [3, 4]])


b = np.array([[5, 6], [7, 8]])
print(a+b)
print(a-b)
print(a@b)
print(a.transpose())
print(np.linalg.det(a))

Output:
[[ 6 8]
[10 12]]
[[-4 -4]
[-4 -4]]
[[19 22]
[43 50]]
[[1 3]
[2 4]]
-2.0000000000000004

3. NumPy Array Indexing and Slicing


🔹 Objective: Learn how to extract specific elements from NumPy arrays.
Tasks:

1.​ Create a 3×3 NumPy array with values from 10 to 90, increasing by 10.
2.​ Extract the middle row and middle column from the array.
3.​ Replace all values greater than 50 with 0.
4.​ Convert the 1D array [1, 2, 3, 4, 5, 6, 7, 8, 9] into a 3×3 matrix.

Program:
import numpy as np

array = np.arange(10,100,10).reshape(3,3)
print("3x3 Numpy array=",array)
print("Middle Row=",array[1,:])
print("Middle Coloumn=",array[:,1])

array[array>50] = 0
print("Updated array (Convert >50 to 0)=",array)

a=np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
print("1d Array=",a)
print("1d to 3x3 Matrix= ",a.reshape(3,3))

Output:
3x3 Numpy array= [[10 20 30]
[40 50 60]
[70 80 90]]
Middle Row= [40 50 60]
Middle Coloumn= [20 50 80]
Updated array (Convert >50 to 0)= [[10 20 30]
[40 50 0]
[ 0 0 0]]
1d Array= [1 2 3 4 5 6 7 8 9]
1d to 3x3 Matrix= [[1 2 3]
[4 5 6]
[7 8 9]]
Pandas Lab Questions

1.​ Creating and Manipulating DataFrames

Objective: Create and modify Pandas DataFrames.

Tasks:

1.​ Create a Pandas DataFrame from the following dictionary:

data = {"Name": ["Alice", "Bob", "Charlie"], "Age": [25, 30, 35], "City": ["NY", "LA", "Chicago"]}

2.​ Add a new column called "Salary" with values [50000, 60000, 55000].
3.​ Display only the "Name" and "Age" columns.
4.​ Update Bob’s age to 32 and display the modified DataFrame.

Program:
import pandas as pd

data = {"Name": ["Alice", "Bob", "Charlie"], "Age": [25, 30, 35],


"City": ["NY", "LA", "Chicago"]}

df = pd.DataFrame(data)

df["Salary"] = [50000, 60000, 55000]

print(df[["Name","Age"]])

df.loc[df["Name"] == "Bob", "Age"] = 32

print("Updated Dataframe:")
print(df.head())

Output:
​ Name Age
0​ Alice 25
1 ​ Bob 30
2 Charlie 35
Updated Dataframe:
​ Name Age ​ City Salary
0​ Alice 25 ​ NY 50000
1 ​ Bob 32 ​ LA 60000
2 Charlie 35 Chicago 55000

2.​ Reading and Analyzing CSV Data

Objective: Load and analyze real-world data using Pandas.

Tasks:

1.​ Load the dataset from a CSV file (e.g., sales_data.csv).


2.​ Display the first five rows of the dataset.
3.​ Find the total number of rows and columns in the dataset.
4.​ Calculate the average sales per product and display it.
5.​ Check for missing values and replace them with an appropriate value.

Program:
import pandas as pd

df = pd.read_csv('sales_data.csv')

print(df.head(5))
print("Total number of rows and columns:")
print(df.shape)
print(df.groupby("Product")["Revenue"].mean())
df.fillna(0, inplace=True)
print("\nDataFrame after replacing missing values with 0:")
print(df)

Output:
​ Date Day ​ Month Year Customer_Age ​ Age_Group
... Order_Quantity Unit_Cost Unit_Price Profit Cost Revenue
0 2013-11-26 26 November 2013 ​ 19 ​ Youth (<25)
... ​ 8 ​ 45 ​ 120​ 590 360 ​ 950
1 2015-11-26 26 November 2015 ​ 19 ​ Youth (<25)
... ​ 8 ​ 45 ​ 120​ 590 360 ​ 950
2 2014-03-23 23 ​March 2014 ​49 Adults (35-64)
... ​ 23 ​ 45 ​ 120 1366 1035​ 2401
3 2016-03-23 23 ​March 2016 ​49 Adults (35-64)
... ​ 20 ​ 45 ​ 120 1188 900​ 2088
4 2014-05-15 15 ​ May 2014 ​ 47 Adults
(35-64) ... ​ 4 ​ 45 ​ 120​ 238 180
​ 418

[5 rows x 18 columns]
Total number of rows and columns:
(113036, 18)
Product
AWC Logo Cap ​ 126.025700
All-Purpose Bike Stand ​ 758.106195
Bike Wash - Dissolver ​ 110.169069
Classic Vest, L ​ 845.732704
Classic Vest, M ​ 871.137500
​...​
Touring-3000 Yellow, 62 ​ 786.583333
Water Bottle - 30 oz. ​ 69.113860
Women's Mountain Shorts, L ​ 938.331395
Women's Mountain Shorts, M​ 1001.744713
Women's Mountain Shorts, S ​ 961.753226
Name: Revenue, Length: 130, dtype: float64

DataFrame after replacing missing values with 0:


​ Date Day ​ Month Year Customer_Age ...
Unit_Cost Unit_Price Profit Cost Revenue
0 ​2013-11-26 26 November 2013 ​19 ... ​ 45
​ 120​ 590 360 ​ 950
1 ​2015-11-26 26 November 2015 ​19 ... ​ 45
​ 120​ 590 360 ​ 950
2 ​2014-03-23 23 ​ March 2014 ​49 ... ​ 45
​ 120 1366 1035​ 2401
3 ​2016-03-23 23 ​ March 2016 ​49 ... ​ 45
​ 120 1188 900​ 2088
4 ​2014-05-15 15 ​ May 2014 ​ 47 ... ​ 45
​ 120​ 238 180 ​ 418
... ​ ... ... ​ ... ... ​ ... ...
​ ... ​ ...​ ... ... ​ ...
113031 2016-04-12 12 ​April 2016 ​41 ... ​ 24
​ 64​ 112​ 72 ​ 184
113032 2014-04-02​ 2 ​ April 2014 ​18 ... ​ 24
​ 64​ 655 528​1183
113033 2016-04-02​ 2 ​ April 2016 ​18 ... ​ 24
​ 64​ 655 528​1183
113034 2014-03-04​ 4 ​ March 2014 ​37 ... ​ 24
​ 64​ 684 576​1260
113035 2016-03-04​ 4 ​ March 2016 ​37 ... ​ 24
​ 64​ 655 552​1207

[113036 rows x 18 columns]

3.​ Filtering and Grouping Data in Pandas

Objective: Perform filtering and aggregation operations on data.

Tasks:

1.​ Create a Pandas DataFrame for the following sales data:

data = {"Product": ["Laptop", "Phone", "Tablet", "Laptop", "Phone"],

"Sales": [1200, 800, 600, 1300, 950],

"Region": ["North", "South", "East", "West", "North"]}

Filter the records where sales > 1000.


2.​ Group the data by "Region" and find the total sales in each region.
3.​ Sort the DataFrame based on Sales in descending order.

Program:
import pandas as pd
data = {"Product": ["Laptop", "Phone", "Tablet", "Laptop",
"Phone"],
"Sales": [1200, 800, 600, 1300, 950],
"Region": ["North", "South", "East", "West", "North"]}

df = pd.DataFrame(data)
print("Orginal data:\n",df)

filtered_data = df[df["Sales"] > 1000]


print("Filtered data(Sales>1000):\n",filtered_data)

grouped = df.groupby("Region")["Sales"].sum()
print("Grouped Data:\n",grouped)

sorted = df.sort_values("Sales", ascending=False)


print("Sorted Data:\n",sorted)

Output:
Orginal data:
Product Sales Region
0 Laptop 1200 North
1 Phone 800 South
2 Tablet 600 East
3 Laptop 1300 West
4 Phone 950 North
Filtered data(Sales>1000):
Product Sales Region
0 Laptop 1200 North
3 Laptop 1300 West
Grouped Data:
Region
East 600
North 2150
South 800
West 1300
Name: Sales, dtype: int64
Sorted Data:
Product Sales Region
3 Laptop 1300 West
0 Laptop 1200 North
4 Phone 950 North
1 Phone 800 South
2 Tablet 600 East

4.​ Analyzing Sales Data with Pandas

●​ Objective: Use Pandas to clean, analyze, and visualize sales data.


●​ Tasks:
1.​ Load a CSV file containing sales data (date, product, sales, region).
2.​ Handle missing values and format the date column properly.
3.​ Group data by product and region to analyze total sales trends.

Program:
import pandas as pd

df = pd.read_csv("sales_data.csv")

df.fillna({
"Revenue": df["Revenue"].median(),
"State": "Unknown",
"Product": "Unknown"
}, inplace=True)

df["Date"] = pd.to_datetime(df["Date"], errors='coerce')


grouped_sales = df.groupby(["Product", "State"])["Revenue"].sum()
print(grouped_sales)

Output:
AWC Logo Cap Alberta 720
Bayern 11450
Brandenburg 2112
British Columbia 65436
California 109891
...​ ... ...
Women's Mountain Shorts, S South Australia 8725
Tasmania 1020
Val de Marne 332
Victoria 20406
Washington 66835
Name: Revenue, Length: 2705, dtype: int64

You might also like