0% found this document useful (0 votes)
7 views30 pages

Ids 1

The document outlines a practical file for a BCA course on Data Science, detailing various experiments involving data manipulation using Python's pandas library. Each experiment includes code snippets and expected outputs, covering topics such as creating Series and DataFrames, statistical calculations, and data visualization techniques. The file serves as a comprehensive guide for students to apply their theoretical knowledge in practical scenarios.

Uploaded by

rawatsumit9902
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views30 pages

Ids 1

The document outlines a practical file for a BCA course on Data Science, detailing various experiments involving data manipulation using Python's pandas library. Each experiment includes code snippets and expected outputs, covering topics such as creating Series and DataFrames, statistical calculations, and data visualization techniques. The file serves as a comprehensive guide for students to apply their theoretical knowledge in practical scenarios.

Uploaded by

rawatsumit9902
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 30

DEPARTMENT OF INFORMATION COMMUNICATION &

TECHNOLOGY
PRACTICAL FILE
OF
BCA 212P
(INTRODUCTION OF DATA SCIENCE)
Academic session: 2024-25
Batch: 2023-26

Submitted to: Submitted by:


Ms. Mansi Jaiswal Name: Harsh Negi
(Assistant Professor) Enrolment no: 00717002023
Program: BCA
Semester: 4th
Shift: 1st
Division: A
1
INDEX
S.no Experiments Sign
1 Create a pandas series from a dictionary of values and an
ndarray.
2 Create a Series and print all the elements that are above
75th percentile.
3 Perform sorting on Series data and DataFrames
4 Write a program to implement pivot() and pivot-table() on a
DataFrame.
5 Write a program to find mean absolute deviation on a
DataFrame.
6 Two Series object, Population stores the details of four
metro cities of India and another object AvgIncome stores
the total average income reported in four years in these
cities. Calculate income per capita for each of these metro
cities.
7 Create a DataFrame based on E-Commerce data and
generate mean, mode, median.
8 Create a DataFrame based on employee data and generate
quartile and variance.
9 Program to implement Skewness on Random data.
10 Create a DateFrame on any Data and compute statistical
function of Kurtosis.
11 Series objects Temp1, temp2, temp3, temp 4 stores the
temperature of days of week 1, week 2, week 3, week 4.
Write a script to:-
a. Print average temperature per week
b. Print average temperature of entire month
12 Write a Program to read a CSV file and create its
DataFrame.
13 Consider the DataFrame QtrSales where each row contains
the item category, item name and expenditure and group
the rows by category, and print the average expenditure per
category.
14 Create a DataFrame having age, name, weight of five
students. Write a program to display only the weight of first
and fourth rows.
15 Write a program to create a DataFrame to store weight, age
and name of three people. Print the DataFrame and its
transpose.

2
Experiment -1
Create a pandas series from a dictionary of values and an
ndarray.
Code: -
import pandas as pd
import numpy as np
data=np.array([1,2,3,4,5])
Series1=pd.Series(data)
print(Series1)
data_dict={"a":10,"b":20,"c":30}
Series2=pd.Series(data_dict)
print(Series2)

Output: -

3
Experiment-2
Create a Series and print all the elements that are above 75th
percentile.
Code: -
import pandas as pd

import numpy as np

# Create a random Series

np.random.seed(42) # For reproducibility

s = pd.Series(np.random.randint(1, 100, 10)) # 10 random integers between 1 and 100

print("Original Series:\n", s)

# Calculate 75th percentile

percentile_75 = s.quantile(0.75)

print("\n75th Percentile:", percentile_75)

# Filter and print elements above 75th percentile

above_75th = s[s > percentile_75]

print("\nElements above 75th percentile:\n", above_75th)

Output: -

4
Experiment-3
Perform sorting on Series data and DataFrames.
Code: -
import pandas as pd

# Create a Series

my_series = pd.Series([5, 1, 9, 2, 7])

print("Original Series:\n", my_series)

# Sort the Series (smallest to largest)

sorted_series = my_series.sort_values()

print("\nSorted Series:\n", sorted_series)

# Sort Series from largest to smallest

sorted_series_desc = my_series.sort_values(ascending=False)

print("\nSorted Series (Descending):\n", sorted_series_desc)

# --- Sorting DataFrames (Easy) ---

# Create a DataFrame

data = {'Name': ['Charlie', 'Alice', 'Bob'],

'Age': [25, 30, 22]}

my_df = pd.DataFrame(data)

print("\nOriginal DataFrame:\n", my_df)

# Sort the DataFrame by 'Age' (youngest to oldest)

sorted_df = my_df.sort_values(by='Age')

print("\nSorted DataFrame by Age:\n", sorted_df)

5
# Sort the DataFrame by 'Name' (alphabetical order)

sorted_df_name = my_df.sort_values(by='Name')

print("\nSorted DataFrame by Name:\n", sorted_df_name)

# Sort the DataFrame by 'Age' (oldest to youngest)

sorted_df_desc_age = my_df.sort_values(by='Age', ascending=False)

print("\nSorted DataFrame by Age (Descending):\n", sorted_df_desc_age)

Output: -

6
7
Experiment-4
Write a program to implement pivot() and pivot-table() on a
DataFrame.
Code: -
import pandas as pd

# Sample DataFrame
data = {
'Date': ['2023-01-01', '2023-01-01', '2023-01-02', '2023-01-02', '2023-01-03'],
'Category': ['A', 'B', 'A', 'B', 'A'],
'Value': [10, 20, 15, 25, 30]
}

df = pd.DataFrame(data)

# Display the original DataFrame


print("Original DataFrame:")
print(df)

# Using pivot() to reshape the DataFrame


pivot_df = df.pivot(index='Date', columns='Category', values='Value')
print("\nPivoted DataFrame using pivot():")
print(pivot_df)

# Using pivot_table() to reshape the DataFrame


# Here we will use pivot_table to handle potential duplicates by taking the mean
pivot_table_df = df.pivot_table(index='Date', columns='Category', values='Value',
aggfunc='mean')

8
print("\nPivoted DataFrame using pivot_table():")
print(pivot_table_df)

Output: -

9
Experiment-5
Write a program to find mean absolute deviation on a
DataFrame.
Code: -
import pandas as pd

# Sample DataFrame
data = {
'A': [1, 2, 3, 4, 5],
'B': [5, 6, 7, 8, 9],
'C': [10, 11, 12, 13, 14]
}

df = pd.DataFrame(data)
print("Original DataFrame:\n", df)

# Function to calculate Mean Absolute Deviation


def mean_absolute_deviation(df):
# Calculate the mean of each column
mean = df.mean()
# Calculate the absolute deviation from the mean
absolute_deviation = abs(df - mean)
# Calculate the mean of the absolute deviations
mad = absolute_deviation.mean()
return mad

# Calculate Mean Absolute Deviation for the DataFrame


mad_result = mean_absolute_deviation(df)

10
# Display the result
print("Mean Absolute Deviation for each column:")
print(mad_result)

Output: -

11
Experiment-6
Two Series object, Population stores the details of four metro
cities of India and another object AvgIncome stores the total
average income reported in four years in these cities.
Calculate income per capita for each of these metro cities.
Code:-
import pandas as pd

# Example data for Population (in millions)


Population = pd.Series({
'DehraDun': 20.4,
'Almora': 18.9,
'Nanital': 12.3,
})
print("Population of Different cities:")
print(Population,end="\n\n")

# Example data for AvgIncome (in millions)


AvgIncome = pd.Series({
'DehraDun': 150,
'Almora': 120,
'Nanital': 100,
})
print("Average Income of Different cities:")
print(AvgIncome,end="\n\n")
# Calculate income per capita
12
IncomePerCapita = AvgIncome / Population

# Display the result


print("IncomePerCapita of Different Cities:")
print(IncomePerCapita)

Output:-

13
Experiment-7
Create a DataFrame based on E-Commerce data and generate
mean, mode, median.
Code:-
import pandas as pd

# Sample E-Commerce data


data = {
'OrderID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Product': ['Laptop', 'Smartphone', 'Tablet', 'Laptop', 'Smartphone',
'Tablet', 'Laptop', 'Smartphone', 'Tablet', 'Laptop'],
'Quantity': [1, 2, 1, 1, 3, 2, 1, 1, 2, 1],
'Price': [1000, 500, 300, 1000, 500, 300, 1000, 500, 300, 1000]
}

# Create DataFrame
ecommerce_df = pd.DataFrame(data)

# Display the DataFrame


print("E-Commerce Dataframe:")
print(ecommerce_df)
print("\n")

# Calculate mean
mean_price = ecommerce_df['Price'].mean()

14
# Calculate mode
mode_price = ecommerce_df['Price'].mode()[0]

# Calculate median
median_price = ecommerce_df['Price'].median()

# Display the results


print(f"Mean Price: {mean_price}")
print(f"Mode Price: {mode_price}")
print(f"Median Price: {median_price}")

Output:-

15
Experiment-8
Create a DataFrame based on employee data and generate
quartile and variance.
Code:-
import pandas as pd
# Sample employee data
data = {
'EmployeeID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Name': ['Krishna', 'Murali', 'Chaitanya', 'Shyam', 'Govind',
'Madhav', 'Gopal', 'Gopal', 'Murari', 'Keshava'],
'Age': [25, 30, 35, 40, 28, 32, 45, 50, 29, 38],
'Salary': [50000, 60000, 70000, 80000, 55000,
62000, 75000, 90000, 58000, 72000],
'YearsAtCompany': [1, 2, 3, 4, 1, 2, 5, 6, 2, 3]
}
# Create DataFrame
employee_df = pd.DataFrame(data)
# Display the DataFrame
print("Employees Data:")
print(employee_df)
# Calculate quartiles
quartiles_salary = employee_df['Salary'].quantile([0.25, 0.5, 0.75])
quartiles_years = employee_df['YearsAtCompany'].quantile([0.25, 0.5, 0.75])

# Calculate variance

16
variance_salary = employee_df['Salary'].var()
variance_years = employee_df['YearsAtCompany'].var()
# Display the results
print("\nQuartiles for Salary:")
print(quartiles_salary)
print("\nQuartiles for Years at Company:")
print(quartiles_years)
print(f"\nVariance for Salary: {variance_salary}")
print(f"Variance for Years at Company: {variance_years}")

Output: -

17
Experiment-9
Program to implement Skewness on Random data.
Code: -
# Program to implement Skewness on Random data.
import numpy as np
from scipy.stats import skew
# Generate random data
data = data = np.random.normal(1, 100, 15)
print("Random Numbers:")
print(data)
# Calculate skewness
data_skewness = skew(data)
# Print the skewness
print(f"\nSkewness of the data: {data_skewness}")

Output: -

18
Experiment-10
Create a DateFrame on any Data and compute statistical
function of Kurtosis.
Code: -
import pandas as pd
from scipy.stats import kurtosis

# Step 1: Create a sample DataFrame


data = {
'EmployeeID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Name':['Krishna', 'Murali', 'Chaitanya', 'Shyam', 'Govind',
'Madhav', 'Gopal', 'Gopal', 'Murari', 'Keshava'],
'Age': [25, 30, 35, 40, 28, 32, 45, 50, 29, 38],
'Salary': [50000, 60000, 70000, 80000, 55000,
62000, 75000, 90000, 58000, 72000],
'YearsAtCompany': [1, 2, 3, 4, 1, 2, 5, 6, 2, 3]
}

# Create DataFrame
employee_df = pd.DataFrame(data)

# Display the DataFrame


print("Employee DataFrame:")
print(employee_df)

# Step 2: Compute kurtosis for the 'Salary' column


kurtosis_salary = kurtosis(employee_df['Salary'], fisher=True) # Fisher's definition (subtracts
3)

19
# Display the kurtosis result
print(f"\nKurtosis of Salary: {kurtosis_salary}")

Output: -

20
Experiment-11
Series objects Temp1, temp2, temp3, temp 4 stores the
temperature of days of week 1, week 2, week 3, week 4.
Write a script to:-
a. Print average temperature per week
b. Print average temperature of entire month
Code: -
import pandas as pd

# Sample temperature data for four weeks (7 days each)


data = {
'Week 1': [30, 32, 31, 29, 28, 30, 31], # Week 1
'Week 2': [31, 30, 29, 32, 33, 31, 30], # Week 2
'Week 3': [28, 29, 30, 31, 32, 30, 29], # Week 3
'Week 4': [30, 31, 32, 33, 34, 30, 31] # Week 4
}

# Create DataFrame
temperature_df = pd.DataFrame(data)

# Display the DataFrame


print("Temperature DataFrame:")
print(temperature_df)

# a. Print average temperature per week


avg_temp_per_week = temperature_df.mean()
print("\nAverage temperature per week:")

21
print(avg_temp_per_week)

# b. Print average temperature of entire month


avg_temp_month = temperature_df.values.flatten().mean()
print(f"\nAverage temperature for the entire month: {avg_temp_month:.2f}°C")

Output: -

22
Experiment-12
Write a Program to read a CSV file and create its DataFrame.
Code: -
CSV File
EmployeeID,Name,Age,Salary
1,Shyam,30,50000
2,Gopal,25,60000
3,Madhav,35,70000
4,keshava,40,80000
5,Murari,28,55000
Python File
import pandas as pd

# Step 1: Read the CSV file


file_path = 'L12.csv' # Make sure this path is correct
employee_df = pd.read_csv(file_path)

# Step 2: Display the DataFrame


print("Employee DataFrame:")
print(employee_df)

# Optional: Display basic information about the DataFrame


print("\nBasic Information about the DataFrame:")
print(employee_df.info())

# Optional: Display the first few rows of the DataFrame


print("\nFirst few rows of the DataFrame:")
print(employee_df.head())

23
Output: -

24
Experiment-13
Consider the DataFrame QtrSales where each row contains
the item category, item name and expenditure and group the
rows by category, and print the average expenditure per
category.
Code: -
import pandas as pd

# Sample data for QtrSales DataFrame


data = {
'Category': ['Electronics', 'Electronics', 'Clothing', 'Clothing', 'Groceries',
'Groceries'],
'Item': ['Laptop', 'Smartphone', 'T-shirt', 'Jeans', 'Milk', 'Bread'],
'Expenditure': [1200, 800, 50, 60, 30, 20]
}

# Create DataFrame
QtrSales = pd.DataFrame(data)

# Display the DataFrame


print("QtrSales DataFrame:")
print(QtrSales)

# Group by 'Category' and calculate the average expenditure


average_expenditure = QtrSales.groupby('Category')['Expenditure'].mean()

25
# Display the average expenditure per category
print("\nAverage Expenditure per Category:")
print(average_expenditure)

Output: -

26
Experiment-14
Create a DataFrame having age, name, weight of five
students. Write a program to display only the weight of first
and fourth rows.
Code: -
import pandas as pd

# Sample data for five students


data = {
'Name': ['Madhav', 'Shyam', 'Murari', 'Gopal', 'Keshava'],
'Age': [20, 21, 19, 22, 20],
'Weight': [55, 70, 60, 80, 65] # Weight in kg
}

# Create DataFrame
students_df = pd.DataFrame(data)

# Display the DataFrame


print("Students DataFrame:")
print(students_df)

# Display the weight of the first and fourth rows


weights = students_df.iloc[[0, 3]]['Weight']

print("\nWeight of the first and fourth students:")


print(weights)
27
Output: -

28
Experiment-15
Write a program to create a DataFrame to store weight, age
and name of three people. Print the DataFrame and its
transpose.
Code: -
import pandas as pd

# Sample data for three people


data = {
'Name': ['Keshava', 'Madhav', 'Murari'],
'Age': [25, 30, 35],
'Weight': [55, 70, 80] # Weight in kg
}

# Create DataFrame
people_df = pd.DataFrame(data)

# Display the DataFrame


print("DataFrame:")
print(people_df)

# Print the transpose of the DataFrame


print("\nTranspose of the DataFrame:")
print(people_df.T)

29
Output: -

30

You might also like