0% found this document useful (0 votes)

14 views12 pages

1 2 Merged

The document outlines five experiments conducted using Google Cloud Data Analytics, each focusing on different datasets and analytical tasks. Experiments include generating sales reports, tracking daily temperatures, analyzing COVID-19 cases, evaluating movie ratings, and assessing online course completion statuses. Each experiment details the aim, algorithm, procedure, code, and results, showcasing data cleaning, computation, and visualization techniques.

Uploaded by

HEMACHANDAR . S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views12 pages

1 2 Merged

Uploaded by

HEMACHANDAR . S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Google Cloud Data Analytics Lab Experiments

Experiment 1: Simple Sales Report

Aim:

To generate a sales report by cleaning missing data, computing total sales, and identifying

top-performing products.

Algorithm:

- Load the dataset into a Pandas DataFrame.

- Fill missing values in the 'Price' column with the average price of the respective product.

- Create a new column: Total_Sales = Quantity × Price.

- Identify the product with the highest total sales.

- Visualize total sales by product using a bar chart.

Procedure:

- Open Google Colab and upload the dataset or use sample data.

- Handle missing 'Price' values by filling in average prices using group-by.

- Create a new column for total sales.

- Group by product and sum total sales.

- Identify the highest selling product.

- Plot a bar chart using matplotlib.

Code:
import pandas as pd
import matplotlib.pyplot as plt

data = {
'Product': ['Pen', 'Pencil', 'Notebook', 'Pen', 'Pencil', 'Notebook'],
'Quantity': [10, 15, 5, 12, 18, 7],
'Price': [5, None, 20, 5, 3, None]
}
df = pd.DataFrame(data)
df['Price'] = df.groupby('Product')['Price'].transform(lambda x: x.fillna(x.mean()))
df['Total_Sales'] = df['Quantity'] * df['Price']
sales_by_product = df.groupby('Product')['Total_Sales'].sum()
top_product = sales_by_product.idxmax()
sales_by_product.plot(kind='bar', title='Total Sales by Product')
plt.ylabel('Total Sales')
plt.show()
print("Product with highest total sales:", top_product)

Sample Output:

Product with highest total sales: Pencil

Result:

The program successfully computes and visualizes product-wise sales and identifies the top-selling

item.
Experiment 2: Daily Temperature Tracker
Aim:

To process temperature data, handle missing values, and visualize average temperature trends over

time.

Algorithm:

- Load the temperature dataset.

- Fill missing values in Min_Temp and Max_Temp with their column means.

- Calculate Average_Temp = (Min_Temp + Max_Temp)/2.

- Find the date with the highest average temperature.

- Plot a line graph of average temperature over time.

Procedure:

- Load the dataset with dates, min temp, and max temp.

- Use fillna() to replace nulls with column averages.

- Compute Average_Temp and add to the DataFrame.

- Use idxmax() to find the date with the highest average.

- Plot temperature trends over time.

Code:
import pandas as pd
import matplotlib.pyplot as plt

data = {
'Date': pd.date_range(start='2023-01-01', periods=5),
'Min_Temp': [21, 23, None, 22, 25],
'Max_Temp': [30, None, 35, 31, 34]
}
df = pd.DataFrame(data)
df['Min_Temp'].fillna(df['Min_Temp'].mean(), inplace=True)
df['Max_Temp'].fillna(df['Max_Temp'].mean(), inplace=True)
df['Average_Temp'] = (df['Min_Temp'] + df['Max_Temp']) / 2
hottest_day = df.loc[df['Average_Temp'].idxmax(), 'Date']
plt.plot(df['Date'], df['Average_Temp'], marker='o')
plt.title("Average Temperature Over Time")
plt.xlabel("Date")
plt.ylabel("Average Temp")
plt.grid(True)
plt.show()
print("Date with highest average temperature:", hottest_day.date())

Sample Output:

Date with highest average temperature: 2023-01-05

Result:

The trend line provides a visual representation of temperature changes, and the hottest day is

identified.
Google Cloud Data Analytics Lab Experiments

Experiment 3: COVID-19 Daily Cases

Aim:

To analyze COVID-19 daily case data by cleaning missing values and visualizing trends.

Algorithm:

- Load dataset with Date and Cases.

- Fill missing case values with 0.

- Calculate total and average daily cases.

- Find the date with the highest case count.

- Plot a line chart of daily cases.

Procedure:

- Import dataset into a DataFrame.

- Use fillna(0) for missing cases.

- Use sum() and mean() to compute totals.

- Use idxmax() for the peak day.

- Plot the data as a line graph.

Code:
import pandas as pd
import matplotlib.pyplot as plt

data = {
'Date': pd.date_range(start='2023-01-01', periods=5),
'Cases': [100, None, 250, 400, None]
}
df = pd.DataFrame(data)
df['Cases'].fillna(0, inplace=True)
total_cases = df['Cases'].sum()
average_cases = df['Cases'].mean()
peak_day = df.loc[df['Cases'].idxmax(), 'Date']
plt.plot(df['Date'], df['Cases'], marker='o')
plt.title("COVID-19 Daily Cases")
plt.xlabel("Date")
plt.ylabel("Cases")
plt.grid(True)
plt.show()
print("Total cases:", total_cases)
print("Average daily cases:", average_cases)
print("Date with highest number of cases:", peak_day.date())

Sample Output:

Total cases: 750.0

Average daily cases: 150.0

Date with highest number of cases: 2023-01-04

Result:

Correctly shows trends and identifies the peak infection date with a clear graph.
Experiment 4: Movie Ratings Dataset
Aim:

To analyze movie ratings and identify top movies based on viewer feedback.

Algorithm:

- Load the movie dataset.

- Remove entries with missing ratings.

- Calculate the average rating.

- Find the top 3 movies with the highest ratings.

- Display a bar chart of top 5 movies.

Procedure:

- Load the movie dataset.

- Drop rows with null ratings.

- Use mean() to get average rating.

- Use nlargest() to get top movies.

- Visualize with matplotlib.

Code:
import pandas as pd
import matplotlib.pyplot as plt

data = {
'Movie_Name': ['Movie A', 'Movie B', 'Movie C', 'Movie D', 'Movie E', 'Movie F'],
'Viewer_Rating': [4.5, 4.8, None, 4.2, 4.9, 4.3]
}
df = pd.DataFrame(data)
df.dropna(inplace=True)
average_rating = df['Viewer_Rating'].mean()
top_movies = df.nlargest(3, 'Viewer_Rating')
top_5 = df.nlargest(5, 'Viewer_Rating')
plt.bar(top_5['Movie_Name'], top_5['Viewer_Rating'], color='skyblue')
plt.title("Top 5 Movie Ratings")
plt.ylabel("Rating")
plt.xticks(rotation=45)
plt.show()
print("Average Rating:", average_rating)
print("Top 3 Movies:")
print(top_movies[['Movie_Name', 'Viewer_Rating']])

Sample Output:

Average Rating: 4.54

Top 3 Movies:

Movie_Name Viewer_Rating

4 Movie E 4.9

1 Movie B 4.8

0 Movie A 4.5

Result:

Identifies and displays the top 3 movies with supporting bar chart visualization.
Experiment 5: Online Course Completion Data
Aim:

To analyze student course completion status and visualize completion vs non-completion.

Algorithm:

- Load the dataset.

- Replace missing Completion_Status with 'No'.

- Count 'Yes' and 'No' entries.

- Plot a pie chart of the results.

Procedure:

- Load the dataset into Pandas.

- Replace nulls in Completion_Status with 'No'.

- Use value_counts() to count 'Yes' and 'No'.

- Visualize using a pie chart.

Code:
import pandas as pd
import matplotlib.pyplot as plt

data = {
'Student_ID': [101, 102, 103, 104, 105],
'Completion_Status': ['Yes', None, 'No', 'Yes', None]
}
df = pd.DataFrame(data)
df['Completion_Status'].fillna("No", inplace=True)
completion_count = df['Completion_Status'].value_counts()
plt.pie(completion_count, labels=completion_count.index, autopct='%1.1f%%',
startangle=140)
plt.title("Course Completion vs Non-Completion")
plt.axis('equal')
plt.show()
print("Course Completion Counts:")
print(completion_count)

Sample Output:
Course Completion Counts:

No 3

Yes 2

Name: Completion_Status, dtype: int64

Result:

Pie chart clearly shows student completion status distribution.

Dev Lab Record
No ratings yet
Dev Lab Record
21 pages
Chirayu (1) Merged Merged
No ratings yet
Chirayu (1) Merged Merged
76 pages
NM
No ratings yet
NM
23 pages
Even Students
No ratings yet
Even Students
36 pages
Oddstudents
No ratings yet
Oddstudents
35 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
42 pages
ML Expt 1 Description
No ratings yet
ML Expt 1 Description
15 pages
INDEX
No ratings yet
INDEX
16 pages
Practical File Class 12 2025-26
No ratings yet
Practical File Class 12 2025-26
19 pages
Python Practical Questions
No ratings yet
Python Practical Questions
13 pages
DAV Practical
No ratings yet
DAV Practical
12 pages
Untitled 5
No ratings yet
Untitled 5
10 pages
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
No ratings yet
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
19 pages
IP Record Python 23-24 Aryan
No ratings yet
IP Record Python 23-24 Aryan
42 pages
IP Practical
No ratings yet
IP Practical
24 pages
DAVP Lab Manual
No ratings yet
DAVP Lab Manual
12 pages
DA Lab
No ratings yet
DA Lab
27 pages
Codes
No ratings yet
Codes
44 pages
Cap 793
No ratings yet
Cap 793
17 pages
23bet10114 Naman Gupta Assignment-1
No ratings yet
23bet10114 Naman Gupta Assignment-1
17 pages
Xii Ip Practical List 2022-23-1
No ratings yet
Xii Ip Practical List 2022-23-1
23 pages
Eda Lab Assignment2
No ratings yet
Eda Lab Assignment2
10 pages
Dev Record Final
No ratings yet
Dev Record Final
34 pages
Python For Machine Learning
No ratings yet
Python For Machine Learning
66 pages
Practical7 Python Programming
No ratings yet
Practical7 Python Programming
6 pages
Class XII-IP-Practical File 1
No ratings yet
Class XII-IP-Practical File 1
28 pages
Applied Tech Lesson 45: 1 Lesson 45: Pie Chart & Bell Curve
No ratings yet
Applied Tech Lesson 45: 1 Lesson 45: Pie Chart & Bell Curve
25 pages
Pie Chart (Products)
No ratings yet
Pie Chart (Products)
4 pages
Matplotlib Pandas Guide
No ratings yet
Matplotlib Pandas Guide
7 pages
Rough Note Text
No ratings yet
Rough Note Text
4 pages
Eda Indepth
No ratings yet
Eda Indepth
19 pages
Hrithik Saini Class 12th c1, Roll No 1033
No ratings yet
Hrithik Saini Class 12th c1, Roll No 1033
25 pages
Certificate
No ratings yet
Certificate
25 pages
Final Dev Record
No ratings yet
Final Dev Record
49 pages
Recurrent Neural Network-Programs
No ratings yet
Recurrent Neural Network-Programs
9 pages
Pandas Prac
No ratings yet
Pandas Prac
4 pages
Practical File Artificial Intelligence Class 10
No ratings yet
Practical File Artificial Intelligence Class 10
11 pages
Dejene Chala Stat606 Screening Quiz Programming Part
No ratings yet
Dejene Chala Stat606 Screening Quiz Programming Part
12 pages
Lab 9
No ratings yet
Lab 9
2 pages
Exercise - 6: DS203-2024-S1 Problem1:: Statistics
No ratings yet
Exercise - 6: DS203-2024-S1 Problem1:: Statistics
10 pages
Practical File IP Class 12 2024 25 Sharing Removed
No ratings yet
Practical File IP Class 12 2024 25 Sharing Removed
29 pages
University Institute of Engineering Department of Computer Science & Engineering
No ratings yet
University Institute of Engineering Department of Computer Science & Engineering
18 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
Da QP
No ratings yet
Da QP
2 pages
Ip Project
No ratings yet
Ip Project
27 pages
Marking Scheme Practical Paper
No ratings yet
Marking Scheme Practical Paper
5 pages
Index
No ratings yet
Index
4 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
Ind. Partnership Act Case Studies
No ratings yet
Ind. Partnership Act Case Studies
10 pages
AIML
No ratings yet
AIML
13 pages
Pandas Notes
No ratings yet
Pandas Notes
27 pages
PYF Project LearnerNotebook LowCode
No ratings yet
PYF Project LearnerNotebook LowCode
6 pages
Machine Learning Project Roadmap
No ratings yet
Machine Learning Project Roadmap
4 pages
Practical - With Solution - XII - IP
No ratings yet
Practical - With Solution - XII - IP
13 pages
Assignment 4 On Visualization On Graph With Solution
No ratings yet
Assignment 4 On Visualization On Graph With Solution
14 pages
12 Ip Practical List With Solution Complete
No ratings yet
12 Ip Practical List With Solution Complete
5 pages
Some Exercises
No ratings yet
Some Exercises
9 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
Practical File Questions With Answers
No ratings yet
Practical File Questions With Answers
7 pages
Petition - Notarial Commission - Template
No ratings yet
Petition - Notarial Commission - Template
5 pages
Housekeeping House Rules Lesson Exemplar
No ratings yet
Housekeeping House Rules Lesson Exemplar
16 pages
Group Work Project: Mscfe 660 Case Studies in Risk Management
100% (1)
Group Work Project: Mscfe 660 Case Studies in Risk Management
7 pages
BA Record
No ratings yet
BA Record
58 pages
MEROX
No ratings yet
MEROX
8 pages
Levin
No ratings yet
Levin
60 pages
Dharavi - A City Within A City
No ratings yet
Dharavi - A City Within A City
2 pages
CT P Ueh: SECTION 1: Vocabulary and Structure (30 Marks, 1 Mark/answer)
No ratings yet
CT P Ueh: SECTION 1: Vocabulary and Structure (30 Marks, 1 Mark/answer)
7 pages
Obt351-Food, Nutrition and Health-744666708-Obt351 Food, Nutrition and Health
No ratings yet
Obt351-Food, Nutrition and Health-744666708-Obt351 Food, Nutrition and Health
51 pages
BRCGS Food Safety
No ratings yet
BRCGS Food Safety
3 pages
Master Server Log
No ratings yet
Master Server Log
35 pages
Resolution 2024-38 (Realignment Diesel)
No ratings yet
Resolution 2024-38 (Realignment Diesel)
2 pages
2021 Iehc 78
No ratings yet
2021 Iehc 78
214 pages
Wireless Networks
No ratings yet
Wireless Networks
5 pages
LST-1198 Decom &amp Xfrto Morroco 7-9-84
No ratings yet
LST-1198 Decom &amp Xfrto Morroco 7-9-84
30 pages
SGS-Supplier Code of Conduct
No ratings yet
SGS-Supplier Code of Conduct
10 pages
Memorandum Ra 9165
No ratings yet
Memorandum Ra 9165
5 pages
Mardhiyah Et Al - 2020 - Maternal Factors and Stunting Among Children Age 0
No ratings yet
Mardhiyah Et Al - 2020 - Maternal Factors and Stunting Among Children Age 0
4 pages
HW12 Sol
No ratings yet
HW12 Sol
9 pages
Amd BKK Amd 7 Seas
No ratings yet
Amd BKK Amd 7 Seas
2 pages
Record Sheet
No ratings yet
Record Sheet
5 pages
Ammonium Hydroxide SDS
No ratings yet
Ammonium Hydroxide SDS
8 pages
Path Bursary Application Form
No ratings yet
Path Bursary Application Form
2 pages
Business Proposal: Enhancing IT Infrastructure and Integration For Simmons Medical Practice
No ratings yet
Business Proposal: Enhancing IT Infrastructure and Integration For Simmons Medical Practice
5 pages
Assignment 1 - CCS372 - Virtualization
No ratings yet
Assignment 1 - CCS372 - Virtualization
2 pages
12 - Section VI - Chapter 3 - Annex 02 - Drawings and Documents Dibi
No ratings yet
12 - Section VI - Chapter 3 - Annex 02 - Drawings and Documents Dibi
15 pages
Muve b330 Datasheet-Ltr 22-0526 Web
No ratings yet
Muve b330 Datasheet-Ltr 22-0526 Web
2 pages
Silt Measuring Instrument
No ratings yet
Silt Measuring Instrument
6 pages
Take Home Quiz
No ratings yet
Take Home Quiz
5 pages
Work Instruction - Manual Update Kuka - Recoveryusb V1.0 To V2.0
No ratings yet
Work Instruction - Manual Update Kuka - Recoveryusb V1.0 To V2.0
4 pages
On Line Audit 2
No ratings yet
On Line Audit 2
2 pages
I. Swot Analysis Facts of The Case Strengths Weaknesses Opportunity Threats
No ratings yet
I. Swot Analysis Facts of The Case Strengths Weaknesses Opportunity Threats
5 pages
Green and Black Minimalist Resume
No ratings yet
Green and Black Minimalist Resume
2 pages
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet

1 2 Merged

Uploaded by

1 2 Merged

Uploaded by

Google Cloud Data Analytics Lab Experiments

Experiment 1: Simple Sales Report

- Load the dataset into a Pandas DataFrame.

- Create a new column: Total_Sales = Quantity × Price.

- Identify the product with the highest total sales.

- Visualize total sales by product using a bar chart.

- Handle missing 'Price' values by filling in average prices using group-by.

- Create a new column for total sales.

- Group by product and sum total sales.

- Identify the highest selling product.

- Plot a bar chart using matplotlib.

Product with highest total sales: Pencil

- Load the temperature dataset.

- Calculate Average_Temp = (Min_Temp + Max_Temp)/2.

- Find the date with the highest average temperature.

- Plot a line graph of average temperature over time.

- Use fillna() to replace nulls with column averages.

- Compute Average_Temp and add to the DataFrame.

- Use idxmax() to find the date with the highest average.

- Plot temperature trends over time.

Date with highest average temperature: 2023-01-05

Experiment 3: COVID-19 Daily Cases

- Load dataset with Date and Cases.

- Fill missing case values with 0.

- Calculate total and average daily cases.

- Find the date with the highest case count.

- Plot a line chart of daily cases.

- Import dataset into a DataFrame.

- Use fillna(0) for missing cases.

- Use sum() and mean() to compute totals.

- Use idxmax() for the peak day.

- Plot the data as a line graph.

Total cases: 750.0

Average daily cases: 150.0

Date with highest number of cases: 2023-01-04

- Load the movie dataset.

- Remove entries with missing ratings.

- Calculate the average rating.

- Find the top 3 movies with the highest ratings.

- Display a bar chart of top 5 movies.

- Load the movie dataset.

- Drop rows with null ratings.

- Use mean() to get average rating.

- Use nlargest() to get top movies.

- Visualize with matplotlib.

Average Rating: 4.54

To analyze student course completion status and visualize completion vs non-completion.

- Load the dataset.

- Replace missing Completion_Status with 'No'.

- Count 'Yes' and 'No' entries.

- Plot a pie chart of the results.

- Load the dataset into Pandas.

- Replace nulls in Completion_Status with 'No'.

- Use value_counts() to count 'Yes' and 'No'.

- Visualize using a pie chart.

Name: Completion_Status, dtype: int64

Pie chart clearly shows student completion status distribution.

You might also like