0% found this document useful (0 votes)

49 views

Intro To Py and ML - Part 2

This document outlines the topics and objectives of a session on data analytics in Python. The session aims to teach participants how to: 1) Design Python scripts to solve data analytics problems and visualize results 2) Solve data management problems in Python 3) Import, export, manage, and visualize data using Python libraries like Pandas The document provides code examples for common data analytics tasks like importing CSV data, managing empty values, selecting data by conditions, calculating statistics, and visualizing data using scatter plots, bar graphs, boxplots, and geospatial maps.

Uploaded by

KAORU Amane

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views

Intro To Py and ML - Part 2

Uploaded by

KAORU Amane

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Dr Mohd Hilmi Hasan

DATA ANALYTICS
OAU5362/DAM5362

May 2021
OUTCOMES & OUTLINE

OUTCOMES OUTLINES

At the end of this session, you will be able to: • Managing Empty Cells
• Design Python script to solve data analytics • Importing and Exporting Data
problems and visualize the results. • Managing Data
• Solve data management problems in Python. • Visualization

2
MANAGING EMPTY CELLS

• Create a data frame as follows:

import pandas as pd
import numpy as np
data = {'Name':['Ali', 'Abu', 'George', 'Mike', 'Chan', 'Sammy'],
'Marks':[70, 65,np.nan, 82, 78, 75]}
score = pd.DataFrame(data)
print(score)

• Type and run (Note: to use mean function import from package - from statistics import mean):
▪ print(sum(score[‘Marks’]))
▪ print(mean(score[‘Marks’]))

• To resolve:
score2 = score.dropna()
print(sum(score2['Marks']))
print(mean(score2['Marks']))
3
IMPORTING & EXPORTING DATA

• The most common way of getting data for analysis is through importing csv or excel dataset.

• Python provides this capability through pandas functions.

• Copy covid_my.csv in the working folder (working path), and code the following:
my = pd.read_csv("covid_my.csv")

• To display the data, call the variable (data frame) name: my

• The correctness of data import may be verified by using head function (this is normally used when data is huge,
and we only want to display certain n observations of the data):
my.head()

• Summary of data containing some statistical measurements can be retrieved by using describe function:
my.describe() • read_excel() and to_excel() functions are used for xlsx files.
• Path is used for files that are not in the working directory:
• To save a csv file use to_csv function: pd.read_csv("C:/Users/mhilmi_hasan/OneDrive/
my.to_csv("covid_my2.csv") cerdas/Py/covid_my.csv")
4
MANAGING DATA
• Type and run:
▪ my['Confirmed']
▪ my[7:16]
▪ max(my['Deaths'])
▪ min(my['Deaths'])
▪ mean(my['Confirmed’])

• Selecting data by condition:

▪ my[my['State']=='Perak']
▪ my[my['Confirmed'] > 1000]
▪ my.loc[my['Confirmed']>1000, 'State']

▪ max_deaths = max(my['Deaths’])
my.loc[my['Deaths']==max_deaths,'State']

▪ my2 = my
my2['Perc_Confirmed'] = round((my2['Confirmed']/my2['Population'])*100,2)
my2['Perc_Deaths'] = (my2['Deaths']/my2['Population'])*100
my2
5
VISUALIZATION – SCATTER PLOT

• Scatter plot:

import matplotlib.pyplot as plt

x = my['State']
y = my['Confirmed']
plt.scatter(x, y)
plt.title('Covid-19 Cases in MY by State')
plt.show()

• To improved the overlapped labels on x-axis (before show function):

plt.xticks(rotation=90)

6
VISUALIZATION – BAR GRAPH & BOXPLOT

• Bar graph:

x = my['State']
y = my['Confirmed']
plt.bar(x,y)
plt.xticks(rotation=90)
plt.title('Covid-19 Cases in MY by State')
plt.show()

• Boxplot:
plt.boxplot(my['Confirmed'])
plt.show()

7
VISUALIZATION – BAR GRAPH & BOXPLOT

• Create dataframe:

import pandas as pd
import numpy as np
data = {'Name':['Ali', 'Abu', 'George', 'Mike', 'Chan', 'Sammy'],
'Marks1':[70, 65,77, 82, 78, 75],
'Marks2':[80, 81,77, 82, 10, 85],
'Marks3':[70, 65,77, 10, 82, 75]}
score = pd.DataFrame(data)
print(score)

• Mutiple Boxplot:
score.boxplot(column=['Marks1','Marks2','Marks3'])

Outliers

8
VISUALIZATION – GEO & MAP

CASE STUDY

• Type and run the following codes:

pip install folium

import pandas as pd
import matplotlib.pyplot as plt #importing plotting library
import folium #importing geospatial visualization library

# creating the map

my_map = folium.Map(location=[1.559580, 103.637489], zoom_start=6)
df = pd.read_csv("covid_my.csv")

df.apply(lambda cvd:folium.Marker(location=[cvd["Lat"], cvd["Long"]],

popup =
['Cases='+str(cvd["Confirmed"]), 'Deaths='+str(cvd["Deaths"])])
.add_to(my_map), axis=1)

# display the map

my_map 9
10

TCN-TRDP2-D-BOM-011-18 - TRDP User's Manual
No ratings yet
TCN-TRDP2-D-BOM-011-18 - TRDP User's Manual
201 pages
King Donkey Ears Pages 1-33 - Flip PDF Download - FlipHTML5
No ratings yet
King Donkey Ears Pages 1-33 - Flip PDF Download - FlipHTML5
33 pages
Assignment Report Presentation
No ratings yet
Assignment Report Presentation
12 pages
IP Projects For Class Xii
0% (1)
IP Projects For Class Xii
20 pages
Python Pandas Data Analysis
No ratings yet
Python Pandas Data Analysis
36 pages
Ashutosh Project
No ratings yet
Ashutosh Project
19 pages
Computer Science Ip
No ratings yet
Computer Science Ip
16 pages
Text
No ratings yet
Text
7 pages
COVID 19 Pandemic Analysis
No ratings yet
COVID 19 Pandemic Analysis
26 pages
Name
No ratings yet
Name
23 pages
Project Report Covid 19 Analysis Tutorialaicsip
No ratings yet
Project Report Covid 19 Analysis Tutorialaicsip
19 pages
r.jeevitha
No ratings yet
r.jeevitha
16 pages
Name
No ratings yet
Name
23 pages
NM
No ratings yet
NM
23 pages
Artificial Intelligence Project Report
No ratings yet
Artificial Intelligence Project Report
15 pages
Essential Software Assignment 3
No ratings yet
Essential Software Assignment 3
2 pages
report_MSA_Practice02
No ratings yet
report_MSA_Practice02
29 pages
assignment 8_
No ratings yet
assignment 8_
2 pages
COVID-19 Clinical Trials EDA Pandas
No ratings yet
COVID-19 Clinical Trials EDA Pandas
30 pages
Visualizing COVID-19 Data Beautifully in Python (In 5 Minutes or Less!!) - by Nik Piepenbreier - Towards Data Science
No ratings yet
Visualizing COVID-19 Data Beautifully in Python (In 5 Minutes or Less!!) - by Nik Piepenbreier - Towards Data Science
8 pages
COVID 19 Pandemic Analysis class 12 practicals (1) (2)
No ratings yet
COVID 19 Pandemic Analysis class 12 practicals (1) (2)
29 pages
Syadatajveez
No ratings yet
Syadatajveez
21 pages
AI Practical Project
No ratings yet
AI Practical Project
15 pages
IP Project Complete Color Coded Justification-Aligned Outputs Changed
No ratings yet
IP Project Complete Color Coded Justification-Aligned Outputs Changed
55 pages
Assignment Sujith S
No ratings yet
Assignment Sujith S
13 pages
Project File -A
No ratings yet
Project File -A
20 pages
data_visualization_python_code
No ratings yet
data_visualization_python_code
8 pages
COVID
No ratings yet
COVID
19 pages
Document (1)
No ratings yet
Document (1)
8 pages
Untitled Document 3
No ratings yet
Untitled Document 3
13 pages
Ip Project
No ratings yet
Ip Project
23 pages
Sameer - Covid Data Set
No ratings yet
Sameer - Covid Data Set
13 pages
Covid_vaccine
No ratings yet
Covid_vaccine
13 pages
Case Study Guidelines
No ratings yet
Case Study Guidelines
7 pages
Assignment - Ipynb - Colaboratory
No ratings yet
Assignment - Ipynb - Colaboratory
14 pages
Covid Report
No ratings yet
Covid Report
3 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Covid Report PDF
No ratings yet
Covid Report PDF
17 pages
lecture3
No ratings yet
lecture3
53 pages
DSBDA Mini Project.ipynb - Colab
No ratings yet
DSBDA Mini Project.ipynb - Colab
22 pages
Maheswari Public School Kalwar Road: Project File Session 2023-24
No ratings yet
Maheswari Public School Kalwar Road: Project File Session 2023-24
28 pages
Week 15.Pptx
No ratings yet
Week 15.Pptx
47 pages
Dataset Extraction and Datasetpre-Processing
No ratings yet
Dataset Extraction and Datasetpre-Processing
7 pages
IP py project
No ratings yet
IP py project
45 pages
Codes
No ratings yet
Codes
44 pages
COVID-19 Data Analysis With Pandas and NumPy
No ratings yet
COVID-19 Data Analysis With Pandas and NumPy
5 pages
Pandas
No ratings yet
Pandas
29 pages
Packages in Python
No ratings yet
Packages in Python
17 pages
Python Codes and Comments
No ratings yet
Python Codes and Comments
5 pages
Share INFORMATICS PRACTICES KABIR
No ratings yet
Share INFORMATICS PRACTICES KABIR
37 pages
Practical No.-01
No ratings yet
Practical No.-01
25 pages
PR 1
No ratings yet
PR 1
7 pages
EDA Report (1)
No ratings yet
EDA Report (1)
10 pages
PROJ-COVID
No ratings yet
PROJ-COVID
23 pages
Pandas
No ratings yet
Pandas
13 pages
Group 10A - GA2
No ratings yet
Group 10A - GA2
10 pages
Covid Data Analysis Sample File
No ratings yet
Covid Data Analysis Sample File
4 pages
Project 3
No ratings yet
Project 3
6 pages
INDEX (1)
No ratings yet
INDEX (1)
16 pages
Co Vids QL Present N 0710
No ratings yet
Co Vids QL Present N 0710
27 pages
Ikhwan Salihin E-Portfolio
No ratings yet
Ikhwan Salihin E-Portfolio
6 pages
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
From Everand
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
Kanto
No ratings yet
Week-01-25Feb2021-PAB5144-PAM5143-Formation Evaluation - GR-CAL-SP-logs
No ratings yet
Week-01-25Feb2021-PAB5144-PAM5143-Formation Evaluation - GR-CAL-SP-logs
74 pages
FM - Pete - 663 - Resist Tools
No ratings yet
FM - Pete - 663 - Resist Tools
53 pages
Machine-Learning-Based Prediction of Corrosion Behavior in Additively Manufactured Inconel 718
No ratings yet
Machine-Learning-Based Prediction of Corrosion Behavior in Additively Manufactured Inconel 718
16 pages
Pressure Buildup and Horner Plot
No ratings yet
Pressure Buildup and Horner Plot
46 pages
PBM5153 Well Test Analysis: Linear Discontinuities (Sealing Faults)
No ratings yet
PBM5153 Well Test Analysis: Linear Discontinuities (Sealing Faults)
26 pages
Net Present Value (NPV)
No ratings yet
Net Present Value (NPV)
28 pages
Energy Jurnal
No ratings yet
Energy Jurnal
18 pages
Well Completion: Assoc. Prof. Issham Ismail
No ratings yet
Well Completion: Assoc. Prof. Issham Ismail
36 pages
Tubing String
No ratings yet
Tubing String
32 pages
WEB Security: Henric Johnson 1
No ratings yet
WEB Security: Henric Johnson 1
22 pages
ND1 Engine Control Unit (ECU) - 4
No ratings yet
ND1 Engine Control Unit (ECU) - 4
7 pages
Arabic OCR System
No ratings yet
Arabic OCR System
3 pages
Computer Portfolio (Aashi Singh)
No ratings yet
Computer Portfolio (Aashi Singh)
18 pages
101566212
No ratings yet
101566212
81 pages
VxRail Appliance - VxRail Appliance Power Control Procedures-Power Down and Power Up A VxRail Cluster
No ratings yet
VxRail Appliance - VxRail Appliance Power Control Procedures-Power Down and Power Up A VxRail Cluster
6 pages
Module 1: Introduction To Cyber Ethics - Concepts, Perspectives, and Methodological Frameworks Week 1 Learning Outcomes
No ratings yet
Module 1: Introduction To Cyber Ethics - Concepts, Perspectives, and Methodological Frameworks Week 1 Learning Outcomes
13 pages
RedBus Ticket TM4X58977667
No ratings yet
RedBus Ticket TM4X58977667
1 page
Electrical Drawings and Schematics 32222354
100% (1)
Electrical Drawings and Schematics 32222354
2 pages
designing books
No ratings yet
designing books
2 pages
TECH MAHINDRA DATA ANALYST INTERVIEW QUESTIONS
No ratings yet
TECH MAHINDRA DATA ANALYST INTERVIEW QUESTIONS
11 pages
Optimization of Automobile Performances
No ratings yet
Optimization of Automobile Performances
19 pages
LabAssignment7 1
No ratings yet
LabAssignment7 1
2 pages
LinkedIn
No ratings yet
LinkedIn
3 pages
JNCIA - SEC Ans1
No ratings yet
JNCIA - SEC Ans1
7 pages
Dmu Coursework Extension Form
100% (2)
Dmu Coursework Extension Form
7 pages
Atlantic Computer Case
100% (1)
Atlantic Computer Case
5 pages
Workplace Software and Skills - WEB IlfJtcP
No ratings yet
Workplace Software and Skills - WEB IlfJtcP
1,101 pages
Key Performance Indicators For HEI's Measurement As An Important Element of Their Accountability
No ratings yet
Key Performance Indicators For HEI's Measurement As An Important Element of Their Accountability
18 pages
sapnote_0001881752
No ratings yet
sapnote_0001881752
2 pages
Problem Statement On Course Registration
No ratings yet
Problem Statement On Course Registration
1 page
DriveDxReport - APPLE SSD SM0256L - 2024-05-04 - 16-48-09-030
No ratings yet
DriveDxReport - APPLE SSD SM0256L - 2024-05-04 - 16-48-09-030
4 pages
Red Faction README
No ratings yet
Red Faction README
16 pages
Montecarlosimulations: Software By: Barringer & Associates, Inc
No ratings yet
Montecarlosimulations: Software By: Barringer & Associates, Inc
26 pages
Jabra Elite Active 75t User Manual - EN - English - RevA
No ratings yet
Jabra Elite Active 75t User Manual - EN - English - RevA
27 pages
Auditing-Cyber-Security WHP Eng 0217 PDF
No ratings yet
Auditing-Cyber-Security WHP Eng 0217 PDF
15 pages
Compus Connect PDF 3
No ratings yet
Compus Connect PDF 3
2 pages
Licensing Brief PLT Outsourcing Software Management
No ratings yet
Licensing Brief PLT Outsourcing Software Management
5 pages

Intro To Py and ML - Part 2

Uploaded by

Intro To Py and ML - Part 2

Uploaded by

Dr Mohd Hilmi Hasan

• Create a data frame as follows:

• Python provides this capability through pandas functions.

• To display the data, call the variable (data frame) name: my

• Selecting data by condition:

import matplotlib.pyplot as plt

• To improved the overlapped labels on x-axis (before show function):

• Type and run the following codes:

pip install folium

# creating the map

df.apply(lambda cvd:folium.Marker(location=[cvd["Lat"], cvd["Long"]],

# display the map

You might also like