0% found this document useful (0 votes)

57 views48 pages

Week 1 v1.32 (Hidden) - Introduction To Data Analytics

The document outlines the structure and content of the ECS784U/P Data Analytics module for Week 1, 2024, led by Dr. Anthony Constantinou. It details the timetable for lectures and labs, module assessment criteria, and the resources available for students, including coursework and recommended readings. The introduction to data analytics covers key concepts, definitions, and the importance of data in decision-making processes.

Uploaded by

Yen-Kai Cheng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views48 pages

Week 1 v1.32 (Hidden) - Introduction To Data Analytics

Uploaded by

Yen-Kai Cheng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

ECS784U/P DATA ANALYTICS

(WEEK 1, 2024)
INTRODUCTION TO DATA ANALYTICS

DR ANTHONY CONSTANTINOU 1
SCHOOL OF ELECTRONIC ENGINEERING AND COMPUTER SCIENCE
WEEK 1 LECTURE OVERVIEW
Introduction to Data Analytics
▪ Timetable.
▪ Staff.
▪ Module overview, assessment and pass criteria.
▪ Books.
▪ Lab 1.
▪ Coursework 1 and 2.
▪ Introduction to Data Analytics.

2
TIMETABLE (LECTURES)

Lectures:
▪ When: Thursdays, 11:00 – 13:00.

▪ Where: Peoples Palace, in Skeel Lecture Theatre.

▪ Weeks: 1-6 and 8-12 (11 lectures).

▪ No new module content will be taught in Week 7.
▪ Use Week 7 to digest the knowledge/skills you
have learnt in the first half of the term and catch
up on any learning gaps.
▪ Week 7 is not a ‘no work’ week.

3
TIMETABLE (LABS)
Labs:
▪ When: Wednesday, 15:00-17:00.
▪ Where: TB ground floor G02 and first floor 101/102. You should go to the
ground floor, and move to first floor if and only if the ground floor is full.
▪ Weeks: 1-6 and 9-11 (9 labs).
▪ Week 1: Preparation lab. Attend if you are new to programming.
▪ Weeks 2, 3, and 4: Main material – 1st half of module.
▪ Week 5: Case study demo.
▪ Week 6: Revision.
▪ Weeks 7, 8: no lab.
▪ Weeks 9: Demo on causal structure learning (this will be online).
▪ Weeks 10 and 11: Main material – 2nd half of the module.

4
TIMETABLE

5
TIMETABLE

6
LECTURER
▪ Module Organiser: Dr. Anthony Constantinou.
▪ I will be delivering the lectures.
▪ At QMUL since Oct 2009 - joined as a PhD student.
▪ Most of my time is spent doing research:
▪ Head of MInDS (Machine Intelligence and Decision Systems):
https://minds.qmul.ac.uk/
▪ Head of the Bayesian Artificial Intelligence research lab:
http://bayesian-ai.eecs.qmul.ac.uk/
▪ Personal research website: http://www.constantinou.info
▪ Research interests in causal machine learning and intelligent decision
making under uncertainty.
▪ i.e., learning cause-and-effect relationships models, and simulating
actions/intervention within those models to determine the most optimal
sequence of decisions.
▪ Contact:
▪ Any questions about the module should be posted on the forum.
▪ For personal questions, you can contact me via e-mail
7
[email protected]
TEACHING FELLOW
▪ Dr. Neville Kenneth Kitson but prefer to be
known as Ken.
▪ Physics B.Sc., then Ph.D. in Computer
Modelling of Air Pollution
▪ 'hands-on' I.T. career - firstly commercial at
Reuters, then at a non-profit using mobile and
web technologies for health and election
projects in Africa and Asia
▪ Big Data M.Sc. at QMUL 2018-9
▪ project with Anthony -> paper on learning to
model causes of diarrhoea from survey data
▪ Studying for 2nd Ph.D., supervised by Anthony
▪ practical limitations of causal machine
learning
▪ active learning where algorithms ask for input
from human experts for most uncertain
cause-effect relationships
8
LAB STAFF

Mr Julien Guinot: PhD Ms Hyunkyung Park. Ms Yuhan Li. PhD student Mr Ishaku Anaobi. PhD
student in Self-Supervision PhD student in fact in mathematical and student in Improving content
for expert latent verification in dialogue. digital epidemiology and moderation in decentralised
representations of music. dynamics of networks. online social networks.

Ms Parvathy
Ms Kasia Adamska. PhD Ms Amani Abumansour. Dr Ken Kitson:
Subramanianprasad. PhD
student in Predicting Hit PhD student in Claim PhD student in
student in ML to speed up
Songs: Multimodal and detection for Automated active learning &
traditional EM simulations 9
Data-Driven Approach. fact-checking systems. causal discovery.
like design of coding
metasurface array.
MODULE ASSESSMENT
▪ 100% Coursework Module (no exam!):

▪ Coursework 1, 60% weighting:

▪ Released in Week 2 with deadline Week 8.
▪ Focuses on classic machine learning methods using Python libraries.
▪ Done individually.
▪ Find/prepare your own data set.
▪ Do some basic coding, such as loading data and calling methods to analyse data.
▪ Write a 7-page (max) report that will include Introduction, Experiments, Results
and Conclusions.
▪ Coursework 2, 40% weighting:
▪ Released in Week 9 with deadline in Week 13, during exam period.
▪ Focuses on causal machine learning.
▪ Done individually.
▪ Find your own data (can reuse data set from coursework 1 if suitable).
▪ Involves no coding. You will be using an open-source Java UI research project.
▪ Answer a set of questions.
10
MODULE PASS CRITERIA
Level-7 module pass criteria apply (postgraduates).

▪ A minimum total mark of 50% is required to pass this module.

▪ Pass example 1: CW1 mark 30% and CW2 mark 80% (total 50%).
▪ Pass example 2: CW1 mark 40% and CW2 mark 65% (total 50%).

▪ Fail example 1: CW1 mark 15% and CW2 mark 100% (total 49%).
▪ Resit CW1.
▪ Fail example 2: CW1 mark 70% and CW2 mark 15% (total 48%).
▪ Resit CW2.
▪ Fail example 3: CW1 mark 45% and CW2 mark 45% (total 45%).
▪ Resit both CW1 & CW2.

11
BOOKS

▪ Almost all books/book chapters to be provided as PDF

files on QMPlus.

▪ They are either freely available online, or permission

granted by authors.

12
BOOK 1
▪ Available on QMPlus as PDF.
▪ Over 400 pages:
▪ Use Table of Contents to quickly refer
to Python concepts of your interest.
▪ Made available from Week 1.

13
BOOK 2
▪ Not available on QMPlus.
▪ Not a critical book for coursework.
▪ Useful if you would like to study the data
analytic and machine learning
techniques covered in the first half of
this course in more detail.
▪ Over 600 pages:
▪ Use Table of Contents to quickly refer
to relevant data analytic concepts.

14
CHAPTERS FROM BOOKS 3, 4, AND 5
▪ The relevant chapters based on the first two books shown
below will be made available on QMPlus in Week 6.

▪ The third book will be made available on QMPlus in Week 9.

15
LABS (WEEKS 1-6)
▪ There are different Integrated Development Environments (IDE)
that can be used to process Python code.
▪ Jupyter Notebook
▪ Spyder
▪ PyDev with Eclipse
▪ Vim
▪ TextMate
▪ Gedit
▪ Idle
▪ PIDA (Linux)
▪ NotePad++ (Windows)

▪ On this module, we will be using the Jupyter Notebook

available in Anaconda Navigator.

16
LABS (WEEKS 9-11)
▪ Based on causal machine learning
▪ Focus will be on the discovery of cause-and-effect
relationships from data.
▪ Different algorithms are publicly available in R, Python,
Java, MATLAB, and in various Bayesian network software.
▪ On this module, we will be using a Java NetBeans
research project developed in our research lab.
▪ Comes with a basic User Interface (UI).
▪ No need to write code.
▪ Provides access to a list of algorithms and methods
needed to complete Coursework 2.

17
WEEK 1 LAB OVERVIEW
Programming basics
▪ Anaconda Navigator and Jupyter Notebook.
▪ Already installed on the machines in the ITL (machines now moved to
Temporary Building (TB) so hoping for no issues with software).
▪ If you can bring your own laptop with you, great. The demonstrators could also
help you set up Python, Anaconda Navigator and Jupyter Notebook on your
own machine.
▪ Follow the instructions provided in the Week 1 Lab document.

18
WEEK 1 LAB OVERVIEW
Lab 1 is aimed at those who are new to Python
(attendance optional but recommended):
▪ Programming basics with Python.
▪ Data types.
▪ Operators.
▪ Conditional statements and loops.
▪ Creating arrays/lists etc.
▪ Dictionaries.
▪ Functions.
▪ Few exercises.

19
PYTHON INTERACTIVE SHELL
▪ Jupyter Notebook operates as an interactive shell.
▪ You can execute Python code one line at a time.

20
WHY PYTHON ? Reading
slide
▪ Relatively easy to learn:
▪ Simple syntax and more intuitive code
▪ Can do more with fewer lines of code.

▪ Is now widely used by the scientific community and industry.

▪ Extensive ecosystem of rapidly maturing libraries for data science, both for data processing
and data visualisation.
▪ We will experiment with some of them:
▪ NumPy: started by Dr Travis Oliphant, BSc & MSc Brigham Young Uni, and PhD Mayo
Clinic. Worked as Assistant Prof at Brigham Young Uni and founded Anaconda.
▪ Pandas: started by Wes McKinney, BSc MIT and PhD Duke Uni. The author of the Python
for Data Analysis book available on QM+. Worked for, and founded, different companies.
▪ Scikit-learn: started by Dr David Cournapeau, MSc Telecom Paristech, PhD Kyoto Uni.
Started working on this library as a summer project at Google. Moved and works in Tokyo.
▪ In 2010 the French Institute for Research in Computer Science and Automation (INRIA)
took leadership of the project and made the first public release on February the 1st, 2010.
▪ These libraries are open-source and many other community members have contributed to
their development.
▪ Adequate to high computational performance.

▪ It’s free!
21
NEW TO PROGRAMMING ?
▪ This course is NOT about programming.
▪ If you are new to programming, attend labs from Week 1
and practise the lab material.
▪ You are not expected to remember all the methods we
cover in the labs from memory!
▪ You can refer to some of those methods when analysing
data for your coursework 1.

22
10 MINUTTERS PAUSE
10分の休憩
10 MINUTEN PAUSE
‫ دقائق استراحة‬10
10 MINUTI DI PAUSA
‫ דקות‬10 ‫הפסקה של‬
10 MINUTES DE PAUSE
10 मिनट का ब्रेक
10 MINUTES BREAK
10 МИНУТА ПАУЗЕ
10 মিমিটের মিরমি
ΔΙΑΛΕΙΜΜΑ 10 ΛΕΠΤΩΝ
ПЕРЕРЫВ 10 МИНУТ
休息10分钟
DESCANSO DE 10 MINUTOS
10 분 휴식
10 MINUTEN PAUZE 23
WHAT IS DATA ANALYTICS?
Machine
Learning
Statistics Databases

Data
Data Data
Mining Analytics Science

Knowledge Artificial
discovery Intelligence

▪ Interdisciplinary subfield of computer science at the

intersection of different fields.
24
WHAT IS DATA ANALYTICS? Reading
slide
Try to distinguish these terms
Quick internet search says…
▪ Data Analytics:
▪ The process of inspecting, cleansing, transforming, and modelling data with the goal of
discovering useful information, informing conclusions, and supporting decision-making.
▪ Data Science:
▪ Interdisciplinary field that uses scientific methods, processes, algorithms and systems to
extract knowledge and insights from noisy, structured and unstructured data, and apply
knowledge and actionable insights from data across a broad range of application
domains. more focus on algorithms
▪ Data Mining and Knowledge Discovery:
▪ The process of extracting and discovering patterns in large data sets involving methods at
the intersection of machine learning, statistics, and database systems.
▪ Machine Learning:
▪ The study of computer algorithms that can improve automatically through experience and
by the use of data.
▪ Artificial Intelligence:
▪ Intelligence demonstrated by machines, as opposed to natural intelligence displayed by
animals including humans.
▪ Statistics:
▪ The discipline that concerns the collection, organisation, analysis, interpretation, and
presentation of data.

25
WHAT IS MACHINE LEARNING?
Machine Learning:
▪ The study of computer algorithms that can improve iteratively
through additional experience and/or data.

Prediction/
Data Model Knowledge Decision Making
discovery
▪ Humans learn by observation, by making comparisons, by looking
for patterns of repeated behaviour or sequences of events, by
generalisation, and by trial and error.
▪ Humans also act by imagining, and this is covered by a field called
counterfactual reasoning, but which requires causal models (i.e., that
we make some assumptions about causality).
▪ In many ways, these are also the roots of machine learning since
many algorithms aim to mirror some of the above traits of human
learning.
26
GENERATIVE VS DISCRIMINATIVE
MACHINE LEARNING
Generative learning:
▪ Assumes data are generated by different probabilistic models (usually soft
boundaries), and tries to categorise data by answering the question of
“which of the sub-models is most likely to have generated a given data
point”. Generative model can tell you the info of the population/distribution of the data
▪ Aim is to maximise the likelihood each data point will belong to each class.

Discriminative learning:
▪ Divides the data into classes using hard decision boundaries, usually
based on distance metrics.
▪ Aim is to minimise mistakes between boundaries.
Generative
Discriminative (non-linear) Discriminative (linear)

27
ERROR: BIAS VS VARIANCE
Bias:
▪ The difference between
expected (average) and
observed values, which
represent the values we are
trying to predict.

Variance:
▪ How the predictions vary for a
given data point.
=>it’s possible to solve this by using
more data and take the average
http://scott.fortmann-roe.com/docs/BiasVariance.html

28
29
https://sketchplanations.com/sampling-bias
30
https://sketchplanations.com/confirmation-bias
DATA SET

Is a data set alone useful?

▪ A data set consists of historical observations.
▪ Data alone is not very useful:
▪ It does not provide knowledge.
▪ it does not answer predictive questions.
▪ it does not tell us what actions to perform.
▪ In data analytics, we seek to extract useful information
from data to support decision making.
Data ——> Info

31
DATA SET
Typically called variable or feature.

Typically called data instance or data sample.

32
SOURCES OF DATA
▪ Sensory data (on cars, doors, etc).
▪ Image/Text/Sound.
▪ Surveys/Questionnaires.
▪ Focus groups/Interviews.
▪ Social media.
▪ Documents/Records.
Generally two types of data:
▪ Observational: represents data collected based on what we
see without taking action.
▪ Experimental: represents data collected based on actions
that we deliberately take to manipulate the outcome.
▪ e.g., randomised control trials when testing new treatments.
33
CHALLENGES OF DATA ANALYTICS
Data noise…
▪ Missing data values.
▪ Missing variables.
▪ Corrupted data.
▪ Incorrect data.
▪ Limited data.
▪ Big data.
▪ Different evaluation criteria and metrics.
▪ Different distributional assumptions (e.g., normality).
▪ Different dependency assumptions (e.g., linear/non-linear).
▪ Data variety (e.g., discrete, continuous, text, sound, image)
▪ Spurious correlations.
▪ Correlation is not causation (recall observational and experimental data).
34
CHALLENGES OF DATA ANALYTICS
Ever heard of spurious correlations?

35
CHALLENGES OF DATA ANALYTICS
Ever heard of spurious correlations?

36
CHALLENGES OF DATA ANALYTICS
Ever heard of spurious correlations?

37
CORRELATION IS NOT CAUSATION

▪ Can the number of shark attacks help us predict ice cream sales?

From newsmax.com 38
HIDDEN SLIDE

39
CORRELATION IS NOT CAUSATION

▪ Can we conclude from correlation that shark attacks cause ice

cream sales to increase?

From newsmax.com 40
HIDDEN SLIDE

41
TYPES OF DATA Reading
slide
▪ Nominal discrete or categorical variables:
▪ Take finite number of states.
▪ Order of states is not important.
▪ Distance between states remains unmeasured.
▪ E.g., gender, colour, profession.
▪ Ordinal discrete or categorical variables:
▪ Same as above, but where the order of states matters.
▪ E.g., low/medium/high, distinction/merit/pass.

▪ Continuous variables:
▪ Have an infinite number of values between any two values.
▪ Can be represented by statistical distributions.
▪ E.g., age, time, temperature, profit.
▪ Note that a numerical or a continuous variable can generally be
converted into an ordinal categorical or a discrete variable.

Note that the above definitions may vary slightly depending on

book/discipline. 42
TYPES OF DATA ANALYSIS
(focusing on healthcare for the examples)
▪ Descriptive or Analytical
▪ E.g., analysing health condition.

▪ Predictive
▪ Predicting health condition.

▪ Risk
▪ Assessing the risk of getting a particular disease over a period of time.

▪ Diagnostic or Inverse/Bayesian inference

▪ Determining the probability of having a particular disease given a positive test
result.
▪ Prescriptive, Intervention, Action, Decision Making
▪ Determining what treatment to perform given disease 𝐴.

▪ Counterfactual
▪ Given that a previous patient died from disease 𝐴, what would be the probability
for death had we also known information 𝐵 and on this basis taken action 𝐶?
43
DISCIPLINES Reading
slide
▪ Health data
▪ What is the diagnosis given the symptoms?
▪ What is the intervention given the diagnosis?

▪ Climate data
▪ What will the weather be tomorrow?
▪ What are the consequences of climate change?

▪ Sports
▪ What is the value of a player?
▪ Who is going to win the match?
▪ What payoff should a bookmaker offer for a given match event?

▪ Finance and economics

▪ How is inflation expected to change over the next few months?
▪ Are house prices going to increase?
▪ What profit should I expect over a certain period of time if I invest in the stock market?

▪ Marketing
▪ How much increase should I expect in sales if I decrease the sales price from X to Y?

▪ Politics
▪ Who is going to be the next prime minister?
44
▪ How many seats is a given Party expected to win?
DATA ANALYTICS IN PRACTICE
Are data
Reduce noise sufficient/
and complexity useful?
Preprocess
Identify Collect Exploratory
& Clean
objective Data Analysis
Data

Need more data?

Predictive and decision

making accuracy

Assess Apply Learn

Evaluation
model model model

Results? Use model in Optimise model for a

real-world particular task
45
DATA ANALYTICS IN PRACTICE Reading
slide

▪ Typically involves:
▪ Inference considerations
▪ what is the objective?
▪ Data pre-processing
▪ Cleansing, transformation, feature selection, etc.
▪ Model building
▪ different assumptions, theorems, and algorithms to consider.
▪ Evaluation and optimisation on given data
▪ how do we judge the performance of the model?
▪ Application
▪ apply the learnt model to a real case and obtain results.
▪ Evaluation on real case
▪ How did the model perform in practice?
▪ Documentation
▪ preparation of a report covering all of the above.
46
DOMAIN EXPERTS
Why they are generally needed?
▪ Domain expertise is the knowledge and understanding of a
particular field.
▪ They can:
▪ tell us if the data are missing any important variables.
▪ specify the project objectives as well as the questions that need to be
answered.
▪ specify a performance threshold that needs to be achieved for a
model to be useful.
▪ communicate the results effectively to decision makers.
▪ provide knowledge of factors that are important for prediction but
which historical data fail to capture.
▪ In most areas, ML/AI models are still used for decision support
rather than for decision making.
47
DATA ETHICS Reading
slide

▪ Often we have to work with anonymised data for ethical and legal
reasons.

▪ The purpose is to eliminate or reduce discrimination.

▪ E.g., loan and insurance applications should not and often legally
cannot use gender, religion, or race in their assessment.

▪ However, anonymising data is difficult.

▪ Hidden information can be deduced from ‘anonymous’ data.
▪ Correlation can be used to re-identify anonymised data.
▪ E.g., postcode may correlate with ethnicity and religion.

▪ However, ethics depend on the domain.

▪ Gender and race might be useful to know in medicine.
▪ E.g., some treatment decisions may be based on gender.
48

R-32 Refrigerant Gas Pressure Temperature Chart
100% (3)
R-32 Refrigerant Gas Pressure Temperature Chart
2 pages
Edexcel International GCSE Mathematics A (4MA1) Paper 1H: Topical Past Paper Workbook
No ratings yet
Edexcel International GCSE Mathematics A (4MA1) Paper 1H: Topical Past Paper Workbook
37 pages
Data Science Masters 2.0 - PW Skills
No ratings yet
Data Science Masters 2.0 - PW Skills
15 pages
Solutions Exercises
100% (1)
Solutions Exercises
3 pages
Tom Rose - From The Red Notebook 2nd Edition
75% (4)
Tom Rose - From The Red Notebook 2nd Edition
33 pages
Matt Mello - Thought Control
No ratings yet
Matt Mello - Thought Control
16 pages
ADMT ZN Detector Manuel 2023
No ratings yet
ADMT ZN Detector Manuel 2023
29 pages
ECS765P - W11 - Stream Processing II
No ratings yet
ECS765P - W11 - Stream Processing II
47 pages
Semester-5 MCA Integrated IIPS DAVV Syllabus
No ratings yet
Semester-5 MCA Integrated IIPS DAVV Syllabus
26 pages
Object Oriented Programming With PHP
No ratings yet
Object Oriented Programming With PHP
28 pages
BTCS9202 Data Sciences Lab Manual
No ratings yet
BTCS9202 Data Sciences Lab Manual
39 pages
Data Analytics 2025 V2.0
No ratings yet
Data Analytics 2025 V2.0
18 pages
Hcia Ai 1 PDF
No ratings yet
Hcia Ai 1 PDF
171 pages
Development 7
No ratings yet
Development 7
174 pages
Happy Fun Activity Book by Jeremy Johnson: An Entry in The 2021/2022 Lindenbaum Prize For Short Gamebook Fiction
No ratings yet
Happy Fun Activity Book by Jeremy Johnson: An Entry in The 2021/2022 Lindenbaum Prize For Short Gamebook Fiction
34 pages
MTECH Handbook
No ratings yet
MTECH Handbook
18 pages
Week 3 v1.1 (Hidden) Supervised Learning (Regression)
No ratings yet
Week 3 v1.1 (Hidden) Supervised Learning (Regression)
52 pages
Solution Manual For Chemistry Principles and Reactions 8th Edition by Masterton Hurley ISBN 130507937X 9781305079373
100% (18)
Solution Manual For Chemistry Principles and Reactions 8th Edition by Masterton Hurley ISBN 130507937X 9781305079373
53 pages
Personalized Medicine Recommendation System
No ratings yet
Personalized Medicine Recommendation System
17 pages
Scripting Language
No ratings yet
Scripting Language
28 pages
ECS781P-11-Edge of The Cloud
No ratings yet
ECS781P-11-Edge of The Cloud
30 pages
Week 4 v1.1 (Hidden) - Supervised Learning (Classification)
No ratings yet
Week 4 v1.1 (Hidden) - Supervised Learning (Classification)
43 pages
MSC Data Science
No ratings yet
MSC Data Science
20 pages
ECS765P - W3 - Hadoop Principles and Components
No ratings yet
ECS765P - W3 - Hadoop Principles and Components
47 pages
ECS765P - W4 - Introduction To Spark
No ratings yet
ECS765P - W4 - Introduction To Spark
39 pages
ECS765P - W5 - Spark Programming
No ratings yet
ECS765P - W5 - Spark Programming
43 pages
1 s2.0 S1053811920307382 Main
No ratings yet
1 s2.0 S1053811920307382 Main
15 pages
Note - Wireless Communications For Everybody
No ratings yet
Note - Wireless Communications For Everybody
2 pages
Scaler Curriculum
No ratings yet
Scaler Curriculum
16 pages
Physics - Electric Potential and Capacitance
No ratings yet
Physics - Electric Potential and Capacitance
10 pages
DS PBR - 2025
No ratings yet
DS PBR - 2025
25 pages
RT8885A
No ratings yet
RT8885A
59 pages
ECS726-Week01 Intro
No ratings yet
ECS726-Week01 Intro
70 pages
Data Science & AI For Freshers
No ratings yet
Data Science & AI For Freshers
26 pages
ML and More
No ratings yet
ML and More
21 pages
Course Admin
No ratings yet
Course Admin
15 pages
ECS726-Week02 Symmetric EncryptionP
No ratings yet
ECS726-Week02 Symmetric EncryptionP
62 pages
ECE 4760J Syllabus Summer2025
No ratings yet
ECE 4760J Syllabus Summer2025
5 pages
ECS726-Week05 Cryptographic Protocols Key Management-P
No ratings yet
ECS726-Week05 Cryptographic Protocols Key Management-P
58 pages
M - 126 Structure of Nephron and Function of The Kidney PDF
100% (1)
M - 126 Structure of Nephron and Function of The Kidney PDF
4 pages
ECS726-Week04 - Hash - MAC - Digital Sinatures - Freshness - Dynamic Password Schemes
No ratings yet
ECS726-Week04 - Hash - MAC - Digital Sinatures - Freshness - Dynamic Password Schemes
52 pages
ECS781P-9-Cloud Data Management
No ratings yet
ECS781P-9-Cloud Data Management
79 pages
Data Science With Python-Sasmita PDF
67% (3)
Data Science With Python-Sasmita PDF
9 pages
ECS765P - W9 - Large-Scale Graph Processing
No ratings yet
ECS765P - W9 - Large-Scale Graph Processing
51 pages
Pre-M.Sc. (3 Months Before Starting M.SC.) : Goal
No ratings yet
Pre-M.Sc. (3 Months Before Starting M.SC.) : Goal
15 pages
W3 Ecs7020p
No ratings yet
W3 Ecs7020p
51 pages
W2 Ecs7020p
No ratings yet
W2 Ecs7020p
54 pages
Lecturezero
No ratings yet
Lecturezero
29 pages
Datasciencewith AI
No ratings yet
Datasciencewith AI
12 pages
SEM 4 Stuff
No ratings yet
SEM 4 Stuff
27 pages
Acds&ai 2024
No ratings yet
Acds&ai 2024
19 pages
Lecture 6
No ratings yet
Lecture 6
18 pages
22am901 Data Science Using Python Unit 2
No ratings yet
22am901 Data Science Using Python Unit 2
116 pages
ECS765P - W10 - Stream Processing
No ratings yet
ECS765P - W10 - Stream Processing
39 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
81 pages
W4 Ecs7020p
No ratings yet
W4 Ecs7020p
48 pages
Evaluating The Incompatibility of Inorganic Zinc Silicate
No ratings yet
Evaluating The Incompatibility of Inorganic Zinc Silicate
8 pages
Ecs765p W2
No ratings yet
Ecs765p W2
55 pages
ECS781P-3-Cloud Applications
No ratings yet
ECS781P-3-Cloud Applications
50 pages
ECS765P - W6 - Big Data Ingestion and Storage
No ratings yet
ECS765P - W6 - Big Data Ingestion and Storage
34 pages
Data Analytics
No ratings yet
Data Analytics
9 pages
AI&ML
No ratings yet
AI&ML
23 pages
Magic Pen Script 10-05-19
No ratings yet
Magic Pen Script 10-05-19
4 pages
Ecs781p 4 Rest
No ratings yet
Ecs781p 4 Rest
47 pages
Cloud Computing Lab 2
No ratings yet
Cloud Computing Lab 2
4 pages
Lights Illusions Script 08-26-19
No ratings yet
Lights Illusions Script 08-26-19
6 pages
ECS781P 10 Microservices
No ratings yet
ECS781P 10 Microservices
34 pages
ECS781P 6 CloudPerformanceSLAs
No ratings yet
ECS781P 6 CloudPerformanceSLAs
39 pages
Data Science Curriculum
No ratings yet
Data Science Curriculum
20 pages
Program Calendar - DS C66
No ratings yet
Program Calendar - DS C66
6 pages
NUS SOC MLDA Brochure 090523 V8
No ratings yet
NUS SOC MLDA Brochure 090523 V8
23 pages
Iisc Cds Brochure
No ratings yet
Iisc Cds Brochure
18 pages
Data Science
No ratings yet
Data Science
9 pages
Exit Quiz Questions
No ratings yet
Exit Quiz Questions
2 pages
IIT-M Advanced Programming Professional & Master Data Science
No ratings yet
IIT-M Advanced Programming Professional & Master Data Science
20 pages
New Ds Brochure
No ratings yet
New Ds Brochure
24 pages
Goldine XT and CD - Technical Publication - en US - 2005
No ratings yet
Goldine XT and CD - Technical Publication - en US - 2005
20 pages
Course Outline - ML IIFT Delhi MBA (BA) Sep-Dec 24
No ratings yet
Course Outline - ML IIFT Delhi MBA (BA) Sep-Dec 24
5 pages
LLT8QT Light Tower Specs
No ratings yet
LLT8QT Light Tower Specs
3 pages
3rd Sem Syllabus
No ratings yet
3rd Sem Syllabus
5 pages
ZeMax Manual
No ratings yet
ZeMax Manual
766 pages
AI in Speech Processing
No ratings yet
AI in Speech Processing
3 pages
DataScience Minordegree 2023 Syllabus
No ratings yet
DataScience Minordegree 2023 Syllabus
12 pages
Data Science - CS109: Joe Blitzstein, Verena Kaynig-Fittkau, Hanspeter Pfister
No ratings yet
Data Science - CS109: Joe Blitzstein, Verena Kaynig-Fittkau, Hanspeter Pfister
47 pages
Sample Paper Xii Phy.
No ratings yet
Sample Paper Xii Phy.
4 pages
CASCADE: Contextual Sarcasm Detection in Online Discussion Forums
No ratings yet
CASCADE: Contextual Sarcasm Detection in Online Discussion Forums
11 pages
KIT306/606: Data Analytics Unit Coordinator: A/Prof. Quan Bai University of Tasmania
No ratings yet
KIT306/606: Data Analytics Unit Coordinator: A/Prof. Quan Bai University of Tasmania
51 pages
DS Curriculum 2024
No ratings yet
DS Curriculum 2024
12 pages
Factors Influencing Internet Banking Adoption in South African Rural Areas
No ratings yet
Factors Influencing Internet Banking Adoption in South African Rural Areas
8 pages
Day01 - Welcome To Data Science Fundamental
No ratings yet
Day01 - Welcome To Data Science Fundamental
30 pages
Zen Data Science Syllabus
No ratings yet
Zen Data Science Syllabus
13 pages
Data Science Using Python 30 Days Internship Agenda
No ratings yet
Data Science Using Python 30 Days Internship Agenda
5 pages
Data Analytics in Python (Johar) SP2022
No ratings yet
Data Analytics in Python (Johar) SP2022
4 pages
Mit Data Science Program
100% (1)
Mit Data Science Program
15 pages
30 Data Science Minor
No ratings yet
30 Data Science Minor
18 pages
Data Science Syl Lab Us
No ratings yet
Data Science Syl Lab Us
4 pages
Program Calendar PGDDS March
No ratings yet
Program Calendar PGDDS March
2 pages
DM Courses 7
No ratings yet
DM Courses 7
4 pages
SB8008 Machine Learningl TPC
No ratings yet
SB8008 Machine Learningl TPC
2 pages
Data Science Student Schedule
No ratings yet
Data Science Student Schedule
7 pages
ECS7020P ClassificationExercisesSolutions II
No ratings yet
ECS7020P ClassificationExercisesSolutions II
7 pages
SonoFlo, Pliant General Siemens, en
No ratings yet
SonoFlo, Pliant General Siemens, en
8 pages
Data Analytics
No ratings yet
Data Analytics
1 page
PACE 2.0 Syllabus Machine Learning With Python Program
No ratings yet
PACE 2.0 Syllabus Machine Learning With Python Program
18 pages
Functional Programming in Python Syllabus
No ratings yet
Functional Programming in Python Syllabus
3 pages
Iit Data Science
No ratings yet
Iit Data Science
20 pages
Shaft Grounding
No ratings yet
Shaft Grounding
7 pages
IT124 Course Outline - BSIT AY 2021-2022
No ratings yet
IT124 Course Outline - BSIT AY 2021-2022
4 pages
Python For Data Science Syllabus
No ratings yet
Python For Data Science Syllabus
6 pages
Tissue Healing Timeline
No ratings yet
Tissue Healing Timeline
1 page
PGP - Unified Brochure
No ratings yet
PGP - Unified Brochure
18 pages
Pulse Generator: TYPE 112-52
No ratings yet
Pulse Generator: TYPE 112-52
2 pages
Byte Academy: Data Science
No ratings yet
Byte Academy: Data Science
11 pages
Data Science Course Outline CES LUMS
No ratings yet
Data Science Course Outline CES LUMS
4 pages
Course Outline DPA
No ratings yet
Course Outline DPA
5 pages
PIAIC Syllabus Quarter - 2
No ratings yet
PIAIC Syllabus Quarter - 2
3 pages
Math 9 DLL Q2 W2 D8
No ratings yet
Math 9 DLL Q2 W2 D8
5 pages
Transmision Trituradora Tesab
No ratings yet
Transmision Trituradora Tesab
4 pages
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet

Week 1 v1.32 (Hidden) - Introduction To Data Analytics

Uploaded by

Week 1 v1.32 (Hidden) - Introduction To Data Analytics

Uploaded by

ECS784U/P DATA ANALYTICS

▪ Where: Peoples Palace, in Skeel Lecture Theatre.

▪ Weeks: 1-6 and 8-12 (11 lectures).

▪ Coursework 1, 60% weighting:

▪ A minimum total mark of 50% is required to pass this module.

▪ Almost all books/book chapters to be provided as PDF

▪ They are either freely available online, or permission

▪ The third book will be made available on QMPlus in Week 9.

▪ On this module, we will be using the Jupyter Notebook

▪ Is now widely used by the scientific community and industry.

▪ Interdisciplinary subfield of computer science at the

Is a data set alone useful?

Typically called data instance or data sample.

▪ Can we conclude from correlation that shark attacks cause ice

Note that the above definitions may vary slightly depending on

▪ Diagnostic or Inverse/Bayesian inference

▪ Finance and economics

Need more data?

Predictive and decision

Assess Apply Learn

Results? Use model in Optimise model for a

▪ The purpose is to eliminate or reduce discrimination.

▪ However, anonymising data is difficult.

▪ However, ethics depend on the domain.

You might also like