0% found this document useful (0 votes)
57 views32 pages

IE5005 Lecture 00

Uploaded by

Braewyn Hsu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views32 pages

IE5005 Lecture 00

Uploaded by

Braewyn Hsu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

IE5005 Data Analytics for Industrial Engineers

Lecture 00: Course introduction and outline

Dr. Wang Zhiguo


[email protected]

Semester 1 AY2024/25
Ice-breaking activity

https://www.polleverywhere.com/clickable_images/zU9a790vyI7gx6VF7AdWv
Course Admins

Course Lecturer: Dr. Wang Zhiguo


• Email: [email protected]

• Consultation (before/during/after class, or email)

❑ Lecture hours/Venues
• Group 1: Monday 6-9pm @LT7
• Group 2: Wednesday 2-5pm @LT7
❑ Webcast recordings are available (Be cautious about “Procrastination”)

❑ Prior knowledges of programming/software are not required


3
Course Admins

❑ Assessments
• 4 Homework assignments (8% each x 4 = 32%)
• 2 Blended Learning assignments (4% each x 2 = 8%) [TBA later]
• Final exam (23 Nov, Saturday, 9am) (60%) [close book with helpsheet]

❑ Canvas

4
Homework Assignment
Submission format:
Microsoft Word File.

Name the file as:


YourName (e00xxx).doc

Late submission
• You may receive 0 mark if no
valid reason.
• Mark deduction or extension
with valid reasons (e.g. medical).

• Avoid last minute submission!!!


• IT or technical issues are not
considered as valid reasons!!!

5
00
Be a Data Sleuth

There is nothing more deceptive than an obvious fact.


-- Sherlock Holmes
The Bascombe Valley Mystery
This course serves a good starting point for different path

❑ Data Scientist: unearth future insights from Maths, Stats


Algorithms
raw data.

❑ Data engineer: focuses on development and Data Data


Engineer Analyst
maintenance of data pipelines. Data
Scientist
❑ Data analyst: mainly take actions that affect
Software Data
the company’s scope. (data cleaning, data Engineering Communication

conversion, data modelling)


Data Engineer Data Analyst Data Scientist

Data engineers build Data analysts analyse and Data scientists analyse,
infrastructure and scalable interpret data to extract visualize existing data and
pipelines to manage the flow meaningful information, find implement algorithms to
of data and prepare it for solution to a business build predictive models for
analysis problem, and make critical making future decisions
decisions
8
Data Engineer Data Analyst Data Scientist

• Data acquisition • Clean and process data • Develop machine learning


• Design, build and test data • Discover hidden patterns models
• Integrate data into existing • Perform exploratory data • Identify trends in data for
data pipelines analysis making decisions
• Prepare data for modelling • Visualize dashboards • Explore data by performing
• Perform ETL operations • Visualize data and use statistical analysis
storytelling techniques
Data Engineer Data Analyst Data Scientist

• SQL and MongoDB • SQL • Programming


• Data architecture • MS Excel • Mathematics and statistics
• Data scripting • Programming • Analytical thinking
• Data warehousing and ETL • Data visualization • Data visualization
• Hadoop • Problem solving • Machine learning
• Domain knowledge • Deep learning
• Decision making
Data analytics is everywhere in our daily life…

If you were to buy a new car, what color are you going to choose?

11
Chocolate consumption Vs Nobel prize winners

Figure Source: Hristova, Krasimira. (2012). EDITORIAL - Perspectives on Chocolate Consumption and Risk of Cardiovascular Diseases and Cognitive Function. The Open 12
Nutraceuticals Journal. 5. 207-212. 10.2174/1876396001205010207.
Smoking vs Life expectancy

Fig. Survival from age 35 for continuing cigarette smokers and lifelong non-
smokers among UK male doctors born 1900-1930, with percentages alive at
each decade of age. (Source of Figure: British Doctors Study) 13
Data analytics in the industry

14
https://www.youtube.com/watch?v=S2e8-Wa6_Kk
Science and Art of data visualization and effective storytelling

15
Data analytics for the commercial stores…

❑ What is the most purchased product in ❑ How about shopping malls? How do
supermarket? How does this help the they design their floor layout?
shop design their shelf layout?

Supermarket Shelf Vector Art, Icons, and Graphics for Free Download 5 Psychological Tricks That Malls Use To Make You Spend More Money - Marketing Mind

16
E-commerce: Product Recommendation

17
Different statistical learning models to identify patterns in the data……

Supervised learning Unsupervised learning

label label

cat ?

Cluster 1
dog ?
It’s a cat Cluster 1

cat ?
Cluster 2

dog ?

18
AI/ML/DL/GenAI/NLP/LLM

AI

Linguistics
ML NLP

DL

GenAI

LLM

19
ChatGPT vs Traditional Chatbot

Traditional Chatbot ChatGPT


• Pre-determined responses • More generalizable
• Can answer to a limited no. of • Uses its understanding of language
questions to interpret the question and respond
• Wide range of potential applications

20
Why is it called ChatGPT?

What does GPT mean

Generative Pre-trained Transformer


Means creating new The foundation LLM is "Attention is All You
content (or ‘next word pre-trained on massive Need" by Ashish
prediction in the earlier amounts of text from Vaswani et al.,
example’) the internet and other published in 2017.
public data.

21
Myth/Fact about ChatGPT

Why does ChatGPT sometimes give a wrong answer or even make stuff up?

Shall I use ChatGPT for my homework?

Is ChatGPT possessing human-like emotions?

How is ChatGPT going to change our workplace?

……

22
Text generator tools for “news bot”

Xiaomingbot
is an automated system to
generate and read news
articles.
It employs text generation
algorithms to generate
news from data and
machine translation
algorithms to translate the
text into multiple
languages.

23
Develop a chatbot……

24
Many more to see in this course……
Production planning

25
Many more to see in this course……

Inventory control and management

150
140
130
120
110
Demand

100
90
80
70
60
50
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58
Month

26
Many more to see in this course……

Transportation planning

27
Many more to see in this course……

28
This course covers a complete data lifecycle

Data analytics contains 3 types of analysis:


• Descriptive, answers the question,
“What happened?”
• Predictive, answers the question,
“What might happen in the future?”
• Prescriptive, answers the question,
“What should we do next?”

Data lifecycle describes the path data takes from when it’s first generated to when it’s interpreted into actionable insights.
29
Source of Figure: https://online.hbs.edu/Documents/a-beginners-guide-to-data-and-analytics.pdf
Lecture 01. Introduction to data analytics and data sampling [1. Data generation, 2. Data collection]

Lecture 02. Descriptive analytics [i. descriptive analytics] Hands-on: Tableau Public

Lecture 03. Art of visualization and storytelling [7. Data visualization, 8. Data presentation]
Homework #1 due on 17:00 Monday, 02 Sep 2024
Hands-on: Power BI
Lecture 04. Statistical inference and data resampling [3. Data exploration]

Lecture 05. Data quality, data wrangling, and data processing [3. Data processing] Hands-on: data processing

Lecture 06. Statistical learning, model selection and regularization [4. Data storage, 5. Data management]
Homework #2 due on 17:00 Monday, 30 Sep 2024
Lecture 07. Classification methods
[6. Data analysis]
[ii. predictive analytics]
Lecture 08. Descriptive data mining

Lecture 09. Introduction to GenAI and LLM [unstructured data] Hands-on: build a chatbot (no coding)
Homework #3 due on 17:00 Monday, 21 Oct 2024
Lecture 10. Monte Carlo simulation Hands-on: simulation with Excel

Lecture 11. Linear programming [iii. prescriptive analytics] Hands-on: LP with Excel Solver

Lecture 12. Decision analysis [Decision-making]


Homework #4 due on 17:00 Monday, 11 Nov 2024
Lecture 13. Summary lecture 30
Feel free to share your feedback anonymously
with me via this link/QR code throughout the
whole semester.

https://app.sli.do/event/hUgiGrg7Ln8KeEFVyCT9o3

31
References
• Camm, J. D., Cochran, J. J., Fry, M. J., & Ohlmann, J. W. (2023). Business analytics. 5th Edition, Cengage
Learning.
• Wilke, C. O. (2019). Fundamentals of data visualization: a primer on making informative and compelling figures.
O'Reilly Media.
• Moreira, J., Carvalho, A., & Horvath, T. (2018). A general introduction to data analytics. John Wiley & Sons.
• Utts, J. M. (2014). Seeing through statistics. Cengage Learning.
• Knaflic, C. N. (2015). Storytelling with data: A data visualization guide for business professionals. John Wiley &
Sons.
• Share Data Through the Art of Visualization, Google Data Analytics Course.
• James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112, p. 18).
New York: springer.
• Hastie, T., Tibshirani, R., Friedman, J. H., & Friedman, J. H. (2009). The elements of statistical learning: data
mining, inference, and prediction (Vol. 2, pp. 1-758). New York: springer.
• Belorkar, A., Guntuku, S. C., Hora, S., & Kumar, A. (2020). Interactive Data Visualization with Python: Present
your data as an effective and compelling story. Packt Publishing Ltd.
32

You might also like