Session 1
Session 1
INFO9026
Siamak Khayyati
HEC Liège - University of Liège
2024-2025
Some Information
• Location: N1a 59 (0/59)
• Time: Thursdays 16:15-19:15
• Email: [email protected]
• Evaluation:
• Project (5/20)
• Groups of up to 5 students
• Exam (15/20)
• Bonus exercises
2
Tentative Outline
• Introduction • K-nearest neighbors method
• What’s machine learning? • Neural networks
• Data preparation • Decision trees
• Linear and logistic • Random forests
regression • Clustering
• Python basics • Data visualization
• Project introduction • Sample exam
• Time series • Wrap up
3
Outline
• Predictive analytics
• What is machine learning?
• Data preparation with Excel
• Installing Jupyter notebooks
4
Predictive
analytics
5
Predictive analytics
• Business analytics
• Knowing what has
happened.
• Understanding
patterns in
performance
• Predicting what is
most likely to happen
• Identifying the best
set of decisions
6
What is machine learning?
Machine Learning
Branch of Artificial Intelligence Study of algorithms and statistical models allowing computers to
automatically learn from the data.
7
What is machine learning?
Association Analysis 35% of the purchases on Amazon
are the result of their
recommender system [McKinsey
2018]
8
What is machine learning?
Recommendations are responsible for 70% of the time people spend watching
videos on YouTube. https://www.cnet.com/news/youtube-ces-2018-neal-mohan/
9
What is machine learning?
• Association Analysis
• Provide Superior Customer Experience benefit from experience of others
• Increase conversion rates convert customers to a certain product/service
• Anticipate customer behavior
• Target marketing campaigns
• Co-branding activations
10
What is machine learning?
Segmentation
Manual segmentation
The analyst defines the groups manually
Example : group by income
11
What is machine learning?
• Segmentation
• Provide Superior Customer Experience
• Segments based on individual preferences
• Develop Effective Retention Strategies
• Segments based on purchasing data and behaviour
• Better Ad Targeting
• Segments based on the lifecycle of a customer or on social channels they
frequent
12
What is machine learning?
Predictive methods
Example: Predict customer response, e.g. ordering a product online with a promotional code
their response on ad campaign (did they
1. Build a prediction rule using a training dataset (historical records) buy smtg ?)
13
What is machine learning?
• Predictive Methods
• Predict consumer behavior
• Predict campaign performance taux de perte de clientèle
14
What is machine learning?
• Applications
• Predict churn
• Mobile phone and utilities companies use Machine Learning
to predict ‘churn’, i.e. when a customer leaves their company
to get their phone/broadband from another provider.
• Fraud detection
• Identify customer fraud and risk behavior patterns across
multiple lines of data. Score risky transactions that need
further investigations.
15
What is machine learning?
• Applications
• CRM : Identify prospective customers, social network,
recommendation
• Supply chain management: sales forecasting and
production needs, diagnosis of machine faults,…
• Finance: Credit risk (the chance that a borrower does
not repay a loan or fulfill a loan obligation)
• Medicine : monitor intensive care patients
16
What is machine learning?
• Applications
• Image detection
• Optical character recognition (OCR)
17
Data preparation
with Excel
18
Exercise 1
• Using absolute and relative
references
• In a new Excel workbook,
compute efficiently a table
with the final values of an
investment on the basis of
the formula 𝐶 × 1 + 𝑟 𝑡 for
• 𝐶 = 1000
• 𝑟 = 0.01 to 0.1 by steps of 0.005
• 𝑡 = 1 to 20 by steps of 1
19
Exercise 2
• The file Ecercise_2.xlsx contains for each month of a given
year, the price for one gas liter and the number of liters
consumed by a given family for each month.
• Compute, in column E, for each month, the gas cost
• Compute, in E15, the total cost for the year.
• Let’s assume that the family negotiates a discount of 1,5% on
the gas price for the year. Compute the reduced price for each
month in column H, and the reduced total cost in column
H15.
20
Exercise 3
• The second sheet in Ecercise_2.xlsx only contains the information
about the consumed gas liters.
• Compute manually, in C15, the average number of consumed
liters. Compare with the predefined function “average” in B15.
• Compute manually, in C16, the median and compare with the
predefined function “median” in B16.
• In which case is it preferable to use the median instead of the
mean?
• Compute manually, in B19, the standard deviation. Compare, in
C19, with the predefined function “stdev.p”.
21
Exercise 4
• Importing data
• Import the file Exercise_4.txt
• Compute a new column with the Total retail price per unit
22
Exercise 5
• Histograms, samples and joining tables
• Use the file order_complete.xlsx and Plot a histogram of the Total Retail
Price
Histogram
60000
50000
23
Exercise 6
• In the file order_complete.xlsx, in the sheet customer_dim Replace the
term Orion Club Gold members low activity by low and Orion Club
Gold members high activity by high
• How can you select a random sample of 1000 customers ?
• Complete the sheet order_fact with four columns with information
coming from the other sheets (not using copy and paste but based on
the product id, the customer id and street id)
• the product category and product name
• the customer type
• the country
24
Exercise 7
25
Python and Jupyter Notebook
• We will use the programming language Python for using
implementations of machine learning algorithms
• To ease the introduction to Python, Jupyter Notebook is
used as an integrated development environment
• As a Python distribution, we will use Anaconda
• The following slides explain the required installation steps
26
Python and Jupyter Notebook
• You can find Anaconda installers https://repo.anaconda.com/archive/
• Select the correct graphical installer for your operating system and download the executable
• Select and download the Anaconda3-2023.03-1 version
• Open and run the executable file, then click Next
• Select an installation location and click Next, make sure to have no spaces in path
• If there is the option, check the Add Anaconda to my PATH environment variable checkbox!
• Click Install
27
Python and Jupyter Notebook
• Go to Windows start menu and look for ”Jupyter
Notebook” and click on it
28
Python and Jupyter Notebook
• The notebook interface will appear in a new
browser window or tab
29
Python and Jupyter Notebook
• Click on New > Python 3
30
Python and Jupyter Notebook
• Write following in the first line: print(”Hello, world!”)
• Click on Shift + Enter to run the code
31
Python and Jupyter Notebook
• Click on File > Rename to rename the notebook file
32
Python and Jupyter Notebook
• Click on save icon to save changes
33
Python and Jupyter Notebook
• After closing the windows, you can navigate in the notebook
interface and click on the file name to open the notebook
• Installing packages:
• In Anaconda Prompt:
• pip install pandas
• pip install numpy
• pip install matplotlib
• …
34
Links
• https://python-
course.eu/books/bernd_klein_python_and_machine_learning_
a4.pdf
35