0% found this document useful (0 votes)

11 views30 pages

Chapter 0. Course Presentation

The Intermediate Econometrics and Data Analysis (IEDA) course aims to equip students with essential skills in data analysis and programming, focusing on Supervised Machine Learning using Python. The course is structured into four phases covering data preparation, statistical tools, classification algorithms, and a final interdisciplinary project. Assessment includes midterms, a final exam, and participation, with an emphasis on practical application through group projects utilizing classification models.

Uploaded by

mpineau2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views30 pages

Chapter 0. Course Presentation

Uploaded by

mpineau2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 30

INTERMEDIATE ECONOMETRICS & DATA

ANALYSIS
CHAPTER 0
ABOUT
IEDA
PART I
A. COURSE PRESENTATION

• Given their significant contribution to business success, the demand for data

analysts has been increasing and is expected to continue growing in the

coming years.

• In this context, the “Intermediate Econometrics and Data Analysis (IEDA)”

course is designed to prepare you for future job opportunities in this

expanding field.
B. COURSE PLAN
(1/4)

• The course is designed to address real managerial problems using tools from
“Supervised Machine Learning”, which is a subset of “Artificial Intelligence”.

• It consists of thirteen sessions designed to help you master essential skills

such as data analysis and programming, with Python as the primary
programming language.
B. COURSE PLAN
(2/4)

• The course consists of four main phases:

◦ Phase I (Chapter 1): Data preparation, including handling imbalanced datasets

and addressing abnormal or missing values using Python.

◦ Phase II (Chapters 2 and 3): Preparation for data analysis, which includes a
review of key statistical tools needed for the course and an overview of the
prerequisites for Supervised Machine Learning (such as training/testing sets,
cross-validation, and understanding variation and bias errors).
B. COURSE PLAN
(3/4)

• The course consists of four main phases:

◦ Phase III (Chapters 4 to 7): In-depth exploration of various classification

algorithms, including K-Nearest Neighbors, Decision Trees, Random Forest, and
Neural Networks.

◦ Phase IV (Last session): Emphasis on the interdisciplinary project, focusing on

applying the skills learned throughout the course to real-world scenarios and
presenting the results.
B. COURSE PLAN
(4/4)

CHAPTER 0 • ABOUT IEDA

 PHASE I
CHAPTER 1 • DATA PREPERATION
CHAPTER 2 • ECONOMETRICS & DATA ANALYSIS
 PHASE II
CHAPTER 3 • SUPERVISED MACHINE LEARNING
CHAPTER 4 • K-NEAREST NEIGHBORS (K-NN)
CHAPTER 5 • DECISION TREES (DT)
 PHASE III
CHAPTER 6 • RANDOM FOREST (RF)
CHAPTER 7 • NEURAL NETWORK (NN)
 PHASE IV PROJECT • COACHING
C. COURSE MATERIALS

MYCOURSES ORGANIZATION

• Three main components: • Before the class:

1. Chapter slides; 1. Read the chapter slides;

2. Datasets; 2. Download the chapter’s datasets;

3. Download the “In-Class Practice” files.

3. Python Scripts (i.e., code).

• After the class:

• Three types of Python scripts:
1. Review the slides;
1. Full script;
2. Download and review the “Full script”;
2. In-Class Practice;
3. Complete the “At-Home Practice” as
3. At-Home Practice.
homework for the next session.
D. ASSESSMENT METHODS

MIDTERM FINAL EXAM

• WEIGHT 30% 55%

• DESCRIPTION • Exercises and MCQs • Mainly consists of exercises

• CONTENT • Chapters 0 to 3 included • All chapters

• DATE • Mid-semester • End of the semester

PARTICIPATION

• WEIGHT 15%

• Participation grades consider attendance, class

• DESCRIPTION engagement, attitude, and timely homework
completion.
E. HOW IEDA DIFFERS FROM THE AI COURSE

IEDA AI

1. Focus on a specific subset of AI: 1. All subsets of AI: “Machine Learning” (Supervised
and Unsupervised), and “Deep Learning”;
“Supervised Machine Learning”;

4. Application of AI in business: Utilize the

2. Build classification models: Write the
algorithms developed in IEDA;
underlying algorithms;

5. AI implementation and ethical issues in business.

3. Select the appropriate model: Evaluate

the strengths and weaknesses of each.

 MANAGERIAL APPROACH
 SCIENTIFIC APPROACH
ABOUT
THE
PROJECT
PART II
A. PROJECT PRESENTATION
(1/3)

• The goal is to use classification models from the IEDA and AI courses to
predict the direction of price movement (up vs. down) of the explained
variable, rather than the actual price.

• To accomplish this, the quantitative explained variable (price) is converted

into a qualitative variable with two categories: up and down. Please note that
selecting the variables and obtaining the data are also part of the job.
A. PROJECT PRESENTATION
(2/3)

• Each group (5 students on average) must select only one topic from those
covered in the Financial Markets course.

• For the chosen topic, each group needs to identify one explained variable and
several explanatory variables (refer to the project guidelines for details).

• Your data source will be “Wharton Research Data Services (WRDS)”. Please
register at WRDS Registration to access the data.
A. PROJECT PRESENTATION
(3/3)

• In this regard, a dataset might look like this (just an example):

• Note that you have flexibility regarding the time frame and frequency of the
variables. However, be mindful of the frequencies. For example, do not use
explanatory variables with annual variation (e.g., GDP) to predict an explained
variable that varies daily (e.g., daily stock price) or monthly, and vice versa.
B. EXPECTATIONS

1. FOR THE PROJECT

• Create 4 models using Python : K-NN, DT, RF and NN;

• Compare all the models using performance metrics and evaluate their theoretical
strengths and weaknesses;

• Determine which model is the most effective for your dataset and study context.

2. FOR THE COACHING SESSION

• Ensure that data collection and preparation are completed, and;

• Bring your results, including the four Python models, to the coaching session.
C. IMPORTANT INFORMATION

• Individual coaching, whether via email or in person, is not possible.

• Collecting data and fixing errors are part of the job.

• You will be provided with the complete Python script for all models.

• The provided code contains no errors. Therefore, adapting it to your own

dataset and handling any potential issues are part of the job.

• No additional coaching or error support will be offered outside of the class.

D. CONTACT

• For any questions regarding group composition (e.g., missing members, individuals

not assigned to any group, etc.), please contact the project coordinators:
◦ Aljona ZORINA: [email protected]

◦ Marc JOETS: [email protected]

• For any questions related to data, please contact your finance professor or the

project coordinators, and for questions about Supervised Machine Learning, please

contact your IEDA professor.

• All project information is available on MyCourses:

https://mycourses.ieseg.fr/course/view.php?id=6216
E. SUBMISSION
(1/4)

1. CODE
• Three “.ipynb” files (i.e., google colab notebooks) are expected:

◦ One file for K-Nearest Neighbors;

◦ One file for both Decision Trees and Random Forest;

◦ One file for Neural Network.

E. SUBMISSION
(2/4)

2. TEMPLATE

• A Word document should be completed for all courses involved in the AI

project. For the IEDA section, you must:
◦ Include and interpret the performance measures of the four models you
created.

◦ Analyze the theoretical strengths and weaknesses of each model.

◦ Select the best model for your study and justify your choice.

• In summary, your IEDA professors will evaluate both your code and the

IEDA section of the template.

E. SUBMISSION
(3/4)

• Please note that:

◦ If the instructions are not followed (e.g., if you do not submit three .ipynb
files), you will get a zero for the coding part.
◦ Code that has not been adjusted to your dataset (i.e., using the provided
code without modifications) will also result in a zero for the coding part.
◦ Failure to submit the code in the correct format (e.g., submitting a
document instead of an .ipynb file or only providing a link to your code in
the template) will result in a zero for the coding part.
E. SUBMISSION
(4/4)

• A faulty model will result in a zero for the

entire IEDA section. In other words, if you
include both the original prices and the
converted qualitative variable (price
direction) in your analysis, your results will
be biased. Using the actual prices to explain
variations in the same prices is inherently
flawed.
ABOUT
PYTHON
PART II
A. INTRODUCTION TO PYTHON
(1/2)

• Python is one of the most in-demand programming languages by companies. Its

ease of learning does not diminish its power as a tool.

• A programming language is a means of communicating with computers. It involves

writing a set of instructions, which programmers use to develop software.

• Once these instructions are organized, they form an algorithm. This algorithm
represents the “behind-the-scenes” process of the software we use.

• Other programming languages include Java, JavaScript, C++, Ruby, and more.
A. INTRODUCTION TO PYTHON
(2/2)

• So far, we have used software like SPSS to create models.

• This year, however, we will be building our models from scratch, allowing you to
learn the underlying algorithms.

• Don’t worry, the course is not intended to teach coding. Instead, it focuses on
understanding the logic behind the algorithms and the instructions they require.

• To support this, the code will be provided. Your task will be to adapt the code to
your project dataset by making the necessary adjustments and modifications.
B. PYTHON CODE EDITORS
(1/3)

• In general, languages can be either written or spoken. Programming

languages, including Python, fall into the written category.

• Like any other written language, Python requires a medium for writing. While
human languages are written using text editors (e.g., MS Word) or physical
media (e.g., paper), programming languages are written using “code editors”.

• Therefore, you will need a code editor to write, edit, and run Python code.
B. PYTHON CODE EDITORS
(2/3)

• To write Python code, you can choose between an “Integrated Development

Environment (IDE)” and a simple code editor.

• On the one hand, an IDE (like PyCharm, Jupyter, Spyder, etc.) is a “program
dedicated to software development”*. It includes a wide range of tools and
features that facilitate coding, but it may require more memory and time to
download and install.
B. PYTHON CODE EDITORS
(3/3)

• On the other hand, a code editor is a text “editor designed to handle codes
(with, for example, syntax highlighting and auto-completion).”*

• For the IEDA course, we will use Google Colab, an online code editor that
only requires a Google account, with no need for download or installation.
C. PYTHON LIBRARIES
(1/3)

• The first step in conducting research is usually a literature review. This

process saves time by ensuring we do not duplicate existing knowledge (e.g.,
researching whether smoking causes cardiovascular disease). It also provides
valuable information for our study without starting from scratch.

• Coding is similar to research in this regard. Developers write lines of code,

which are then stored in “libraries” and shared with the community for
reuse. A Python library can therefore be defined as a “reusable chunk of
code”*.
C. PYTHON LIBRARIES
(2/3)

• Using libraries in programming avoids the need to write code from scratch.
Instead, you can search for the appropriate library, import it, and use it.

• Just as in research where you visit a library to find the most relevant book for
your topic, in programming you use libraries to find the tools and functions
needed for your specific task.
C. PYTHON LIBRARIES
(3/3)

Among the thousands of libraries available, these are the ones we will
primarily use:
Data cleaning & analysis: Scientific computing: Machine Learning:
missing values, outliers… mathematical operations PCA, HCA, K-NN, DT, RF…

https://pandas.pydata.org/ https://numpy.org/ https://scikit-learn.org/stable/

Superset of the matplotlib

Imbalanced datasets:
Visualizations: library:
Under-sampling, over-
basic graphs applies themes and decorates
sampling…
matplotlib graphs
https://matplotlib.org https://seaborn.pydata.org/ https://imbalanced-learn.org/

RNP Approaches
88% (8)
RNP Approaches
69 pages
Traffic Analysis - LMC-01
67% (3)
Traffic Analysis - LMC-01
15 pages
18+430 List of CH.: Design of Pier P1
No ratings yet
18+430 List of CH.: Design of Pier P1
54 pages
Calculation Worksheet: Combustion Air, Standard Method: Step 1
No ratings yet
Calculation Worksheet: Combustion Air, Standard Method: Step 1
1 page
LP LECTURE NOTES-1 Linux Programming PDF
No ratings yet
LP LECTURE NOTES-1 Linux Programming PDF
235 pages
اطروحة شبر جواد كاظم العبيدي
No ratings yet
اطروحة شبر جواد كاظم العبيدي
135 pages
Astm f2882
No ratings yet
Astm f2882
7 pages
PX-760/PX-860/AP-260/AP-460/PX-160 MIDI Implementation: Casio Computer Co., LTD
No ratings yet
PX-760/PX-860/AP-260/AP-460/PX-160 MIDI Implementation: Casio Computer Co., LTD
51 pages
Data Science Training in Naresh I Technologies
100% (3)
Data Science Training in Naresh I Technologies
18 pages
TEXA Axone Nemo Specs
No ratings yet
TEXA Axone Nemo Specs
36 pages
Design Analysis of Machine Tool Structure With Art
No ratings yet
Design Analysis of Machine Tool Structure With Art
14 pages
WPA Exploitation in The World of Wireless Network
No ratings yet
WPA Exploitation in The World of Wireless Network
34 pages
Google Cheat Sheet
No ratings yet
Google Cheat Sheet
11 pages
Pipe Vibration and Pressure Detection - Bruel - Kjaer
No ratings yet
Pipe Vibration and Pressure Detection - Bruel - Kjaer
12 pages
Python - Follow Dr. AngShu (@drangshu) For More
100% (1)
Python - Follow Dr. AngShu (@drangshu) For More
300 pages
Interface Knowledge
No ratings yet
Interface Knowledge
4 pages
Maxime Cohen Promo Paper Final
No ratings yet
Maxime Cohen Promo Paper Final
58 pages
Low Voltage Circuit Breaker Testing - Emerson
100% (1)
Low Voltage Circuit Breaker Testing - Emerson
1 page
Stragieretal.2019 Efficacyofanewstrengthtrainingdesign The3 7method
No ratings yet
Stragieretal.2019 Efficacyofanewstrengthtrainingdesign The3 7method
12 pages
Liebert Apm 30 600 KW Brochure English
No ratings yet
Liebert Apm 30 600 KW Brochure English
8 pages
Stat and Machine Learning Python PDF
No ratings yet
Stat and Machine Learning Python PDF
300 pages
Materials For Mechanical Parts
No ratings yet
Materials For Mechanical Parts
20 pages
1LE1601-1AB53-4FB4-Z F01+F11+F50+L05 Datasheet en
No ratings yet
1LE1601-1AB53-4FB4-Z F01+F11+F50+L05 Datasheet en
2 pages
Guia Ilagan Sauz Balinado Pedal Power Generation
No ratings yet
Guia Ilagan Sauz Balinado Pedal Power Generation
5 pages
00 - Intro To The Course (Slides)
No ratings yet
00 - Intro To The Course (Slides)
13 pages
Summer Training Report - Ishan Patwal
No ratings yet
Summer Training Report - Ishan Patwal
21 pages
Igcse Weathering
100% (1)
Igcse Weathering
16 pages
AE264 Spring2014 HW1
No ratings yet
AE264 Spring2014 HW1
3 pages
Data Science Immersive Syllabus: Course
No ratings yet
Data Science Immersive Syllabus: Course
4 pages
EdYoda Data Scientist Program Curriculum
No ratings yet
EdYoda Data Scientist Program Curriculum
14 pages
DSI Detailed Syllabus v10.2
No ratings yet
DSI Detailed Syllabus v10.2
4 pages
Data Science Content
No ratings yet
Data Science Content
11 pages
Artificial Intelligence (1 Day)
No ratings yet
Artificial Intelligence (1 Day)
3 pages
1) Introduction To Numpy, Pandas and Matplotlib
No ratings yet
1) Introduction To Numpy, Pandas and Matplotlib
11 pages
AIDI - 1010 - WEEK2 - Google Colab - v1.2
No ratings yet
AIDI - 1010 - WEEK2 - Google Colab - v1.2
17 pages
Business Analytics & Text Mining Modeling Using Python: Dr. Gaurav Dixit
No ratings yet
Business Analytics & Text Mining Modeling Using Python: Dr. Gaurav Dixit
17 pages
PACE 2.0 Syllabus Machine Learning With Python Program
No ratings yet
PACE 2.0 Syllabus Machine Learning With Python Program
18 pages
Alchemyst Data Science and Machine Learning Program
No ratings yet
Alchemyst Data Science and Machine Learning Program
4 pages
IT Lab PPT Pratham Chouhan CSE174
No ratings yet
IT Lab PPT Pratham Chouhan CSE174
40 pages
Fingerprint Identification and Verification System Using Minuate Matching
No ratings yet
Fingerprint Identification and Verification System Using Minuate Matching
4 pages
BFS 6701 - Financial Analytics Using Python
No ratings yet
BFS 6701 - Financial Analytics Using Python
6 pages
ML Masters Curriculum Brochure
No ratings yet
ML Masters Curriculum Brochure
12 pages
ML Lab Manual
No ratings yet
ML Lab Manual
90 pages
0MLwP Workshop Brochure
No ratings yet
0MLwP Workshop Brochure
8 pages
Data Science & ML Using Python
No ratings yet
Data Science & ML Using Python
5 pages
US8248150
No ratings yet
US8248150
8 pages
INF385T IMLsyllabus
No ratings yet
INF385T IMLsyllabus
4 pages
Cable Crimping (Punching)
No ratings yet
Cable Crimping (Punching)
3 pages
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
General Mathematics 11-Module 1
No ratings yet
General Mathematics 11-Module 1
6 pages
Data Science Lab-KTU
No ratings yet
Data Science Lab-KTU
5 pages
Data Science Course Outline CES LUMS
No ratings yet
Data Science Course Outline CES LUMS
4 pages
Python Itinerary
No ratings yet
Python Itinerary
4 pages
UPDATED Data Science Syllabus
No ratings yet
UPDATED Data Science Syllabus
20 pages
Diya Basera
No ratings yet
Diya Basera
15 pages
CIEN 30043 Lecture No. 6 - Chapter 4 Part 2
No ratings yet
CIEN 30043 Lecture No. 6 - Chapter 4 Part 2
8 pages
Industrial Training Report - Shreya
No ratings yet
Industrial Training Report - Shreya
38 pages
Python Tutorial Text 2024-1
No ratings yet
Python Tutorial Text 2024-1
82 pages
Syllabus
No ratings yet
Syllabus
4 pages
3 Months Python and Data Analytics Syllabus
100% (1)
3 Months Python and Data Analytics Syllabus
3 pages
305 BA PYTHON - APR 2022 ANSWER Key
No ratings yet
305 BA PYTHON - APR 2022 ANSWER Key
14 pages
AIML Curriculum
No ratings yet
AIML Curriculum
25 pages
DS Curriculum
No ratings yet
DS Curriculum
4 pages
Data Science & Analytics - AI & ML and Visualization
No ratings yet
Data Science & Analytics - AI & ML and Visualization
2 pages
Semester-5 MCA Integrated IIPS DAVV Syllabus
No ratings yet
Semester-5 MCA Integrated IIPS DAVV Syllabus
26 pages
3rd Sem Syllabus
No ratings yet
3rd Sem Syllabus
5 pages
Learn Python Basics For AI Real-World Applications - Raj Cloud Technologies
No ratings yet
Learn Python Basics For AI Real-World Applications - Raj Cloud Technologies
5 pages
Ce473 Project - Fall 2024
No ratings yet
Ce473 Project - Fall 2024
8 pages
The Preparation of Ethene From Ethanol - Chemistry U2
No ratings yet
The Preparation of Ethene From Ethanol - Chemistry U2
3 pages
Artificial Intelligence & Data Science Course Outline
No ratings yet
Artificial Intelligence & Data Science Course Outline
5 pages
Week 3 v1.1 (Hidden) Supervised Learning (Regression)
No ratings yet
Week 3 v1.1 (Hidden) Supervised Learning (Regression)
52 pages
Analytics or Computing With Python
No ratings yet
Analytics or Computing With Python
2 pages
Data Mining & Machine Learning Courseoutline
No ratings yet
Data Mining & Machine Learning Courseoutline
7 pages
Lab Report - 5
No ratings yet
Lab Report - 5
7 pages
AI-Internship Syllabus
No ratings yet
AI-Internship Syllabus
3 pages
MLP Syllabus
No ratings yet
MLP Syllabus
4 pages
Internship Project Ppt-1
No ratings yet
Internship Project Ppt-1
23 pages
Syllabus
No ratings yet
Syllabus
7 pages
Week 1: Python Basics: Class 1: Getting Started With Python
No ratings yet
Week 1: Python Basics: Class 1: Getting Started With Python
6 pages
Datasciencewith AI
No ratings yet
Datasciencewith AI
12 pages
Data Science Task List
No ratings yet
Data Science Task List
15 pages
Ai & ML FDP
No ratings yet
Ai & ML FDP
7 pages
Python Full Stack - Data Sciences
No ratings yet
Python Full Stack - Data Sciences
18 pages
MCA 3rd Semester Artificial Intelligence & Machine Learning Syllabus
No ratings yet
MCA 3rd Semester Artificial Intelligence & Machine Learning Syllabus
6 pages
3 CSE Multidisplinary Honours 10062024
No ratings yet
3 CSE Multidisplinary Honours 10062024
11 pages
Report Intership Chapters
No ratings yet
Report Intership Chapters
39 pages
01 Course Logistics
No ratings yet
01 Course Logistics
12 pages
Python GC
No ratings yet
Python GC
4 pages

Chapter 0. Course Presentation

Uploaded by

Chapter 0. Course Presentation

Uploaded by

INTERMEDIATE ECONOMETRICS & DATA

analysts has been increasing and is expected to continue growing in the

• In this context, the “Intermediate Econometrics and Data Analysis (IEDA)”

course is designed to prepare you for future job opportunities in this

• It consists of thirteen sessions designed to help you master essential skills

• The course consists of four main phases:

◦ Phase I (Chapter 1): Data preparation, including handling imbalanced datasets

• The course consists of four main phases:

◦ Phase III (Chapters 4 to 7): In-depth exploration of various classification

◦ Phase IV (Last session): Emphasis on the interdisciplinary project, focusing on

CHAPTER 0 • ABOUT IEDA

• Three main components: • Before the class:

1. Chapter slides; 1. Read the chapter slides;

2. Datasets; 2. Download the chapter’s datasets;

3. Download the “In-Class Practice” files.

• After the class:

MIDTERM FINAL EXAM

• WEIGHT 30% 55%

• DESCRIPTION • Exercises and MCQs • Mainly consists of exercises

• CONTENT • Chapters 0 to 3 included • All chapters

• DATE • Mid-semester • End of the semester

• Participation grades consider attendance, class

4. Application of AI in business: Utilize the

5. AI implementation and ethical issues in business.

the strengths and weaknesses of each.

• To accomplish this, the quantitative explained variable (price) is converted

• In this regard, a dataset might look like this (just an example):

1. FOR THE PROJECT

2. FOR THE COACHING SESSION

• Individual coaching, whether via email or in person, is not possible.

• Collecting data and fixing errors are part of the job.

• The provided code contains no errors. Therefore, adapting it to your own

• No additional coaching or error support will be offered outside of the class.

◦ Marc JOETS: [email protected]

contact your IEDA professor.

• All project information is available on MyCourses:

◦ One file for K-Nearest Neighbors;

◦ One file for both Decision Trees and Random Forest;

◦ One file for Neural Network.

• A Word document should be completed for all courses involved in the AI

◦ Analyze the theoretical strengths and weaknesses of each model.

IEDA section of the template.

• Please note that:

• A faulty model will result in a zero for the

• Python is one of the most in-demand programming languages by companies. Its

• A programming language is a means of communicating with computers. It involves

• So far, we have used software like SPSS to create models.

• In general, languages can be either written or spoken. Programming

• To write Python code, you can choose between an “Integrated Development

• The first step in conducting research is usually a literature review. This

• Coding is similar to research in this regard. Developers write lines of code,

https://pandas.pydata.org/ https://numpy.org/ https://scikit-learn.org/stable/

Superset of the matplotlib

You might also like