Quantum 1: Data-driven decision making
- Python for accessing web data
(17029)
2 ECTS
Professor Benjamin Maury
E-mail:
[email protected] Teaching assistants: Gustav Finne, Anh Nguyen, Sinh Mai, Elyas Saif, Joosua
Virtanen, Xiaogeng Xu and Marco Lambrecht
Course description
This course introduces frameworks for accessing web data and doing textual
analysis with Python 3. The course introduces the basics of Python syntax. After the
basics, the focus is on data use and access. Data (pandas) and textual analyses are
reviewed. Networked programs are reviewed including web scraping (BeautifulSoup).
Web services are reviewed including API contracts. Interactive webpage development
is introduced. Examples used are relevant for financial analysts.
Learning objectives
Upon completion of the course the students will
- have an understanding of the Python syntax,
- be able to perform basic data and textual analyses,
- have a framework for accessing web data, and
- have skills to develop simple interactive webpages.
Course requirements (Scale 0-100 points)
(1) Small projects (a-d) 0-80 points. Select a maximum of 4 (out of 5) small projects to
submit. The project is individual work. Upload through Moodle before deadline
(see Moodle submission section).
a. Python basics [0-20 points] (including printing, comments, variables,
numbers, math, strings, if/else, lists, loops, dictionaries, functions)
b. Data (pandas) and textual analysis [0-20 points] (data analysis with pandas,
OLS regressions, and textual analysis)
c. Networked programs [0-20 points] (opening web pages to be treated like
files, web scraping, BeautifulSoup)
d. Web services [0-20 points] (APIs)
e. Interactive web pages [0-20 points]
(2) Learning diary 0-20 points. Individual work that discusses what you have learned
and reviews the course content and your projects.
• Review the course content and your projects as well as discuss what you have
learned.
• Discuss how you could automate repetitive tasks regarding (i) data access, and (ii)
data & textual analysis for an equity fund that updates its stock portfolio regularly.
• How do you think business students can benefit from Python programming in
general?
Length: 4-6 pages.
See the Assignments document on Moodle for more details.
Submit through the course Moodle page (deadline: October 8).
Total requirements: 50 points to pass the course.
Topics and readings:
Introduction to the course (August 30, 10:15-11:15, online with Microsoft Teams):
Overview of course and requirements
- Installing Python 3 via Anaconda
https://docs.anaconda.com/anaconda/install/windows/
- Using Jupiter notebook
- Elyas Saif demonstrates how to install Python
Pre-assignment: Watch the video on how to install Python via Anaconda
Session 1 (September 1, 8:30-10:00, online with Microsoft Teams): Introduction to
the Python syntax
- We will review printing, comments, variables, numbers, math, strings, if/else,
lists, loops, dictionaries, functions, and writing scripts
- Elyas Saif demonstrates the basic syntax in Python
Pre-assignment: Watch the session 1 recording (on Moodle) on the Python syntax
Literature:
- Jake VanderPlas “A Whirlwind Tour of Python”
https://jakevdp.github.io/WhirlwindTourOfPython/
Background video:
- Intro to Python for Business with Mattan Griffel (Columbia University)
https://www.youtube.com/watch?v=32LiJFZC484
Session 2 (September 10, 8:30-10:30, online with Microsoft Teams): Data analysis
with pandas and textual analysis
- Using pandas and other packages for data analysis
- Anh Nguyen shows how to (i) import data files to Jupyter (ii) run regressions
with the data, and (iii) visualize data
- Joosua Virtanen vill demonstrate the use of textual analysis
Pre-assignment: Watch the session 2 recording (on Moodle) on pandas and
textual analysis
Literature:
- Dr. Charles R. Severance: ”Python for Everybody: Exploring Data Using
Python 3” (Chapter 11)
https://www.py4e.com/book.php
- Pandas:
https://pandas.pydata.org/pandas-docs/stable/getting_started/index.html
Session 3 (September 14, 8:30-10:30, online with Microsoft Teams): Networked
programs
- Networked programs are reviewed (HTTP) including web scraping
(BeautifulSoup)
- Accessing web data that can be treated like a file
- Joosua Virtanen and Sinh Mai demonstrate the use of BeautifulSoup
Pre-assignment: Watch the session 3 recording (on Moodle) on networked
programs
Literature:
- Dr. Charles R. Severance: ”Python for Everybody: Exploring Data Using
Python 3” (Chapter 12)
https://www.py4e.com/book.php
Session 4 (September 17, 8:30-10:00, online with Microsoft Teams): Web services
- Accessing web data via APIs
- We will use Python to access databases in Quantum throughs APIs
- Gustav Finne will demonstrate the use of APIs
Pre-assignment: Watch the session 4 recording (on Moodle) on web services
Literature:
- Dr. Charles R. Severance: ”Python for Everybody: Exploring Data Using
Python 3” (Chapter 13)
https://www.py4e.com/book.php
Session 5 (September 21, 8.30-10:00, online with Microsoft Teams): Basics of
developing webpages
- Introduction to oTree (a software based on Django and Python)
- Introduction to Django (Python based web framework)
- Structure of oTree (models, pages, templates)
- A simple example of a webpage
- Marco Lambrecht will demonstrate the basics of developing webpages
Session 6 (September 22, 8:30-10:00, online with Microsoft Teams): Development
of interactive webpages
- Motivation: Why program webpages instead of using easy platforms like
Qualtrics?
- Developing a simple interactive app
- Deploying an app on an external host
- Styling webpages
- Xiaogeng Xu will demonstrate the development of interactive webpages
(JavaScript, HTML, and CSS)
Optional session (September 27, 8:30-10:00, online with Microsoft Teams):
- The session is not mandatory.
- Xu and Marco will be available to provide help on the webpage assignment.
Updated: August 28, 2021.
Changes are possible.