0% found this document useful (0 votes)
6 views

Python Syllabus

Uploaded by

kkk.tazabaeva
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Python Syllabus

Uploaded by

kkk.tazabaeva
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Python as a tool for data collection and analysis

Lecturer: Alla A.Tambovtseva


Classteacher: Alla A.Tambovtseva

Course description
Course is an optional one module course. During this course students will learn basics of
programming, methods to process and visualize qualitative and quantitative data, and approaches to
retrieving information from the Internet using web scraping and API requests. The ultimate goal of
the course is to provide students with techniques useful for data collection, data visualization and
exploratory data analysis. The course is taught in two languages: Russian and English. Classes will
be taught in Russian, and materials will be available in both Russian and English.
Course prerequisites
No special requirements.
Course objectives
The course is aimed at developing basic programming skills, learning methods of data processing
and visualization using Python libraries, learning methods of collecting data from the Web.
Learning outcomes
At the end of their studies, students should be able to:
 Work in Jupyter Notebook, use interactive widgets and other elements of interaction with
users.
 Apply knowledge of different data structures to solve practical problems.
 Use conditional structures, loops and functions to work with real data.
 Apply methods of data processing and exploratory analysis using Pandas.
 Visualize qualitative and quantitative data using graphical Python libraries.
 Collect data from the Internet via web scraping and API requests.
Literature
1. Introduction to Python and Jupyter Notebook.
Anaconda 3 and Jupyter Notebook: installation and usage. Interface of Jupyter Notebook.
Variables and basic data types in Python.
Main texts

 Федоров Д. Ю. Программирование на языке высокого уровня Python. Учебное


пособие для СПО. М.: Издательство Юрайт, 2019. –Текст электронный // ЭБС
Юрайт - https://biblio-online.ru/book/programmirovanie-na-yazyke-vysokogo-urovnya-
python-446505

Supplementary reading

 Программирование на PYTHON. Т. 1, Лутц М., 2013


2. Data structures in Python: lists, tuples, dictionaries.
Mutability and immutability in programming. Lists and methods on lists. Tuples and
methods on lists. Lists vs tuples. Dictionaries methods on lists. Dictionaries and JSON-files.
Main texts

 Федоров Д. Ю. Программирование на языке высокого уровня Python. Учебное


пособие для СПО. М.: Издательство Юрайт, 2019. –Текст электронный // ЭБС
Юрайт - https://biblio-online.ru/book/programmirovanie-na-yazyke-vysokogo-urovnya-
python-446505

Supplementary reading

 Программирование на PYTHON. Т. 1, Лутц М., 2013

3. Control structures and functions in Python.


If-else conditional structures. For-loop and while-loop. User-defined functions. Local and
global variables. Lambda-functions. Code debugging.
Main texts

 Федоров Д. Ю. Программирование на языке высокого уровня Python. Учебное


пособие для СПО. М.: Издательство Юрайт, 2019. –Текст электронный // ЭБС
Юрайт - https://biblio-online.ru/book/programmirovanie-na-yazyke-vysokogo-urovnya-
python-446505

Supplementary reading

 Программирование на PYTHON. Т. 1, Лутц М., 2013

4. Working with data frames in Python with libraries NumPy and Pandas.
NumPy arrays for data analysis. Basic data handling using Pandas methods. Grouping and
aggregation data. Merging and melting data. Gathering descriptive statistics for
exploratory analysis.
Main texts

 Nelli, F. (2018). Python Data Analytics : With Pandas, NumPy, and Matplotlib (Vol.
Second edition). New York, NY: Apress. Retrieved from
http://search.ebscohost.com/login.aspx?direct=true&site=eds-
live&db=edsebk&AN=1905344

Supplementary reading

 Изучаем pandas : высокопроизводительная обработка и анализ данных в Python,


Хейдт, М., Груздева, А. В., 2018

5. Data visualization with graphical Python libraries.


Visualizing mathematical functions in Python. Visualization of qualitative and quantitative
data with Matplotlib. Visualization of data with Seaborn.
Main texts

 Nelli, F. (2018). Python Data Analytics : With Pandas, NumPy, and Matplotlib (Vol.
Second edition). New York, NY: Apress. Retrieved from
http://search.ebscohost.com/login.aspx?direct=true&site=eds-
live&db=edsebk&AN=1905344

Supplementary reading

 Python и анализ данных, Маккинли У., Слинкина А. А., 2015

6. Collecting data from the Web using Python.


Introduction to HTML and web design. Parsing html-files in Python with the libraries
requests and BeautifulSoup. Introduction to CSS-selector. Управление браузером with
Selenium. API as a source of data. Working with API of social networks.
Main texts

 G. Nair, V. (2014). Getting Started with Beautiful Soup. Birmingham, UK: Packt
Publishing. Retrieved from http://search.ebscohost.com/login.aspx?
direct=true&site=eds-live&db=edsebk&AN=691839

Supplementary reading

 Скрапинг веб - сайтов с помощью Python : сбор данных из современного


интернета, Митчелл Р., Груздева А. В., 2016

Grading system

Grade for the course consists of homeworks (40%), online practice (20%), and final project (40%).
Grade for homeworks is the average of grades for all homeworks. Grade for online practice is the
percent of tasks completed divided by 10). In case of missing the deadline for home assignment and
project for a valid reason, an additional date may be set. There are no blocking elements. Only final
project is subject to retakes.

Homeworks consist of programming problems that vary in difficulty and weigh different number of
points. At the beginning of each homework there specified a minimum number of points enough to
get the maximum grade, so a student might decide which problems to solve. For example, if a
student should gain at least 8 points for the homework to get grade 10, he can solve 8 one-point
problems or 4 two-point problems.

The submission after the deadline will lead to penalty: 15% for delay within 1 hour, 30% for delay
within 1 day, 50% for delay within 2 days. Assignments submitted later are not graded. If student’s
code cannot be reproduced due to syntax errors, it can be graded as 0. If plagiarism is detected in
homeworks, all works involved will get the score 0. The administration of programme will be
notified in a written form about any cases of academic misconduct.
Online practice includes doing tasks on the online platform DataCamp
(https://www.datacamp.com/home), free access is provided to students. Online practice should be
completed before the deadline (usually next class), late submissions will not be graded.

For the final project students are expected to write a program of practical use that includes
requesting some input from a user, retrieving data from the Internet and processing these data. For
the final project students should submit two files: a file with Python code (ipynb-file or py-file), and
a file with documentation for this code that describes its aims, usage and limitations. Project can be
done individually or in groups up to 3 people.

The final grades are also transferred to 10- and 5-points grades in accordance with the ICEF
Grading Regulations (par. 3). Retakes are organized in accordance with the HSE Interim and
Ongoing Assessment Regulations (incl. Annex 8 for ICEF). Grade determination after retakes is
done in accordance with ICEF Grading Regulations (par. 5). Sample materials for knowledge
assessment are available in ICEF Information system and by request.

Distribution of hours of the course by topics and types of work

№ Topic Total Lectures Classes Selfstudy


1 Introduction to Python and Jupyter Notebook 4 2 2 2

2 Data structures in Python: lists, tuples, dictionaries 4 2 2 2

3 Control structures and functions in Python 4 2 2 2

4 Working with data frames in Python with libraries 8 4 4 4


NumPy and Pandas

5 Data visualization with graphical Python libraries 4 2 2 2

6 Collecting data from the Web using Python 8 4 4 4

Total 32 16 16 16

You might also like