0% found this document useful (0 votes)
48 views326 pages

Pyecon

Uploaded by

Edu Merino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views326 pages

Pyecon

Uploaded by

Edu Merino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 326

Python for Econometrics

Lecturer: Fabian H. C. Raters

Institute: Econometrics, University of Goettingen

Version: April 28, 2019

© 2019 PyEcon.org. All rights reserved. Python is a trademark of the PSF.


Learning Python for econometrics 2
Essential
concepts
Getting started Welcome to this course and to the world of Python!
Procedural
programming
Object-orientation
Learning objectives of this course:
Numerical
programming
NumPy package Python: The course is about Python programming.
Array basics
Linear algebra
for : You will learn tools and methods.
Data formats and
handling Econometrics:
Pandas package
Series
Statistics: Numerical programming in Python.
DataFrame applied to: We will use it on examples.
Import/Export data
Economics: In an economic context.
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Learning Python for econometrics 3
Essential
concepts
Getting started Knowledge after completing this course:
Procedural
programming
Object-orientation
You have acquired a basic understanding of programming in general
Numerical
programming
with Python and a special knowledge of working with standard
NumPy package numerical packages.
Array basics
Linear algebra
You are able to study Python in depth and absorb new knowledge
Data formats and
handling for your scientific work with Python.
Pandas package
Series
You know the capabilities and further possibilities to use Python
DataFrame
in econometrics.
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Learning Python for econometrics 4
Essential
concepts
Getting started What you should not expect from this course:
Procedural
programming
Object-orientation
A guide how to install or maintain an application.
Numerical
programming An introduction to programming for beginners.
NumPy package
Array basics An introduction to professional development tools.
Linear algebra

Data formats and Non-scientific, general purpose programming (beyond the language
handling
Pandas package
essentials).
Series
DataFrame
Few content and less effort...
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Course organisation 5
Essential
concepts
Getting started This course can be seen as an applied lecture:
Procedural
programming
Object-orientation Lecture:
Numerical We try to explain the partly theoretical knowledge on Python by sim-
programming
NumPy package ple, easy to understand examples. You can learn the programming
Array basics
Linear algebra
language’s subtleties by reading literature.
Data formats and Exercises:
handling
Pandas package Digital work sheets in the form of Jupyter notebooks with applied
Series
DataFrame
tasks are available for each chapter. For all exercises there are sample
Import/Export data solutions available in separate notebooks.
Visual
illustrations Self-tests:
Matplotlib package
Figures and subplots
At the end of each of the five chapters there are typical exam questions.
Plot types and styles
Pandas layers
Written exam:
Applications There will be a final exam. This will be a pure multiple choice exam:
Time series
60 questions, 90 minutes.
Moving window
Financial applications
After the successful participation in the exam you will receive 6 ECTS.

© 2019 PyEcon.org
Literature 6
Essential
concepts
Getting started The programming language Python is already established and very well
Procedural
programming in trend for numerical applications. Some keywords:
Object-orientation

Numerical
programming
Data science,
NumPy package
Array basics
Data wrangling,
Linear algebra
Machine learning,
Data formats and
handling
Pandas package
Numerical statistics,
Series
DataFrame
...
Import/Export data

Visual Recommended literature while following this course:


illustrations
Matplotlib package Learning Python, 5th Edition by Mark Lutz,
Figures and subplots
Plot types and styles Python Crash Course by Eric Matthes,
Pandas layers

Applications Python Data Science Handbook by Jake VanderPlas,


Time series
Moving window Python for Data Analysis, 2nd Edition by Wes McKinney,
Financial applications
Python for Finance by Yves Hilpisch.

© 2019 PyEcon.org
Software: Python 3 7
Essential
concepts
Getting started We are using Python 3. There was a big revision in the migration
Procedural
programming from Python 2 to version 3 and the new version is no longer backwards
Object-orientation
compatible to the old version.
Numerical
programming
NumPy package Python 3 running [command line]
Array basics
Linear algebra python3 --version
Data formats and
handling
Pandas package ## Python 3.6.7
Series
DataFrame
Import/Export data The normal execution mode is that the Python interpreter processes
Visual
illustrations
the instructions in the background – in other numeric programming
Matplotlib package languages such as R this is known as batch mode. It executes program
Figures and subplots
Plot types and styles
code that is usually located in a source code file.
Pandas layers
The interpreter can also be started in an interactive mode. It is used
Applications
Time series for testing and analytical purposes in order to obtain fast results when
Moving window
Financial applications
performing simple applications.

© 2019 PyEcon.org
Software: IDEs 8
Essential
concepts
Getting started For everyday work with Python it would be extremely tedious to make
Procedural
programming all edits in interactive mode.
Object-orientation

Numerical
There are a number of excellent integrated development environments
programming
NumPy package
(IDEs) for Python, with three being emphasized here:
Array basics
Linear algebra
Jupyter (and IPython)
Data formats and
handling Spyder (scientific IDE)
Pandas package
Series PyCharm (by IntelliJ)
DataFrame
Import/Export data

Visual
Of course, you can also use a simple text editor. However, you would
illustrations
Matplotlib package
probably miss the comfort of an IDE.
Figures and subplots
Plot types and styles
Installing, adding and maintaining Python is not trivial at the beginning.
Pandas layers Therefore, as a beginner, you are well advised to download and install
Applications the Python distribution Anaconda. Bonus: Many standard packages
Time series
Moving window are supplied directly or you can post-install them conveniently.
Financial applications

© 2019 PyEcon.org
Following this course 9
Essential
concepts
Getting started In this course – in a numerical and analytical context – we use only
Procedural
programming Jupyter with the IPython kernel.
Object-orientation

Numerical
That is why we have combined
programming
NumPy package
Array basics
1 all the code from the slides, and
Linear algebra
2 all the exercises and solutions
Data formats and
handling
Pandas package
into interactive Jupyter notebooks that you can use online without
Series
DataFrame having to install software locally on your computer. The GWDG has
Import/Export data
set up a cloud-based Jupyter-Hub for you.
Visual
illustrations
Matplotlib package
You can access the working environment with your university credentials
Figures and subplots at
Plot types and styles
Pandas layers https://jupyter.gwdg.de/
Applications
Time series
create a profile and get started right away – even using your smart
Moving window devices. However, so far you are still asked to upload the course
Financial applications
notebooks by yourself or rewrite the code from scratch.

© 2019 PyEcon.org
Notebook workflow 10
Essential
concepts
Getting started A Jupyter notebook is divided into individual, vertically arranged cells,
Procedural
programming which can be executed separately:
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
The notebook approach is not novel and comes from the field of
computer algebra software.
© 2019 PyEcon.org
Notebook workflow 11
Essential
concepts
Getting started Actually, an interactive Python interpreter called IPython is started “in
Procedural
programming the core”.
Object-orientation

Numerical IPython running [command line]


programming
NumPy package ipython3 --version
Array basics
Linear algebra

Data formats and ## 6.5.0


handling
Pandas package
Series Roughly speaking, this is a greatly enhanced version of the Python
DataFrame
Import/Export data
3 interpreter, which has numerous, convenient advantages over the
Visual “normal” interpreter in interactive mode, such as, e. g.,
illustrations
Matplotlib package printing of return values,
Figures and subplots
Plot types and styles color highlighting, and
Pandas layers

Applications
magic commands.
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Following this course 12
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra Finally, we wish you a lot of fun and success with and in this course!
Data formats and
handling
Pandas package Practice makes perfect!
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots Contribution and credits:
Plot types and styles
Pandas layers
Fabian H. C. Raters
Applications
Time series Eike Manßen
Moving window
Financial applications
GWDG for the Jupyter-Hub

© 2019 PyEcon.org
Table of contents 13
Essential
concepts
Getting started
Procedural
programming
1 Essential concepts 4 Visual illustrations
Object-orientation
1.1 Getting started 4.1 Matplotlib package
Numerical
programming 1.2 Procedural programming 4.2 Figures and subplots
NumPy package
Array basics
1.3 Object-orientation 4.3 Plot types and styles
Linear algebra
2 Numerical programming 4.4 Pandas layers
Data formats and
handling 2.1 NumPy package 5 Applications
Pandas package
Series
2.2 Array basics 5.1 Time series
DataFrame 2.3 Linear algebra 5.2 Moving window
Import/Export data

Visual 3 Data formats and handling 5.3 Financial applications


illustrations
Matplotlib package 3.1 Pandas package
Figures and subplots
Plot types and styles
3.2 Series
Pandas layers 3.3 DataFrame
Applications 3.4 Import/Export data
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Chapter 1 14
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Essential concepts
Numerical
programming
NumPy package
Array basics
1.1 Getting started
Linear algebra

Data formats and


1.2 Procedural programming
handling
Pandas package 1.3 Object-orientation
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Section 1.1 15
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Essential concepts
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Getting started
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Motivation for learning Python 16
Essential
concepts
Getting started Python can be described as
Procedural
programming
Object-orientation
a dynamic, strongly typed, multi-paradigm and object-oriented
Numerical
programming
programming language,
NumPy package
Array basics
for versatile, powerful, elegant and clear programming,
Linear algebra
with a general, high-level, multi-platform application scope,
Data formats and
handling
Pandas package
which is being used very successfully in the data science sector
Series and very much in trend.
DataFrame
Import/Export data

Visual
Moreover, Python is relatively easy to learn and its successful language
illustrations
Matplotlib package
design supports novices to professional developers.
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
A short history of time 17
Essential
concepts
Getting started ... of the Python era:
Procedural
programming
Object-orientation The language was originally developed in 1991 by Guido van Rossum.
Numerical Its name was based on Monty Python’s Flying Circus. Its main identifi-
programming
NumPy package cation feature is the novel markup of code blocks – by indentation:
Array basics
Linear algebra
Indentation example
Data formats and
handling password = input("I am your bank. Password please: ")
Pandas package
Series ## I am your bank. Password please: sparkasse
DataFrame
Import/Export data if password == "sparkasse":
Visual print("You successfully logged in!")
illustrations else:
Matplotlib package
print("Fail. Will call the police!")
Figures and subplots
Plot types and styles
Pandas layers ## You successfully logged in!
Applications
Time series
Moving window
This increases the readability of code and should at the same time
Financial applications encourage the programmer in programming neatly. Since the source
code can be written more compactly with Python, an increased efficiency
in daily work can be expected.
© 2019 PyEcon.org
A short history of time 18
Essential
concepts
Getting started Overview of the Python development by versions and dates:
Procedural
programming
Object-orientation

Numerical
programming
1990 1995 2000 2005 2010 2015 2020
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
Python’s birthday Python 2.0 Python 3.0
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Python 2.7 lives forever Python 2.7 will die
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Python 3.6
Financial applications

© 2019 PyEcon.org
In comparison 19
Essential
concepts
Getting started Comparing the way Python works with common programming languages,
Procedural
programming we briefly discuss a selection of popular competitors:
Object-orientation

Numerical C/C++:
programming
NumPy package CPython is interpreted, not compiled.
Array basics
Linear algebra C/C++ are strongly static, complex languages.
Data formats and
handling Java:
Pandas package
Series CPython is not compiled just-in-time.
DataFrame
Import/Export data Java has a C-type syntax.
Visual
illustrations
MATLAB
Matplotlib package
Figures and subplots
In Python you primarily follow a scalar way of thinking, while in
Plot types and styles MATLAB you write matrix-based programs.
Pandas layers

Applications In the numerical context, the matrix view and syntax are very
Time series
similar to those of MATLAB.
Moving window
Financial applications
MATLAB is partially compiled just-in-time.
Where CPython is the reference implementation – the “Original Python”,
© 2019 PyEcon.org
which is implemented in C itself.
In comparison 20
Essential
concepts
Getting started R
Procedural
programming
Object-orientation
In Python you primarily follow a scalar way of thinking, while in R
Numerical
you write vector-based programs.
programming
NumPy package R has a C-type syntax including additions to novel language con-
Array basics
Linear algebra
cepts.
Data formats and Stata
handling
Pandas package Any comparison would inadequately describe the differences.
Series
DataFrame
Import/Export data Reference semantics
Visual
illustrations An extremely important difference between the first two languages,
Matplotlib package
Figures and subplots
C/C++ and Java, as well as Python itself, and the last three languages
Plot types and styles is that they follow a call-by-reference semantic, while MATLAB, R and
Pandas layers

Applications
Stata are call-by-copy.
Time series
Moving window Further specific differences and similarities to MATLAB and R will be
Financial applications
addressed in other parts of this course.

© 2019 PyEcon.org
Versatility – diversity 21
Essential
concepts
Getting started Python has become extremely popular:
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/
© 2019 PyEcon.org
Versatility – diversity 22
Essential
concepts
Getting started So, you’re on the right track – because who wants to bet on the wrong
Procedural
programming hoRse?
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/
© 2019 PyEcon.org
Versatility – diversity 23
Essential
concepts
Getting started Areas in which Python is used with great success:
Procedural
programming
Object-orientation Scripts,
Numerical Console applications,
programming
NumPy package GUI applications,
Array basics
Linear algebra
Game development,
Data formats and Website development, and
handling
Pandas package
Numerical programming.
Series
DataFrame Places where Python is used:
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Yet another outline 24
Essential
concepts
Getting started In this course we will successively gain the following insights:
Procedural
programming
Object-orientation

Numerical
programming
1 General basics of the language.
NumPy package
Array basics
Linear algebra
2 Numerical programming and handling of data sets.
Data formats and
handling 3 Application to economic and analytical questions.
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Section 1.2 25
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Essential concepts
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Procedural programming
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
The first program 26
Essential
concepts
Getting started Programs can be implemented very quickly – this is a pretty minimal
Procedural
programming example. You can write this command to a text file of your choice and
Object-orientation
run it directly on your system:
Numerical
programming
NumPy package Hello there
Array basics
Linear algebra
print("Hello there!")
Data formats and
handling ## Hello there!
Pandas package
Series

Only one function print() (shown here as a keyword),


DataFrame
Import/Export data

Visual
illustrations Function displays argument (a string) on screen,
Matplotlib package
Figures and subplots
Arguments are passed to the function in parentheses,
A string must be wrapped in " " or ’ ’,
Plot types and styles
Pandas layers

Applications
No semicolon at the end.
Time series
Moving window
Financial applications

© 2019 PyEcon.org
User input 27
Essential
concepts
Getting started Let’s add a user input to the program:
Procedural
programming
Object-orientation Hello you
name = input("Please enter your name: ")
Numerical
programming
NumPy package
Array basics
## Please enter your name: Angela Merkel
Linear algebra
print("Hello " + name + "!")
Data formats and
handling
Pandas package
## Hello Angela Merkel!
Series
DataFrame
Import/Export data

Visual
The function input() is used for interactive text input,
You can use the equal sign = to assign variables (here: name),
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Strings can be joined by the (overloaded) Operator +.
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Determining weekdays 28
Essential
concepts
Getting started We are now trying to find out on which weekday a person was born
Procedural
programming (Merkel’s birthday is 17-07-1954):
Object-orientation

Numerical
programming
Weekday of birth
NumPy package
from datetime import datetime
Array basics
Linear algebra answer = input("Your birthday (DD-MM-YYYY): ")
Data formats and
handling ## Your birthday (DD-MM-YYYY): 17-07-1954
Pandas package
Series birthday = datetime.strptime(answer, "%d-%m-%Y")
DataFrame print("Your birthday was on a " + birthday.strftime("%A") + "!")
Import/Export data

Visual ## Your birthday was on a Saturday!


illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
It is really easy to import functionality from other modules,
Applications Function strptime() is a method of class datetime,
Time series
Moving window Both methods, strptime() and strftime(), are used to convert
Financial applications
between strings and date time specifications.

© 2019 PyEcon.org
Time since birth 29
Essential
concepts
Getting started And how many days have passed since then (until Merkel’s 4th swearing-
Procedural
programming in as Federal Chancellor)?
Object-orientation

Numerical
programming
Age in days
NumPy package
someday = datetime.strptime("14-03-2018", "%d-%m-%Y")
Array basics
Linear algebra
print("You are " + str((someday - birthday).days) + " days old!")
Data formats and
handling ## You are 23251 days old!
Pandas package
Series
DataFrame
Import/Export data You can create time differences, i. e., the operator - is overloaded,
Visual
illustrations The difference represents a new object, with its own attributes,
Matplotlib package
Figures and subplots
such as days,
Plot types and styles
Pandas layers
When using the overloaded operator +, you have to explicitly
Applications convert the number of days by means of str() into a string.
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Time since birth 30
Essential
concepts
Getting started How many years, weeks and days do you think that is?
Procedural
programming
Object-orientation Human readable age
Numerical
programming
from dateutil.relativedelta import relativedelta
NumPy package delta = relativedelta(someday, birthday)
Array basics print(f"That’s {delta.years} years, {delta.months} months "
Linear algebra
f"and {delta.days} days!!")
Data formats and
handling
Pandas package
## That's 63 years, 7 months and 25 days!!
Series
DataFrame
Import/Export data

Visual
You don’t have to keep reinventing the wheel – a wealth of packages
illustrations and individual modules are freely available,
Matplotlib package
Figures and subplots
A lowercase f before "..." provides convenient formatting – there
Plot types and styles
Pandas layers are other options as well,
Applications
Time series
Two strings in sequence are implicitly joined together – "That"
Moving window "’s nice"!
Financial applications

© 2019 PyEcon.org
Getting help 31
Essential
concepts
Getting started When working with the interactive interpreter, i. e., in a notebook, you
Procedural
programming can quickly get useful information about Python objects:
Object-orientation

Numerical
programming
Help system
NumPy package
help(len)
Array basics
Linear algebra
## Help on built-in function len in module builtins:
Data formats and
handling ##
Pandas package ## len(obj, /)
Series
## Return the number of items in a container.
DataFrame
Import/Export data

Visual Alternatively, e. g., for more complex problems, it is best to search


illustrations
Matplotlib package
directly with your preferred internet search engine.
Figures and subplots
Plot types and styles
You can find neat solutions to conventional challenges in literature.
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Lexical structure 32
Essential
concepts
Getting started As with natural language, programming languages have a lexical struc-
Procedural
programming ture. Source code consists of the smallest possible, indivisible elements,
Object-orientation
the tokens. In Python you can find the following groups of elements:
Numerical
programming
NumPy package Literals
Array basics
Linear algebra
Variables
Data formats and
handling Operators
Pandas package
Series Delimiters
DataFrame
Import/Export data Keywords
Visual
illustrations Comments
Matplotlib package
Figures and subplots
Plot types and styles These terms give us a rock-solid foundation for exploring the heart of
Pandas layers
a programming language.
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Literals and variables 33
Essential
concepts
Getting started Basically, we distinguish between literals and variables:
Procedural
programming
Object-orientation Assigning variables with literals
Numerical
programming
myint = 7
NumPy package myfloat = 4.0
Array basics myboat = "nice"
Linear algebra
mybool = True
Data formats and
handling
myfloat = myboat
Pandas package
Series
DataFrame
Import/Export data In this course, we will work with four different literals: integer (7),
Visual float (4.0), string ("nice") and boolean (True),
illustrations
Matplotlib package Literals are assigned to variables at runtime,
Figures and subplots
Plot types and styles In Python the data type is derived from the literal and does not
Pandas layers

Applications
have to be described explicitly,
Time series
Moving window
It is allowed to assign values of different data types to the same
Financial applications variable (name) sequentially,
If we don’t assign a literal to any variables, we forfeit it.
© 2019 PyEcon.org
Operators and delimiters 34
Essential
concepts
Getting started Most operators and delimiters will be introduced to you during this
Procedural
programming course. Here is an overview of the operators:
Object-orientation

Numerical
programming
Overview of operators
NumPy package
## + - * / ** //
## % @ << >> & |
Array basics
Linear algebra

Data formats and


## ^ ~ == != < >
handling ## <= >= and or not in
Pandas package ## not in is is not
Series
DataFrame
Import/Export data An overview of the delimiters follows:
Visual
illustrations Overview of delimiters
Matplotlib package
Figures and subplots ## ( ) [ ] { }
Plot types and styles ## , : . = ; ->
Pandas layers
## += -= *= /= **= //=
Applications ## %= @= <<= >>= &= |=
Time series
Moving window
## ^= ' " \ @ SPACE
Financial applications

© 2019 PyEcon.org
Arithmetic operators 35
Essential
concepts
Getting started All regular arithmetic operations involving numbers are possible:
Procedural
programming
Object-orientation Pocket calculator
Numerical 10 + 5
programming
NumPy package
100 - 20
Array basics 8 / 2
Linear algebra 4 * (10 + 20)
Data formats and 2**3
handling
Pandas package ## 15
Series ## 80
DataFrame
## 4.0
Import/Export data
## 120
Visual
illustrations ## 8
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers The result of dividing two integers is a floating point number,
Applications
Time series
The conventional rules apply: Parentheses first, then multiplication
Moving window and division, etc.,
Financial applications
The operator ** is used for exponentiation.

© 2019 PyEcon.org
Boolean operators 36
Essential
concepts
Getting started In order to demonstrate the use of logical operators (and formatted
Procedural
programming strings and for-loops), we create a handy table summarizing some
Object-orientation
important results from boolean algebra:
Numerical
programming
NumPy package Logical table
Array basics
Linear algebra # Create table head
Data formats and print("a b a and b a or b not a\n"
handling "--------------------------------")
Pandas package
Series
DataFrame # Loop through the rows
Import/Export data for a in [False, True]:
Visual for b in [False, True]:
illustrations
print(f"{a:1} {b:3} {a and b:6} {a or b:8} {not a:7}")
Matplotlib package
Figures and subplots ## a b a and b a or b not a
Plot types and styles
Pandas layers
## --------------------------------
## 0 0 0 0 1
Applications
Time series
## 0 1 0 1 1
Moving window ## 1 0 0 1 0
Financial applications ## 1 1 1 1 0

© 2019 PyEcon.org
Keywords and comments 37
Essential
concepts
Getting started The programmer explains the structure of his/her program to the
Procedural
programming interpreter via a restricted set of short commands, the keywords:
Object-orientation

Numerical Overview of keywords


programming
NumPy package ## and as assert break class continue
Array basics ## def del elif else except False
## finally for from global if import
Linear algebra

Data formats and


handling
## in is lambda None nonlocal not
Pandas package ## or pass raise return True try
Series ## while with yield
DataFrame
Import/Export data

Visual
There are two ways to make comments:
illustrations
Matplotlib package Provide some comments
Figures and subplots
Plot types and styles # Set variable to something - or nothing?
Pandas layers something = None
Applications
Time series """
Moving window
Financial applications
I am a docstring!
A multiline string comment hybrid.
I will be useful for describing classes and methods.
"""
© 2019 PyEcon.org
Data types 38
Essential
concepts
Getting started Python offers the following basic data types, which we will use in this
Procedural
programming course:
Object-orientation

Numerical Data type Description


programming
NumPy package
int() Integers
Array basics float() Floating point numbers
Linear algebra

Data formats and


str() Strings, i. e., unicode (UTF-8) texts
handling
bool() Boolean, i. e., True or False
Pandas package
Series list() List, an ordered array of objects
tuple()
DataFrame
Import/Export data
Tuple, an ordered, unmutable array of objects
Visual dict() Dictionary, an unordered, associative array of objects
set()
illustrations
Matplotlib package Set, an unordered array/set of objects
Figures and subplots
Plot types and styles
None() Nothing, emptyness, the void..
Pandas layers

Applications
Each data type has its own methods, that is, functions that are appli-
Time series cable specifically to an object of this type.
Moving window
Financial applications You will gradually get to know new and more complex data types or
object classes.
© 2019 PyEcon.org
Lists 39
Essential
concepts
Getting started A list is an ordered array of objects, accessible via an index:
Procedural
programming
Object-orientation Listing tech companies
stocks = ["Google", "Amazon", "Facebook", "Apple"]
Numerical
programming
NumPy package stocks[1]
Array basics
stocks.append("Twitter")
stocks.insert(2, "Microsoft")
Linear algebra

Data formats and


handling
stocks.sort()
Pandas package
## ['Google', 'Amazon', 'Facebook', 'Apple']
## Amazon
Series
DataFrame
Import/Export data ## ['Google', 'Amazon', 'Facebook', 'Apple', 'Twitter']
Visual ## ['Google', 'Amazon', 'Microsoft', 'Facebook', 'Apple', 'Twitter']
illustrations ## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft', 'Twitter']
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
The constructor for new lists is [ ],
Applications
Time series The first element has the index 0,
Moving window
Financial applications The data type list() possesses its own methods.

© 2019 PyEcon.org
Tuples 40
Essential
concepts
Getting started Tuples are immutable sequences related to lists that cannot be extended,
Procedural
programming for example. The drawbacks in flexibility are compensated by the
Object-orientation
advantages in speed and memory usage:
Numerical
programming
NumPy package Selecting elements in sequences
Array basics
Linear algebra lottery = (1, 8, 9, 12, 24, 28)
Data formats and len(lottery)
handling lottery[1:3]
lottery[:4]
Pandas package
Series
DataFrame lottery[-1]
Import/Export data lottery[-2:]
Visual
illustrations ## (1, 8, 9, 12, 24, 28)
Matplotlib package ## 6
Figures and subplots
## (8, 9)
Plot types and styles
Pandas layers
## (1, 8, 9, 12)
Applications
## 28
Time series ## (24, 28)
Moving window
Financial applications
The same operations are also supported when using lists.

© 2019 PyEcon.org
Dictionaries 41
Essential
concepts
Getting started Dictionaries are associative collections of key-value pairs. The key must
Procedural
programming be immutable and unique:
Object-orientation

Numerical
programming
Internet slang dictionary
NumPy package
slang = {"imho": "in my humble opinion",
"lol": "laughing out loud",
Array basics
Linear algebra

Data formats and


"tl;dr": "too long; didn’t read"}
handling slang["lol"]
Pandas package slang["gl&hl"] = "good luck & have fun"
Series
DataFrame
slang.keys()
Import/Export data slang.values()
Visual
illustrations
## {'imho': 'in...ion', 'lol': 'la...oud', 'tl;dr': 'to...ead'}
Matplotlib package ## laughing out loud
Figures and subplots ## good luck & have fun
Plot types and styles
## dict_keys(['imho', 'lol', 'tl;dr', 'gl&hl'])
## dict_values([... & have fun'])
Pandas layers

Applications
Time series
Moving window
Financial applications
The constructor for dict() is { } with :,
The pairs are unordered, iterable sequences.
© 2019 PyEcon.org
Sets 42
Essential
concepts
Getting started A set is an unordered collection of objects without duplicates:
Procedural
programming
Object-orientation Set operations
x = {"o", "n", "y", "t"}
Numerical
programming
NumPy package y = {"p", "h", "o", "n"}
Array basics
x & y
x | y
Linear algebra

Data formats and


handling
x - y
Pandas package
Series
## {'n', 't', 'o', 'y'}
DataFrame ## {'n', 'p', 'o', 'h'}
Import/Export data ## {'o', 'n'}
Visual ## {'t', 'n', 'o', 'y', 'h', 'p'}
illustrations
## {'t', 'y'}
Matplotlib package
Figures and subplots
Plot types and styles

The constructor for set() is { },


Pandas layers

Applications
Time series
Defines its own operators that overload existing ones.
Moving window
Financial applications
Empty set via set(), because {} already creates dict().

© 2019 PyEcon.org
Comparison operators 43
Essential
concepts
Getting started The <, <=, >, >=, ==, != operators compare the values of two objects
Procedural
programming and return True or False.
Object-orientation

Numerical Op. True, only if the value of the left operand is


programming
NumPy package
< less than the value of the right operand
Array basics <= less than or equal to the value of the right operand
Linear algebra

Data formats and


> greater than the value of the right operand
handling
>= greater than or equal to the value of the right operand
Pandas package
Series == equal to the right operand
DataFrame
Import/Export data
!= not equal to the right operand
Visual
illustrations The comparison depends on the datatype of the objects. For example
Matplotlib package
Figures and subplots
"7" == 7 will return False, while 7.0 == 7 will return True.
Plot types and styles
Pandas layers
Numbers are compared arithmetically.
Applications Strings are compared lexicographically.
Time series
Moving window Tuples and lists are compared lexicographically using comparison
Financial applications
of corresponding elements. This behaviour can be altered.

© 2019 PyEcon.org
Comparison operators 44
Essential
concepts
Getting started
Procedural Comparing examples
programming
Object-orientation x, y = 5, 8
Numerical print("x < y is", x < y)
programming
NumPy package
## x < y is True
Array basics
Linear algebra
print("x > y is", x > y)
Data formats and
handling
Pandas package ## x > y is False
Series
DataFrame print("x == y is", x == y)
Import/Export data

Visual ## x == y is False
illustrations
Matplotlib package
Figures and subplots
print("x != y is", x != y)
Plot types and styles
Pandas layers ## x != y is True
Applications
Time series print("This is", "Name" == "Name", "and not", "Name" == "name")
Moving window
Financial applications
## This is True and not False

Comparing strings, the case has to be considered.


© 2019 PyEcon.org
Chaining comparison operators 45
Essential
concepts
Getting started In Python, comparison operators can also be chained.
Procedural
programming
Object-orientation Chaining comparison examples
Numerical
programming
x = 5
NumPy package
Array basics 5 >= x > 4
Linear algebra

Data formats and ## True


handling
Pandas package
Series
12 < x < 20
DataFrame
Import/Export data ## False
Visual
illustrations 2 < x < 10
Matplotlib package
Figures and subplots
Plot types and styles
## True
Pandas layers

Applications
2 < x and x < 10 # unchained expression
Time series
Moving window ## True
Financial applications

The comparison is performed for both sides and combined by and.

© 2019 PyEcon.org
Logical operators 46
Essential
concepts
Getting started There are three logical operators: not, and, or.
Procedural
programming
Object-orientation Op. Description
Numerical not x Returns True only if x is False
programming
NumPy package x and y Returns True only if x and y are True
Array basics
Linear algebra
x or y Returns True only if x or y or both are True
Data formats and
handling
Pandas package Logical operators examples
Series
DataFrame x, y = 5, 8
Import/Export data

Visual (x == 5) and (y == 9)
illustrations

## False
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers (x == 5) or (y == 8)
Applications
Time series ## True
Moving window
Financial applications not(x == 4) or (y == 9)

## True
© 2019 PyEcon.org
Exclusive or 47
Essential
concepts
Getting started In some situations, you need a logical operation that is True only when
Procedural
programming the operands differ (one is True, the other is False). This task can
Object-orientation
be solved by using the logical operators not, and, or or simply !=.
Numerical
programming
NumPy package Exclusive or
x, y = 5, 8
Array basics
Linear algebra

Data formats and


handling ((x == 5) and not (y == 8)) or (not (x == 5) and (y == 8))
Pandas package
Series
## False
DataFrame
Import/Export data
x = 4
Visual
illustrations ((x == 5) and not (y == 8)) or (not (x == 5) and (y == 8))
Matplotlib package
Figures and subplots ## True
Plot types and styles
Pandas layers
(x == 5) != (y == 8)
Applications
Time series
## True
Moving window
Financial applications

In many other programming languages, an operation “exclusive or” or


xor is explicitly part of the language, but not in Python.
© 2019 PyEcon.org
Binary numbers 48
Essential
concepts
Getting started Bitwise operators operate on numbers, but instead of treating that
Procedural
programming number as if it were a single (decimal) value, they operate on the string
Object-orientation
of bits representation, written in binary. A binary number is a number
Numerical
programming expressed in the base-2 numeral system, also called binary numeral
NumPy package
Array basics
system, which consists of only two distinct symbols: typically 0 (zero)
Linear algebra and 1 (one).
Data formats and
handling
Pandas package
Binary numbers
## Decimal: Binary:
Series
DataFrame
Import/Export data ## 0: 0
Visual ## 1: 1
illustrations ## 2: 10
Matplotlib package
## 3: 11
Figures and subplots
Plot types and styles
## 4: 100
Pandas layers ## 5: 101
Applications ## 6: 110
Time series ## 7: 111
Moving window
## 8: 1000
Financial applications
## 9: 1001
## 10: 1010

© 2019 PyEcon.org
Binary numbers 49
Essential
concepts
Getting started How to convert binary numbers to integers (the unknown keywords and
Procedural
programming language structures will be introduced soon):
Object-orientation

Numerical
programming
Binary to integer
NumPy package
def bintoint(binary):
Array basics
Linear algebra
binary = binary[::-1]
Data formats and
num = 0
handling for i in range(len(binary)):
Pandas package
num += int(binary[i]) * 2**i
return num
Series
DataFrame
Import/Export data

Visual bintoint("1101001")
illustrations
Matplotlib package ## 105
Figures and subplots

int("1101001", 2)
Plot types and styles
Pandas layers
# compare with built-in function
Applications
Time series
## 105
Moving window
Financial applications

© 2019 PyEcon.org
Binary numbers 50
Essential
concepts
Getting started How to convert integers to binary numbers:
Procedural
programming
Object-orientation
Integers to binary
Numerical def inttobin(num):
programming
NumPy package
binary = ""
Array basics if num != 0:
Linear algebra
while num >= 1:
Data formats and if num % 2 == 0:
handling
Pandas package
binary += "0"
Series num = num / 2
DataFrame else:
Import/Export data
binary += "1"
Visual
illustrations
num = (num - 1) / 2
Matplotlib package else:
Figures and subplots binary = "0"
Plot types and styles
return binary[::-1]
Pandas layers
inttobin(105)
Applications

## '1101001'
Time series
Moving window
Financial applications
bin(105)[2:] # compare with built-in function

## '1101001'
© 2019 PyEcon.org
Bitwise operators 51
Essential
concepts
Getting started Python offers distinct bitwise operators. Some of them will be redefined
Procedural
programming entirely different by extensions, such as, e. g., vectorization.
Object-orientation

Numerical Bit. op. Description


programming
NumPy package
x >> y Returns x with the bits shifted to the left by y places
Array basics x << y Returns x with the bits shifted to the right by y places
Linear algebra

Data formats and


x&y Does a bitwise and
handling
Pandas package
x|y Does a bitwise or
Series ~x Returns the complement of x
x^y
DataFrame
Import/Export data Does a bitwise exclusive or
Visual
illustrations Bitwise operators
Matplotlib package
Figures and subplots a, b = 5, 7
Plot types and styles
c = a & b # bitwise and
Pandas layers

Applications
## a: 101
Time series ## b: 111
Moving window ## c: 101
Financial applications
print(c)

## 5
© 2019 PyEcon.org
Bitwise operators 52
Essential
concepts
Getting started
Procedural Bitwise operators
programming
Object-orientation a, b = 5, 7
Numerical c = a | b # bitwise or
programming
NumPy package
## a: 101
Array basics ## b: 111
Linear algebra
## c: 111
Data formats and
handling print(c)
Pandas package
Series ## 7
DataFrame
Import/Export data
a = 13
Visual
b = a << 2 # bitwise shift
illustrations
Matplotlib package
## a: 1101
Figures and subplots ## b: 110100
Plot types and styles
a, b = 35, 37
Pandas layers
c = a ^ b # bitwise exclusive or
Applications
Time series ## a: 100011
Moving window ## b: 100101
Financial applications
## c: 000110

© 2019 PyEcon.org
Control flow: Conditional statements 53
Essential
concepts
Getting started Python has only one kind of conditional statement – if-elif-else:
Procedural
programming
Object-orientation Computer data sizes
bytes = 100000000 / 8 # e.g. DSL 100000
Numerical
programming
NumPy package if bytes >= 1e9:
Array basics
print(f"{bytes/1e9:6.2f} GByte")
elif bytes >= 1e6:
Linear algebra

Data formats and


handling
print(f"{bytes/1e6:6.2f} MByte")
Pandas package elif bytes >= 1e3:
Series print(f"{bytes/1e3:6.2f} KByte")
DataFrame
Import/Export data
else:
print(f"{bytes:6.2f} Byte")
Visual
illustrations
Matplotlib package ## 12.50 MByte
Figures and subplots
Plot types and styles
Pandas layers Control flow structures may be nested in any order:
Applications
Time series Nestings
Moving window
Financial applications if a > 1:
if b > 2:
pass # special keyword for empty blocks
© 2019 PyEcon.org
Control flow: The for loop 54
Essential
concepts
Getting started In Python there exist two conventional program loops – for-in-else:
Procedural
programming
Object-orientation Total sum
Numerical
programming
numbers = [7, 3, 4, 5, 6, 15]
NumPy package
y = 0
Array basics for i in numbers:
Linear algebra
y += i
Data formats and print(f"The sum of ’numbers’ is {y}.")
handling
Pandas package
Series
## The sum of 'numbers' is 40.
DataFrame
Import/Export data
Lists or other collections can also be created dynamically:
Visual
illustrations
Matplotlib package Powers of 2
powers = [2 ** i for i in range(11)]
Figures and subplots
Plot types and styles
Pandas layers teacher = ["***", "**", "*"]
Applications grades = {star: len(teacher) - len(star) + 1 for star in teacher}
## [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]
Time series
Moving window
Financial applications ## {'***': 1, '**': 2, '*': 3}

© 2019 PyEcon.org
Control flow: continue and break 55
Essential
concepts
Getting started Loops can skip iterations (continue):
Procedural
programming
Object-orientation Continue the loop
Numerical
programming
for x in ["a", "b", "c"]:
NumPy package a = x.upper()
Array basics continue
Linear algebra
print(x)
Data formats and print(a)
handling
Pandas package
Series ## C
DataFrame
Import/Export data
Or a loop can be aborted instantly (break):
Visual
illustrations
Matplotlib package Breaking the habit
Figures and subplots
Plot types and styles y = 0
Pandas layers for i in [7, 3, 4, "x", 6, 15]:
Applications if not isinstance(i, int):
Time series break
Moving window
y += i
Financial applications
print(f"The total sum is {y}.")

## The total sum is 14.


© 2019 PyEcon.org
Control flow: The while loop 56
Essential
concepts
Getting started For loops where the number of iterations is not known at the beginning,
Procedural
programming you use while-else.
Object-orientation

Numerical
Have you already noticed the keyword else? Python only executes the
programming
NumPy package
branch if it was not terminated by break:
Array basics
Linear algebra Favorite lottery number
Data formats and
handling
import random
Pandas package n = 0
Series favorite = 7
DataFrame
Import/Export data
while n < 100:
n += 1
Visual
illustrations draw = random.randint(1, 49) # e.g. German lottery
Matplotlib package if draw == favorite:
Figures and subplots
print("Got my number! :)")
Plot types and styles
Pandas layers
break
Applications
else:
Time series print("My favorite did not show up! :(")
Moving window print(f"I tried {n} times!")
Financial applications
## Got my number! :)
## I tried 10 times!
© 2019 PyEcon.org
Functions 57
Essential
concepts
Getting started Functions are defined using the keyword def. The structure of function
Procedural
programming signature and body is specified by indentation, too:
Object-orientation

Numerical
programming
Drawing lottery numbers
NumPy package
def draw_sample(n, first=1, last=49):
numbers = list(range(first, last + 1))
Array basics
Linear algebra

Data formats and


sample = []
handling for i in range(n):
Pandas package ind = random.randint(0, len(numbers) - 1)
Series
DataFrame
sample.append(numbers.pop(ind))
Import/Export data sample.sort()
Visual
return sample
illustrations
Matplotlib package draw_sample(6)
Figures and subplots draw_sample(6, 80, 100)
Plot types and styles
draw_sample(3, first=5)
Pandas layers

Applications ## [2, 3, 4, 16, 23, 28]


Time series ## [82, 84, 94, 95, 99, 100]
Moving window
## [5, 12, 16]
Financial applications

© 2019 PyEcon.org
Functions 58
Essential
concepts
Getting started Functions are of type callable(), defined as closures, and can be
Procedural
programming created and used like other objects:
Object-orientation

Numerical Prime numbers


programming
NumPy package def primes(n):
Array basics numbers = [2]
Linear algebra

Data formats and def is_prime(num):


handling
Pandas package
for i in numbers:
Series if num % i == 0:
DataFrame return False
Import/Export data
return True
Visual
illustrations
if n == 2:
Matplotlib package return numbers
Figures and subplots for i in range(3, n + 1):
Plot types and styles
if is_prime(i):
Pandas layers
numbers.append(i)
Applications
Time series
return numbers
Moving window primes(50)
Financial applications

## [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

© 2019 PyEcon.org
Seems weird? We discuss namespaces in the next section.
Section 1.3 59
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Essential concepts
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Object-orientation
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Python is object-oriented 60
Essential
concepts
Getting started There are three widely known programming paradigms: procedural,
Procedural
programming functional and object-oriented programming (OOP). Python supports
Object-orientation
them all.
Numerical
programming
NumPy package You have learned how to handle predefined data types in Python.
Array basics
Linear algebra
Actually, we have already encountered classes and instances, take for
Data formats and
example dict().
handling
Pandas package In this section you will learn the basics of dealing with (your own)
Series
DataFrame
classes:
Import/Export data
1 References
Visual
illustrations 2 Classes
Matplotlib package
Figures and subplots 3 Instances
Plot types and styles
Pandas layers 4 Main principles
Applications
Time series 5 Garbage collection
Moving window
Financial applications OOP is a wide field and challenging for beginners. Don’t get discouraged
and, if you find deficits in yourself, read the literature.

© 2019 PyEcon.org
References 61
Essential
concepts
Getting started When you assign a variable, a reference to an object is set:
Procedural
programming
Object-orientation Equal but not identical
Numerical
programming a = ["Star", "Trek"]
NumPy package b = ["Star", "Trek"]
Array basics c = a
Linear algebra
a == b
Data formats and
handling
a == c
Pandas package a is b
Series a is c
DataFrame
Import/Export data ## ['Star', 'Trek']
Visual
## ['Star', 'Trek']
illustrations ## ['Star', 'Trek']
Matplotlib package
## True
Figures and subplots
Plot types and styles
## True
Pandas layers ## False
Applications ## True
Time series
Moving window
Financial applications
Two equal but not identical objects are created,
Variables a and c link to the same object.
© 2019 PyEcon.org
Copying objects 62
Essential
concepts
Getting started When we introduced lists, we initially did not mention that they are a
Procedural
programming first-class example of mutable objects:
Object-orientation

Numerical Collecting grades


programming
NumPy package grades = [1.7, 1.3, 2.7, 2.0]
Array basics
result = grades.append(1.0)
Linear algebra
result
Data formats and
handling grades
Pandas package finals = grades
Series finals.remove(2.7)
DataFrame
Import/Export data
finals
grades
Visual
illustrations
## None
Matplotlib package
Figures and subplots
## [1.7, 1.3, 2.7, 2.0, 1.0]
Plot types and styles ## [1.7, 1.3, 2.0, 1.0]
Pandas layers ## [1.7, 1.3, 2.0, 1.0]
Applications
Time series
Moving window
Financial applications
Modifications can be in-place – the object itself is modified.
Changing an object that is referenced several times could cause
(un)intended consequences.
© 2019 PyEcon.org
Side effects 63
Essential
concepts
Getting started In Python, arguments are passed by assignment, i. e., call-by-reference:
Procedural
programming
Object-orientation Side effects
Numerical def last_element(x):
programming
NumPy package
return x.pop(-1)
Array basics
Linear algebra
a = stocks
Data formats and last_element(a)
handling
Pandas package
a
Series
## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft', 'Twitter']
## Twitter
DataFrame
Import/Export data
## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft']
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles There are side effects,
Pandas layers

Applications
Referenced mutable objects might be modified,
Time series
Moving window
Referenced immutable objects might be copyied.
Financial applications

© 2019 PyEcon.org
Copying objects 64
Essential
concepts
Getting started We are able to make an exact copy of the object:
Procedural
programming
Object-orientation Copying
Numerical
programming def last_element(x):
NumPy package y = x.copy()
Array basics
return y.pop(-1)
Linear algebra

Data formats and


handling
a = stocks
Pandas package last_element(a)
Series a
DataFrame
Import/Export data ## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft']
Visual ## Microsoft
illustrations ## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft']
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
We receive a new object,
Applications
Time series The new object is not identical to the old one.
Moving window
Financial applications

© 2019 PyEcon.org
Deep and shallow copying 65
Essential
concepts
Getting started However, keep in mind that, in most cases, a method copy() will
Procedural
programming create shallow copys while only deep copying will duplicate also the
Object-orientation
contents of a mutable object with a complex structure:
Numerical
programming
NumPy package Cloning fast food
Array basics
Linear algebra fastfood = [["burgers", "hot dogs"], ["pizza", "pasta"]]
Data formats and italian = fastfood.copy()
handling italian.pop(0)
american = list(fastfood)
Pandas package
Series
DataFrame american.pop(1)
Import/Export data american[0] = american[0].copy()
Visual fastfood[0][1] = "chicken wings"
illustrations
fastfood[1][0] = "risotto"
Matplotlib package
Figures and subplots
italian
Plot types and styles american
Pandas layers
## [['risotto', 'pasta']]
Applications
Time series
## [['burgers', 'hot dogs']]
Moving window
Financial applications
Both approaches, copy() and list(), create new list objects con-
taining new references to the original sub-lists. But for a deep copy,
© 2019 PyEcon.org
you have to recursively create duplicates of all its objects.
Classes 66
Essential
concepts
Getting started In Python everything is an object and more complex objects consist of
Procedural
programming several other objects.
Object-orientation

Numerical In the OOP, we create objects according to patterns. These kinds of


programming
NumPy package blueprints are called classes and are characterized by two categories of
Array basics
Linear algebra
elements:
Data formats and
handling
Attributes:
Pandas package Variables that represent the properties of
Series
DataFrame an object, object attributes, or
Import/Export data

Visual a class, named class attributes.


illustrations
Matplotlib package Methods:
Figures and subplots
Plot types and styles
Functions that are defined within a class:
Pandas layers
(non-static) methods can access all attributes, while
Applications
Time series
static methods can only access class attributes.
Moving window
Financial applications
Every generated object is an instance of such a construction plan.

© 2019 PyEcon.org
Class definition 67
Essential
concepts
Getting started Specifically, we want to create “rectangle object” and define a separate
Procedural
programming Rectangle class for it:
Object-orientation

Numerical
programming
Rectangle class
NumPy package class Rectangle:
width = 0
Array basics
Linear algebra
height = 0
Data formats and
handling
Pandas package def area(self):
Series
return self.width * self.height
DataFrame
Import/Export data

Visual
myrectangle = Rectangle()
illustrations myrectangle.width = 10
Matplotlib package
myrectangle.height = 20
myrectangle.area()
Figures and subplots
Plot types and styles
Pandas layers

Applications
## 200
Time series
Moving window
Financial applications
New classes are defined using the keyword class,
The variable self always refers to the instance itself.
© 2019 PyEcon.org
Class constructor 68
Essential
concepts
Getting started We add a constructor (method) __init__(), that is called to initialize
Procedural
programming an object of Rectangle:
Object-orientation

Numerical
programming
Rectangle class with constructor
NumPy package class Rectangle:
width = 0
Array basics
Linear algebra
height = 0
Data formats and
handling
Pandas package def __init__(self, width, height):
Series
self.width = width
DataFrame
Import/Export data self.height = height
Visual
illustrations def area(self):
Matplotlib package
return self.width * self.height
myrectangle = Rectangle(15, 30)
Figures and subplots
Plot types and styles
Pandas layers myrectangle.area()
Applications
Time series ## 450
Moving window
Financial applications
In our example, we use the constructor to set the attributes. Methods
with names matching __fun__() have a special, standardized meaning
© 2019 PyEcon.org
in Python.
Class inheritance 69
Essential
concepts
Getting started One of the most important concepts of OOP is inheritance. A class
Procedural
programming inherits all attributes and methods of its parent class and can add new
Object-orientation
or overwrite existing ones:
Numerical
programming
NumPy package Square inherits Rectangle
Array basics
Linear algebra class Square(Rectangle):
Data formats and def __init__(self, length):
handling super().__init__(length, length)
Pandas package
Series
DataFrame def diagonal(self):
Import/Export data return (self.width**2 + self.height**2)**0.5
Visual mysquare = Square(15)
illustrations
Matplotlib package print(f"Area: {mysquare.area()}")
Figures and subplots
print(f"Diagonal length: {mysquare.diagonal():7.4f}")
Plot types and styles
Pandas layers ## Area: 225
Applications ## Diagonal length: 21.2132
Time series
Moving window
Financial applications The methods of the parent class, including the constructor, may be
referenced by super().
© 2019 PyEcon.org
Garbage collection 70
Essential
concepts
Getting started You do not have to worry about memory management in Python. The
Procedural
programming garbage collector will tidy up for you.
Object-orientation

Numerical If there are no more references to an object, it is automatically disposed


programming
NumPy package of by the garbage collector:
Array basics
Linear algebra
Garbage collection in action
Data formats and
handling class Dog:
Pandas package
def __del__(self):
print("Woof! The dogcatcher got me! Entering the void.. :(")
Series
DataFrame
Import/Export data # My old dog on a leash
Visual mydog = Dog()
illustrations # A new dog is born
Matplotlib package
Figures and subplots
newdog = Dog()
Plot types and styles # Using my leash for the new dog
Pandas layers mydog = newdog
Applications
Time series ## Woof! The dogcatcher got me! Entering the void.. :(
Moving window
Financial applications
The destructor __del__() is executed as the last act before an object
gets deleted.
© 2019 PyEcon.org
Namespaces 71
Essential
concepts
Getting started We have already come into contact with namenspaces in Python many
Procedural
programming times. These are hierarchically linked layers in which the references to
Object-orientation
objects are defined. A rough distinction is made between
Numerical
programming
NumPy package the global namespace, and
Array basics
Linear algebra
the local namespace.
Data formats and
handling
Pandas package The global namespace is the outermost environment whose references
Series
DataFrame
are known by all objects.
Import/Export data
On the other hand, locally defined references are only known in a local,
Visual
illustrations i. e., internal environment.
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Namespaces 72
Essential
concepts
Getting started Reference names from the local namespace mask the same names in
Procedural
programming an outer or in the global namespace:
Object-orientation

Numerical
programming
Namespaces
NumPy package
def multiplier(x):
x = 4 * x
Array basics
Linear algebra

Data formats and


return x
handling x = "OH"
Pandas package multiplier("AH")
Series
DataFrame
multiplier(x)
Import/Export data x
Visual ## OH
illustrations
## AHAHAHAH
Matplotlib package
Figures and subplots
## OHOHOHOH
Plot types and styles ## OH
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Namespaces 73
Essential
concepts
Getting started In fact, functions defined in Python are themselves objects that remem-
Procedural
programming ber and can access their own context where they were created. This
Object-orientation
concept comes from functional programming and is called closure:
Numerical
programming
NumPy package Closures
Array basics
Linear algebra
def gen_multiplier(a):
Data formats and
def fun(x):
handling return a * x
Pandas package
return fun
Series
DataFrame
Import/Export data multi1 = gen_multiplier(4)
Visual multi2 = gen_multiplier(5)
illustrations multi1
Matplotlib package
Figures and subplots
multi1("EH")
Plot types and styles multi2("EH")
Pandas layers
## <function gen_multiplier.<locals>.fun at 0x7fe838606f28>
Applications ## EHEHEHEH
## EHEHEHEHEH
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Managing code 74
Essential
concepts
Getting started In order to provide, maintain and extend modular functionality with
Procedural
programming Python, its code containing components can be described hierarchically:
Object-orientation

Numerical
programming Packages
NumPy package
Array basics
Linear algebra Modules
Data formats and
handling
Pandas package
Classes
Series
DataFrame Functions
Import/Export data

Visual The organization in Python is very straightforward and is based on the


illustrations
Matplotlib package local namespaces mentioned before.
Figures and subplots
Plot types and styles When you download and use new packages, such as NumPy for numer-
Pandas layers
ical programming in the next chapter, the packages are loaded and the
Applications
Time series
namespaces initialized.
Moving window
Financial applications
The development of custom packages is an advanced topic and not
essential for a reasonable code structure of small projects, as it is in
other programming languages.
© 2019 PyEcon.org
Importing modules 75
Essential
concepts
Getting started Modules provide classes and functions via namespaces. It is Python
Procedural
programming code that is executed in a local namespace and whose classes and
Object-orientation
functions you can import. Basically, there are the following alternatives
Numerical
programming how to import from an module:
NumPy package
Array basics
Linear algebra
Import statements
Data formats and import datetime
handling
Pandas package
import datetime as dt
Series from datetime import date, timedelta
DataFrame from datetime import *
Import/Export data

Visual dt.date.today()
illustrations
Matplotlib package
dt.timedelta.days
Figures and subplots
Plot types and styles date.today()
timedelta.days
Pandas layers

Applications
Time series
Moving window
datetime.now()
Financial applications

In the latter case, all classes and functions, but no instances, are
imported from the datetime namespace.
© 2019 PyEcon.org
Build-in modules 76
Essential
concepts
Getting started A Python installation ships with a standard library consisting of built-
Procedural
programming in modules. These modules provide standardized solutions for many
Object-orientation
problems that occur in everyday programming - “batteries included”.
Numerical
programming For example, they provide access to system functionality such as file
NumPy package
Array basics
management. The Python Docs give an overview of all build-in modules.
Linear algebra

Data formats and Usage of build-in modules


handling
Pandas package import math
Series from random import randint
DataFrame
Import/Export data
math.pi
Visual
illustrations
Matplotlib package ## 3.141592653589793
Figures and subplots
Plot types and styles math.factorial(5)
Pandas layers

Applications ## 120
Time series
Moving window
Financial applications
randint(10, 20)

## 18

© 2019 PyEcon.org
Installing modules 77
Essential
concepts
Getting started Often you might want to use extended functionality. Python has a large
Procedural
programming and active community of users who make their developments publicly
Object-orientation
available under open source license terms. Packages are containers of
Numerical
programming modules which can be imported and used within your Python code.
NumPy package
Array basics These third-party packages can be installed comfortably by using the
Linear algebra
(command line) package manager pip. The Python Package Index
Data formats and
handling provides an overview of the thousands of packages available. Basic
Pandas package
Series
commands for maintaining, for example, the installation of the package
DataFrame “numpy”:
Import/Export data

Visual Installing the package: pip install numpy


illustrations
Matplotlib package Upgrading the package: pip install –upgrade numpy
Figures and subplots
Plot types and styles Installing the package locally for the current user:
pip install –user numpy
Pandas layers

Applications
Time series
Uninstalling the package: pip uninstall numpy
Moving window
Financial applications

© 2019 PyEcon.org
Installing modules 78
Essential
concepts
Getting started Example: OpenCV is a package for image processing in Python. Here
Procedural
programming you can see how the installation proceeds in a Unix terminal.
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Writing modules 79
Essential
concepts
Getting started Your Python projects will become complex and you will need to main-
Procedural
programming tain the codes properly. Therefore, one can break a large, unwieldy
Object-orientation
programming task into separate, more manageable modules. Modules
Numerical
programming can be written in Python itself or in C, but here we keep focussing on
NumPy package
Array basics
the Python language.
Linear algebra
Creating modules in Python is very straightforward - a Python module
Data formats and
handling is a file containing Python code, for example:
Pandas package
Series
DataFrame s = "Hello world!"
Import/Export data
l = [1, 2, 3, 5, 5]
Visual
illustrations
Matplotlib package
Figures and subplots
def add_one(n):
Plot types and styles return n + 1
Pandas layers

Applications
Time series File: mymodule.py
Moving window
Financial applications

© 2019 PyEcon.org
Working with modules 80
Essential
concepts
Getting started If you import the module mymodule, the interpreter looks in the
Procedural
programming current working directory for a file mymodule.py, reads and interprets
Object-orientation
its contents and makes its namespace available:
Numerical
programming
NumPy package Usage of own modules
Array basics
Linear algebra import mymodule
Data formats and mymodule.s
handling mymodule.l
Pandas package
mymodule.add_one(5)
Series
DataFrame
## Hello world!
## [1, 2, 3, 5, 5]
Import/Export data

Visual
illustrations
## 6
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Python packages 81
Essential
concepts
Getting started Large projects could require more than one module. Packages allow
Procedural
programming to structure the modules and their namespaces hierarchically by using
Object-orientation
the dot notation. They are simple folders containing modules and
Numerical
programming (sub-)packages. Consider the following structure:
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations The directory mypackage contains two modules which we can import
Matplotlib package
Figures and subplots separately:
Plot types and styles
Pandas layers
Usage of own package
Applications
Time series import mypackage.mymodule
Moving window import mypackage.somemodule
Financial applications
mypackage.mymodule.add_one(4)
## 5
© 2019 PyEcon.org
Package initialization 82
Essential
concepts
Getting started If a package directory contains a file __init__.py, its code is invoked
Procedural
programming when the package gets imported. The directory mypackage, now,
Object-orientation
contains the two modules and the initialization file:
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots The file __init__.py can be empty but can also be used for package
Plot types and styles
Pandas layers
initialization purposes.
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
The Zen of Python 83
Essential
concepts
Getting started
Procedural The Zen of Python
programming
Object-orientation import this
Numerical
programming
## The Zen of Python, by Tim Peters
NumPy package ##
Array basics ##
Linear algebra
## Beautiful is better than ugly.
Data formats and
handling
## Explicit is better than implicit.
Pandas package
## Simple is better than complex.
Series ## Complex is better than complicated.
DataFrame
## Flat is better than nested.
## Sparse is better than dense.
Import/Export data

Visual
illustrations
## Readability counts.
Matplotlib package ## Special cases aren't special enough to break the rules.
Figures and subplots ## Although practicality beats purity.
Plot types and styles
Pandas layers
## Errors should never pass silently.
## Unless explicitly silenced.
Applications
Time series
## In the face of ambiguity, refuse the temptation to guess.
Moving window ## ...
Financial applications

© 2019 PyEcon.org
Further topics 84
Essential
concepts
Getting started A selection of exciting topics that are among the advanced basics but
Procedural
programming are not covered in this lecture:
Object-orientation

Numerical
programming
Dynamic language concepts, such as duck typing,
NumPy package
Array basics
Further, complex type classes, such as ChainMap or OrderedDict,
Linear algebra
Iterators and generators in detail,
Data formats and
handling
Pandas package
Exception handling, raising exceptions, catching errors,
Series
DataFrame
Debugging, introspection and annotations.
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Chapter 2 85
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Numerical programming
Numerical
programming
NumPy package
Array basics
2.1 NumPy package
Linear algebra

Data formats and


2.2 Array basics
handling
Pandas package 2.3 Linear algebra
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Section 2.1 86
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Numerical programming
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I NumPy package
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
The NumPy package 87
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series The Numerical Python package NumPy provides efficient tools for sci-
DataFrame
Import/Export data entific computing and data analysis:
Visual
illustrations
np.array(): Multidimensional array capable of doing fast and
Matplotlib package efficient computations,
Figures and subplots
Plot types and styles Built-in mathematical functions on arrays without writing loops,
Pandas layers

Applications Built-in linear algebra functions.


Time series
Moving window
Financial applications Import NumPy
import numpy as np

© 2019 PyEcon.org
Motivation 88
Essential
concepts
Getting started
Procedural Element-wise addition
programming
Object-orientation vec1 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
Numerical vec2 = np.array(vec1)
programming vec1 + vec1
NumPy package
Array basics
Linear algebra
## [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Data formats and
handling
vec2 + vec2
Pandas package
Series ## array([ 2, 4, 6, 8, 10, 12, 14, 16, 18])
DataFrame
Import/Export data
for i in range(len(vec1)):
Visual vec1[i] += vec1[i]
illustrations
Matplotlib package
vec1
Figures and subplots
Plot types and styles ## [2, 4, 6, 8, 10, 12, 14, 16, 18]
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Motivation 89
Essential
concepts
Getting started
Procedural Matrix multiplication
programming
Object-orientation mat1 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Numerical mat2 = np.array(mat1)
programming
np.dot(mat2, mat2)
NumPy package
Array basics
Linear algebra ## array([[ 30, 36, 42],
Data formats and
## [ 66, 81, 96],
handling ## [102, 126, 150]])
Pandas package
Series
mat3 = np.zeros([3, 3])
DataFrame
Import/Export data for i in range(3):
Visual
for k in range(3):
illustrations for j in range(3):
Matplotlib package
mat3[i][k] = mat3[i][k] + mat1[i][j] * mat1[j][k]
mat3
Figures and subplots
Plot types and styles
Pandas layers

Applications
## array([[ 30., 36., 42.],
Time series ## [ 66., 81., 96.],
Moving window ## [102., 126., 150.]])
Financial applications

© 2019 PyEcon.org
Motivation 90
Essential
concepts
Getting started
Procedural Time comparison
programming
Object-orientation import time
Numerical mat1 = np.random.rand(50, 50)
programming mat2 = np.array(mat1)
t = time.time()
NumPy package
Array basics
Linear algebra mat3 = np.dot(mat2, mat2)
Data formats and nptime = time.time() - t
handling mat3 = np.zeros([50, 50])
Pandas package
Series
t = time.time()
DataFrame for i in range(50):
Import/Export data for k in range(50):
Visual for j in range(50):
illustrations
mat3[i][k] = mat3[i][k] + mat1[i][j] * mat1[j][k]
pytime = time.time() - t
Matplotlib package
Figures and subplots
Plot types and styles times = str(pytime / nptime)
Pandas layers print("NumPy is " + times + " times faster!")
Applications
Time series ## NumPy is 17.29180230837526 times faster!
Moving window
Financial applications

© 2019 PyEcon.org
Section 2.2 91
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Numerical programming
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Array basics
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Creating NumPy arrays 92
Essential
concepts
Getting started np.array(list): Converts python list into NumPy arrays.
Procedural
programming array.ndim: Returns Dimension of the array.
Object-orientation
array.shape: Returns shape of the array as a list.
Numerical
programming
NumPy package Creation
Array basics
Linear algebra arr1 = [4, 8, 2]
Data formats and
arr1 = np.array(arr1)
handling arr2 = np.array([24.3, 0., 8.9, 4.4, 1.65, 45])
Pandas package
arr3 = np.array([[4, 8, 5], [9, 3, 4], [1, 0, 6]])
arr1.ndim
Series
DataFrame
Import/Export data

Visual
## 1
illustrations
Matplotlib package arr3.shape
Figures and subplots
Plot types and styles
Pandas layers
## (3, 3)
Applications
Time series
Moving window From now on, the name array refers to an np.array().
Financial applications

© 2019 PyEcon.org
Array creation functions 93
Essential
concepts
Getting started np.arange(start, stop, step): Creates vector of values from start
Procedural
programming to stop with step width step.
Object-orientation
np.zeros((rows, columns)): Creates array with all values set to 0.
Numerical
programming np.identity(n): Creates identity matrix of dimension n.
NumPy package
Array basics
Linear algebra
Creation functions
Data formats and np.zeros((4, 3))
handling
Pandas package
Series
## array([[0., 0., 0.],
DataFrame ## [0., 0., 0.],
Import/Export data ## [0., 0., 0.],
Visual ## [0., 0., 0.]])
illustrations
Matplotlib package
np.arange(6)
Figures and subplots
Plot types and styles
Pandas layers ## array([0, 1, 2, 3, 4, 5])
Applications
Time series
np.identity(3)
Moving window
Financial applications ## array([[1., 0., 0.],
## [0., 1., 0.],
## [0., 0., 1.]])
© 2019 PyEcon.org
Array creation functions 94
Essential
concepts
Getting started np.linspace(start, stop, n): Creates vector of n evenly divided
Procedural
programming values from start to stop.
Object-orientation
np.full((row, column), k): Creates array with all values set to k.
Numerical
programming
NumPy package Array creation
Array basics
Linear algebra np.linspace(0, 80, 5)
Data formats and
handling ## array([ 0., 20., 40., 60., 80.])
Pandas package
Series
DataFrame
np.full((5, 4), 7)
Import/Export data
## array([[7, 7, 7, 7],
Visual
illustrations ## [7, 7, 7, 7],
Matplotlib package ## [7, 7, 7, 7],
Figures and subplots
## [7, 7, 7, 7],
Plot types and styles
Pandas layers ## [7, 7, 7, 7]])
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Array creation functions 95
Essential
concepts
Getting started np.random.rand(rows, columns): Creates array of random floats
Procedural
programming between zero and one.
Object-orientation
np.rondom.randint(k, size=(rows, columns)): Creates array of
Numerical
programming random integers between 0 and k-1.
NumPy package
Array basics
Linear algebra
Array of random numbers
Data formats and np.random.rand(3, 3)
handling
Pandas package
Series
## array([[0.01014591, 0.55955228, 0.48103055],
DataFrame ## [0.30368877, 0.99078572, 0.61537046],
Import/Export data
## [0.83572553, 0.45976471, 0.63241975]])
Visual
illustrations
np.random.randint(10, size=(5, 4))
Matplotlib package
Figures and subplots
Plot types and styles ## array([[7, 9, 7, 8],
Pandas layers ## [0, 6, 7, 5],
Applications ## [7, 3, 4, 7],
Time series ## [9, 4, 4, 8],
Moving window
## [8, 0, 6, 1]])
Financial applications

© 2019 PyEcon.org
Copy arrays 96
Essential
concepts
Getting started
Procedural Reference
programming
Object-orientation arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package ## [9, 3, 4],
Array basics
## [1, 0, 6]])
Linear algebra

Data formats and


handling
arr = arr3
Pandas package arr[1, 1] = 777
Series arr3
DataFrame
Import/Export data
## array([[ 4, 8, 5],
Visual
illustrations
## [ 9, 777, 4],
Matplotlib package ## [ 1, 0, 6]])
Figures and subplots
Plot types and styles arr3[1, 1] = 3
Pandas layers

Applications
Time series
Moving window
call-by-reference
Financial applications
arr = arr3 binds arr to the existing arr3. They both refer to the
same object.
© 2019 PyEcon.org
Copy array 97
Essential
concepts
Getting started array.copy(): Copies an array without reference (call-by-value).
Procedural
programming
Object-orientation

Numerical Copy Reference


programming
NumPy package arr3 arr3
Array basics
Linear algebra
## array([[4, 8, 5], ## array([[4, 8, 5],
Data formats and ## [9, 3, 4], ## [9, 3, 4],
handling
## [1, 0, 6]]) ## [1, 0, 6]])
Pandas package
Series
DataFrame arr = arr3.copy() arr = arr3
Import/Export data
arr[1, 1] = 777 arr[1, 1] = 777
Visual arr3 arr3
illustrations
Matplotlib package
## array([[4, 8, 5], ## array([[ 4, 8, 5],
Figures and subplots
Plot types and styles ## [9, 3, 4], ## [ 9, 777, 4],
Pandas layers
## [1, 0, 6]]) ## [ 1, 0, 6]])
Applications
Time series arr3[1, 1] = 3
Moving window
Financial applications

© 2019 PyEcon.org
Overview: Array creation functions 98
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Function Description
Numerical
programming array Convert input array in NumPy array
NumPy package
Array basics arange(start,stop,step) Creates array from given input
Linear algebra
ones Creates array containing only ones
Data formats and
handling zeros Creates array containing only zeros
Pandas package
Series
empty Allocating memory without specific values
DataFrame eye, identity Creates N x N identity matrix
Import/Export data

Visual
linspace Creats array of evenly divided values
illustrations full Creates array with values set to one number
Matplotlib package
Figures and subplots random.rand Creates array of random floats
Plot types and styles
Pandas layers
random.randint Creates array of random int
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Data types of arrays 99
Essential
concepts
Getting started array.dtype: Returns the type of array.
Procedural
programming array.astype(np.type): Conducts a manual typecast.
Object-orientation

Numerical
programming
Data types
NumPy package arr1.dtype
Array basics
Linear algebra
## dtype('int64')
Data formats and
handling
Pandas package arr2.dtype
Series
DataFrame ## dtype('float64')
Import/Export data

Visual arr1 = arr1 * 2.5


arr1.dtype
illustrations
Matplotlib package
Figures and subplots
Plot types and styles ## dtype('float64')
Pandas layers

Applications arr1 = (arr1 / 2.5).astype(np.int64)


Time series arr1.dtype
Moving window
Financial applications
## dtype('int64')

© 2019 PyEcon.org
Array operations 100
Essential
concepts
Getting started
Procedural
Element-wise operations
programming
Object-orientation
Calculation operators on NumPy arrays operate element-wise.
Numerical
programming
NumPy package
Array basics
Element-wise operations
Linear algebra
arr3
Data formats and
handling
## array([[4, 8, 5],
Pandas package
Series
## [9, 3, 4],
DataFrame ## [1, 0, 6]])
Import/Export data

Visual arr3 + arr3


illustrations
Matplotlib package
## array([[ 8, 16, 10],
Figures and subplots
Plot types and styles ## [18, 6, 8],
Pandas layers ## [ 2, 0, 12]])
Applications
Time series arr3**2
Moving window
Financial applications
## array([[16, 64, 25],
## [81, 9, 16],
## [ 1, 0, 36]])
© 2019 PyEcon.org
Array operations 101
Essential
concepts
Getting started
Procedural Matrix multiplication
programming
Object-orientation
Operator * applied on arrays does not do the matrix multiplication.
Numerical
programming
NumPy package
Array basics
Element-wise operations
Linear algebra
arr3 * arr3
Data formats and
handling
Pandas package
## array([[16, 64, 25],
Series ## [81, 9, 16],
DataFrame ## [ 1, 0, 36]])
Import/Export data

Visual arr = np.ones((3, 2))


illustrations
Matplotlib package
arr
Figures and subplots
Plot types and styles ## array([[1., 1.],
Pandas layers ## [1., 1.],
Applications ## [1., 1.]])
Time series
Moving window arr3 * arr # not defined for element-wise multiplication
## ValueError: operands could not be broadcast together
Financial applications

© 2019 PyEcon.org
Integer indexing 102
Essential
concepts
Getting started array[index]: Selects the value at position index from the data.
Procedural
programming
Object-orientation Indexing with an integer
Numerical
programming arr = np.arange(10)
NumPy package arr
Array basics
Linear algebra
## array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Data formats and
handling
Pandas package
arr[4]
Series
DataFrame ## 4
Import/Export data

Visual arr[-1]
illustrations
Matplotlib package
## 9
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Slicing 103
Essential
concepts
Getting started array[start : stop : step]: Selects a subset of the data.
Procedural
programming
Object-orientation Slicing in one dimension
Numerical
programming arr = np.arange(10)
NumPy package arr
Array basics
Linear algebra
## array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Data formats and
handling
Pandas package
arr[3:7]
Series
DataFrame ## array([3, 4, 5, 6])
Import/Export data

Visual arr[1:]
illustrations
Matplotlib package
## array([1, 2, 3, 4, 5, 6, 7, 8, 9])
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Slicing 104
Essential
concepts
Getting started
Procedural Slicing in one dimension with steps
programming
Object-orientation arr[:7]
Numerical
programming ## array([0, 1, 2, 3, 4, 5, 6])
NumPy package

arr[-3:]
Array basics
Linear algebra

Data formats and


handling ## array([7, 8, 9])
Pandas package
Series arr[::-1]
DataFrame
Import/Export data
## array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
Visual

arr[::2]
illustrations
Matplotlib package
Figures and subplots
Plot types and styles ## array([0, 2, 4, 6, 8])
Pandas layers

Applications arr[:5:-1]
Time series
Moving window ## array([9, 8, 7, 6])
Financial applications

© 2019 PyEcon.org
Slicing 105
Essential
concepts
Getting started
Procedural Slicing in higher dimensions
programming
Object-orientation
In n-dimensional arrays the element at each index is an
Numerical
programming (n − 1)-dimensional array.
NumPy package
Array basics
Linear algebra Indexing rows
Data formats and
handling arr3
Pandas package
Series ## array([[4, 8, 5],
## [9, 3, 4],
DataFrame
Import/Export data
## [1, 0, 6]])
Visual
illustrations
Matplotlib package vec = arr3[1]
Figures and subplots vec
Plot types and styles

## array([9, 3, 4])
Pandas layers

Applications
Time series
Moving window
arr3[-1]
Financial applications
## array([1, 0, 6])

© 2019 PyEcon.org
Slicing 106
Essential
concepts
Getting started
Procedural Slicing in two dimensions
programming
Object-orientation arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package
## [9, 3, 4],
Array basics
Linear algebra
## [1, 0, 6]])
Data formats and
handling arr3[0:2, 0:2]
Pandas package
Series ## array([[4, 8],
DataFrame
## [9, 3]])
Import/Export data

Visual
illustrations
arr3[2:, :]
Matplotlib package
Figures and subplots ## array([[1, 0, 6]])
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Slicing 107
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

Figure: Python for Data Analysis (2017) on page 99

© 2019 PyEcon.org
Views on arrays 108
Essential
concepts
Getting started So far, selecting by index numbers or slicing belongs to basic indexing
Procedural
programming in NumPy. With basic indexing you get NO COPY of your data but a
Object-orientation
so-called view on the existing data set – a different perspective.
Numerical
programming A view on an array can be seen as a reference to a rectangular memory
NumPy package
Array basics
area of its values. The view is intended to
Linear algebra
edit a rectangular part of a matrix, e. g., a sub-matrix, a column,
Data formats and
handling or a single value,
Pandas package
Series change the shape of the matrix or the arrangement of its elements,
DataFrame
Import/Export data
e. g., transpose or reshape a matrix,
Visual
illustrations
change the visual representation of values, e. g., to cast a float
Matplotlib package array into an int array,
Figures and subplots
Plot types and styles map the values in other program areas.
Pandas layers

Applications The crucial point here is that for efficiency reasons data arrays in your
Time series
working memory do not have to be copied again and again for simple
Moving window
Financial applications index operations, which would require an excessive additional effort
writing to the computer memory.

© 2019 PyEcon.org
Creating views implicitly 109
Essential
concepts
Getting started A view is created automatically when you do basic indexing such as
Procedural
programming slicing:
Object-orientation

Numerical
programming
Create a view by slicing
NumPy package
column = arr3[:, 1]
column
Array basics
Linear algebra

Data formats and


handling ## array([8, 3, 0])
Pandas package
Series column.base
DataFrame
Import/Export data
## array([[4, 8, 5],
Visual
illustrations
## [9, 3, 4],
Matplotlib package ## [1, 0, 6]])
Figures and subplots
Plot types and styles column[1] = 100
Pandas layers
arr3
Applications
Time series
## array([[ 4, 8, 5],
## [ 9, 100, 4],
Moving window
Financial applications
## [ 1, 0, 6]])

© 2019 PyEcon.org
Creating views implicitly 110
Essential
concepts
Getting started
Procedural Create a view by slicing
programming
Object-orientation elem = column[1:2]
Numerical elem.base
programming
NumPy package
Array basics
## array([[ 4, 8, 5],
Linear algebra ## [ 9, 100, 4],
Data formats and
## [ 1, 0, 6]])
handling
Pandas package elem[0] = 3
Series
arr3
DataFrame
Import/Export data
## array([[4, 8, 5],
Visual
illustrations ## [9, 3, 4],
Matplotlib package ## [1, 0, 6]])
Figures and subplots
Plot types and styles
Pandas layers

Applications The middle column is a view of the base array referenced by arr3,
Time series
Moving window Any changes to the values of a view directly affect the base data,
Financial applications
A view of a view is another view on the same base matrix.

© 2019 PyEcon.org
Obtaining views explicitly 111
Essential
concepts
Getting started In addition, an array contains methods and attributes that return a
Procedural
programming view of its data:
Object-orientation

Numerical Obtain a view


programming
NumPy package
arr3_t = arr3.T
Array basics arr3_t
Linear algebra

Data formats and ## array([[4, 9, 1],


handling
## [8, 3, 0],
Pandas package
Series
## [5, 4, 6]])
DataFrame
Import/Export data arr3_t.flags.owndata
Visual
illustrations ## False
Matplotlib package
Figures and subplots
Plot types and styles
arr3_r = arr3.reshape(1, 9)
Pandas layers arr3_r
Applications
Time series
## array([[4, 8, 5, 9, 3, 4, 1, 0, 6]])
Moving window
Financial applications arr3_t.flags.owndata

## False
© 2019 PyEcon.org
Obtaining views explicitly 112
Essential
concepts
Getting started
Procedural Obtain a view
programming
Object-orientation arr3_v = arr3.view()
Numerical arr3_v.flags.owndata
programming
NumPy package ## False
Array basics
Linear algebra

Data formats and


handling The transposed matrix is a predefined view that is available as an
Pandas package
Series
attribute,
DataFrame
Import/Export data
Reshaping is also just another way of looking at the same set of
Visual data,
illustrations
Matplotlib package By means of the method view() you create a view with an identical
Figures and subplots
Plot types and styles
representation.
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Fancy indexing 113
Essential
concepts
Getting started The behavior described above changes with advanced indexing, i. e., if
Procedural
programming at least one component of the index tuple is not a scalar index number
Object-orientation
or slice. The case of fancy indexing is described below:
Numerical
programming
NumPy package Advanced and basic indexing
Array basics
Linear algebra arr3
Data formats and
handling ## array([[4, 8, 5],
Pandas package ## [9, 3, 4],
Series
DataFrame
## [1, 0, 6]])
Import/Export data

Visual
arr = arr3[[0, 2], [0, 2]]
illustrations arr
Matplotlib package
Figures and subplots
## array([4, 6])
Plot types and styles
Pandas layers
arr.base
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Fancy indexing 114
Essential
concepts
Getting started
Procedural
Advanced and basic indexing
programming
Object-orientation arr = arr3[0:3:2, 0:3:2]
Numerical arr
programming
NumPy package
## array([[4, 5],
Array basics
Linear algebra
## [1, 6]])
Data formats and
handling arr.base
Pandas package
Series ## array([[4, 8, 5],
DataFrame
Import/Export data
## [9, 3, 4],
## [1, 0, 6]])
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Contrary to intuition, fancy indexing does not return a (2 × 2)-
Pandas layers
matrix, but a vector of the matrix elements (0, 0) and (2, 2). This
Applications
Time series
is a complete copy – a new object and not a view to the original
Moving window matrix.
Financial applications

A submatrix (view) with the corner elements of the initial matrix


can be obtained with slicing.
© 2019 PyEcon.org
Boolean arrays 115
Essential
concepts
Getting started A boolean array is a NumPy array with boolean True and False values.
Procedural
programming Such an array can be created by applying a comparison operator on
Object-orientation
NumPy arrays.
Numerical
programming
NumPy package Boolean arrays
Array basics
Linear algebra bool_arr = (arr3 < 5)
Data formats and bool_arr
handling
Pandas package
## array([[ True, False, False],
## [False, True, True],
Series
DataFrame
Import/Export data ## [ True, True, False]])
Visual
illustrations bool_arr1 = (arr3 == 0)
Matplotlib package
bool_arr1
Figures and subplots
Plot types and styles
Pandas layers
## array([[False, False, False],
Applications
## [False, False, False],
Time series ## [False, True, False]])
Moving window
Financial applications
The comparison operators on arrays can be combined by means of
NumPy redefined bitwise operators.
© 2019 PyEcon.org
Boolean arrays 116
Essential
concepts
Getting started
Procedural Boolean arrays and bitwise operators
programming
Object-orientation a = np.array([3, 8, 4, 1, 9, 5, 2])
Numerical b = np.array([2, 3, 5, 6, 11, 15, 17])
programming
c = (a % 2 == 0) | (b % 3 == 0) # or
NumPy package
Array basics
c
Linear algebra

Data formats and


## array([False, True, True, True, False, True, True])
handling
Pandas package d = (a > b) ^ (a % 2 == 1) # exclusive or
Series
d
DataFrame
Import/Export data
## array([False, True, False, True, True, True, False])
Visual
illustrations
Matplotlib package c ^ d # exclusive or
Figures and subplots
Plot types and styles
## array([False, False, True, False, True, False, True])
Pandas layers

Applications
Time series
Moving window
Boolean arrays
Financial applications
Logical operations on NumPy arrays work in a similar way compared
to bitwise operators.
© 2019 PyEcon.org
Indexing with boolean arrays 117
Essential
concepts
Getting started Boolean arrays can be used to select elements of other NumPy arrays.
Procedural
programming If x is an array and y is a boolean array of the same dimension, then
Object-orientation
a[b] selects all the elements of x, for which the correspanding value (at
Numerical
programming the same position) of y is True.
NumPy package
Array basics
Linear algebra
Indexing with boolean arrays
Data formats and arr3
handling

## array([[4, 8, 5],
Pandas package
Series
DataFrame ## [9, 3, 4],
Import/Export data ## [1, 0, 6]])
Visual
illustrations y = arr3 % 2 == 0
y
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers ## array([[ True, True, False],
Applications ## [False, False, True],
Time series ## [False, True, True]])
Moving window
Financial applications
arr3[y]

## array([4, 8, 4, 0, 6])
© 2019 PyEcon.org
Conditional indexing 118
Essential
concepts
Getting started Conditional indexing allows you using boolean arrays to select subsets
Procedural
programming of values and to avoid loops. Applying comparison operator on arrays,
Object-orientation
every element of the array is tested, if it corresponds to the logical
Numerical
programming condition. Consider an application setting all even numbers to 5:
NumPy package
Array basics
Linear algebra
Find and replace values in arrays
Data formats and a, b = arr3.copy(), arr3.copy()
handling
for i in range(a.shape[0]):
Pandas package
Series
for j in range(a.shape[1]):
DataFrame if a[i, j] % 2 == 0:
Import/Export data a[i, j] = 5
Visual
illustrations
Matplotlib package
b[b % 2 == 0] = 5
Figures and subplots b
Plot types and styles
Pandas layers ## array([[5, 5, 5],
Applications ## [9, 3, 5],
Time series
## [1, 5, 5]])
Moving window
Financial applications
np.allclose(a, b)

## True
© 2019 PyEcon.org
Conditional indexing 119
Essential
concepts
Getting started
Procedural Find and replace values in arrays, condition: equal
programming
Object-orientation arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package
## [9, 3, 4],
Array basics
Linear algebra
## [1, 0, 6]])
Data formats and
handling arr = arr3.copy()
Pandas package arr[arr == 4] = 100
Series
arr
DataFrame
Import/Export data
## array([[100, 8, 5],
Visual
illustrations ## [ 9, 3, 100],
Matplotlib package ## [ 1, 0, 6]])
Figures and subplots
Plot types and styles
Pandas layers

Applications In this example, arr == 4 creates a boolean array as described


Time series
Moving window
before which is then used to index the array arr.
Financial applications
Finally, every element of arr which is marked True according to
the boolean index array will be set to 100.
© 2019 PyEcon.org
Best practice: Indexing arrays 120
Essential
concepts
Getting started Step 1a
Procedural
programming Integer indexing array[row index, column index]: Indexing an n-
Object-orientation
dimensional array with n integer indices returns the single value at this
Numerical
programming position.
NumPy package
Array basics
Best practice Step 1a
Linear algebra

Data formats and mat = np.arange(12).reshape((3, 4))


handling
mat
Pandas package
Series
DataFrame
## array([[ 0, 1, 2, 3],
Import/Export data ## [ 4, 5, 6, 7],
Visual ## [ 8, 9, 10, 11]])
illustrations
Matplotlib package
mat[2, 2]
Figures and subplots
Plot types and styles
Pandas layers ## 10
Applications
Time series
mat[0, -1]
Moving window
Financial applications ## 3

Keep in mind that, in this case only, the results are not arrays but
© 2019 PyEcon.org values!
Best practice: Indexing arrays 121
Essential
concepts
Getting started Step 1b
Procedural
programming Integer indexing array[row index]: In n-dimensional arrays, the ele-
Object-orientation
ment at each index is an (n − 1)-dimensional array.
Numerical
programming
NumPy package Best practice Step 1b
Array basics
Linear algebra mat = np.arange(12).reshape((3, 4))
Data formats and mat
handling
Pandas package
## array([[ 0, 1, 2, 3],
Series
DataFrame
## [ 4, 5, 6, 7],
Import/Export data ## [ 8, 9, 10, 11]])
Visual
illustrations mat[2]
Matplotlib package
Figures and subplots
Plot types and styles
## array([ 8, 9, 10, 11])
Pandas layers
mat[0]
Applications
Time series
Moving window ## array([0, 1, 2, 3])
Financial applications

By specifying the row index only, we create arrays which are views.
© 2019 PyEcon.org
Best practice: Indexing arrays 122
Essential
concepts
Getting started Step 2a
Procedural
programming Slicing array[start : stop : step]: Slicing can be used separately
Object-orientation
for rows and columns.
Numerical
programming
NumPy package
Best practice Step 2a
Array basics
Linear algebra
mat = np.arange(12).reshape((3, 4))
Data formats and
mat
handling
Pandas package ## array([[ 0, 1, 2, 3],
Series
## [ 4, 5, 6, 7],
DataFrame
Import/Export data
## [ 8, 9, 10, 11]])
Visual
illustrations mat[0:2]
Matplotlib package
Figures and subplots ## array([[0, 1, 2, 3],
Plot types and styles
## [4, 5, 6, 7]])
Pandas layers

Applications
mat[0:2, ::2]
Time series
Moving window
Financial applications
## array([[0, 2],
## [4, 6]])

© 2019 PyEcon.org
Best practice: Indexing arrays 123
Essential
concepts
Getting started Step 2b
Procedural
programming A frequent task is to get a specific row or column of an array. This can
Object-orientation
be done easily by slicing.
Numerical
programming
NumPy package Best practice Step 2b
Array basics
Linear algebra mat
Data formats and
handling ## array([[ 0, 1, 2, 3],
Pandas package ## [ 4, 5, 6, 7],
Series
DataFrame
## [ 8, 9, 10, 11]])
Import/Export data
row = mat[1] # get second row
Visual
illustrations column = mat[:, 2] # get third column
Matplotlib package row
Figures and subplots

## array([4, 5, 6, 7])
Plot types and styles
Pandas layers

Applications
Time series
column
Moving window
Financial applications ## array([ 2, 6, 10])

Slicing with [:] means to take every element from the first to the last.
© 2019 PyEcon.org
Best practice: Indexing arrays 124
Essential
concepts
Getting started Step 3
Procedural
programming Fancy indexing array[rows list, columns list]: Return a one di-
Object-orientation
mensional array with the values at the index tuples specified elementwise
Numerical
programming by the index lists.
NumPy package
Array basics
Linear algebra
Best practice Step 3
Data formats and mat = np.arange(12).reshape((3, 4))
handling
Pandas package
mat
Series
DataFrame ## array([[ 0, 1, 2, 3],
Import/Export data
## [ 4, 5, 6, 7],
Visual ## [ 8, 9, 10, 11]])
illustrations

mat[[1, 2], [1, 2]]


Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers ## array([ 5, 10])
Applications
Time series mat[[0, -1], [-1]]
Moving window
Financial applications
## array([ 3, 11])

The index lists might also contain just a single element.


© 2019 PyEcon.org
Best practice: Indexing arrays 125
Essential
concepts
Getting started Step 4
Procedural
programming Conditional indexing: Applying comparison operators to arrays, the
Object-orientation
boolean operations are evaluated elementwise in a vectorized fashion.
Numerical
programming
NumPy package Best practice Step 4
Array basics
Linear algebra bool_mat = mat > 0
Data formats and bool_mat
handling
Pandas package ## array([[False, True, True, True],
Series
DataFrame
## [ True, True, True, True],
Import/Export data ## [ True, True, True, True]])
Visual
illustrations mat[bool_mat] = 111 # equivalent to mat[mat > 0] = 111
Matplotlib package mat
Figures and subplots

## array([[ 0, 111, 111, 111],


Plot types and styles
Pandas layers

Applications
## [111, 111, 111, 111],
Time series
## [111, 111, 111, 111]])
Moving window
Financial applications

© 2019 PyEcon.org
Best practice: Indexing arrays 126
Essential
concepts
Getting started Step 5
Procedural
programming Replacing values in arrays. Assigning a slice of an array to new values,
Object-orientation
the shape of slice must be considered.
Numerical
programming
NumPy package Best practice Step 5
Array basics
Linear algebra mat[0] = np.array([3, 2, 1]) # Fails because the shapes do not fit
Data formats and
handling
## Error: could not broadcast array from shape (3) into shape (4)
Pandas package
Series
mat[2, 3] = 100
DataFrame mat[:, 0] = np.array([3, 3, 3])
Import/Export data mat
Visual
illustrations ## array([[ 3, 111, 111, 111],
Matplotlib package
## [ 3, 111, 111, 111],
Figures and subplots
Plot types and styles ## [ 3, 111, 111, 100]])
Pandas layers

Applications mat[1:3, 1:3] = np.array([[0, 0], [0, 0]])


Time series mat
Moving window
Financial applications
## array([[ 3, 111, 111, 111],
## [ 3, 0, 0, 111],
## [ 3, 0, 0, 100]])
© 2019 PyEcon.org
Reshaping arrays 127
Essential
concepts
Getting started array.reshape((rows, columns)): Reshapes an existing array.
Procedural
programming array.resize((rows, columns)): Changes array shape to rows x
Object-orientation
columns and fills new values with 0.
Numerical
programming
NumPy package
Reshape
Array basics
Linear algebra
arr = np.arange(15)
Data formats and
arr.reshape((3, 5))
handling
Pandas package ## array([[ 0, 1, 2, 3, 4],
Series
## [ 5, 6, 7, 8, 9],
DataFrame
Import/Export data
## [10, 11, 12, 13, 14]])
Visual
illustrations arr = np.arange(15)
Matplotlib package arr.resize((3, 7))
Figures and subplots
arr
Plot types and styles
Pandas layers
## array([[ 0, 1, 2, 3, 4, 5, 6],
Applications
Time series
## [ 7, 8, 9, 10, 11, 12, 13],
Moving window ## [14, 0, 0, 0, 0, 0, 0]])
Financial applications

© 2019 PyEcon.org
Adding and removing elements of arrays 128
Essential
concepts
Getting started np.append(array, value): Appends value to the end of array.
Procedural
programming np.insert(array, index, value): Inserts values before index.
Object-orientation
np.delete(array, index, axis): Deletes row or column on index.
Numerical
programming
NumPy package Naming
Array basics
Linear algebra a = np.arange(5)
Data formats and a = np.append(a, 8)
handling a = np.insert(a, 3, 77)
Pandas package
Series
print(a)
DataFrame
Import/Export data ## [ 0 1 2 77 3 4 8]
Visual
illustrations a.resize((3, 3))
Matplotlib package
np.delete(a, 1, axis=0)
Figures and subplots
Plot types and styles
Pandas layers ## array([[0, 1, 2],
Applications
## [8, 0, 0]])
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Combining and splitting 129
Essential
concepts
Getting started np.concatenate((arr1, arr2), axis): Joins a sequence of arrays
Procedural
programming along an existing axis.
Object-orientation
np.split(array, n): Splits an array into multiple sub-arrays.
Numerical
programming np.hsplit(array, n): Splits an array into multiple sub-arrays hori-
NumPy package
Array basics
zontally.
Linear algebra

Data formats and Naming


handling
Pandas package np.concatenate((a, np.arange(6).reshape(2, 3)), axis=0)
Series
DataFrame
## array([[ 0, 1, 2],
## [77, 3, 4],
Import/Export data

Visual
illustrations
## [ 8, 0, 0],
Matplotlib package ## [ 0, 1, 2],
Figures and subplots ## [ 3, 4, 5]])
Plot types and styles

np.split(np.arange(8), 4)
Pandas layers

Applications
Time series
Moving window
## [array([0, 1]), array([2, 3]), array([4, 5]), array([6, 7])]
Financial applications

© 2019 PyEcon.org
Transposing array 130
Essential
concepts
Getting started array.T: Returns the transposed array (as a view).
Procedural
programming
Object-orientation Transpose
Numerical
programming arr3
NumPy package
Array basics ## array([[4, 8, 5],
Linear algebra
## [9, 3, 4],
Data formats and ## [1, 0, 6]])
handling
Pandas package
Series arr3.T
DataFrame
Import/Export data ## array([[4, 9, 1],
Visual ## [8, 3, 0],
illustrations
## [5, 4, 6]])
Matplotlib package
Figures and subplots
Plot types and styles np.eye(3).T
Pandas layers

Applications ## array([[1., 0., 0.],


Time series ## [0., 1., 0.],
Moving window
## [0., 0., 1.]])
Financial applications

© 2019 PyEcon.org
Matrix multiplication 131
Essential
concepts
Getting started np.dot(arr1, arr2): Conducts a matrix multiplication of arr1 and
Procedural
programming arr2. The @ operator can be used instead of the np.dot() function.
Object-orientation

Numerical
programming
Matrix multiplication
NumPy package
res = np.dot(arr3, np.arange(18).reshape((3, 6)))
Array basics
Linear algebra
res
Data formats and
handling ## array([[108, 125, 142, 159, 176, 193],
Pandas package ## [ 66, 82, 98, 114, 130, 146],
Series
## [ 72, 79, 86, 93, 100, 107]])
DataFrame
Import/Export data
res2 = arr3 @ np.arange(18).reshape((3, 6))
Visual
illustrations res2
Matplotlib package
Figures and subplots ## array([[108, 125, 142, 159, 176, 193],
Plot types and styles
## [ 66, 82, 98, 114, 130, 146],
Pandas layers
## [ 72, 79, 86, 93, 100, 107]])
Applications

np.allclose(res, res2)
Time series
Moving window
Financial applications
## True

© 2019 PyEcon.org
Array functions 132
Essential
concepts
Getting started
Procedural Element-wise functions
programming
Object-orientation arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package ## [9, 3, 4],
Array basics
## [1, 0, 6]])
Linear algebra

Data formats and


handling
np.sqrt(arr3)
Pandas package
Series ## array([[2. , 2.82842712, 2.23606798],
DataFrame ## [3. , 1.73205081, 2. ],
Import/Export data
## [1. , 0. , 2.44948974]])
Visual
illustrations
Matplotlib package
np.exp(arr3)
Figures and subplots
Plot types and styles ## array([[5.45981500e+01, 2.98095799e+03, 1.48413159e+02],
Pandas layers ## [8.10308393e+03, 2.00855369e+01, 5.45981500e+01],
Applications ## [2.71828183e+00, 1.00000000e+00, 4.03428793e+02]])
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Overview: Element-wise array functions 133
Essential
concepts
Getting started
Procedural
programming Function Description
Object-orientation
abs Absolute value of integer and floating point
Numerical
programming sqrt Sqare root
NumPy package
Array basics exp Exponential function
Linear algebra
log, log10, log2 Natural logarithm, log base 10, log base 2
Data formats and
handling sign Sign (1 : positiv, 0: zero, -1 : negative)
Pandas package
Series
ceil Rounding up to integer
DataFrame floor Round down to integer
Import/Export data

Visual
rint Round to nearest integer
illustrations
Matplotlib package
modf Returns fractional parts
Figures and subplots sin, cos, tan, sinh, cosh, tanh, arcsin, ...
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Binary functions 134
Essential
concepts
Getting started
Procedural Binary
programming
Object-orientation x = np.array([3, -6, 8, 4, 3, 5])
Numerical y = np.array([3, 5, 7, 3, 5, 9])
programming
np.maximum(x, y)
NumPy package
Array basics
Linear algebra ## array([3, 5, 8, 4, 5, 9])
Data formats and
handling np.greater_equal(x, y)
Pandas package
Series ## array([ True, False, True, True, False, False])
DataFrame

np.add(x, y)
Import/Export data

Visual
illustrations
Matplotlib package
## array([ 6, -1, 15, 7, 8, 14])
Figures and subplots
Plot types and styles np.mod(x, y)
Pandas layers

Applications ## array([0, 4, 1, 1, 3, 5])


Time series
Moving window
Financial applications

© 2019 PyEcon.org
Overview: Binary functions 135
Essential
concepts
Getting started
Procedural
programming Function Description
Object-orientation
add Add elements of arrays
Numerical
programming subtract Subtract elements in the second from the first array
NumPy package
Array basics multiply Multiply elements
Linear algebra
divide Divide elements
Data formats and
handling power Raise elements in first array to powers in second
Pandas package
Series
maximum Element-wise maximum
DataFrame minimum Element-wise minimum
Import/Export data

Visual
mod Element-wise modulus
illustrations
Matplotlib package
greater, less, equal gives boolean
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Data processing 136
Essential
concepts
Getting started np.meshgrid(array1, array2): Returns coordinate matrices from
Procedural
programming coordinate arrays.
Object-orientation
p
Numerical
programming Evaluate the function f (x , y ) = x 2 + y 2 on a 10 x 10 grid
NumPy package
Array basics p = np.arange(-5, 5, 0.01)
Linear algebra x, y = np.meshgrid(p, p)
Data formats and x
handling
Pandas package
Series
## array([[-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
DataFrame ## [-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
Import/Export data ## [-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
Visual ## ...,
illustrations
## [-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
## [-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
Matplotlib package
Figures and subplots
Plot types and styles ## [-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99]])
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Data processing 137
Essential
concepts
Getting started p
Procedural
programming
Evaluate the function f (x , y ) = x 2 + y 2 on a 10 x 10 grid.
Object-orientation
import matplotlib.pyplot as plt
Numerical
programming
val = np.sqrt(x**2 + y**2)
NumPy package plt.figure(figsize=(2, 2))
Array basics plt.imshow(val, cmap="hot")
Linear algebra
plt.colorbar()
Data formats and
handling
## <matplotlib.colorbar.Colorbar object at 0x7fe8375f8160>
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Data processing 138
Essential
concepts
Getting started p
Procedural
programming
Evaluate the function f (x , y ) = x 2 + y 2 on a 10 x 10 grid.
plt.show()
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
6
Series

4
DataFrame
Import/Export data

Visual
illustrations

2
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Conditional logic 139
Essential
concepts
Getting started np.where(condition, a, b): If condition is True, returns value a,
Procedural
programming otherwise returns b.
Object-orientation

Numerical Conditional logic


programming
NumPy package a = np.array([4, 7, 5, -7, 9, 0])
Array basics
b = np.array([-1, 9, 8, 3, 3, 3])
cond = np.array([True, True, False, True, False, False])
Linear algebra

Data formats and


handling res = np.where(cond, a, b)
Pandas package res
Series
DataFrame
## array([ 4, 7, 8, -7, 3, 3])
Import/Export data

Visual
illustrations
res = np.where(a <= b, b, a)
Matplotlib package
res
Figures and subplots
Plot types and styles ## array([4, 9, 8, 3, 9, 3])
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Conditional logic 140
Essential
concepts
Getting started
Procedural Conditional logic, examples
programming
Object-orientation arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package
## [9, 3, 4],
Array basics
Linear algebra
## [1, 0, 6]])
Data formats and
handling res = np.where(arr3 < 5, 0, arr3)
Pandas package res
Series
DataFrame
## array([[0, 8, 5],
Import/Export data
## [9, 0, 0],
Visual
illustrations ## [0, 0, 6]])
Matplotlib package
Figures and subplots even = np.where(arr3 % 2 == 0, arr3, arr3 + 1)
Plot types and styles
even
Pandas layers

Applications
## array([[ 4, 8, 6],
## [10, 4, 4],
Time series
Moving window
Financial applications ## [ 2, 0, 6]])

© 2019 PyEcon.org
Statistical methods 141
Essential
concepts
Getting started array.mean(): Computes the mean of all array elements.
Procedural
programming array.sum(): Computes the sum of all array elements.
Object-orientation

Numerical
programming
Statistical methods
NumPy package arr3
Array basics
Linear algebra
## array([[4, 8, 5],
Data formats and ## [9, 3, 4],
handling
Pandas package
## [1, 0, 6]])
Series
DataFrame arr3.mean()
Import/Export data

Visual ## 4.444444444444445
illustrations
Matplotlib package
Figures and subplots
arr3.sum()
Plot types and styles
Pandas layers ## 40
Applications
Time series arr3.argmin()
Moving window
Financial applications ## 7

© 2019 PyEcon.org
Overview: Statistical methods 142
Essential
concepts
Getting started
Procedural
programming Method Description
Object-orientation
sum Sum of all array elements
Numerical
programming mean Mean of all array elements
NumPy package
Array basics std, var Standard deviation, variance
Linear algebra
min, max Minimum and Maximum value in array
Data formats and
handling argmin, argmax Indices of Minimum and Maximum value
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Axis 143
Essential
concepts
Getting started Axes are defined for arrays with more than one dimension. A two-
Procedural
programming dimensional array has two axes. The first one is running vertically
Object-orientation
downwards across the rows (axis=0), the second one running horizon-
Numerical
programming tally across the columns (axis=1).
NumPy package
Array basics
Linear algebra
Axis
Data formats and arr3
handling
Pandas package
## array([[4, 8, 5],
Series
DataFrame
## [9, 3, 4],
Import/Export data ## [1, 0, 6]])
Visual
illustrations arr3.sum(axis=0)
Matplotlib package
Figures and subplots
Plot types and styles
## array([14, 11, 15])
Pandas layers
arr3.sum(axis=1)
Applications
Time series
Moving window ## array([17, 16, 7])
Financial applications

© 2019 PyEcon.org
Sorting 144
Essential
concepts
Getting started array.sort(axis): Sorts array by an axis.
Procedural
programming
Object-orientation Sorting one-dimensional arrays
Numerical
programming arr2
NumPy package
Array basics ## array([24.3 , 0. , 8.9 , 4.4 , 1.65, 45. ])
Linear algebra

Data formats and arr2.sort()


handling
arr2
Pandas package
Series
DataFrame ## array([ 0. , 1.65, 4.4 , 8.9 , 24.3 , 45. ])
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Sorting 145
Essential
concepts
Getting started
Procedural
Sorting two-dimensional arrays
programming
Object-orientation
arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package ## [9, 3, 4],
Array basics
## [1, 0, 6]])
Linear algebra

Data formats and


handling
arr3.sort()
Pandas package arr3
Series
DataFrame ## array([[4, 5, 8],
Import/Export data
## [3, 4, 9],
Visual
illustrations
## [0, 1, 6]])
Matplotlib package
Figures and subplots arr3.sort(axis=0)
Plot types and styles arr3
Pandas layers

Applications ## array([[0, 1, 6],


Time series
## [3, 4, 8],
## [4, 5, 9]])
Moving window
Financial applications

The default axis using sort() is -1, which means to sort along the
© 2019 PyEcon.org last axis (in this case axis 1).
Section 2.3 146
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Numerical programming
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Linear algebra
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Inverse matrix 147
Essential
concepts
Getting started
Procedural Import numpy.linalg
programming
Object-orientation import numpy.linalg as nplin
Numerical
programming
NumPy package nplin.inv(array): Computes the inverse matrix.
Array basics
Linear algebra
np.allclose(array1, array2): Returns True if two arrays are ele-
Data formats and
ment-wise equal within a tolerance.
handling
Pandas package
Series
Inverse
DataFrame inv = nplin.inv(arr3)
Import/Export data
inv
Visual
illustrations
Matplotlib package
## array([[ 4., -21., 16.],
Figures and subplots ## [ -5., 24., -18.],
Plot types and styles ## [ 1., -4., 3.]])
Pandas layers

Applications np.allclose(np.identity(3), np.dot(inv, arr3))


Time series
Moving window
## True
Financial applications

© 2019 PyEcon.org
Matrix functions 148
Essential
concepts
Getting started nplin.det(array): Computes the determinant.
Procedural
programming np.trace(array): Computes the trace.
Object-orientation
np.diag(array): Returns the diagonal elements as an array.
Numerical
programming
NumPy package Linear algebra functions
Array basics
Linear algebra nplin.det(arr3)
Data formats and
handling ## -1.0
Pandas package
Series
DataFrame
np.trace(arr3)
Import/Export data
## 13
Visual
illustrations
Matplotlib package np.diag(arr3)
Figures and subplots
Plot types and styles
## array([0, 4, 9])
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Eigenvalues and eigenvectors 149
Essential
concepts
Getting started nplin.eig(array): Returns the array of eigenvalues and the array of
Procedural
programming eigenvectors as a list.
Object-orientation

Numerical
programming
Get eigenvalues and eigenvectors
NumPy package
A = np.array([[3, -1, 0], [2, 0, 0], [-2, 2, -1]])
eigenval, eigenvec = nplin.eig(A)
Array basics
Linear algebra

Data formats and


eigenval
handling
Pandas package ## array([-1., 1., 2.])
Series
DataFrame
eigenvec
Import/Export data

Visual
illustrations
## array([[ 0. , -0.40824829, -0.70710678],
Matplotlib package ## [ 0. , -0.81649658, -0.70710678],
Figures and subplots ## [ 1. , -0.40824829, 0. ]])
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Eigenvalues and eigenvectors 150
Essential
concepts
Getting started
Procedural Check eigenvalues and eigenvectors
programming
Object-orientation eigenval * eigenvec
Numerical
programming ## array([[-0. , -0.40824829, -1.41421356],
NumPy package
## [-0. , -0.81649658, -1.41421356],
Array basics
Linear algebra
## [-1. , -0.40824829, 0. ]])
Data formats and
handling np.dot(A, eigenvec)
Pandas package
Series ## array([[ 0. , -0.40824829, -1.41421356],
DataFrame
## [ 0. , -0.81649658, -1.41421356],
Import/Export data
## [-1. , -0.40824829, 0. ]])
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles        
Pandas layers 3 −1 0 0 0 0
Applications 2 0 0  · 0 = (−1) · 0 =  0 
Time series
Moving window −2 2 −1 1 1 −1
Financial applications

© 2019 PyEcon.org
QR decomposition 151
Essential
concepts
Getting started nplin.qr(array): Conducts a QR decomposition and returns Q and
Procedural
programming R as lists.
Object-orientation

Numerical QR decomposition
programming
NumPy package Q, R = nplin.qr(arr3)
Array basics
Q
Linear algebra

Data formats and


handling
## array([[ 0. , 0.98058068, 0.19611614],
Pandas package ## [-0.6 , 0.15689291, -0.78446454],
Series ## [-0.8 , -0.11766968, 0.58834841]])
DataFrame
Import/Export data
R
Visual

## array([[ -5. , -6.4 , -12. ],


illustrations
Matplotlib package
Figures and subplots ## [ 0. , 1.0198039 , 6.07960019],
Plot types and styles ## [ 0. , 0. , 0.19611614]])
Pandas layers

Applications np.allclose(arr3, np.dot(Q, R))


Time series
Moving window
## True
Financial applications

© 2019 PyEcon.org
Linearsystem 152
Essential
concepts
Getting started nplin.solve(A, b): Returns the solution of the linearsystem Ax = b.
Procedural
programming
Object-orientation Solve linearsystems
Numerical
programming b = np.array([7, 4, 8])
NumPy package x = nplin.solve(A, b)
Array basics
x
Linear algebra

Data formats and ## array([ 2., -1., -14.])


handling
Pandas package
Series np.allclose(np.dot(A, x), b)
DataFrame
Import/Export data ## True
Visual
illustrations
Matplotlib package
Figures and subplots    
Plot types and styles
Pandas layers
3x1 − 1x2 + 0x3 =7 x1 2
Applications
2x1 − 0x2 + 0x3 = 4 → x2  =  −1 
Time series −2x1 + 2x2 − 1x3 =8 x3 −14
Moving window
Financial applications

© 2019 PyEcon.org
Overview: Linear algebra 153
Essential
concepts
Getting started
Procedural
programming Function Description
Object-orientation
np.dot Matrix multiplication
Numerical
programming np.trace Sum of the diagonal elements
NumPy package
Array basics np.diag Diagonal elements as an array
Linear algebra
nplin.det Matrix determinant
Data formats and
handling nplin.eig Eigenvalues and eigenvectors
Pandas package
Series
nplin.inv Inverse matrix
DataFrame nplin.qr QR decomposition
Import/Export data

Visual
nplin.solve Solve linearsystem
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Chapter 3 154
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Data formats and handling
Numerical
programming
NumPy package
Array basics
3.1 Pandas package
Linear algebra

Data formats and


3.2 Series
handling
Pandas package 3.3 DataFrame
Series
DataFrame
Import/Export data
3.4 Import/Export data
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Section 3.1 155
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Data formats and handling
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Pandas package
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Pandas 156
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series The package pandas is a free software library for Python including the
DataFrame
Import/Export data
following features:
Visual
illustrations
Data manipulation and analysis,
Matplotlib package
Figures and subplots
DataFrame objects and Series,
Plot types and styles
Pandas layers
Export and import data from files and web,
Applications Handling of missing data.
Time series
Moving window → Provides high-performance data structures and data analysis tools.
Financial applications

© 2019 PyEcon.org
Motivation 157
Essential
concepts
Getting started With pandas you can import and visualize financial data in only a few
Procedural
programming lines of code.
Object-orientation

Numerical Motivation
programming
NumPy package
import pandas as pd
Array basics import matplotlib.pyplot as plt
Linear algebra

Data formats and fig = plt.figure()


handling
Pandas package
ax = fig.add_subplot(1, 1, 1)
Series dow = pd.read_csv("data/dji.csv", index_col=0, parse_dates=True)
DataFrame close = dow["Close"]
Import/Export data
close.plot(ax=ax)
Visual
illustrations
ax.set_xlabel("Date")
Matplotlib package ax.set_ylabel("Price")
Figures and subplots ax.set_title("DJI")
Plot types and styles
fig.savefig("out/dji.pdf", format="pdf")
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Motivation 158
Essential
concepts
Getting started
Procedural
programming

DJI
Object-orientation

Numerical 27500
programming
NumPy package
Array basics
25000
Linear algebra

Data formats and 22500


handling
Pandas package 20000
Series
DataFrame
17500
Price

Import/Export data

Visual
illustrations 15000
Matplotlib package
Figures and subplots 12500
Plot types and styles
Pandas layers
10000
Applications
Time series
Moving window
7500
Financial applications

6 8 0 2 4 6 8
200 200 201 201 201 201 201
Date
© 2019 PyEcon.org
Section 3.2 159
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Data formats and handling
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Series
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Series 160
Essential
concepts
Getting started Series are a data structure in pandas.
Procedural
programming
Object-orientation
One-dimensional array-like object,
Numerical
programming Containing a sequence of values and a corresponding array of
NumPy package
Array basics
labels, called the index,
Linear algebra
The string representation of a Series displays the index on the left
Data formats and
handling and the values on the right,
Pandas package
Series The default index consists of the integers 0 through N-1.
DataFrame
Import/Export data

Visual
illustrations String representation of a Series
## 0 3
Matplotlib package
Figures and subplots
Plot types and styles ## 1 7
Pandas layers ## 2 -8
Applications ## 3 4
Time series
## 4 26
## dtype: int64
Moving window
Financial applications

© 2019 PyEcon.org
Create Series 161
Essential
concepts
Getting started pd.Series(): Creates one-dimensional array-like object including val-
Procedural
programming ues and an index.
Object-orientation

Numerical Importing Pandas and creating a Series


programming
NumPy package import numpy as np
Array basics import pandas as pd
Linear algebra

Data formats and


handling
obj = pd.Series([2, -5, 9, 4])
Pandas package obj
Series
DataFrame ## 0 2
Import/Export data
## 1 -5
Visual
illustrations
## 2 9
Matplotlib package ## 3 4
Figures and subplots ## dtype: int64
Plot types and styles
Pandas layers

Applications
Time series Simple Series formed only from a list,
Moving window
Financial applications An index is added automatically.

© 2019 PyEcon.org
Create Series 162
Essential
concepts
Getting started
Procedural
Series indexing vs. Numpy indexing
programming
Object-orientation obj2 = pd.Series([2, -5, 9, 4], index=["a", "b", "c", "d"])
Numerical npobj = np.array([2, -5, 9, 4])
programming obj2
NumPy package

## a 2
Array basics
Linear algebra

Data formats and


## b -5
handling ## c 9
Pandas package ## d 4
Series
DataFrame
## dtype: int64
Import/Export data

Visual
obj2["b"]
illustrations
Matplotlib package ## -5
Figures and subplots
Plot types and styles
npobj[1]
Pandas layers

Applications ## -5
Time series
Moving window
Financial applications

NumPy arrays can only be indexed by integers while Series can be


indexed by the manually set index.
© 2019 PyEcon.org
Create Series 163
Essential
concepts
Getting started Pandas Series can be created from:
Procedural
programming
Object-orientation
Lists,
Numerical NumPy arrays,
programming
NumPy package Dicts.
Array basics
Linear algebra

Data formats and


Series creation from Numpy arrays
handling
Pandas package
npobj = np.array([2, -5, 9, 4])
Series obj2 = pd.Series(npobj, index=["a", "b", "c", "d"])
DataFrame obj2
Import/Export data

Visual ## a 2
illustrations
Matplotlib package
## b -5
Figures and subplots ## c 9
Plot types and styles ## d 4
Pandas layers
## dtype: int64
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Create Series 164
Essential
concepts
Getting started
Procedural Series from dicts
programming
Object-orientation dictdata = {"Göttingen": 117665, "Northeim": 28920,
Numerical "Hannover": 532163, "Berlin": 3574830}
programming obj3 = pd.Series(dictdata)
NumPy package
obj3
Array basics
Linear algebra
## Göttingen 117665
Data formats and
handling ## Northeim 28920
Pandas package ## Hannover 532163
Series
## Berlin 3574830
DataFrame
Import/Export data
## dtype: int64
Visual
illustrations
Matplotlib package
Figures and subplots The index of the Series can be set manually,
Plot types and styles
Pandas layers Compared to NumPy array you can use the set index to select
Applications single values,
Time series
Moving window Data contained in a dict can be passed to a Series. The index of
Financial applications
the resulting Series consists of the dict’s keys.

© 2019 PyEcon.org
Create Series 165
Essential
concepts
Getting started
Procedural Dict to Series with manual index
programming
Object-orientation cities = ["Hamburg", "Göttingen", "Berlin", "Hannover"]
Numerical obj4 = pd.Series(dictdata, index=cities)
programming obj4
NumPy package
Array basics
Linear algebra
## Hamburg NaN
## Göttingen 117665.0
Data formats and
handling ## Berlin 3574830.0
Pandas package ## Hannover 532163.0
Series
## dtype: float64
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Passing a dict to a Series, the index can be set manually,
Figures and subplots
Plot types and styles
NaN (not a number) marks missing values where the index and the
Pandas layers dict do not match.
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Series properties 166
Essential
concepts
Getting started Series.values: Returns the values of a Series.
Procedural
programming Series.index: Returns the index of a Series.
Object-orientation

Numerical Series properties


programming
NumPy package obj.values
Array basics
Linear algebra
## array([ 2, -5, 9, 4])
Data formats and
handling
obj.index
Pandas package
Series
DataFrame ## RangeIndex(start=0, stop=4, step=1)
Import/Export data

Visual obj2.index
illustrations
Matplotlib package
## Index(['a', 'b', 'c', 'd'], dtype='object')
Figures and subplots
Plot types and styles
Pandas layers

Applications The values and the index of a Series can be printed separately.
Time series
Moving window The default index, if none was explicitly specified, is a RangeIndex.
Financial applications

RangeIndex inherits from Index class.

© 2019 PyEcon.org
Selecting and manipulating values 167
Essential
concepts
Getting started
Procedural Series manipulation
programming
Object-orientation obj2[["c", "d", "a"]]
Numerical
programming ## c 9
NumPy package
Array basics
## d 4
Linear algebra ## a 2
Data formats and
## dtype: int64
handling
Pandas package obj2[obj2 < 0]
Series
DataFrame
Import/Export data
## b -5
## dtype: int64
Visual
illustrations
Matplotlib package
Figures and subplots
NumPy-like functions can be applied on Series
Plot types and styles
Pandas layers
For filtering data,
Applications To do scalar multiplications or applying math functions,
Time series
Moving window The index-value link will be preserved.
Financial applications

© 2019 PyEcon.org
Selecting and manipulating values 168
Essential
concepts
Getting started
Procedural
Series functions
programming
Object-orientation
obj2 * 2
Numerical
programming
## a 4
NumPy package ## b -10
Array basics ## c 18
Linear algebra
## d 8
Data formats and
handling
## dtype: int64
Pandas package
Series np.exp(obj2)["a":"c"]
DataFrame
Import/Export data ## a 7.389056
Visual ## b 0.006738
illustrations
Matplotlib package
## c 8103.083928
Figures and subplots ## dtype: float64
Plot types and styles
Pandas layers "c" in obj2
Applications
Time series ## True
Moving window
Financial applications

Mathematical functions applied to a Series will only be applied on


its values – not on its index.
© 2019 PyEcon.org
Selecting and manipulating values 169
Essential
concepts
Getting started
Procedural
Series manipulation
programming
Object-orientation obj4["Hamburg"] = 1900000
Numerical obj4
programming
NumPy package ## Hamburg 1900000.0
Array basics
Linear algebra
## Göttingen 117665.0
## Berlin 3574830.0
Data formats and
handling ## Hannover 532163.0
Pandas package ## dtype: float64
Series
DataFrame
Import/Export data
obj4[["Berlin", "Hannover"]] = [3600000, 1100000]
obj4
Visual
illustrations
Matplotlib package ## Hamburg 1900000.0
Figures and subplots ## Göttingen 117665.0
Plot types and styles
Pandas layers
## Berlin 3600000.0
## Hannover 1100000.0
Applications
Time series
## dtype: float64
Moving window
Financial applications

Values can be manipulated by using the labels in the index,


Sets of values can be set in one line.
© 2019 PyEcon.org
Detect missing data 170
Essential
concepts
Getting started pd.isnull(): True if data is missing.
Procedural
programming pd.notnull(): False if data is missing.
Object-orientation

Numerical
programming
NaN
NumPy package pd.isnull(obj4)
Array basics
Linear algebra
## Hamburg False
Data formats and
handling
## Göttingen False
Pandas package ## Berlin False
Series ## Hannover False
DataFrame
## dtype: bool
Import/Export data

Visual
illustrations
pd.notnull(obj4)
Matplotlib package
Figures and subplots ## Hamburg True
Plot types and styles ## Göttingen True
Pandas layers
## Berlin True
Applications ## Hannover True
Time series
Moving window
## dtype: bool
Financial applications

© 2019 PyEcon.org
Align differently indexed data 171
Essential
concepts
Getting started There are not two values to align for Hamburg and Northeim – so they
Procedural
programming are marked with NaN (not a number).
Object-orientation

Numerical
programming
NumPy package
Data 1 Data 2
Array basics
obj3 obj4
Linear algebra

Data formats and


handling
## Göttingen 117665 ## Hamburg 1900000.0
Pandas package
## Northeim 28920 ## Göttingen 117665.0
Series ## Hannover 532163 ## Berlin 3600000.0
DataFrame
## Berlin 3574830 ## Hannover 1100000.0
## dtype: int64 ## dtype: float64
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Align data
Plot types and styles
obj3 + obj4
Pandas layers

Applications ## Berlin 7174830.0


Time series
Moving window
## Göttingen 235330.0
Financial applications ## Hamburg NaN
## Hannover 1632163.0
## Northeim NaN
## dtype: float64
© 2019 PyEcon.org
Naming Series 172
Essential
concepts
Getting started Series.name: Returns name of the Series.
Procedural
programming Series.index.name: Returns name of the Series’ index.
Object-orientation

Numerical Naming
programming
NumPy package obj4.name = "population"
Array basics
obj4.index.name = "city"
obj4
Linear algebra

Data formats and


handling
Pandas package
## city
Series ## Hamburg 1900000.0
DataFrame ## Göttingen 117665.0
## Berlin 3600000.0
Import/Export data

Visual
illustrations
## Hannover 1100000.0
Matplotlib package ## Name: population, dtype: float64
Figures and subplots
Plot types and styles
Pandas layers

Applications The attribute name will change the name of the existing Series,
Time series
Moving window There is no default name of the Series or the index.
Financial applications

© 2019 PyEcon.org
Series vs. NumPy arrays 173
Essential
concepts
Getting started
Procedural
programming
NumPy arrays are accessed by their integer positions,
Object-orientation
Series can be accessed by a user defined index, including letters
Numerical
programming and numbers,
NumPy package
Array basics Different Series can be aligned efficiently by the index,
Linear algebra

Data formats and Series can work with missing values, so operations do not auto-
handling
Pandas package
matically fail.
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Section 3.3 174
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Data formats and handling
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I DataFrame
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
DataFrame 175
Essential
concepts
Getting started
Procedural
programming
DataFrames are the primary structure of pandas,
Object-orientation
It represents a table of data with an ordered collection of columns,
Numerical
programming
NumPy package
Each column can have a different data type,
Array basics
Linear algebra
A DataFrame can be thought of as a dict of Series sharing the
Data formats and same index,
handling
Pandas package Physically a DataFrame is two-dimensional but by using hierarchical
Series
DataFrame
indexing it can respresent higher dimensional data.
Import/Export data

Visual
illustrations String representation of a DataFrame
Matplotlib package
Figures and subplots ## company price volume
Plot types and styles
## 0 Daimler 69.20 4456290
## 1 E.ON 8.11 3667975
Pandas layers

Applications
## 2 Siemens 110.92 3669487
Time series
Moving window
## 3 BASF 87.28 1778058
Financial applications ## 4 BMW 87.81 1824582

© 2019 PyEcon.org
DataFrame 176
Essential
concepts
Getting started pd.DataFrame(): Creates a DataFrame which is a two-dimensional
Procedural
programming tabular-like structure with labeled axis (rows and columns).
Object-orientation

Numerical Creating a DataFrame


programming
NumPy package data = {"company": ["Daimler", "E.ON", "Siemens", "BASF", "BMW"],
Array basics
"price": [69.2, 8.11, 110.92, 87.28, 87.81],
Linear algebra
"volume": [4456290, 3667975, 3669487, 1778058, 1824582]}
Data formats and
handling frame = pd.DataFrame(data)
Pandas package frame
Series
DataFrame
## company price volume
Import/Export data
## 0 Daimler 69.20 4456290
## 1 E.ON 8.11 3667975
Visual
illustrations
Matplotlib package ## 2 Siemens 110.92 3669487
Figures and subplots
## 3 BASF 87.28 1778058
## 4 BMW 87.81 1824582
Plot types and styles
Pandas layers

Applications

In this example the construction of the DataFrame frame is done


Time series
Moving window
Financial applications
by passing a dict of equal-length lists,
Instead of passing a dict of lists, it is also possible to pass a dict
of NumPy arrays.
© 2019 PyEcon.org
Show DataFrames 177
Essential
concepts
Getting started
Procedural Print DataFrame
programming
Object-orientation frame2 = pd.DataFrame(data, columns=["company", "volume",
Numerical "price", "change"])
programming frame2
NumPy package
Array basics
Linear algebra
## company volume price change
## 0 Daimler 4456290 69.20 NaN
Data formats and
handling ## 1 E.ON 3667975 8.11 NaN
Pandas package ## 2 Siemens 3669487 110.92 NaN
Series
## 3 BASF 1778058 87.28 NaN
DataFrame
Import/Export data
## 4 BMW 1824582 87.81 NaN
Visual
illustrations
Matplotlib package
Figures and subplots
Passing a column that is not contained in the dict, it will be
Plot types and styles marked with NaN,
Pandas layers

Applications The default index will be assigned automatically as with Series.


Time series
Moving window
Financial applications

© 2019 PyEcon.org
Inputs to DataFrame constructor 178
Essential
concepts
Getting started
Procedural
programming Type Description
Object-orientation
2D NumPy arrays A matrix of data
Numerical
programming dict of arrays, lists, or tuples Each sequence becomes a column
NumPy package
Array basics dict of Series Each value becomes a column
Linear algebra
dict of dicts Each inner dict becomes a column
Data formats and
handling List of dicts or Series Each item becomes a row
Pandas package
Series
List of lists or tuples Treated as the 2D NumPy arrays
DataFrame Another DataFrame Same indexes
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Indexing and adding DataFrames 179
Essential
concepts
Getting started
Procedural Add data to DataFrame
programming
Object-orientation frame2["change"] = [1.2, -3.2, 0.4, -0.12, 2.4]
Numerical frame2["change"]
programming
NumPy package ## 0 1.20
Array basics
Linear algebra
## 1 -3.20
## 2 0.40
Data formats and
handling ## 3 -0.12
Pandas package ## 4 2.40
Series
## Name: change, dtype: float64
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Selecting the column of DataFrame, a Series is returned,
Figures and subplots
Plot types and styles
A attribute-like access, e. g., frame2.change, is also possible,
Pandas layers
The returned Series has the same index as the initial DataFrame.
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Indexing DataFrames 180
Essential
concepts
Getting started
Procedural Indexing DataFrames
programming
Object-orientation frame2[["company", "change"]]
Numerical
programming ## company change
NumPy package
Array basics
## 0 Daimler 1.20
Linear algebra ## 1 E.ON -3.20
Data formats and
## 2 Siemens 0.40
handling ## 3 BASF -0.12
Pandas package
## 4 BMW 2.40
Series
DataFrame
Import/Export data

Visual Using a list of multiple columns while indexing, the result is a


illustrations
Matplotlib package DataFrame,
Figures and subplots
Plot types and styles The returned DataFrame has the same index as the initial one.
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Changing DataFrames 181
Essential
concepts
Getting started del DataFrame[column]: Deletes column from DataFrame.
Procedural
programming
Object-orientation DataFrame delete column
Numerical
programming
del frame2["volume"]
NumPy package frame2
Array basics
Linear algebra ## company price change
Data formats and ## 0 Daimler 69.20 1.20
handling
Pandas package
## 1 E.ON 8.11 -3.20
Series ## 2 Siemens 110.92 0.40
DataFrame ## 3 BASF 87.28 -0.12
Import/Export data
## 4 BMW 87.81 2.40
Visual
illustrations
frame2.columns
Matplotlib package
Figures and subplots
Plot types and styles ## Index(['company', 'price', 'change'], dtype='object')
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Naming DataFrames 182
Essential
concepts
Getting started
Procedural Naming properties
programming
Object-orientation frame2.index.name = "number:"
Numerical frame2.columns.name = "feature:"
programming
frame2
NumPy package
Array basics
Linear algebra
## feature: company price change
Data formats and
## number:
handling ## 0 Daimler 69.20 1.20
Pandas package
## 1 E.ON 8.11 -3.20
## 2 Siemens 110.92 0.40
Series
DataFrame
Import/Export data ## 3 BASF 87.28 -0.12
Visual ## 4 BMW 87.81 2.40
illustrations
Matplotlib package
Figures and subplots
Plot types and styles In DataFrames there is no default name for the index or the
Pandas layers
columns.
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Reindexing 183
Essential
concepts
Getting started DataFrame.reindex(): Creates new DataFrame with data conformed
Procedural
programming to a new index, while the initial DataFrame will not be changed.
Object-orientation

Numerical
programming
Reindexing
NumPy package
frame3 = frame.reindex([0, 2, 3, 4])
frame3
Array basics
Linear algebra

Data formats and


handling ## company price volume
Pandas package ## 0 Daimler 69.20 4456290
Series
## 2 Siemens 110.92 3669487
## 3 BASF 87.28 1778058
DataFrame
Import/Export data
## 4 BMW 87.81 1824582
Visual
illustrations
Matplotlib package

Index values that are not already present will be filled with NaN by
Figures and subplots
Plot types and styles
Pandas layers
default,
Applications
Time series There are many options for filling missing values.
Moving window
Financial applications

© 2019 PyEcon.org
Reindexing 184
Essential
concepts
Getting started
Procedural Filling missing values
programming
Object-orientation frame4 = frame.reindex(index=[0, 2, 3, 4, 5], fill_value=0,
Numerical columns=["company", "price", "market cap"])
programming
frame4
NumPy package
Array basics
Linear algebra ## company price market cap
Data formats and
## 0 Daimler 69.20 0
handling ## 2 Siemens 110.92 0
Pandas package
## 3 BASF 87.28 0
## 4 BMW 87.81 0
Series
DataFrame
Import/Export data ## 5 0 0.00 0
Visual
illustrations frame4 = frame.reindex(index=[0, 2, 3, 4], fill_value=np.nan,
Matplotlib package
columns=["company", "price", "market cap"])
frame4
Figures and subplots
Plot types and styles
Pandas layers

Applications
## company price market cap
Time series ## 0 Daimler 69.20 NaN
Moving window ## 2 Siemens 110.92 NaN
Financial applications
## 3 BASF 87.28 NaN
## 4 BMW 87.81 NaN

© 2019 PyEcon.org
Fill NaN 185
Essential
concepts
Getting started DataFrame.fillna(value): Fills NaNs with value.
Procedural
programming
Object-orientation Filling NaN
Numerical
programming
frame4[:3]
NumPy package
Array basics ## company price market cap
Linear algebra ## 0 Daimler 69.20 NaN
Data formats and ## 2 Siemens 110.92 NaN
handling
## 3 BASF 87.28 NaN
Pandas package
Series
DataFrame frame4.fillna(1000000, inplace=True)
Import/Export data frame4[:3]
Visual
illustrations ## company price market cap
Matplotlib package
Figures and subplots
## 0 Daimler 69.20 1000000.0
Plot types and styles ## 2 Siemens 110.92 1000000.0
Pandas layers ## 3 BASF 87.28 1000000.0
Applications
Time series
Moving window
Financial applications
The option inplace=True fills the current DafaFrame (here
frame4). Without using inplace a new DataFrame will be cre-
ated, filled with NaN values.
© 2019 PyEcon.org
Dropping entries 186
Essential
concepts
Getting started DataFrame.drop(index, axis): Returns a new object with labels in
Procedural
programming requested axis removed.
Object-orientation

Numerical
programming
Dropping index
NumPy package frame5 = frame
Array basics
Linear algebra
frame5
Data formats and
handling
## company price volume
Pandas package ## 0 Daimler 69.20 4456290
Series ## 1 E.ON 8.11 3667975
DataFrame
Import/Export data
## 2 Siemens 110.92 3669487
## 3 BASF 87.28 1778058
Visual
illustrations ## 4 BMW 87.81 1824582
Matplotlib package
Figures and subplots frame5.drop([1, 2])
Plot types and styles

## company price volume


Pandas layers

Applications
## 0 Daimler 69.20 4456290
Time series
Moving window
## 3 BASF 87.28 1778058
Financial applications ## 4 BMW 87.81 1824582

© 2019 PyEcon.org
Dropping entries 187
Essential
concepts
Getting started
Procedural Dropping column
programming
Object-orientation frame5[:2]
Numerical
programming ## company price volume
NumPy package
Array basics
## 0 Daimler 69.20 4456290
Linear algebra ## 1 E.ON 8.11 3667975
Data formats and
handling frame5.drop("price", axis=1)[:3]
Pandas package
Series ## company volume
DataFrame
Import/Export data
## 0 Daimler 4456290
## 1 E.ON 3667975
Visual
illustrations ## 2 Siemens 3669487
Matplotlib package
Figures and subplots frame5.drop(2, axis=0)
Plot types and styles

## company price volume


Pandas layers

Applications
## 0 Daimler 69.20 4456290
Time series
Moving window
## 1 E.ON 8.11 3667975
Financial applications ## 3 BASF 87.28 1778058
## 4 BMW 87.81 1824582

© 2019 PyEcon.org
Indexing, selecting and filtering 188
Essential
concepts
Getting started Indexing of DataFrames works like indexing an numpy array, you can
Procedural
programming use the default index values and a manually set index.
Object-orientation

Numerical
programming
Indexing
NumPy package frame
Array basics

## company price volume


Linear algebra

Data formats and


handling
## 0 Daimler 69.20 4456290
Pandas package ## 1 E.ON 8.11 3667975
Series ## 2 Siemens 110.92 3669487
DataFrame
Import/Export data
## 3 BASF 87.28 1778058
## 4 BMW 87.81 1824582
Visual
illustrations
Matplotlib package frame[2:]
Figures and subplots
Plot types and styles ## company price volume
## 2 Siemens 110.92 3669487
Pandas layers

Applications
## 3 BASF 87.28 1778058
Time series
Moving window
## 4 BMW 87.81 1824582
Financial applications

© 2019 PyEcon.org
Indexing, selecting and filtering 189
Essential
concepts
Getting started
Procedural Indexing
programming
Object-orientation frame6 = pd.DataFrame(data, index=["a", "b", "c", "d", "e"])
Numerical frame6
programming
NumPy package
Array basics
## company price volume
Linear algebra ## a Daimler 69.20 4456290
Data formats and
## b E.ON 8.11 3667975
handling ## c Siemens 110.92 3669487
Pandas package
## d BASF 87.28 1778058
## e BMW 87.81 1824582
Series
DataFrame
Import/Export data

Visual
frame6["b":"d"]
illustrations
Matplotlib package ## company price volume
Figures and subplots
## b E.ON 8.11 3667975
Plot types and styles
Pandas layers
## c Siemens 110.92 3669487
Applications
## d BASF 87.28 1778058
Time series
Moving window
Financial applications
When slicing with labels the end element is inclusive.

© 2019 PyEcon.org
Indexing, selecting and filtering 190
Essential
concepts
Getting started DataFrame.loc(): Selects a subset of rows and columns from a
Procedural
programming DataFrame using axis labels.
Object-orientation
DataFrame.iloc(): Selects a subset of rows and columns from a
Numerical
programming DataFrame using integers.
NumPy package
Array basics
Linear algebra
Selection with loc and iloc
Data formats and frame6.loc["c", ["company", "price"]]
handling
Pandas package
## company Siemens
Series
DataFrame
## price 110.92
Import/Export data ## Name: c, dtype: object
Visual
illustrations frame6.iloc[2, [0, 1]]
Matplotlib package
Figures and subplots
Plot types and styles
## company Siemens
Pandas layers ## price 110.92
Applications
## Name: c, dtype: object
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Indexing, selecting and filtering 191
Essential
concepts
Getting started
Procedural Selection with loc and iloc
programming
Object-orientation frame6.loc[["c", "d", "e"], ["volume", "price", "company"]]
Numerical
programming ## volume price company
NumPy package ## c 3669487 110.92 Siemens
Array basics
Linear algebra
## d 1778058 87.28 BASF
## e 1824582 87.81 BMW
Data formats and
handling
Pandas package frame6.iloc[2:, ::-1]
Series
DataFrame ## volume price company
## c 3669487 110.92 Siemens
Import/Export data

Visual
illustrations
## d 1778058 87.28 BASF
Matplotlib package ## e 1824582 87.81 BMW
Figures and subplots
Plot types and styles
Pandas layers

Applications
Both of the indexing functions work with slices or lists of labels,
Time series
Moving window
Many ways to select and rearrange pandas objects.
Financial applications

© 2019 PyEcon.org
DataFrame indexing options 192
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Type Description
Numerical
programming df[val] Select single column or set of columns
NumPy package
Array basics df.loc[val] Select single row or set of rows
Linear algebra
df.loc[:, val] Select single column or set of columns
Data formats and
handling df.loc[val1, val2] Select row and column by label
Pandas package
Series
df.iloc[where] Select row or set of rows by integer position
DataFrame df.iloc[:, where] Select column or set of columns by integer pos.
Import/Export data

Visual
df.iloc[w1, w2] Select row and column by integer position
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Hierarchical indexing 193
Essential
concepts
Getting started Hierarchical indexing enables you to have multiple index levels.
Procedural
programming
Object-orientation Multiindex
Numerical
programming
ind = [["a", "a", "a", "b", "b"], [1, 2, 3, 1, 2]]
NumPy package
frame6 = pd.DataFrame(np.arange(15).reshape((5, 3)),
Array basics index=ind,
Linear algebra
columns=["first", "second", "third"])
Data formats and frame6
handling
Pandas package
Series
## first second third
DataFrame ## a 1 0 1 2
Import/Export data
## 2 3 4 5
Visual ## 3 6 7 8
illustrations
Matplotlib package
## b 1 9 10 11
Figures and subplots ## 2 12 13 14
Plot types and styles
Pandas layers frame6.index.names = ["index1", "index2"]
Applications frame6.index
Time series
Moving window
## MultiIndex(levels=[['a', 'b'], [1, 2, 3]],
Financial applications
## labels=[[0, 0, 0, 1, 1], [0, 1, 2, 0, 1]],
## names=['index1', 'index2'])

© 2019 PyEcon.org
Hierarchical indexing 194
Essential
concepts
Getting started
Procedural Selecting of a multiindex
programming
Object-orientation frame6.loc["a"]
Numerical
programming ## first second third
NumPy package
Array basics
## index2
Linear algebra ## 1 0 1 2
Data formats and
## 2 3 4 5
handling ## 3 6 7 8
Pandas package
Series
frame6.loc["b", 1]
DataFrame
Import/Export data
## first 9
Visual
illustrations ## second 10
Matplotlib package ## third 11
Figures and subplots
## Name: (b, 1), dtype: int64
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Operations between DataFrame and Series 195
Essential
concepts
Getting started
Procedural Series and DataFrames
programming
Object-orientation frame7 = frame[["price", "volume"]]
Numerical frame7.index = ["Daimler", "E.ON", "Siemens", "BASF", "BMW"]
programming series = frame7.iloc[2]
NumPy package
frame7
Array basics
Linear algebra
## price volume
Data formats and
handling ## Daimler 69.20 4456290
Pandas package ## E.ON 8.11 3667975
Series
## Siemens 110.92 3669487
DataFrame
Import/Export data
## BASF 87.28 1778058
Visual
## BMW 87.81 1824582
illustrations
Matplotlib package series
Figures and subplots
Plot types and styles
## price 110.92
Pandas layers
## volume 3669487.00
Applications
Time series
## Name: Siemens, dtype: float64
Moving window
Financial applications

Here the Series was generated from the first row of the DataFrame.
© 2019 PyEcon.org
Operations between DataFrames and Series 196
Essential
concepts
Getting started
Procedural Operations between Series and DataFrames down the rows
programming
Object-orientation frame7 + series
Numerical
programming ## price volume
NumPy package
## Daimler 180.12 8125777.0
Array basics
Linear algebra
## E.ON 119.03 7337462.0
Data formats and
## Siemens 221.84 7338974.0
handling ## BASF 198.20 5447545.0
Pandas package ## BMW 198.73 5494069.0
Series
DataFrame
Import/Export data

Visual By default arithmetic operations between DataFrames and Series


illustrations
Matplotlib package
match the index of the Series on the DataFrame’s columns,
Figures and subplots
Plot types and styles The operations will be broadcasted along the rows.
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Operations between DataFrames and Series 197
Essential
concepts
Getting started
Procedural Operations between Series and DataFrames down the columns
programming
Object-orientation series2 = frame7["price"]
Numerical frame7.add(series2, axis=0)
programming
NumPy package
Array basics
## price volume
Linear algebra ## Daimler 138.40 4456359.20
Data formats and
## E.ON 16.22 3667983.11
handling ## Siemens 221.84 3669597.92
Pandas package
## BASF 174.56 1778145.28
## BMW 175.62 1824669.81
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Here, the Series was generated from the price column,
Figures and subplots
Plot types and styles
The arithmetic operation will be broadcasted along a column
Pandas layers matching the DataFrame’s row index (axis=0).
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Operations between DataFrames and Series 198
Essential
concepts
Getting started
Procedural Pandas vs Numpy
programming
Object-orientation nparr = np.arange(12.).reshape((3, 4))
Numerical row = nparr[0]
programming
nparr-row
NumPy package
Array basics
Linear algebra ## array([[0., 0., 0., 0.],
Data formats and
## [4., 4., 4., 4.],
handling ## [8., 8., 8., 8.]])
Pandas package
Series
DataFrame
Import/Export data
Operations between DataFrames are similar to operations between
Visual
illustrations one- and two-dimensional Numpy arrays,
Matplotlib package
Figures and subplots
As in DataFrames and Series the arithmetic operations will be
Plot types and styles
Pandas layers
broadcasted along the rows.
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
NumPy functions on DataFrames 199
Essential
concepts
Getting started DataFrame.apply(np.function, axis): Applies a NumPy function
Procedural
programming on the DataFrame axis. See also statistical and mathematical NumPy
Object-orientation
functions.
Numerical
programming
NumPy package
Numpy functions on DataFrames
Array basics
Linear algebra
frame7[:2]
Data formats and
handling ## price volume
Pandas package ## Daimler 69.20 4456290
Series
## E.ON 8.11 3667975
DataFrame
Import/Export data
frame7.apply(np.mean)
Visual
illustrations
Matplotlib package ## price 72.664
Figures and subplots ## volume 3079278.400
Plot types and styles
## dtype: float64
Pandas layers

Applications
frame7.apply(np.sqrt)[:2]
Time series
Moving window
Financial applications
## price volume
## Daimler 8.318654 2110.992657
## E.ON 2.847806 1915.195812
© 2019 PyEcon.org
Grouping DataFrames 200
Essential
concepts
Getting started DataFrame.groupby(col1, col2): Groups DataFrame by columns
Procedural
programming (grouping by one or more than two columns is also possible). See also
Object-orientation
how to import data from CSV files.
Numerical
programming
NumPy package Groupby
Array basics
Linear algebra vote = pd.read_csv("data/vote.csv")[["Party", "Member", "Vote"]]
Data formats and vote.head()
handling
Pandas package
## Party Member Vote
## 0 CDU/CSU Abercron yes
Series
DataFrame
Import/Export data ## 1 CDU/CSU Albani yes
Visual ## 2 CDU/CSU Altenkamp yes
illustrations ## 3 CDU/CSU Altmaier absent
Matplotlib package
Figures and subplots
## 4 CDU/CSU Amthor yes
Plot types and styles
Pandas layers
Adding the functions count() or mean() to groupby() returns the
Applications
Time series sum or the mean of the grouped columns.
Moving window
Financial applications

© 2019 PyEcon.org
Grouping DataFrames 201
Essential
concepts
Getting started
Procedural Groupby
programming
Object-orientation res = vote.groupby(["Party", "Vote"]).count()
Numerical res
programming
NumPy package
Array basics
## Member
Linear algebra ## Party Vote
Data formats and
## AfD absent 6
handling ## no 86
Pandas package
## BÜ90/GR absent 9
## no 58
Series
DataFrame
Import/Export data ## CDU/CSU absent 7
Visual ## yes 239
illustrations ## DIE LINKE. absent 7
Matplotlib package
Figures and subplots
## no 62
Plot types and styles ## FDP absent 5
Pandas layers ## no 75
Applications ## Fraktionslos absent 1
Time series ## no 1
Moving window
Financial applications
## SPD absent 6
## yes 147

© 2019 PyEcon.org
Section 3.4 202
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Data formats and handling
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Import/Export data
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Reading data in text format 203
Essential
concepts
Getting started ex1.csv
Procedural
programming
Object-orientation a, b, c, d, hello
Numerical
programming
1, 2, 3, 4, world
NumPy package 5, 6, 7, 8, python
2, 3, 5, 7, pandas
Array basics
Linear algebra

Data formats and


handling
Pandas package pd.read_csv("file"): Reads CSV into DataFrame.
Series
DataFrame
Import/Export data Read comma-separated values
Visual
illustrations
df = pd.read_csv("data/ex1.csv")
Matplotlib package df
Figures and subplots
Plot types and styles ## a b c d hello
Pandas layers
## 0 1 2 3 4 world
Applications ## 1 5 6 7 8 python
Time series
Moving window
## 2 2 3 5 7 pandas
Financial applications

© 2019 PyEcon.org
Reading data in text format 204
Essential
concepts
Getting started tab.txt
Procedural
programming
Object-orientation a| b| c| d| hello
Numerical
programming
1| 2| 3| 4| world
NumPy package 5| 6| 7| 8| python
2| 3| 5| 7| pandas
Array basics
Linear algebra

Data formats and


handling
Pandas package pd.read_table("file", sep): Reads table with any seperators into
Series
DataFrame DataFrame.
Import/Export data

Visual Read table values


illustrations
Matplotlib package
df = pd.read_table("data/tab.txt", sep="|")
Figures and subplots df
Plot types and styles
Pandas layers
## a b c d hello
Applications ## 0 1 2 3 4 world
Time series
## 1 5 6 7 8 python
Moving window
Financial applications ## 2 2 3 5 7 pandas

© 2019 PyEcon.org
Reading data in text format 205
Essential
concepts
Getting started ex2.csv
Procedural
programming
Object-orientation 1, 2, 3, 4, world
Numerical
programming
5, 6, 7, 8, python
NumPy package 2, 3, 5, 7, pandas
Array basics
Linear algebra

Data formats and


handling
CSV file without header row:
Pandas package
Series Read CSV and header settings
DataFrame
Import/Export data df = pd.read_csv("data/ex2.csv", header=None)
Visual df
illustrations
Matplotlib package ## 0 1 2 3 4
Figures and subplots
Plot types and styles
## 0 1 2 3 4 world
Pandas layers ## 1 5 6 7 8 python
Applications
## 2 2 3 5 7 pandas
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Reading data in text format 206
Essential
concepts
Getting started ex2.csv
Procedural
programming
Object-orientation 1, 2, 3, 4, world
Numerical
programming
5, 6, 7, 8, python
NumPy package 2, 3, 5, 7, pandas
Array basics
Linear algebra

Data formats and


handling
Specify header:
Pandas package
Series Read CSV and header names
DataFrame
Import/Export data df = pd.read_csv("data/ex2.csv",
Visual names=["a", "b", "c", "d", "hello"])
illustrations df
Matplotlib package
Figures and subplots
Plot types and styles
## a b c d hello
Pandas layers ## 0 1 2 3 4 world
Applications
## 1 5 6 7 8 python
Time series ## 2 2 3 5 7 pandas
Moving window
Financial applications

© 2019 PyEcon.org
Reading data in text format 207
Essential
concepts
Getting started ex2.csv
Procedural
programming
Object-orientation 1, 2, 3, 4, world
Numerical
programming
5, 6, 7, 8, python
NumPy package 2, 3, 5, 7, pandas
Array basics
Linear algebra

Data formats and


handling
Use hello-column as the index:
Pandas package
Series Read CSV and specify index
DataFrame
Import/Export data df = pd.read_csv("data/ex2.csv",
Visual names=["a", "b", "c", "d", "hello"],
illustrations index_col="hello")
Matplotlib package
df
Figures and subplots
Plot types and styles
Pandas layers ## a b c d
Applications
## hello
Time series ## world 1 2 3 4
Moving window ## python 5 6 7 8
Financial applications
## pandas 2 3 5 7

© 2019 PyEcon.org
Reading data in text format 208
Essential
concepts
Getting started ex3.csv
Procedural
programming
Object-orientation 1, 2, 3, 4, world
Numerical
programming
#+#-.,.-'*'-.,
NumPy package 5, 6, 7, 8, python
87646756754456978
Array basics
Linear algebra

Data formats and 2, 3, 5, 7, pandas


handling
Pandas package
Series
DataFrame
Skip rows while reading:
Import/Export data

Visual Read CSV and choose rows


illustrations
Matplotlib package df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
Figures and subplots df
Plot types and styles
Pandas layers
## 1 2 3 4 world
Applications ## 0 5 6 7 8 python
## 1 2 3 5 7 pandas
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Writing data to text file 209
Essential
concepts
Getting started DataFrame.to_csv("filename"): Writes DataFrame to CSV.
Procedural
programming
Object-orientation Write to CSV
Numerical df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
programming
NumPy package
df.to_csv("out/out1.csv")
Array basics
Linear algebra
out1.csv
Data formats and
handling
Pandas package ,1, 2, 3, 4, world
0,5,6,7,8, python
Series
DataFrame
Import/Export data
1,2,3,5,7, pandas
Visual
illustrations
Matplotlib package
Figures and subplots In the .csv file, the index and header is included (reason why ,1).
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Writing data to text file 210
Essential
concepts
Getting started
Procedural Write to CSV and settings
programming
Object-orientation df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
Numerical df.to_csv("out/out2.csv", index=False, header=False)
programming
NumPy package
Array basics out2.csv
Linear algebra

Data formats and


handling
5,6,7,8, python
Pandas package 2,3,5,7, pandas
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Writing data to text file 211
Essential
concepts
Getting started
Procedural Write to CSV and specify header
programming
Object-orientation df = pd.read_csv("data/ex3.csv", skiprows=[1, 3, 4])
Numerical df.to_csv("out/out3.csv", index=False,
programming
header=["a", "b", "c", "d", "e"])
NumPy package
Array basics
Linear algebra
out3.csv
Data formats and
handling
Pandas package a,b,c,d,e
Series
DataFrame
5,6,7,8, python
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Reading Excel files 212
Essential
concepts
Getting started pd.read_excel("file.xls"): Reads .xls files.
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data
Figure: goog.xls
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles Reading Excel
Pandas layers
xls_frame = pd.read_excel("data/goog.xls")
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Reading Excel files 213
Essential
concepts
Getting started
Procedural Excel as a DataFrame
programming
Object-orientation xls_frame[["Adj Close", "Volume", "High"]]
Numerical
programming ## Adj Close Volume High
NumPy package ## 0 1169.939941 1538700 1173.000000
Array basics
Linear algebra
## 1 1167.699951 2412100 1174.000000
## 2 1111.900024 4857900 1123.069946
Data formats and
handling ## 3 1055.800049 3798300 1110.000000
Pandas package ## 4 1080.599976 3448000 1081.709961
Series
## 5 1048.579956 2341700 1081.780029
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Remote data access 214
Essential
concepts
Getting started Extract financial data from Internet sources into a DataFrame. There
Procedural
programming are different sources offering different kind of data. Some sources are:
Object-orientation

Numerical
Robinhood
programming
NumPy package IEX
Array basics
Linear algebra Yahoo Finance
Data formats and
handling
World Bank
Pandas package
Series
OECD
DataFrame
Import/Export data
Eurostat
Visual
illustrations
A complete list of the sources and the usage can be found here:
Matplotlib package pandas-datareader
Figures and subplots
Plot types and styles
Pandas layers
Import pandas-datareader
Applications from pandas_datareader import data
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Data access: IEX 215
Essential
concepts
Getting started data.DataReader("symbol", "source", "start", "end"): Returns
Procedural
programming financial data of a stock in a certain time period.
Object-orientation

Numerical
programming
IEX get data
NumPy package
ford = data.DataReader("F", "iex", "2017-01-01", "2018-01-31")
ford.head()[["close", "volume"]]
Array basics
Linear algebra

Data formats and


handling ## close volume
Pandas package ## date
Series
## 2017-01-03 10.7619 40510821
## 2017-01-04 11.2577 77638075
DataFrame
Import/Export data
## 2017-01-05 10.9158 75628443
Visual
illustrations ## 2017-01-06 10.9072 40315887
Matplotlib package ## 2017-01-09 10.7961 39438393
Figures and subplots
Plot types and styles
Pandas layers Stock code list

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Data access: IEX 216
Essential
concepts
Getting started
Procedural IEX handle data
programming
Object-orientation ford.index
Numerical ## Index(['2017-01-03', '2017-01-04',...
programming
## dtype='object', name='date',...
NumPy package
Array basics
ford.loc["2018-01-26"]
Linear algebra

Data formats and


handling
## open 1.046130e+01
Pandas package ## high 1.056060e+01
Series ## low 1.038010e+01
DataFrame
## close 1.051550e+01
## volume 5.249600e+07
Import/Export data

Visual
illustrations ## Name: 2018-01-26, dtype: float64
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
DataFrame index
Applications Index of the DataFrame is different at different sources. Always check
Time series
Moving window
DataFrame.index!
Financial applications

© 2019 PyEcon.org
Data access: IEX 217
Essential
concepts
Getting started
Procedural IEX
programming
Object-orientation sap = data.DataReader("SAP", "iex", "2017-01-01", "2018-01-31")
Numerical sap[25:27]
programming
NumPy package ## open high low close volume
Array basics
Linear algebra
## date
## 2017-02-08 89.5382 90.0263 89.4405 89.6065 653804
Data formats and
handling ## 2017-02-09 89.7139 89.9738 89.5284 89.5284 548787
Pandas package
Series sap.loc["2017-02-08"]
DataFrame

## open 89.5382
Import/Export data

Visual
illustrations
## high 90.0263
Matplotlib package ## low 89.4405
Figures and subplots ## close 89.6065
Plot types and styles
## volume 653804.0000
Pandas layers
## Name: 2017-02-08, dtype: float64
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Data access: Eurostat 218
Essential
concepts
Getting started
Procedural Eurostat
programming
Object-orientation population = data.DataReader("tps00001", "eurostat", "2007-01-01",
Numerical "2018-01-01")
programming
NumPy package
population.columns
Array basics
Linear algebra
## MultiIndex(levels=[[Population on 1 January - total], [Albania,
## Andorra, Armenia, Austria, Azerbaijan, Belarus, Belgium, ...
Data formats and
handling
population["Population on 1 January - total", "France"][0:5]
Pandas package
Series ## FREQ Annual
## TIME_PERIOD
DataFrame
Import/Export data
## 2007-01-01 63645065.0
Visual
illustrations ## 2008-01-01 64007193.0
Matplotlib package ## 2009-01-01 64350226.0
Figures and subplots
## 2010-01-01 64658856.0
Plot types and styles
Pandas layers
## 2011-01-01 64978721.0
Applications
Time series Eurostat Database
Moving window
Financial applications

© 2019 PyEcon.org
Read data from HTML 219
Essential
concepts
Getting started Website used for the example: Econometrics
Procedural
programming
Object-orientation Beautiful Soup
Numerical
programming
from bs4 import BeautifulSoup
NumPy package import requests
Array basics url = "www.uni-goettingen.de/de/applied-econometrics/412565.html"
Linear algebra
r = requests.get("https://" + url)
Data formats and
handling
d = r.text
Pandas package soup = BeautifulSoup(d, "lxml")
soup.title
Series
DataFrame

## <title>Applied Econometrics - Georg-August-... ...</title>


Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Reading data from HTML in detail exceeds the content of this course.
Plot types and styles If you are interested in this kind of importing data, you can find detailed
Pandas layers
information on Beautiful Soup here.
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Motivation 220
Essential
concepts
Getting started
Procedural Bollinger
programming
Object-orientation sap = data.DataReader("SAP", "iex", "2017-01-01", "2018-08-31")
Numerical sap.index = pd.to_datetime(sap.index)
programming
boll = sap["close"].rolling(window=20, center=False).mean()
NumPy package
Array basics
std = sap["close"].rolling(window=20, center=False).std()
Linear algebra upp = boll + std * 2
Data formats and low = boll - std * 2
handling fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
Pandas package
Series
DataFrame boll.plot(ax=ax, label="20 days Rolling mean")
Import/Export data upp.plot(ax=ax, label="Upper Band")
Visual low.plot(ax=ax, label="Lower Band")
illustrations
sap["close"].plot(ax=ax, label="SAP Price")
Matplotlib package
Figures and subplots ax.legend(loc="best")
Plot types and styles fig.savefig("out/boll.pdf")
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Motivation 221
Essential
concepts
Getting started
Procedural
programming
125 20 days Rolling mean
Object-orientation Upper Band
120 Lower Band
Numerical
programming
SAP Price
NumPy package
115
Array basics
Linear algebra
110
Data formats and
handling
Pandas package 105
Series
DataFrame
100
Import/Export data

Visual
illustrations 95
Matplotlib package
Figures and subplots 90
Plot types and styles
Pandas layers
85
Applications
1 3 5 7 9 1 1 3 5 7 9
7-0 017-0 017-0 017-0 017-0 017-1 018-0 018-0 018-0 018-0 018-0
Time series
Moving window
201 2 2 2 2 2 2 2 2 2 2
Financial applications date

© 2019 PyEcon.org
Chapter 4 222
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Visual illustrations
Numerical
programming
NumPy package
Array basics
4.1 Matplotlib package
Linear algebra

Data formats and


4.2 Figures and subplots
handling
Pandas package 4.3 Plot types and styles
Series
DataFrame
Import/Export data
4.4 Pandas layers
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Section 4.1 223
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Visual illustrations
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Matplotlib package
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
matplotlib 224
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package The package matplotlib is a free software library for python including
Series
DataFrame the following functions:
Import/Export data

Visual
Image plots, Contour plots, Scatter plots, Polar plots, Line plots,
illustrations 3D plots,
Matplotlib package
Figures and subplots Variety of hardcopy formats,
Plot types and styles
Pandas layers Works in Python scripts, the Python and IPython shell and the
Applications
Time series
jupyter notebook,
Moving window
Financial applications
Interactive environments.

© 2019 PyEcon.org
matplotlib 225
Essential
concepts
Getting started
Procedural
Usage of matplotlib
programming
Object-orientation matplotlib has a vast number of functions and options, which is hard
Numerical
programming
to remember. But for almost every task there is an example you can
NumPy package take code from. A great source of information is the examples gallery
on the matplotlib homepage. Also note the best practice quick start
Array basics
Linear algebra

Data formats and guide.


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Simple plot 226
Essential
concepts
Getting started plt.plot(array): Plots the values of a list, the X-axis has by default
Procedural
programming the range [0, 1, ..., n-1].
Object-orientation

Numerical
programming
Import matplotlib and simple example
NumPy package import matplotlib.pyplot as plt
Array basics
Linear algebra
import numpy as np
plt.plot(np.arange(10))
Data formats and
handling plt.savefig("out/list.pdf")
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations 8

Matplotlib package
6
Figures and subplots
Plot types and styles
4
Pandas layers

Applications 2

Time series
Moving window 0
0 2 4 6 8
Financial applications

© 2019 PyEcon.org
Section 4.2 227
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Visual illustrations
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Figures and subplots
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Figures 228
Essential
concepts
Getting started Plots in matplotlib reside in a Figure object:
Procedural
programming plt.figure(...): Creates new Figure object allowing for multiple
Object-orientation
parameters.
Numerical
programming plt.gcf(): Returns the reference of the active figure.
NumPy package
Array basics
Linear algebra
Create Figures
Data formats and fig = plt.figure(figsize=(16, 8))
handling
Pandas package
print(plt.gcf())
Series
DataFrame ## Figure(1600x800)
Import/Export data

Visual
illustrations
Matplotlib package A Figure object can be considered as an empty window,
Figures and subplots
Plot types and styles The Figure object has a number of options, such as the size or
Pandas layers

Applications
the aspect ratio,
Time series
Moving window
You cannot draw a plot in a blank figure. There has to be a
Financial applications subplot in the Figure object.

© 2019 PyEcon.org
Saving plots to file 229
Essential
concepts
Getting started plt.savefig("filename"): Saves active figure to file.
Procedural
programming Available file formats are among others:
Object-orientation

Numerical
programming Filename extension Description
NumPy package
Array basics .png Portable Network Graphics
Linear algebra
.pdf Portable Document Format
Data formats and
handling .svg Scalable Vector Graphics
Pandas package
Series
.jpeg JPEG File Interchange Format
DataFrame
Import/Export data
.jpg JPEG File Interchange Format
Visual .ps PostScript
illustrations
Matplotlib package
.raw Raw Image Format
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Subplots 230
Essential
concepts
Getting started fig.add_subplot(): Adds subplot to the Figure fig.
Procedural
programming Example: fig.add_subplot(2, 2, 1) creates four subplots and se-
Object-orientation
lects the first.
Numerical
programming
NumPy package
Adding subplots
Array basics
Linear algebra
ax1 = fig.add_subplot(2, 2, 1)
Data formats and
ax2 = fig.add_subplot(2, 2, 2)
handling ax3 = fig.add_subplot(2, 2, 3)
Pandas package
ax4 = fig.add_subplot(2, 2, 4)
fig.savefig("out/subplots.pdf")
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
The Figure object is filled with subplots in which the plots reside,
Using the plt.plot() command without creating a subplot in
Figures and subplots
Plot types and styles
Pandas layers
advance, matplotlib will create a Figure object and a subplot
Applications
Time series
automatically,
Moving window
Financial applications
The Figure object and its subplots can be created in one line.

© 2019 PyEcon.org
Subplots 231
Essential
concepts
Getting started
Procedural
programming
Object-orientation
1.0 1.0
Numerical 0.8 0.8
programming
NumPy package 0.6 0.6

Array basics 0.4 0.4


Linear algebra
0.2 0.2
Data formats and
0.0 0.0
handling 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Pandas package 1.0 1.0
Series
0.8 0.8
DataFrame
Import/Export data 0.6 0.6

Visual 0.4 0.4


illustrations
0.2 0.2
Matplotlib package
0.0 0.0
Figures and subplots 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Subplots 232
Essential
concepts
Getting started
Procedural Filling subplots with content
programming
Object-orientation from numpy.random import randn
Numerical ax1.plot([5, 7, 4, 3, 1])
programming ax2.hist(randn(100), bins=20, color="r")
ax3.scatter(np.arange(30), np.arange(30) * randn(30))
NumPy package
Array basics
Linear algebra ax4.plot(randn(40), "k--")
Data formats and fig.savefig("out/content.pdf")
handling
Pandas package
Series
DataFrame The subplots in one Figure object can be filled with different plot
Import/Export data

Visual
types,
illustrations
Matplotlib package
Using only plt.plot() matplotlib draws the plot in the last
Figures and subplots Figure object and last subplot selected.
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Subplots 233
Essential
concepts
Getting started
Procedural
programming
Object-orientation
7 14
Numerical 6 12
programming 10
5
NumPy package 8
4
Array basics 6
3
Linear algebra 4
2
2
Data formats and 1
0
handling 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 2 1 0 1 2 3
Pandas package
2
Series 40
DataFrame 20 1
Import/Export data
0 0
Visual 20
illustrations 1
40
Matplotlib package 2
Figures and subplots 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Standard creation of plots 234
Essential
concepts
Getting started plt.subplots(nrows, ncols, sharex, sharey): Creates figure and
Procedural
programming subplots in one line. If sharex or sharey are True, all subplots share
Object-orientation
the same X- or Y-ticks.
Numerical
programming
NumPy package
Standard creation
Array basics fig, axes = plt.subplots(2, 3, figsize=(16, 8), sharey=True)
Linear algebra
axes[1, 1].plot(np.arange(7), color="r")
Data formats and
handling
axes[0, 2].plot(np.arange(10, 0, -1))
Pandas package fig.savefig("out/standard.pdf")
Series
DataFrame
Import/Export data

Visual 10
illustrations 8
Matplotlib package 6

Figures and subplots 4

Plot types and styles 2

0
Pandas layers 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8

10
Applications
8
Time series
6
Moving window
4
Financial applications 2

0
0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0

© 2019 PyEcon.org
Section 4.3 235
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Visual illustrations
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Plot types and styles
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Plot types 236
Essential
concepts
Getting started ax.scatter(x, y): Creates a scatter plot of x vs y.
Procedural
programming ax.hist(x, bins): Creates a histogram.
Object-orientation
ax.fill_between(x, y, a): Creates a plot of x vs y and fills plot
Numerical
programming between a and y.
NumPy package
Array basics
Linear algebra
Types
Data formats and fig, ax = plt.subplots(1, 3, figsize=(16, 8))
handling
Pandas package
ax[0].hist([1, 2, 3, 4, 5, 4, 3, 2, 3, 4, 2, 3, 4, 4],
Series bins=5, color="yellow")
DataFrame x = np.arange(0, 10, 0.1)
Import/Export data
y = np.sin(x)
Visual
illustrations
ax[1].fill_between(x, y, 0, color="green")
Matplotlib package ax[2].scatter(x, y)
Figures and subplots fig.savefig("out/types.pdf")
Plot types and styles
Pandas layers

Applications A vast number of plot types can be found in the examples gallery.
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Plot types 237
Essential
concepts
Getting started
Procedural
programming
Object-orientation
5 1.00 1.00
Numerical
programming
0.75 0.75
NumPy package
4
Array basics 0.50 0.50
Linear algebra
0.25 0.25
Data formats and 3
handling
0.00 0.00
Pandas package
Series 2 0.25 0.25
DataFrame
Import/Export data 0.50 0.50

1
Visual 0.75 0.75
illustrations
Matplotlib package 1.00 1.00
0
Figures and subplots 1 2 3 4 5 0 2 4 6 8 10 0 2 4 6 8 10
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Adjusting the spacing around subplots 238
Essential
concepts
Getting started plt.subplots_adjust(left, bottom, ..., hspace): Sets the space
Procedural
programming between the subplots. wspace and hspace control the percentage of
Object-orientation
the figure width and figure height, respectively, to use as spacing
Numerical
programming between subplots.
NumPy package
Array basics
Linear algebra
Adjust spacing
Data formats and fig, axes = plt.subplots(2, 2, sharex=True, sharey=True)
handling
Pandas package
for i in range(2):
Series for j in range(2):
DataFrame axes[i][j].plot(randn(10))
Import/Export data
plt.subplots_adjust(wspace=0, hspace=0)
Visual
illustrations
fig.savefig("out/spacing.pdf")
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Adjusting the spacing around subplots 239
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 3
programming
NumPy package 2
Array basics
Linear algebra 1
Data formats and
handling 0
Pandas package
Series 1
DataFrame
Import/Export data
3
Visual
illustrations 2
Matplotlib package
Figures and subplots 1
Plot types and styles
Pandas layers
0
Applications
Time series 1
Moving window
Financial applications
0 2 4 6 8 0 2 4 6 8

© 2019 PyEcon.org
Colors, markers and line styles 240
Essential
concepts
Getting started ax.plot(data, linestyle, color, marker): Sets data and styles
Procedural
programming of subplot ax.
Object-orientation

Numerical
programming
Styles
NumPy package
fig, ax = plt.subplots(1, figsize=(15, 6))
ax.plot(randn(10), linestyle="--", color="darkcyan", marker="p")
Array basics
Linear algebra

Data formats and


fig.savefig("out/style.pdf")
handling
Pandas package
Series
DataFrame
Import/Export data
2.0
Visual
illustrations 1.5
Matplotlib package
Figures and subplots 1.0

Plot types and styles


0.5
Pandas layers
0.0
Applications
Time series 0.5
Moving window
1.0
Financial applications
0 2 4 6 8

© 2019 PyEcon.org
Plot colors 241
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Plot line styles 242
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Plot markers 243
Essential
concepts
Getting started
Procedural
Marker Description
programming
Object-orientation
"." point
Numerical "," pixel
programming
NumPy package
"o" circle
Array basics "v" triangle_down
Linear algebra

Data formats and


"8" octagon
handling
Pandas package
"s" square
Series "p" pentagon
DataFrame
Import/Export data "P" plus (filled)
Visual "*" star
illustrations
Matplotlib package "h" hexagon1
Figures and subplots
Plot types and styles
"H" hexagon2
Pandas layers "+" plus
Applications
Time series
"x" x
Moving window "X" x (filled)
Financial applications
"D" diamond

© 2019 PyEcon.org
Ticks and labels 244
Essential
concepts
Getting started ax.set_xticks(): Sets list of X-ticks, analogously for Y-axis.
Procedural
programming ax.set_xlabel(): Sets the X-label.
Object-orientation
ax.set_title(): Sets the subplot title.
Numerical
programming
NumPy package Ticks and labels - default
Array basics
Linear algebra
fig, ax = plt.subplots(1, figsize=(15, 10))
Data formats and
ax.plot(randn(1000).cumsum())
handling fig.savefig("out/withoutlabls.pdf")
Pandas package
Series
DataFrame
Import/Export data
Here, we create a Figure object as well as a subplot and fill it
Visual
illustrations with a line plot of a random walk,
Matplotlib package
Figures and subplots By default matplotlib places the ticks evenly distributed along the
Plot types and styles
Pandas layers
data range. Individual ticks can be set as follows,
Applications By default there is no axis label or title.
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Ticks and labels 245
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package 60
Array basics
Linear algebra

Data formats and


handling
40
Pandas package
Series
DataFrame
Import/Export data
20
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles 0
Pandas layers

Applications
Time series
0 200 400 600 800 1000
Moving window
Financial applications

© 2019 PyEcon.org
Ticks and labels 246
Essential
concepts
Getting started
Procedural Set ticks and labels
programming
Object-orientation ax.set_xticks([0, 250, 500, 750, 1000])
Numerical ax.set_xlabel("Days", fontsize=20)
programming ax.set_ylabel("Change", fontsize=20)
NumPy package
ax.set_title("Simulation", fontsize=30)
Array basics
Linear algebra
fig.savefig("out/labels.pdf")
Data formats and
handling

The individual ticks are given as a list to ax.set_xticks(),


Pandas package
Series
DataFrame
Import/Export data The label and titel can be set to an individual size using the
Visual argument fontsize.
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Ticks and labels 247
Essential
concepts
Getting started

Simulation
Procedural
programming
Object-orientation

Numerical
programming
NumPy package 60
Array basics
Linear algebra

Data formats and


handling
40
Pandas package
Change

Series
DataFrame
Import/Export data
20
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles 0
Pandas layers

Applications
Time series
0 250 500 750 1000
Moving window Days
Financial applications

© 2019 PyEcon.org
Legends 248
Essential
concepts
Getting started Using multiple plots in one subplot one needs a legend.
Procedural
programming ax.legend(loc): Shows the legend at location loc.
Object-orientation
Some options: "best", "upper right", "center left", ...
Numerical
programming
NumPy package Set legend
Array basics
Linear algebra fig = plt.figure(figsize=(15, 10))
Data formats and ax = fig.add_subplot(1, 1, 1)
handling ax.plot(randn(1000).cumsum(), label="first")
ax.plot(randn(1000).cumsum(), label="second")
Pandas package
Series
DataFrame ax.plot(randn(1000).cumsum(), label="third")
Import/Export data ax.legend(loc="best", fontsize=20)
Visual fig.savefig("out/legend.pdf")
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
The legend displays the label and the color of the associated plot,
Applications Using the option "best" the legend will placed in a corner where
Time series
Moving window is does not interfere the plots.
Financial applications

© 2019 PyEcon.org
Legends 249
Essential
concepts
Getting started
Procedural
programming
Object-orientation
80
Numerical
first
programming second
NumPy package 60 third
Array basics
Linear algebra
40
Data formats and
handling
Pandas package 20
Series
DataFrame
Import/Export data 0

Visual
illustrations
20
Matplotlib package
Figures and subplots
Plot types and styles 40
Pandas layers

Applications
60
Time series
0 200 400 600 800 1000
Moving window
Financial applications

© 2019 PyEcon.org
Annotations on a subplot 250
Essential
concepts
Getting started ax.text(x, y, "text", fontsize): Inserts a text into a subplot.
Procedural
programming ax.annotate("text", xy, xytext, arrwoprops): Inserts an ar-
Object-orientation
row with annotations.
Numerical
programming
NumPy package
Annotations
Array basics
ax.text(400, -30, "here", fontsize=50)
Linear algebra
ax.annotate("there",
Data formats and
handling fontsize=40,
Pandas package xy=(0, 0),
Series
xytext=(400, 8),
arrowprops=dict(facecolor="black",
DataFrame
Import/Export data

Visual
shrink=0.05))
illustrations ax.set_yticks([-40, -30, -20, -10, 0, 10, 20, 30, 40])
Matplotlib package fig.savefig("out/arrow.pdf")
Figures and subplots
Plot types and styles
Pandas layers

Applications Using ax.annotate() the arrow head points at xy and the


bottom left corner of the text will be placed at xytext.
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Annotations 251
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
first
programming second
NumPy package third
Array basics
Linear algebra
40
Data formats and
handling 30

there
Pandas package 20
Series
10
DataFrame
Import/Export data 0

Visual 10

here
illustrations
20
Matplotlib package
Figures and subplots 30
Plot types and styles 40
Pandas layers

Applications
Time series
0 200 400 600 800 1000
Moving window
Financial applications

© 2019 PyEcon.org
Annotations 252
Essential
concepts
Getting started
Procedural Annotation Lehman
programming
Object-orientation import pandas as pd
Numerical
from datetime import datetime
programming
NumPy package
date = datetime(2008, 9, 15)
fig = plt.figure(figsize=(16, 8))
Array basics
Linear algebra

Data formats and


ax = fig.add_subplot(1, 1, 1)
handling dow = pd.read_csv("data/dji.csv", index_col=0, parse_dates=True)
Pandas package close = dow["Close"]
Series
DataFrame
close.plot(ax=ax)
Import/Export data ax.annotate("Lehman Bankruptcy",
Visual
fontsize=30,
illustrations xy=(date, close.loc[date] + 400),
Matplotlib package
xytext=(date, 22000),
Figures and subplots
Plot types and styles
arrowprops=dict(facecolor="red",
Pandas layers shrink=0.03))
Applications ax.set_title("Dow Jones Industrial Average", size=40)
Time series fig.savefig("out/lehman.pdf")
Moving window
Financial applications

© 2019 PyEcon.org
Annotations 253
Essential
concepts
Getting started

Dow Jones Industrial Average


Procedural
programming
Object-orientation
27500
Numerical
programming 25000

NumPy package
Array basics
22500 Lehman Bankruptcy
Linear algebra 20000

Data formats and 17500


handling
15000
Pandas package
Series 12500
DataFrame
10000
Import/Export data
7500
Visual
illustrations
6 8 0 2 4 6 8
200 200 201 201 201 201 201
Matplotlib package Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Drawing on a subplot 254
Essential
concepts
Getting started plt.Rectangle((x, y), width, height, angle): Creates a rect-
Procedural
programming angle
Object-orientation
plt.Circle((x,y), radius): Creates a circle.
Numerical
programming
NumPy package Drawing
Array basics
Linear algebra fig = plt.figure(figsize=(6, 6))
Data formats and ax = fig.add_subplot(1, 1, 1)
handling ax.set_xticks([0, 1, 2, 3, 4, 5])
ax.set_yticks([0, 1, 2, 3, 4, 5])
Pandas package
Series
DataFrame rectangle = plt.Rectangle((1.5, 1),
Import/Export data width=0.8, height=2,
Visual color="red", angle=30)
illustrations
Matplotlib package
circ = plt.Circle((3, 3),
Figures and subplots radius=1, color="blue")
Plot types and styles ax.add_patch(rectangle)
Pandas layers
ax.add_patch(circ)
Applications fig.savefig("out/draw.pdf")
Time series
Moving window
Financial applications A list of all available patches can be found here: matplotlib-patches

© 2019 PyEcon.org
Drawing on a subplot 255
Essential
concepts
Getting started
Procedural
programming
Object-orientation 5
Numerical
programming
NumPy package
Array basics 4
Linear algebra

Data formats and


handling
Pandas package 3
Series
DataFrame
Import/Export data

Visual 2
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
1
Pandas layers

Applications
Time series
Moving window 0
Financial applications 0 1 2 3 4 5

© 2019 PyEcon.org
Best practice: Visual illustrations 256
Essential
concepts
Getting started Step 1
Procedural
programming Create a Figure object and subplots
Object-orientation

Numerical
programming
Best practice Step 1
NumPy package
fig, ax = plt.subplots(1, 1, figsize=(16, 8))
Array basics
Linear algebra

Data formats and Step 2


handling
Pandas package Plot data using different plot types
Series
DataFrame
An overview of plot types can be found in the examples gallery.
Import/Export data

Visual
Best practice Step 2
illustrations
Matplotlib package
x = np.arange(0, 10, 0.1)
Figures and subplots y = np.sin(x)
Plot types and styles
ax.scatter(x, y)
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Best practice: Visual illustrations 257
Essential
concepts
Getting started
Procedural
programming
Object-orientation
1.00
Numerical
programming
0.75
NumPy package
Array basics 0.50
Linear algebra
0.25
Data formats and
handling
0.00
Pandas package
Series 0.25
DataFrame
Import/Export data 0.50

Visual 0.75
illustrations
Matplotlib package 1.00
Figures and subplots 0 2 4 6 8 10
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Best practice: Visual illustrations 258
Essential
concepts
Getting started Step 3
Procedural
programming Set colors, markers and line styles
Object-orientation

Numerical
programming
Best practice Step 3
NumPy package
ax.scatter(x, y, color="green", marker="s")
Array basics
Linear algebra

Data formats and Step 4


handling
Pandas package Set title, axis labels and ticks
Series
DataFrame Best practice Step 4
Import/Export data

Visual ax.set_title("Sine wave", fontsize=30)


illustrations ax.set_xticks([0, 2.5, 5, 7.5, 10])
ax.set_yticks([-1, 0, 1])
Matplotlib package
Figures and subplots
Plot types and styles ax.set_ylabel("y-value", fontsize=20)
Pandas layers ax.set_xlabel("x-value", fontsize=20)
Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Best practice: Visual illustrations 259
Essential
concepts
Getting started
Procedural

Sine wave
programming
Object-orientation

1
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


y-value

handling
0
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package 1
Figures and subplots 0.0 2.5 5.0 7.5 10.0
Plot types and styles x-value
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Best practice: Visual illustrations 260
Essential
concepts
Getting started Step 5
Procedural
programming Set labels
Object-orientation

Numerical Best practice Step 5


programming
NumPy package ax.scatter(x, y, color="green", marker="s", label="Sine")
Array basics
Linear algebra

Data formats and Step 6


handling
Pandas package
Set legend (if you add another plot to an existing figure)
Series
DataFrame Best practice Step 6
Import/Export data

Visual ax.plot(np.arange(11) / 10, color="blue", linestyle="-",


illustrations label="Linear")
Matplotlib package
Figures and subplots
ax.legend(fontsize=20)
Plot types and styles
Pandas layers
Step 7
Applications
Time series
Save plot to file
Moving window
Financial applications
Best practice Step 7
fig.savefig("out/sinewave.pdf")
© 2019 PyEcon.org
Best practice: Visual illustrations 261
Essential
concepts
Getting started
Procedural

Sine wave
programming
Object-orientation

1
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


y-value

handling
0
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations Linear
Matplotlib package 1 Sine
Figures and subplots 0.0 2.5 5.0 7.5 10.0
Plot types and styles x-value
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Section 4.4 262
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Visual illustrations
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Pandas layers
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Line plots 263
Essential
concepts
Getting started DataFrame/Series.plot(): Plots a DataFrame or a Series.
Procedural
programming
Object-orientation Simple line plot
Numerical
programming plt.close("all")
NumPy package p = pd.Series(np.random.rand(10).cumsum(),
Array basics
index=np.arange(0, 1000, 100))
p
Linear algebra

Data formats and


handling
Pandas package
## 0 0.669761
Series ## 100 0.989702
DataFrame
## 200 1.655715
## 300 1.966073
Import/Export data

Visual
illustrations
## 400 2.151883
Matplotlib package ## 500 2.776987
Figures and subplots ## 600 2.839751
Plot types and styles
Pandas layers
## 700 3.188431
## 800 4.169061
Applications
Time series
## 900 4.923286
Moving window ## dtype: float64
Financial applications

p.plot()
plt.savefig("out/line.pdf")
© 2019 PyEcon.org
Line plots 264
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 5
programming
NumPy package
Array basics
Linear algebra 4
Data formats and
handling
Pandas package
3
Series
DataFrame
Import/Export data

Visual 2
illustrations
Matplotlib package
Figures and subplots
Plot types and styles 1
Pandas layers

Applications 0 200 400 600 800


Time series
Moving window
Financial applications

© 2019 PyEcon.org
Line plots 265
Essential
concepts
Getting started
Procedural Line plots
programming
Object-orientation df = pd.DataFrame(np.random.randn(10, 3), index=np.arange(10),
Numerical columns=["a", "b", "c"])
programming
df
NumPy package
Array basics
Linear algebra ## a b c
Data formats and
## 0 1.703615 -1.376905 -1.336154
handling ## 1 -1.402924 0.812501 1.739143
Pandas package
## 2 0.593504 0.699582 0.423217
## 3 1.140647 -1.454363 0.250578
Series
DataFrame
Import/Export data ## 4 -0.044809 0.438279 -0.821514
Visual ## 5 1.897959 -0.254581 0.157704
illustrations ## 6 0.782639 1.196116 0.763081
Matplotlib package
Figures and subplots
## 7 0.577947 1.815039 1.175842
Plot types and styles ## 8 -0.278585 -0.538956 0.102930
Pandas layers ## 9 -0.091891 0.310788 -0.857167
Applications
Time series df.plot(figsize=(15, 12))
Moving window
plt.savefig("out/line2.pdf")
Financial applications

© 2019 PyEcon.org
Line plots 266
Essential
concepts
Getting started
Procedural
2.0 a
programming b
Object-orientation c

Numerical
1.5
programming
NumPy package
Array basics
Linear algebra 1.0

Data formats and


handling
Pandas package 0.5
Series
DataFrame
Import/Export data 0.0
Visual
illustrations
Matplotlib package
0.5
Figures and subplots
Plot types and styles
Pandas layers
1.0
Applications
Time series
Moving window
1.5
Financial applications
0 2 4 6 8

© 2019 PyEcon.org
Plotting and pandas 267
Essential
concepts
Getting started The plot method applied to a DataFrame plots each column as a
Procedural
programming different line and shows the legend automatically. Plotting DataFrames,
Object-orientation
there are serveral arguments to change the style of the plot:
Numerical
programming
NumPy package
Array basics Argument Description
Linear algebra
kind "line", "bar", etc
Data formats and
handling logy logarithmic scale on Y-axis
Pandas package
Series
use_index If True, use index for tick labels
DataFrame
rot Rotation of tick labels
Import/Export data

Visual
xticks Values for x ticks
illustrations
Matplotlib package
yticks Values for y ticks
Figures and subplots grid Set grid True or False
Plot types and styles
Pandas layers xlim X-axis limits
Applications ylim Y-axis limits
Time series
Moving window
subplots Plot each DataFrame column in a new subplot
Financial applications

Table: Pandas plot arguments

© 2019 PyEcon.org
Pandas plot 268
Essential
concepts
Getting started
Procedural
Separated line plots
programming
Object-orientation df.plot(grid=True, rot=45, subplots=True, title="Example",
Numerical figsize=(15, 10))
programming plt.savefig("out/pandas.pdf")
NumPy package
Array basics
Linear algebra
Example
Data formats and
handling
2
a
Pandas package
1
Series
0
DataFrame
Import/Export data 1

Visual b
illustrations 1
Matplotlib package
0
Figures and subplots
1
Plot types and styles
Pandas layers
1.5 c
Applications 1.0
0.5
Time series 0.0
0.5
Moving window
1.0
Financial applications 1.5
0

© 2019 PyEcon.org 8
Standard creation of plots and pandas 269
Essential
concepts
Getting started dataframe.plot(ax=subplot): Plots a dataframe into subplot.
Procedural
programming
Object-orientation Standard creation
Numerical
programming
fig = plt.figure(figsize=(6, 6))
NumPy package ax = fig.add_subplot(1, 1, 1)
Array basics guests = np.array([[1334, 456], [1243, 597], [1477, 505],
Linear algebra
[1502, 404], [854, 512], [682, 0]])
Data formats and canteen = pd.DataFrame(guests,
handling
Pandas package
index=["Mon", "Tue", "Wed",
Series "Thu", "Fri", "Sat"],
DataFrame columns=["Zentral", "Turm"])
Import/Export data
canteen
Visual
illustrations
Matplotlib package
## Zentral Turm
Figures and subplots ## Mon 1334 456
Plot types and styles ## Tue 1243 597
Pandas layers
## Wed 1477 505
Applications
## Thu 1502 404
## Fri 854 512
Time series
Moving window
Financial applications ## Sat 682 0

© 2019 PyEcon.org
Standard creation of plots and pandas 270
Essential
concepts
Getting started
Procedural Bar plot
programming
Object-orientation canteen.plot(ax=ax, kind="bar")
Numerical ax.set_ylabel("guests", fontsize=20)
programming
ax.set_title("Canteen use in Göttingen", fontsize=20)
NumPy package
Array basics
fig.savefig("out/canteen.pdf")
Linear algebra

Data formats and

The bar plot resides in the subplot ax,


handling
Pandas package
Series
DataFrame The label and title are set as shown before without using pandas.
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Bar plot 271
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Canteen use in Göttingen
Numerical Zentral
programming Turm
NumPy package
1400
Array basics
Linear algebra 1200
Data formats and
handling 1000
guests
Pandas package
Series
800
DataFrame
Import/Export data
600
Visual
illustrations
Matplotlib package
400
Figures and subplots
Plot types and styles 200
Pandas layers

Applications 0
Mon

Tue

Wed

Thu

Fri

Sat
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Bar plot 272
Essential
concepts
Getting started
Procedural Bar plot - stacked
programming
Object-orientation canteen.plot(ax=ax, kind="bar", stacked=True)
Numerical ax.set_ylabel("guests", fontsize=20)
programming
ax.set_title("Canteen use in Göttingen", fontsize=20)
NumPy package
Array basics
fig.savefig("out/canteenstacked.pdf")
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Bar plot 273
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Canteen use in Göttingen
Numerical 2000 Zentral
programming Turm
NumPy package Zentral
1750 Turm
Array basics
Linear algebra
1500
Data formats and
handling
1250
guests
Pandas package
Series
DataFrame 1000
Import/Export data

Visual
750
illustrations
Matplotlib package 500
Figures and subplots
Plot types and styles 250
Pandas layers

Applications 0
Mon

Tue

Wed

Thu

Fri

Sat
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Plot financial data 274
Essential
concepts
Getting started
Procedural BTC chart
programming
Object-orientation fig = plt.figure(figsize=(16, 8))
Numerical ax = fig.add_subplot(1, 1, 1)
programming ax.set_ylabel("price", fontsize=20)
NumPy package
ax.set_xlabel("Date", fontsize=20)
Array basics
Linear algebra
BTC = pd.read_csv("data/btc-eur.csv", index_col=0, parse_dates=True)
Data formats and
BTCclose = BTC["Close"]
handling BTCclose.plot(ax=ax)
Pandas package
ax.set_title("BTC-EUR", fontsize=20)
fig.savefig("out/btc.pdf")
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Plot financial data 275
Essential
concepts
Getting started
Procedural
programming
Object-orientation BTC-EUR
Numerical
15000
programming
NumPy package
12500
Array basics
Linear algebra
10000
price

Data formats and


handling 7500
Pandas package
Series 5000

DataFrame
Import/Export data
2500

Visual 0
illustrations
2 3 4 5 6 7 8 9
201 201 201 201 201 201 201 201
Matplotlib package
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Plot financial data 276
Essential
concepts
Getting started
Procedural Compare - bad illustration
programming
Object-orientation amazon = pd.read_csv("data/amzn.csv", index_col=0,
Numerical parse_dates=True)["Close"]
programming
siemens = pd.read_csv("data/sie.de.csv", index_col=0,
NumPy package
Array basics
parse_dates=True)["Close"]
Linear algebra fig = plt.figure(figsize=(16, 8))
Data formats and ax = fig.add_subplot(1, 1, 1)
handling ax.set_ylabel("price")
amazon.plot(ax=ax, label="Amazon")
Pandas package
Series
DataFrame siemens.plot(ax=ax, label="Siemens")
Import/Export data ax.legend(loc="best")
Visual fig.savefig("out/compare.pdf")
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
In this illustration you can hardly compare the trend of the two
Applications stocks,
Time series
Moving window
Using pandas you can standardize both dataframes in one line.
Financial applications

© 2019 PyEcon.org
Plot financial data 277
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Amazon
Numerical Siemens
1400
programming
NumPy package 1200
Array basics
Linear algebra 1000

Data formats and


price

800
handling
Pandas package 600
Series
DataFrame 400
Import/Export data
200
Visual
illustrations
7-03 7-0
5
7-0
7
7-0
9
7-1
1
8-0
1
8-0
3
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Plot financial data 278
Essential
concepts
Getting started
Procedural Compare - good illustration
programming
Object-orientation amazon = amazon/amazon[0] * 100
Numerical siemens = siemens/siemens[0] * 100
programming
fig = plt.figure(figsize=(16, 8))
NumPy package
Array basics
ax = fig.add_subplot(1, 1, 1)
Linear algebra ax.set_ylabel("percentage")
Data formats and amazon.plot(ax=ax, label="Amazon")
handling siemens.plot(ax=ax, label="Siemens")
ax.legend(loc="best")
Pandas package
Series
DataFrame fig.savefig("out/comparenew.pdf")
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Plot financial data 279
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Amazon
Numerical Siemens
programming
160
NumPy package
Array basics
Linear algebra
140
percentage

Data formats and


handling
Pandas package 120
Series
DataFrame
Import/Export data 100

Visual
illustrations
7-03 7-0
5
7-0
7
7-0
9
7-1
1
8-0
1
8-0
3
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Chapter 5 280
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Applications
Numerical
programming
NumPy package
Array basics
5.1 Time series
Linear algebra

Data formats and


5.2 Moving window
handling
Pandas package 5.3 Financial applications
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Section 5.1 281
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Applications
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Time series
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Date and time data types 282
Essential
concepts
Getting started Data types for date and time are included in the Python standard
Procedural
programming library.
Object-orientation

Numerical
programming
Datetime creation
NumPy package from datetime import datetime
Array basics now = datetime.now()
now
Linear algebra

Data formats and


handling
Pandas package
## datetime.datetime(2019, 4, 28, 16, 26, 48, 256113)
Series
DataFrame now.day
Import/Export data

Visual ## 28
illustrations
Matplotlib package
Figures and subplots
now.hour
Plot types and styles
Pandas layers ## 16
Applications
Time series From datetime you can get the attributes year, month, day, hour,
minute, second, microsecond.
Moving window
Financial applications

© 2019 PyEcon.org
Set datetime 283
Essential
concepts
Getting started datetime(year, month, day, ..., microsecond): Sets date and
Procedural
programming time.
Object-orientation

Numerical Datetime representation


programming
NumPy package holiday = datetime(2018, 12, 24, 8, 30)
Array basics
holiday
Linear algebra

Data formats and


handling
## datetime.datetime(2018, 12, 24, 8, 30)
Pandas package
Series exam = datetime(2018, 11, 9, 10)
DataFrame print("The exam will be on the " + "{:%Y-%m-%d}".format(exam))
Import/Export data

Visual ## The exam will be on the 2018-11-09


illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Time difference 284
Essential
concepts
Getting started timedelta(days, seconds, microseconds): Represents difference
Procedural
programming between two datetime objects.
Object-orientation

Numerical
programming
Datetime difference
NumPy package from datetime import timedelta
Array basics delta = exam - now
delta
Linear algebra

Data formats and


handling
Pandas package
## datetime.timedelta(-171, 63191, 743887)
Series
DataFrame print("The exam will take place in " + str(delta.days) + " days.")
Import/Export data

Visual ## The exam will take place in -171 days.


illustrations
Matplotlib package
Figures and subplots
now
Plot types and styles
Pandas layers ## datetime.datetime(2019, 4, 28, 16, 26, 48, 256113)
Applications
Time series now + timedelta(10, 120)
Moving window
Financial applications ## datetime.datetime(2019, 5, 8, 16, 28, 48, 256113)

© 2019 PyEcon.org
Convert string and datetime 285
Essential
concepts
Getting started datetime.strftime("format"): Converts datetime object into string.
Procedural
programming datetime.strptime(datestring, "format"): Converts date as a
Object-orientation
string into a datetime object.
Numerical
programming
NumPy package Convert Datetime
Array basics
Linear algebra
stamp = datetime(2018, 4, 12)
Data formats and
stamp
handling
Pandas package ## datetime.datetime(2018, 4, 12, 0, 0)
Series
DataFrame
Import/Export data
print("German date format: " + stamp.strftime("%d.%m.%Y"))
Visual
illustrations
## German date format: 12.04.2018
Matplotlib package
Figures and subplots val = "2018-5-5"
Plot types and styles d = datetime.strptime(val, "%Y-%m-%d")
Pandas layers
d
Applications
Time series
## datetime.datetime(2018, 5, 5, 0, 0)
Moving window
Financial applications

© 2019 PyEcon.org
Convert string and datetime 286
Essential
concepts
Getting started
Procedural Converting examples
programming
Object-orientation val = "31.01.2012"
Numerical d = datetime.strptime(val, "%d.%m.%Y")
programming
d
NumPy package
Array basics
Linear algebra
## datetime.datetime(2012, 1, 31, 0, 0)
Data formats and
handling now.strftime("Today is %A and we are in week %W of the year %Y.")
Pandas package
Series ## 'Today is Sunday and we are in week 16 of the year 2019.'
DataFrame

now.strftime("%c")
Import/Export data

Visual
illustrations
Matplotlib package
## 'Sun 28 Apr 2019 04:26:48 PM '
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Overview: Datetime formats 287
Essential
concepts
Getting started
Procedural
programming Type Description
Object-orientation
%Y 4-digit year
Numerical
programming %m 2-digit month [01, 12]
NumPy package
Array basics %d 2-digit day [01, 31]
Linear algebra
%H Hour (24-hour clock) [00, 23]
Data formats and
handling %I Hour (12-hour clock) [01, 12]
Pandas package
Series
%M 2-digit minute [00, 59]
DataFrame %S Second [00, 61]
Import/Export data

Visual
%W Week number of the year [00, 53]
illustrations
Matplotlib package
%F Shortcut for %Y-%m-%d
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Overview : Datetime formats 288
Essential
concepts
Getting started
Procedural
programming Type Description
Object-orientation
%a Abbreviated weekday name
Numerical
programming %A Full weekday name
NumPy package
Array basics %b Abbreviated month name
Linear algebra
%B Full month name
Data formats and
handling %c Full date and time
Pandas package
Series
%x Locale-appropriate formatted date
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Generating date ranges with pandas 289
Essential
concepts
Getting started pd.date_range(start, end, freq): Generates a date range.
Procedural
programming
Object-orientation Date ranges
Numerical
programming import pandas as pd
NumPy package index = pd.date_range("2018-01-01", now)
Array basics index[0:2]
Linear algebra
index[15:16]
Data formats and
handling
index = pd.date_range("2018-01-01", now, freq="M")
Pandas package index[0:2]
## DatetimeIndex(['2018-01-01', '2...ype='datetime64[ns]', freq='D')
Series
DataFrame
Import/Export data ## DatetimeIndex(['2018-01-16'], dtype='datetime64[ns]', freq='D')
Visual ## DatetimeIndex(['2018-01-31', '2...ype='datetime64[ns]', freq='M')
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Overview: Time series frequencies 290
Essential
concepts
Getting started
Procedural
programming Alias Offset type
Object-orientation
D Day
Numerical
programming B Business day
NumPy package
Array basics H Hour
Linear algebra
T Minute
Data formats and
handling S Second
Pandas package
Series
M Month end
DataFrame BM Business month end
Import/Export data

Visual
Q-JAN, Q-FEB, ... Quarter end
illustrations
Matplotlib package
A-JAN, A-FEB, ... Year end
Figures and subplots AS-JAN, AS-FEB, ... Year begin
Plot types and styles
Pandas layers
BA-JAN, BA-FEB, ... Business year end
Applications BAS-JAN, BAS-FEB, ... Business year begin
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Resample date ranges 291
Essential
concepts
Getting started DataFrame.resample("frequency"): Resamples time series by a
Procedural
programming specified frequency.
Object-orientation

Numerical
programming
Resample date ranges
NumPy package import numpy as np
Array basics
Linear algebra
start = datetime(2016, 1, 1)
ind = pd.date_range(start, now)
Data formats and
handling numbers = np.arange((now - start).days + 1)
Pandas package df = pd.DataFrame(numbers, index=ind)
Series
DataFrame
Import/Export data

Visual
illustrations

df.head() df.resample("3BM").sum().head()
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers ## 0 ## 0
Applications ## 2016-01-01 0 ## 2016-01-29 406
Time series ## 2016-01-02 1 ## 2016-04-29 6734
Moving window
## 2016-01-03 2 ## 2016-07-29 15015
Financial applications
## 2016-01-04 3 ## 2016-10-31 24205
## 2016-01-05 4 ## 2017-01-31 32246

© 2019 PyEcon.org
Section 5.2 292
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Applications
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Moving window
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Moving window functions 293
Essential
concepts
Getting started DataFrame.rolling(window): Conducts rolling window computa-
Procedural
programming tions.
Object-orientation

Numerical Rolling mean


programming
NumPy package import matplotlib.pyplot as plt
Array basics amazon = pd.read_csv("data/amzn.csv", index_col=0,
parse_dates=True)["Adj Close"]
Linear algebra

Data formats and


handling
fig = plt.figure(figsize=(16, 8))
Pandas package ax = fig.add_subplot(1, 1, 1)
Series ax.set_ylabel("price")
DataFrame
amazon.plot(ax=ax, label="Amazon")
Import/Export data
amazon.rolling(window=20).mean().plot(ax=ax, label="Rolling mean")
Visual
illustrations ax.legend(loc="best")
Matplotlib package ax.set_title("Amazon price and rolling mean", fontsize=25)
Figures and subplots
fig.savefig("out/amzn.pdf")
Plot types and styles
Pandas layers

Applications Frequently used rolling functions: mean(), median(), sum(), var(),


Time series
Moving window
std(), min(), max().
Financial applications

© 2019 PyEcon.org
Moving window functions 294
Essential
concepts
Getting started
Procedural
programming
Object-orientation
1500
Amazon price and rolling mean
Amazon
Numerical Rolling mean
programming 1400
NumPy package
Array basics 1300
Linear algebra
1200
Data formats and
price

handling
1100
Pandas package
Series
1000
DataFrame
Import/Export data
900
Visual
illustrations
7-03 7-0
5
7-0
7
7-0
9
7-1
1
8-0
1
8-0
3
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Moving window functions 295
Essential
concepts
Getting started
Procedural Standard deviation
programming
Object-orientation fig = plt.figure(figsize=(16, 8))
Numerical ax = fig.add_subplot(1, 1, 1)
programming pfizer = pd.read_csv("data/pfe.csv", index_col=0,
NumPy package
parse_dates=True)["Adj Close"]
Array basics
Linear algebra
pg = pd.read_csv("data/pg.csv", index_col=0,
Data formats and
parse_dates=True)["Adj Close"]
handling prices = pd.DataFrame(index=amazon.index)
Pandas package
prices["amazon"] = pd.DataFrame(amazon)
prices["pfizer"] = pd.DataFrame(pfizer)
Series
DataFrame
Import/Export data prices["pg"] = pd.DataFrame(pg)
Visual prices_std = prices.rolling(window=20).std()
illustrations prices_std.plot(ax=ax)
Matplotlib package
Figures and subplots
ax.set_title("Standard deviation", fontsize=25)
Plot types and styles fig.savefig("out/std.pdf")
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Moving window functions 296
Essential
concepts
Getting started
Procedural
programming
Object-orientation Standard deviation
amazon
Numerical pfizer
70 pg
programming
NumPy package 60
Array basics
50
Linear algebra

Data formats and 40


handling
30
Pandas package
Series
20
DataFrame
Import/Export data 10

Visual 0
illustrations
5 7 9 1 1 3
7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Moving window functions 297
Essential
concepts
Getting started
Procedural Logarithmic standard deviation
programming
Object-orientation fig = plt.figure(figsize=(16, 8))
Numerical ax = fig.add_subplot(1, 1, 1)
programming
prices_std.plot(ax=ax, logy=True)
NumPy package
Array basics
ax.set_title("Logarithmic standard deviation", fontsize=25)
Linear algebra fig.savefig("out/std_log.pdf")
Data formats and
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Moving window functions 298
Essential
concepts
Getting started
Procedural
programming
Object-orientation
102
Logarithmic standard deviation
amazon
Numerical pfizer
pg
programming
NumPy package
Array basics
101
Linear algebra

Data formats and


handling
Pandas package
Series 100

DataFrame
Import/Export data

Visual
illustrations
5 7 9 1 1 3
7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Exponentially weighted functions 299
Essential
concepts
Getting started DataFrame.ewm(span): Computes exponentially weighted rolling win-
Procedural
programming dow functions.
Object-orientation

Numerical Exponentially weighted functions


programming
NumPy package fig = plt.figure(figsize=(16, 8))
Array basics
ax = fig.add_subplot(1, 1, 1)
amazon.rolling(window=40).mean().plot(ax=ax, label="Rolling mean")
Linear algebra

Data formats and


handling amazon.ewm(span=40).mean().plot(ax=ax, label="Exp mean",
Pandas package linestyle="--", color="red")
Series amazon.plot(ax=ax, label="Amazon price")
DataFrame
Import/Export data
ax.legend(loc="best")
ax.set_title("Exponentially weighted functions", fontsize=25)
Visual
illustrations fig.savefig("out/mean.pdf")
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Exponentially weighted functions 300
Essential
concepts
Getting started
Procedural
programming
Object-orientation
1500
Exponentially weighted functions
Rolling mean
Numerical Exp mean
Amazon price
programming 1400
NumPy package
Array basics 1300
Linear algebra
1200
Data formats and
handling
1100
Pandas package
Series
1000
DataFrame
Import/Export data
900
Visual
illustrations
7-03 7-0
5
7-0
7
7-0
9
7-1
1
8-0
1
8-0
3
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Binary moving window functions 301
Essential
concepts
Getting started DataFrame.pct_change(): Computes the percentage changes per
Procedural
programming period.
Object-orientation

Numerical
programming
Percentage change
NumPy package
fig = plt.figure(figsize=(16, 8))
ax = fig.add_subplot(1, 1, 1)
Array basics
Linear algebra

Data formats and


returns = prices.pct_change()
handling returns.head()
Pandas package
Series
## amazon pfizer pg
## Date
DataFrame
Import/Export data
## 2017-02-23 NaN NaN NaN
Visual
illustrations ## 2017-02-24 -0.008155 0.005872 -0.000878
Matplotlib package ## 2017-02-27 0.004023 0.000584 -0.001757
Figures and subplots
Plot types and styles
## 2017-02-28 -0.004242 -0.004668 0.001980
Pandas layers ## 2017-03-01 0.009514 0.008792 0.006479
Applications
Time series
returns.plot(ax=ax)
Moving window ax.set_title("Returns", fontsize=25)
Financial applications fig.savefig("out/returns.pdf")

© 2019 PyEcon.org
Binary moving window functions 302
Essential
concepts
Getting started
Procedural
programming
Object-orientation Returns
amazon
Numerical 0.125 pfizer
pg
programming
NumPy package 0.100

Array basics
0.075
Linear algebra
0.050
Data formats and
handling
0.025
Pandas package
Series 0.000
DataFrame
Import/Export data 0.025

Visual 0.050
illustrations
3 5 7 9 1 1 3
7-0 7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Binary moving window functions 303
Essential
concepts
Getting started DataFrame.rolling().corr(benchmark): Computes correlation be-
Procedural
programming tween two time series.
Object-orientation

Numerical Correlation
programming
NumPy package fig = plt.figure(figsize=(16, 8))
Array basics ax = fig.add_subplot(1, 1, 1)
Linear algebra
DJI = pd.read_csv("data/dji.csv", index_col=0,
Data formats and parse_dates=True)["Adj Close"]
handling
Pandas package
DJI_ret = DJI.pct_change()
Series corr = returns.rolling(window=20).corr(DJI_ret)
DataFrame corr.plot(ax=ax)
ax.grid()
Import/Export data

Visual
illustrations
ax.set_title("20 days correlation", fontsize=25)
Matplotlib package fig.savefig("out/corr.pdf")
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Binary moving window functions 304
Essential
concepts
Getting started
Procedural
programming
Object-orientation 20 days correlation
Numerical 0.8
programming
NumPy package 0.6
Array basics
Linear algebra 0.4

Data formats and


handling 0.2

Pandas package
0.0
Series
DataFrame
0.2
Import/Export data
amazon
Visual pfizer
0.4 pg
illustrations
5 7 9 1 1 3
7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Section 5.3 305
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Applications
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Financial applications
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Cumulative returns 306
Essential
concepts
Getting started
Procedural Returns
programming
Object-orientation fig = plt.figure(figsize=(16, 8))
Numerical ax = fig.add_subplot(1, 1, 1)
programming ret_index = (1 + returns).cumprod()
NumPy package
stocks = ["amazon", "pfizer", "pg"]
Array basics
Linear algebra
for i in stocks:
Data formats and
ret_index[i][0] = 1
handling ret_index.tail()
Pandas package
Series
## amazon pfizer pg
DataFrame
Import/Export data
## Date
Visual
## 2018-02-15 1.715298 1.088693 0.932322
illustrations ## 2018-02-16 1.699961 1.105461 0.934471
Matplotlib package ## 2018-02-20 1.723031 1.097840 0.920217
## 2018-02-21 1.740128 1.090218 0.907772
Figures and subplots
Plot types and styles
Pandas layers ## 2018-02-22 1.742968 1.090218 0.914560
Applications
Time series ret_index.plot(ax=ax)
Moving window ax.set_title("Cumulative returns", fontsize=25)
Financial applications
fig.savefig("out/cumret.pdf")

© 2019 PyEcon.org
Cumulative returns 307
Essential
concepts
Getting started
Procedural
programming
Object-orientation Cumulative returns
amazon
Numerical pfizer
pg
programming
NumPy package 1.6

Array basics
Linear algebra
1.4
Data formats and
handling
Pandas package 1.2
Series
DataFrame
Import/Export data 1.0

Visual
illustrations
7-03 7-0
5
7-0
7
7-0
9
7-1
1
8-0
1
8-0
3
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Cumulative returns 308
Essential
concepts
Getting started
Procedural Monthly returns
programming
Object-orientation returns_m = ret_index.resample("BM").last().pct_change()
Numerical returns_m.head()
programming
NumPy package
Array basics
## amazon pfizer pg
Linear algebra ## Date
Data formats and
## 2017-02-28 NaN NaN NaN
handling ## 2017-03-31 0.049110 0.002638 -0.013396
Pandas package
## 2017-04-28 0.043371 -0.008477 -0.020604
## 2017-05-31 0.075276 -0.028124 0.008703
Series
DataFrame
Import/Export data ## 2017-06-30 -0.026764 0.028790 -0.010671
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Volatility calculation 309
Essential
concepts
Getting started
Procedural Volatility
programming
Object-orientation fig = plt.figure(figsize=(16, 8))
Numerical ax = fig.add_subplot(1, 1, 1)
programming
vola = returns.rolling(window=20).std() * np.sqrt(20)
NumPy package
Array basics
vola.plot(ax=ax)
Linear algebra ax.set_title("Volatility", fontsize=25)
Data formats and fig.savefig("out/vola.pdf")
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Volatility calculation 310
Essential
concepts
Getting started
Procedural
programming
Object-orientation Volatility
0.14 amazon
Numerical pfizer
pg
programming
0.12
NumPy package
Array basics
0.10
Linear algebra

Data formats and 0.08


handling
Pandas package 0.06
Series
DataFrame 0.04
Import/Export data
0.02
Visual
illustrations
5 7 9 1 1 3
7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Group analysis 311
Essential
concepts
Getting started DataFrame.describe(): Shows a statistical summary.
Procedural
programming
Object-orientation Describe
Numerical
programming
prices.describe()
NumPy package
Array basics ## amazon pfizer pg
Linear algebra ## count 252.000000 251.000000 252.000000
Data formats and ## mean 1044.521903 33.892665 87.934304
handling
## std 158.041844 1.694680 2.728659
Pandas package
Series
## min 843.200012 30.872143 79.919998
DataFrame ## 25% 953.567474 32.593733 86.241475
Import/Export data
## 50% 988.680023 33.147469 87.863598
Visual ## 75% 1136.952484 35.331834 90.363035
illustrations
Matplotlib package
## max 1485.339966 38.661823 92.988976
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Return analysis 312
Essential
concepts
Getting started
Procedural Histogram
programming
Object-orientation fig, ax = plt.subplots(3, 1, figsize=(10, 8), sharex=True)
Numerical for i in range(3):
programming
ax[i].set_title(stocks[i])
NumPy package
Array basics
returns[stocks[i]].hist(ax=ax[i], bins=50)
Linear algebra fig.savefig("out/return_hist.pdf")
Data formats and
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Return analysis 313
Essential
concepts
Getting started
Procedural amazon
programming
40
Object-orientation

Numerical
30
programming
20
NumPy package
Array basics 10
Linear algebra
0
Data formats and pfizer
handling
40
Pandas package
Series 30
DataFrame
Import/Export data 20
Visual 10
illustrations
Matplotlib package 0
pg
Figures and subplots
30
Plot types and styles
Pandas layers
20
Applications
Time series
10
Moving window
Financial applications
0
0.050 0.025 0.000 0.025 0.050 0.075 0.100 0.125

© 2019 PyEcon.org
Ordinary Least Squares 314
Essential
concepts
Getting started Using the statsmodels module to determine regressions:
Procedural
programming Series.tolist(): Returns a list containing the DataFrame values.
Object-orientation
sm.OLS(Y, X).fit(): Computes OLS fit of data (X, Y).
Numerical
programming
NumPy package Regression data
Array basics
Linear algebra import statsmodels.api as sm
Data formats and
handling fig = plt.figure(figsize=(16, 8))
Pandas package
Series
ax = fig.add_subplot(1, 1, 1)
DataFrame Y = np.array(amazon.loc["2018-1-1":"2018-1-15"].tolist())
Import/Export data X = np.arange(len(Y))
Visual ax.scatter(x=X, y=Y, marker="o", color="red")
illustrations
fig.savefig("out/reg_data.pdf")
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Ordinary Least Squares 315
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 1300
programming
NumPy package
Array basics 1280

Linear algebra

1260
Data formats and
handling
Pandas package 1240
Series
DataFrame
1220
Import/Export data

Visual
illustrations 1200

Matplotlib package
Figures and subplots 0 1 2 3 4 5 6 7 8
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Ordinary Least Squares 316
Essential
concepts
Getting started
Procedural Regression
programming
Object-orientation X_reg = sm.add_constant(X)
Numerical res = sm.OLS(Y, X_reg).fit()
programming
b, a = res.params
NumPy package
Array basics
ax.plot(X, a * X + b)
Linear algebra fig.savefig("out/ols.pdf")
Data formats and
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Ordinary Least Squares 317
Essential
concepts
Getting started Summary of OLS regression. To print in python use res.summary().
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Ordinary Least Squares 318
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 1300
programming
NumPy package
1280
Array basics
Linear algebra
1260
Data formats and
handling
Pandas package 1240
Series
DataFrame
1220
Import/Export data

Visual
1200
illustrations
Matplotlib package
Figures and subplots 0 1 2 3 4 5 6 7 8
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Newton-Raphson 319
Essential
concepts
Getting started The Newton-Raphson method is an algorithm for finding successively
Procedural
programming better approximations to the roots of real-valued functions.
Object-orientation

Numerical
programming
Let F : Rk → Rk be a continuously differentiable function and JF (xn )
NumPy package the Jacobian matrix of F . The recursive Newton-Raphson method to
Array basics
Linear algebra find the root of F is given by:
Data formats and

x n+1 := x n − J(x n )−1 F (x n )


handling

Pandas package
Series
DataFrame
Import/Export data
with an initial guess x0 .
Visual
illustrations For f : R → R the process is repeated as
Matplotlib package
Figures and subplots
Plot types and styles f (xn )
xn+1 = xn − .
f 0 (xn )
Pandas layers

Applications
Time series
Moving window Accordingly, we can determine the optimum of the function f by
Financial applications
applying the method instead to f 0 = df /dx .

© 2019 PyEcon.org
Newton-Raphson 320
Essential
concepts
Getting started As an illustrative application, we consider the function
Procedural
programming
Object-orientation
f (x ) = 3x 3 + 3x 2 − 5x , x ∈ R,
Numerical
programming which is represented by the blue line in the following diagram. The
NumPy package
Array basics
figure depicts the iterative solution path applying the Newton-Raphson
Linear algebra method to find the root, e. g., x solving f (x ) = 0, by tangent points
Data formats and
handling
and tangents starting from the intial guess x0 = −1.
Pandas package
Series
15.0 f(x)
DataFrame
Import/Export data
12.5
Visual
illustrations
10.0
Matplotlib package
Figures and subplots
7.5
Plot types and styles
Pandas layers
5.0
Applications
Time series 2.5
Moving window
Financial applications
0.0
x0 x3 x2 x1
1.5 1.0 0.5 0.0 0.5 1.0 1.5

© 2019 PyEcon.org
Newton-Raphson implementation 321
Essential
concepts
Getting started The first step involves the definition of the function f (x ) and its
Procedural
programming derivation f 0 (x ) in Python:
Object-orientation

Numerical
programming
Newton-Raphson requirements
NumPy package
def f(x):
Array basics
Linear algebra
return 3*x**3 + 3*x**2 - 5*x
Data formats and
handling
Pandas package
def df(x):
return 9*x**2 + 6*x - 5
Series
DataFrame
Import/Export data

Visual Finally, we implement the Newton-Raphson algorithm as outlined above.


illustrations
Matplotlib package We allow for a (small) absolute deviation between the target function
Figures and subplots
Plot types and styles
and its target value, i. e., 0. In addition, for a better understanding,
Pandas layers we plot the solution path using the tangent points for x0 , x1 , . . . , xN .
Applications The solution point is colored black. Hence, the lines starting with
ax.scatter() are not part of the algorithm – they take global variables
Time series
Moving window
Financial applications
and are included just for the visual illustration.

© 2019 PyEcon.org
Newton-Raphson implementation 322
Essential
concepts
Getting started
Procedural Newton-Raphson
programming
Object-orientation def newton_raphson(fun, dfun, x0, e):
Numerical delta = abs(fun(x0))
programming
while delta > e:
NumPy package
Array basics
ax.scatter(x0, f(x0), color="red", s=80)
Linear algebra x0 = x0 - fun(x0) / dfun(x0)
Data formats and delta = abs(fun(x0))
handling ax.scatter(x0, f(x0), color="black", s=80)
Pandas package
Series
return(x0)
DataFrame
Import/Export data fig = plt.figure(figsize=(16, 8))
Visual ax = fig.add_subplot(1, 1, 1)
illustrations
x = np.arange(-1.5, 1.7, 0.001)
Matplotlib package
Figures and subplots ax.plot(x, f(x))
Plot types and styles ax.grid()
Pandas layers
x_root = newton_raphson(f, df, -1, 0.1)
Applications fig.savefig("out/newton_raphson_root.pdf")
Time series
print(f"Root at: {x_root:.4f}")
Moving window
Financial applications
## Root at: 0.8878

© 2019 PyEcon.org
Newton-Raphson implementation 323
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 14
programming
NumPy package 12
Array basics
Linear algebra 10

Data formats and 8


handling
Pandas package 6
Series
4
DataFrame
Import/Export data
2
Visual
illustrations 0

Matplotlib package
2
Figures and subplots 1.5 1.0 0.5 0.0 0.5 1.0 1.5
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
Newton-Raphson optimization 324
Essential
concepts
Getting started With the definition of the second derivative f 00 , i.e. the derivative of the
Procedural
programming derivative, we can employ the Newton-Raphson method to obtain an
Object-orientation
optimum of the target function f (x ) numerically. Hence, the previous
Numerical
programming example needs only minimal modifications:
NumPy package
Array basics
Linear algebra
Newton-Raphson
Data formats and def ddf(x):
handling
Pandas package
return 18*x + 6
Series
DataFrame fig = plt.figure(figsize=(16, 8))
Import/Export data
ax = fig.add_subplot(1, 1, 1)
Visual
illustrations
x = np.arange(-1.5, 1.7, 0.001)
Matplotlib package ax.plot(x, f(x))
Figures and subplots ax.grid()
Plot types and styles
x_opt = newton_raphson(df, ddf, 1, 0.1)
fig.savefig("out/newton_raphson_optimum.pdf")
Pandas layers

Applications
Time series
print(f"Minimum at: {x_opt:.4f}")
Moving window
Financial applications ## Minimum at: 0.4886

© 2019 PyEcon.org
Newton-Raphson optimization 325
Essential
concepts
Getting started
Procedural
programming
Object-orientation
15.0
Numerical
programming
NumPy package 12.5

Array basics
Linear algebra 10.0

Data formats and


7.5
handling
Pandas package
Series 5.0

DataFrame
Import/Export data 2.5

Visual
illustrations 0.0

Matplotlib package
Figures and subplots 1.5 1.0 0.5 0.0 0.5 1.0 1.5
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org
The End... but not finally 326
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications

© 2019 PyEcon.org

You might also like