0% found this document useful (0 votes)
3 views35 pages

Lecture 8 2 Memory and Libraries

The document discusses dynamic memory management in Python, explaining how memory allocation and garbage collection work behind the scenes. It also covers the concept of libraries, differentiating between the standard library and external libraries, and highlights common external libraries such as NumPy, SciPy, Matplotlib, and Scikit-Learn. The content emphasizes the importance of these libraries in enhancing Python's capabilities, particularly for data science applications.

Uploaded by

ddskflk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views35 pages

Lecture 8 2 Memory and Libraries

The document discusses dynamic memory management in Python, explaining how memory allocation and garbage collection work behind the scenes. It also covers the concept of libraries, differentiating between the standard library and external libraries, and highlights common external libraries such as NumPy, SciPy, Matplotlib, and Scikit-Learn. The content emphasizes the importance of these libraries in enhancing Python's capabilities, particularly for data science applications.

Uploaded by

ddskflk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Memory management and libraries

in Python

Sukrit Gupta

October 26, 2023

Sukrit Gupta Intro to Computing and Data Structures 1/35


Outline

1 Dynamic Memory Management

2 Libraries
The standard ‘core’ library
External libraries

3 Some common external Python libraries


NumPy (Numeric Python)
SciPy (Scientific Python)
Matplotlib
Scikit-Learn

Sukrit Gupta Intro to Computing and Data Structures 2/35


Acknowledgement and disclaimer
All mistakes (if any) are mine.
I have used several other sources which I have referred to in the
appropriate places.

Sukrit Gupta Intro to Computing and Data Structures 3/35


Section 1

Dynamic Memory Management

Sukrit Gupta Intro to Computing and Data Structures 4/35


What is Dynamic Memory Management?

Memory allocation (and deallocation) can be defined as allocation


(and deallocation) of a block of space in the computer memory to
a program.
Learning about memory allocation and deallocation schema in a
language helps you write memory-efficient code.
In Python, memory allocation and deallocation is done behind the
scenes.
An automatic routine called garbage collection runs in the
background and handles allocation and deallocation.
Memory is in the form of a stack and a private heap (not the heap
data structure we talked about).

Sukrit Gupta Intro to Computing and Data Structures 5/35


Stack memory for method/function calls

The allocation happens on contiguous blocks of memory handled


by the compiler using predefined routines.
When a function is called, it is added onto the program’s call stack.
Any local memory assignments (variable initializations, etc.)
inside the function are stored on the function call stack.
These are deleted once the function returns, and the call stack
moves on to the next task.

Sukrit Gupta Intro to Computing and Data Structures 6/35


Private heap for value objects

Not the heap data structure. Called heap since it is a pile of


memory space available to programmers.
Heap memory is allocated during the execution of instructions.
Heap memory is used for variables that are needed outside of a
function/method or are shared within multiple functions globally.

Sukrit Gupta Intro to Computing and Data Structures 7/35


How to think about memory allocation in Python

Here’s one way to think about this:


At top level, i.e., the level of the shell, the heap memory keeps
track of all names defined at that level and their current bindings.
When a function is called, a new stack frame is created. This table
keeps track of all names defined within the function (including the
formal parameters) and their current bindings.
If a function is called from within the function body, yet another
stack frame is created.
When the function completes, its stack frame goes away.

Sukrit Gupta Intro to Computing and Data Structures 8/35


Example 1

def f(x):
y = 1
x = x + y
print ('x =', x) #
print ('y =', y) #
return x
x = 3
y = 2
z = f(x)
print ('z =', z)
print ('x =', x) #
print ('y =', y) #

Sukrit Gupta Intro to Computing and Data Structures 9/35


Example 2

def f(x):
def h():
z = x
print ('z =', z) #
def g():
x = 'abc'
print ('x =', x) #
x = x + 1
print ('x =', x) #
h()
g()
print ('x =', x) #
return x
x = 3
z = f(x)
print ('x =', x) #
print ('z =', z) #

Sukrit Gupta Intro to Computing and Data Structures 10/35


Garbage Collection and Reference Counting

Garbage collection: Interpreter frees up memory when it is not in


use making it available for other objects.
For objects in functions, we wait for function execution to be over
before memory is released.
When an object has no reference to it, the garbage collector
deletes it from the heap memory.
Reference counting: Count the the number of times an object is
referenced by other objects.
Decrement reference count when references to an object are
removed.
When the reference count becomes zero, the object is deallocated.

Sukrit Gupta Intro to Computing and Data Structures 11/35


Section 2

Libraries

Sukrit Gupta Intro to Computing and Data Structures 12/35


What is a library?

Library
A library is a collection of implementations of behavior, written in
terms of a language, that has a well-defined interface by which the
behavior is invoked.

People writing a higher-level program can use libraries to make


system calls instead of implementing them again.
Two types:
Standard Library; and
External modules/package/libraries to support programming (will
refer to these as external libraries for this lecture)

Sukrit Gupta Intro to Computing and Data Structures 13/35


2.1 The standard ‘core’ library

Sukrit Gupta Intro to Computing and Data Structures 14/35


Standard Library

A standard library in computer programming is the library made


available across implementations of a programming language.
A language’s standard library is often treated as part of the
language by its users, although the designers may have treated it
as a separate entity.
Many language specifications define a core set that must be made
available in all implementations, in addition to other portions
which may be optionally implemented.

Sukrit Gupta Intro to Computing and Data Structures 15/35


Python Standard Library

Python distributions have the standard library along with some


optional components included in them.
Python’s standard library contains built-in modules (written in C)
that provide access to system functionality, as well as modules
written in Python.
It contains data types that would normally be considered part of
the core of a language, such as numbers and lists.

Sukrit Gupta Intro to Computing and Data Structures 16/35


The standard library also contains built-in functions and
exceptions (objects that can be used by all Python code without
the need of an import statement).
The bulk of the library, however, consists of a collection of
modules (math, random, itertools, etc.) that can be used with an
import statement.

Sukrit Gupta Intro to Computing and Data Structures 17/35


2.2 External libraries

Sukrit Gupta Intro to Computing and Data Structures 18/35


What are external libraries?

People build higher-level programs by importing external libraries


use classes/functions defined in the library instead of
implementing them from scratch.
There is a growing collection of several thousand libraries (≈ 490k
as on 24 October, 2023) available from the Python Package Index
(PyPI).

Sukrit Gupta Intro to Computing and Data Structures 19/35


Why have external libraries separately?

Goal of standard library is to provide bare bones functionality;


Idea is to build complex functionalities from the basics;
Include functionality that are of use to a wide range of users;
Making the language lightweight;
A (minor) reason is that the code needs to be validated for
multiple use cases.

Sukrit Gupta Intro to Computing and Data Structures 20/35


Programming in an external vs. standard library?

In case of Python the standard library has some code written in C


(or Java). Check out this link.
How about C/C++? Some of the standard library is written in
assembly language and then some is written in C/C++. Check
out this link.
External libraries can build upon the language’s standard library
besides using assembly level code or code in another language.

Sukrit Gupta Intro to Computing and Data Structures 21/35


Why is Python such a good fit for data science?

Ease of use and simple syntax. Thus, widely used by subject


experts who are non engineers.
Large number of external libraries which provide functionality to
deal with mathematics, statistics and scientific function.
Suited for quick prototyping.
Deep learning frameworks for Python have made it the go-to
language for ML/DL models.

Sukrit Gupta Intro to Computing and Data Structures 22/35


Section 3

Some common external Python libraries

Sukrit Gupta Intro to Computing and Data Structures 23/35


3.1 NumPy (Numeric Python)

Sukrit Gupta Intro to Computing and Data Structures 24/35


NumPy and its usage

Figure: List Allocation

Figure: NumPy array allocation


Sukrit Gupta Intro to Computing and Data Structures 25/35
NumPy brings the computational power of languages like C and
Fortran to Python.
In Python we have lists that serve the purpose of arrays, but they
are slow to process. NumPy arrays are faster because:
they contain homogeneous data and can therefore be stored at
continuous memory locations;
breaks down a task into multiple fragments and then processes all
the fragments parallelly; and
integrates C, C++, and Fortran codes for faster execution
The speedups can go as far as 50x than traditional Python lists
(Link).

Sukrit Gupta Intro to Computing and Data Structures 26/35


The array object in NumPy is called ndarray (n-dimensional
array). Provides supporting functions that make working with
ndarray very easy.
Arrays are very frequently used in data science, where speed and
resources are very important.
Start exploring NumPy and its operations on arrays here.

Sukrit Gupta Intro to Computing and Data Structures 27/35


3.2 SciPy (Scientific Python)

Sukrit Gupta Intro to Computing and Data Structures 28/35


SciPy and its usage

SciPy provides algorithms for optimization, integration,


interpolation, differential equations, statistics and many other
classes of problems.
Extends NumPy and wraps highly-optimized implementations
written in low-level languages like Fortran, C, and C++.
Flexibility of Python with the speed of compiled code.
Start exploring SciPy here.

Sukrit Gupta Intro to Computing and Data Structures 29/35


3.3 Matplotlib

Sukrit Gupta Intro to Computing and Data Structures 30/35


Matplotlib and its usage

Matplotlib is a comprehensive library for creating static,


animated, and interactive visualizations in Python.
From the matplotlib.org website: ‘Matplotlib makes easy things
easy and hard things possible.’
With matplotlib, you can:
Create publication quality plots.
Make interactive figures that can zoom, pan, update.
Customize visual style and layout.
Export to many file formats.
Basically, if you want any kind of plot in Python, you use
Matplotlib.
Start learning matplotlib here and about different types of plots
here.

Sukrit Gupta Intro to Computing and Data Structures 31/35


3.4 Scikit-Learn

Sukrit Gupta Intro to Computing and Data Structures 32/35


Scikit-Learn and its usage

Open source machine learning library for Python.


Built on top of NumPy, SciPy, and matplotlib.
It features various classification, regression and clustering, data
preprocessing and dimensionality reduction algorithms.
Sufficient for basic machine learning algorithms. However, not
preferred when things start getting complex.
Start learning the different offerings of Scikit-Learn here.

Sukrit Gupta Intro to Computing and Data Structures 33/35


What did we learn today?

1 Dynamic Memory Management

2 Libraries
The standard ‘core’ library
External libraries

3 Some common external Python libraries


NumPy (Numeric Python)
SciPy (Scientific Python)
Matplotlib
Scikit-Learn

Sukrit Gupta Intro to Computing and Data Structures 34/35


Thank you!

Sukrit Gupta Intro to Computing and Data Structures 35/35

You might also like