0% found this document useful (0 votes)
24 views

Python Foundations and Tooling

jjjajjaajjjajjjjjjjjjjjjjaalpañsdfkaspdfañcmsalvmapfñvm psfmsdklfvs

Uploaded by

micogec858
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Python Foundations and Tooling

jjjajjaajjjajjjjjjjjjjjjjaalpañsdfkaspdfañcmsalvmapfñvm psfmsdklfvs

Uploaded by

micogec858
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Pre-Course Programming

Python Foundations and Tooling

Study Program Data Science


Prof. Dr. Tillmann Schwörer
Access to Course Materials

 Enrolled data science students have permanent access via:


https://collab.fh-kiel.de/course/view.php?id=321
 Students not yet enrolled can use the following Dropbox folder:
https://bit.ly/precourse-programming-2024
 Note that this dropbox folder will be deleted after the precourse!
 First materials are already online, further materials will be uploaded
during the precourse

2
Kickoff Survey

https://forms.gle/TWUGUAcxUBcZ3F2d9

3
Python foundations
Data types
Operators
Functions
Control flow and iterators
Finding help Data Science Workflow

Python
Tooling
Installation
Visual Studio Code
Jupyter Notebooks
Python Packages
Virtual Environments

4
Agenda (preliminary)

 Day 1
 Installation
 Primitive Types
 Using Python as a Calculator
 Python Packages and Virtual Environments
 Day 2
 Functions
 List, Tuple, Dictionary and Set
 Control Flow
 Day 3
 List and Dict Comprehensions
 Useful Iterators
 Developing our own quiz application
5
Python Setup and Tooling

6
Recommended Setup

 CPython (not Anaconda Python)


 Integrated Development Environment: Visual Studio Code
 Python Extension and Jupyter Extension for Visual Studio Code
 Installation of Python Packages in virtual environments (one virtual
environment per university course)

7
Python Installation Guide

 No Python so far → just follow the instructions


 I have an older Python version installed
 Python 2.X: you MUST install a new version
 Python 3.X: you CAN install a new version
 In general, it is absolutely fine to have multiple Python versions installed
 I have Anaconda Python installed
 Either stick with Anaconda: some aspects will be slightly different for you
 Or install CPython in addition: it is fine to have both distributions installed
 I have strange problems due to multiple Python versions installed on my
system
 This happens occasionally: let‘s try to fix these problems NOW!

8
Which Python distribution?
Distributions CPython distribution Anaconda distribution
https://www.python.org https://anaconda.org/

What is it? Standard Python distribution Data Science Toolkit :


• Python and R distribution
• Package management system „conda“
• Graphical User Interface (GUI) „Anaconda
Navigator“
• Selection of IDEs
How are projects The relevant Python version and Python packages The Python/R versions and all required packages
managed? are installed in a virtual environment including requirements of your operating system
are defined in conda environments
How are packages Packages are installed from the Python Package • Packages are installed from the conda package
installed? Index (https://pypi.org/) via pip install repository via conda install packagename
packagename
What are pros and cons? • Requires little space on disc • Requires a lot of disc space
• 395.000 Python packages are available from • Fewer Python packages are available, and
PyPi sometimes not their most recent versions;
• Does not handle system dependencies installing missing packages via pip creates its
own problems
• Handles system dependencies
9
Integrated Development Environment (IDE)

Jupyterlab/Jupyter
Visual Studio Code PyCharm Notebook (via Spyder
Browser)
Jupyter Notebooks
✓ ✓ ✓ x
(.ipynb)

Python Script (.py) ✓ ✓ X ✓

Git ✓ ✓ (✓) x

Pricing Free For Students free Free Free

General purpose code


editor: support for Comprehensive Data Science focus, but
specific functionalities functionalities for Python; cannot replace a full IDE
Comments Similarity to RStudio
and programming somewhat overloaded for larger projects and
languages are available interface production code
via extensions

10
Visual Studio Code

File Explorer

Git / Version Control

Extensions
Command Palette
Ctrl + Shift + P
Fast access to everything

Settings

11
Packages

3rd party packages: e.g.


for data science

Standard library: built-in


modules with core
Python functionalities

Python

12
Virtual Environments

 Different projects may need different


versions of Python and/or packages →
Virtual Environment
 Ensure reproducibility: code needs to
run stably on any system, today and in
the future
 Safely experiment

13
Virtual Environments in VS Code

14
Python script vs Jupyter Notebook
Script (.py) Jupyter Notebook (.ipynb)

15
Python scripts

 Text file (editable in any text editor), with file ending .py
 Use cases:
 Focus on code
 Concise documents
 Re-use code elsewhere by importing the python file
 Production setting
 Comments are marked by # symbol. Comments won‘t be executed. All
other lines will.
 Execution modes in VS Code: interactive (selection, current line) or script
mode (entire file)

16
Python scripts: execution modes

Run selection/line
Shift + Enter

Execute entire file

17
Jupyter Notebooks

we see: we type: under the hood:

18
Jupyter Notebooks

Content aspects:
 Code, output, text, formulas, images, etc. all in one file. Contents are formatted via
Markdown
 Julia, Python, and R code
 Use cases: exploration, presentation, documentation, books

Technical aspects:
 Files with JSON structure and file ending.ipynb (IPython Notebook)
 Web-based
 Editable in browser via Jupyterlab/Jupyer Notebook, and via IDEs such as VS Code
 Convertible into html, pdf, or python scripts

19
Jupyter Notebooks

 Disadvantages
 Editing functionalities are limited compared to Python scripts
 Working productively requires that you remember many shortcuts
 Not useful if you want to re-use code
 Not ideal for version control due to JSON format

→ More information on working with Jupyter Notebooks in VS Code.

20
Markdown

 Formatting of text documents


 Intuitive, easy to read, platform-
independent
 Applications: Jupyter Notebooks,
Rmarkdown, Github,…
 Useful links:
 Cheat Sheet, Tutorial
 VS Code extension: Markdown
Preview Enhanced

21
Python Foundations

22
Data types Manipulating data
Primitive Types create | subset | edit | delete
List
Tuple
Dictionary Operators
Set [2, 3] Math | comparison |logical

Python
Functions Control flow and Iterators
Foundations
Using functions if, else and loops
Writing functions List and dict comprehensions
Lambda functions Generator expressions
Generator functions
Concepts / Paradigms
Mutability
Object Oriented Programming
Functional programming
23
Built-In Data Types
Integer 20

Numbers
Float 37.5

Primitive

Immutable
Complex 1+3j

Boolean True/False

Data Types Strings ˈJessaˈ

Sequences
Tuple (3, 4.5, ˈbˈ)

Collections
List [2, ˈaˈ, 5.7]

Mutable
Dictionary {1:ˈaˈ, 2:ˈbˈ}

Set {2, 4, 6}
24
Math Operators

25
Functions
 Formatting:
 Consistent indentation
 Docstring: function help
 Type Hints (optional)
 Default values (optional)

 A function may
 carry out some operations
 return objects
 print()

26
Methods

 Methods are functions which are associated with a specific class


 Data Science is an instance of the class string
 The split method operates on the string ‘Data Science’

27
Strings

 Create strings:
 'single quotes'
 "double quotes"
 """triple quotes""" (for multiline strings)
 Subsetting strings
 [start:stop:step]
 zero-based
 start is inclusive, stop is exclusive
 negative indexing allowed

28
Strings

 Concatenate strings via +


 Repeat strings via *
 Iterate over string elements

 String Methods
 str.upper()
 str.lower()
 str.reverse()
 …

29
Comparison and Membership Operators

30
Assignment, names, values

 Assignment: x = 2337
 Variable x is a name that stores the reference to value 2337

Consequences:
 Dynamic typing: We can change the type of x, by
pointing x to another object
 Aliasing: Multiple names can point to the same object
 Side effects: If a mutable object is changed, this
effects all aliases

31
Mutability

 Mutable objects can be changed, which means that the change occurs in-
place (without altering the memory address)
 Use copy method to avoid side effects between two names

a a b a b

2 3 4 2 3 4 2 3 5
#123456 #123456 #123456

32
List and Tuple

 Lists and tuples: ordered collections, can contain mixed types


 Tuples are immutable, lists are mutable
 Tuples have very few methods
 Lists have many methods: append, extend, pop, remove, …

Shared features of all sequence types:


 Subset via index position
 Concatenate sequences via + operator
 Repeat sequences via * operator
 Iterate over elements in a for loop

33
If - elif - else loop

Condition that evaluates to True or False

34
While condition

Intialize variable i Condition that evaluates to True or False


Once the condition evaluates to False, the while loop stops

Change the value of the variable for next iteration

35
For loop

for number in [2, 4, 6]:


print(number)

for <var> in <iterable>:


# do something #

36
Iterables
tuple
list (2, 4, 6)
[2, 4, 6] 2 4 6 string
2 4 6 ˈdataˈ
d a t a
for <var> in <iterable>:
# do something #

dictionary
range
dict = {1:ˈAnnaˈ, 2:ˈMaxˈ}
range(0,10,2)
0 2 4 6 8
dict.keys() dict.values()
1 2 ˈAnnaˈ ˈMaxˈ

37
Iterables

for <var1> <var2> <…> in <iterable>:


# do something #
dict for i, j in {1:ˈAnnaˈ,2:ˈMaxˈ}.items(): ˈ1. Annaˈ
print(fˈ{i}. {j}ˈ) ˈ2. Maxˈ

enumerate for i, j in enumerate(ˈJoeˈ): 0: ˈJˈ


print(fˈ{i}: {j}ˈ) 1: ˈoˈ
2: ˈeˈ

zip names = [ˈAnnaˈ, ˈMaxˈ]


ages = [18, 20]
for name, age in zip(names, ages): ˈAnna is 18ˈ
print(fˈ{name} is {age}ˈ) ˈMax is 20ˈ38
Special statements in a loop

 pass: placeholder to avoid error due to empty loop


 continue: continue with next iteration of loop
 break: completely interrupt execution

39
List comprehension

Concise alternative to a list-creating for loop

For loop:

squares = []
for x in [1, 3, 5]: [1, 9, 25]
squares.append(x**2)

List comprehension:

squares = [x**2 for x in [1, 3, 5]]

40
Dictionary comprehension

Concise alternative to a dictionary-creating for loop

Replaces for loop:

squares = {}
for x in [1, 3, 5]: {1: 1, 3: 9, 5: 25}
squares.update({x: x**2})

Dict comprehension:

squares = {x: x**2 for x in [1, 3, 5]}

41
Conventions

PEP 20: The Zen of Python


 Beautiful is better than ugly.
 Explicit is better than implicit.
 Simple is better than complex.
 …

PEP 8: Style Guide for Python Code


 4 spaces per indentation level Automate PEP 8 and PEP 257 consistent code:
 Code should always use UTF-8 encoding • pip install autopep8
• VS Code docstring generator extension
 Imports should usually be on separate lines
 …

PEP 257: Docstring Conventions

42

You might also like