MSBA315_intro_to_python_for_ML
MSBA315_intro_to_python_for_ML
In preparation for MSBA 315, you should be familiar with the following concepts. Highlighted
concepts (in red font) are particularly important.
Environments
You should be familiar with these development environments:
Anaconda + Jupyter Notebooks (local computer)
o How to install Anaconda(python) on Windows 10 by Coder's Digest - 2021
https://www.youtube.com/watch?v=GEYK1dlDqgU
o How to Set Up Your Data Science Environment by Coder's Digest - 2021
https://www.youtube.com/watch?v=w3kXtaZEtRs
o Anaconda: Start here for data science in Python! by Karan Bhanot - 2020
https://towardsdatascience.com/anaconda-start-here-for-data-science-in-python-
475045a9627
Google Colab (cloud-based):
o Google Colab Tutorial for Beginners by Doga Ozgon - 2021
https://youtu.be/RLYoEyIHL6A
o Deal with Files in Google Colab by Siddhant Sadangi - 2022
https://neptune.ai/blog/google-colab-dealing-with-files
o A Complete Guide to Google Colab by Ahmad Anis - 2020
https://www.kdnuggets.com/2020/06/google-colab-deep-learning.html
Python Basics
Data types: Numbers, Strings, Lists, Dictionaries, Booleans, Tuples, Sets
Comparison operators
if, elif, else Statements
for Loops
while Loops
range()
list comprehension
functions
lambda expressions
methods
string manipulation
split(), strip(), join()
NumPy Basics
Numpy Arrays
o One dimensional Arrays (1D array)
o Two dimensional Arrays (2D array)
o From Python list to Numpy Array
Built-in Methods for Basic Array Data Generations
o arange(), linspace()
o zeros(), ones(), empty(), empty_like(), full(), eye()
o random, rand(), randn(), randint()
Array Attributes and Methods
o dtype
o shape
o itemsize
1/3
o reshape(), expand_dims(), and newaxis
o max(), min(), argmax(), argmin()
NumPy Indexing and Selection/Slicing
o Bracket Indexing and Slicing: 1D and 2D arrays
o Boolean Indexing and Slicing
https://www.pythontutorial.net/python-numpy/boolean-indexing/
o Array Stacking (row-wise and column-wise)
o Array Splitting (row-wise and column-wise)
NumPy Operations
o Arithmetic (array and scalar)
o Broadcasting
o Universal Array Functions (element-by-element)
sqrt(), exp(), log(), max(), min(), sin(), etc.
https://numpy.org/doc/stable/reference/ufuncs.html
o Vectorization
Pandas
Series
o Creation
o Convert a python list, dictionary, or Numpy array to Pandas Series
o Indexing and Selection
DataFrames
o Creation
o Convert a python list, dictionary, or Numpy array to Pandas DataFrames
o Creating and Dropping Columns
Data Input and Output
Loading data from csv, excel, etc. files: read_csv(), read_excel()
Saving data to csv, excel, etc. files: to_csv(), to_excel()
Indexing and Selection
Bracket and list selection by label or position
Boolean (or conditional) selection
Setting and resetting indexes
Group By
Merging, Joining, and Concatenating
Useful Operations
o unique(), nunique(), value_counts()
o sort_values()
o isnull()
o apply(func)
Plotting with Pandas
Matplotlib
Basic Plotting and Sub-Plotting
o plt.plot() with a different color, marker, line style, width, etc.
plt.xlabel(), plt.ylabel(), plt.title(), plt.legend(), etc.
o plt.subplot()
Create a Figure Instance and Add Axes (Object Oriented Way)
o fig = plt.figure() with specific figsize and dpi
axes1 = fig.add_axes()
axes2 = fig.add_axes()
axes1.plot()
axes2.plot()
…
o fig, axes = plt.subplots(nrows=1, ncols=2)
Saving:
o fig.savefig("filename.extension", dpi=200)
Resources:
Top Python Concepts to Know Before Learning Data Science by Ibrahim Abayomi Ogunbiyi – 2022
https://www.freecodecamp.org/news/top-python-concepts-for-data-science/#integers-and-floating-
point-numbers-in-python
String Manipulation in Python by PFB Staff Writer - 2022
https://www.pythonforbeginners.com/basics/string-manipulation-in-python
The Python Tutorial [official reference]
https://docs.python.org/3/tutorial/
NumPy For Machine Learning by Paritosh Mahto - 2020
https://medium.com/mlpoint/numpy-for-machine-learning-211a3e58b574
Complete Python NumPy Tutorial by Keith Galli - 2019
Video: https://www.youtube.com/watch?v=GB9ByFAIAH4
Code: https://github.com/KeithGalli/NumPy
Vectorization in Python: Data Science Code by ritvikmath - 2019
https://www.youtube.com/watch?v=BR3Qx9AVHZE
The Markdown Guide
https://www.markdownguide.org/basic-syntax/
An interactive tutorial to learn Markdown’s syntax by Esteban Herrera.
http://eherrera.net/markdowntutorial/
Recommeded Books:
Enjoy!