0% found this document useful (0 votes)
6 views

MSBA315_intro_to_python_for_ML

The document outlines the prerequisites for the MSBA 315 course, focusing on Python for Machine Learning, including essential development environments like Anaconda and Google Colab. It covers fundamental Python concepts, NumPy, Pandas, Matplotlib, and Markdown, along with recommended resources and books for further learning. Key topics include data types, array operations, data manipulation, and basic plotting techniques.

Uploaded by

gacia der
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

MSBA315_intro_to_python_for_ML

The document outlines the prerequisites for the MSBA 315 course, focusing on Python for Machine Learning, including essential development environments like Anaconda and Google Colab. It covers fundamental Python concepts, NumPy, Pandas, Matplotlib, and Markdown, along with recommended resources and books for further learning. Key topics include data types, array operations, data manipulation, and basic plotting techniques.

Uploaded by

gacia der
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

MSBA 315

Introduction to Python for Machine Learning

In preparation for MSBA 315, you should be familiar with the following concepts. Highlighted
concepts (in red font) are particularly important.

Environments
You should be familiar with these development environments:
 Anaconda + Jupyter Notebooks (local computer)
o How to install Anaconda(python) on Windows 10 by Coder's Digest - 2021
 https://www.youtube.com/watch?v=GEYK1dlDqgU
o How to Set Up Your Data Science Environment by Coder's Digest - 2021
 https://www.youtube.com/watch?v=w3kXtaZEtRs
o Anaconda: Start here for data science in Python! by Karan Bhanot - 2020
 https://towardsdatascience.com/anaconda-start-here-for-data-science-in-python-
475045a9627
 Google Colab (cloud-based):
o Google Colab Tutorial for Beginners by Doga Ozgon - 2021
 https://youtu.be/RLYoEyIHL6A
o Deal with Files in Google Colab by Siddhant Sadangi - 2022
 https://neptune.ai/blog/google-colab-dealing-with-files
o A Complete Guide to Google Colab by Ahmad Anis - 2020
 https://www.kdnuggets.com/2020/06/google-colab-deep-learning.html

Python Basics
 Data types: Numbers, Strings, Lists, Dictionaries, Booleans, Tuples, Sets
 Comparison operators
 if, elif, else Statements
 for Loops
 while Loops
 range()
 list comprehension
 functions
 lambda expressions
 methods
 string manipulation
 split(), strip(), join()

NumPy Basics
 Numpy Arrays
o One dimensional Arrays (1D array)
o Two dimensional Arrays (2D array)
o From Python list to Numpy Array
 Built-in Methods for Basic Array Data Generations
o arange(), linspace()
o zeros(), ones(), empty(), empty_like(), full(), eye()
o random, rand(), randn(), randint()
 Array Attributes and Methods
o dtype
o shape
o itemsize

1/3
o reshape(), expand_dims(), and newaxis
o max(), min(), argmax(), argmin()
 NumPy Indexing and Selection/Slicing
o Bracket Indexing and Slicing: 1D and 2D arrays
o Boolean Indexing and Slicing
 https://www.pythontutorial.net/python-numpy/boolean-indexing/
o Array Stacking (row-wise and column-wise)
o Array Splitting (row-wise and column-wise)
 NumPy Operations
o Arithmetic (array and scalar)
o Broadcasting
o Universal Array Functions (element-by-element)
 sqrt(), exp(), log(), max(), min(), sin(), etc.
 https://numpy.org/doc/stable/reference/ufuncs.html
o Vectorization
Pandas
 Series
o Creation
o Convert a python list, dictionary, or Numpy array to Pandas Series
o Indexing and Selection
 DataFrames
o Creation
o Convert a python list, dictionary, or Numpy array to Pandas DataFrames
o Creating and Dropping Columns
 Data Input and Output
 Loading data from csv, excel, etc. files: read_csv(), read_excel()
 Saving data to csv, excel, etc. files: to_csv(), to_excel()
 Indexing and Selection
 Bracket and list selection by label or position
 Boolean (or conditional) selection
 Setting and resetting indexes
 Group By
 Merging, Joining, and Concatenating
 Useful Operations
o unique(), nunique(), value_counts()
o sort_values()
o isnull()
o apply(func)
 Plotting with Pandas

Matplotlib
 Basic Plotting and Sub-Plotting
o plt.plot() with a different color, marker, line style, width, etc.
 plt.xlabel(), plt.ylabel(), plt.title(), plt.legend(), etc.
o plt.subplot()
 Create a Figure Instance and Add Axes (Object Oriented Way)
o fig = plt.figure() with specific figsize and dpi
 axes1 = fig.add_axes()
 axes2 = fig.add_axes()
 axes1.plot()
 axes2.plot()
 …
o fig, axes = plt.subplots(nrows=1, ncols=2)
 Saving:
o fig.savefig("filename.extension", dpi=200)

American University of Beirut


PO Box 11-0236, Riad El Solh, Beirut 1107 2020, Lebanon
T +961 1 35 00 00 – Ext 000 | [email protected]
aub.edu.lb/OSB 2/3
Markdown:
 Basic markdown syntax to create headings (Sections, subsections, etc.), bold or italic font,
lists, ordered lists, hyperlinks, etc. to organize your Jupyter notebook (see resources).

Resources:
Top Python Concepts to Know Before Learning Data Science by Ibrahim Abayomi Ogunbiyi – 2022
 https://www.freecodecamp.org/news/top-python-concepts-for-data-science/#integers-and-floating-
point-numbers-in-python
String Manipulation in Python by PFB Staff Writer - 2022
 https://www.pythonforbeginners.com/basics/string-manipulation-in-python
The Python Tutorial [official reference]
 https://docs.python.org/3/tutorial/
NumPy For Machine Learning by Paritosh Mahto - 2020
 https://medium.com/mlpoint/numpy-for-machine-learning-211a3e58b574
Complete Python NumPy Tutorial by Keith Galli - 2019
 Video: https://www.youtube.com/watch?v=GB9ByFAIAH4
 Code: https://github.com/KeithGalli/NumPy
Vectorization in Python: Data Science Code by ritvikmath - 2019
 https://www.youtube.com/watch?v=BR3Qx9AVHZE
The Markdown Guide
 https://www.markdownguide.org/basic-syntax/
An interactive tutorial to learn Markdown’s syntax by Esteban Herrera.
 http://eherrera.net/markdowntutorial/

Recommeded Books:

Python Data Science Handbook (2nd Edition) Jake VanderPlas


 Chapter 2: Introduction To NumPy
 Chapter 3: Data Manipulation With Pandas

Python for Data Analysis (3rd Edition) by Wes McKinney


 Chapter 2: Python Language Basics, IPython, And Jupyter
Notebooks
 Chapter 3: Built-In Data Structures, Functions, And Files
 Chapter 4: NumPy Basics: Arrays And Vectorized
Computation
 Chapter 5: Getting Started With Pandas
 Chapter 6: Data Loading, Storage, And File Formats
 Chapter 9: Plotting And Visualization

These books are available via AUB libraries under O'Reilly


Safari Books

Enjoy!

Wael Khreich, PhD


Assistant Professor
Business Information and Decision Systems

American University of Beirut


PO Box 11-0236, Riad El Solh, Beirut 1107 2020, Lebanon
T +961 1 35 00 00 – Ext 000 | [email protected]
aub.edu.lb/OSB 3/3

You might also like