Data Science With Machine Learning Curriculum 2021
Data Science With Machine Learning Curriculum 2021
Data Science With Machine Learning Curriculum 2021
Curriculum
1
Updated April 2021
Data Science with Machine Learning Curriculum
Program Objective
Data science is a fast-evolving field and offers many employment opportunities for people with a
robust operational analysis background. In recent years, technological development in data
collection and storage and innovations in data science tools and methodologies have made it even
more important to have properly trained data analysts and data scientists to perform data analyses
to gain business insights.
NYC Data Science Academy designed the Data Science with Machine Learning bootcamp to
provide accelerated training to fulfill the need for data science professionals in the employment
market. The objective of the Data Science Bootcamp is to provide training in primary data science
tools and methods that prepares students for employment opportunities across all industries as
data science professionals.
Program Description
The Data Science with Machine Learning bootcamp is an advanced certificate program that is
designed primarily for individuals who have earned a baccalaureate or higher degree and want to
further their career in the field of data science. It is a very accelerated training program in which
students learn the major tools and methods for performing data analyses and apply them to various
projects typically found in the data science field.
At the foundation level of the program, students learn to employ R and Python for data analytics
projects and for presenting research results effectively. Beyond the foundational level, students
study machine learning with Python and carry out research projects that involve advanced data
science methods and strategies. The program also exposes students to concepts and practices in
deep learning and big data.
Data Machine
Prework Capstone Job Search
Analytics Learning
3
Updated April 2021
Data Science with Machine Learning Curriculum
Prework
Once students are enrolled in the bootcamp, they are granted access to our online, self-paced pre-
work materials to get prepared in linear algebra, statistics, and some foundational work in coding.
Enrolled bootcamp students can also choose to take part-time, beginner-level courses hosted at our
NYC campus for preparation. Tuition paid for such courses will be credited as part of bootcamp
tuition.
4
Updated April 2021
Data Science with Machine Learning Curriculum
5
Data Science with Machine Learning Curriculum
• SQL: Part 4
o Window functions
o Comparison with Grouping and Aggregating
o Application
• SQL: Part 5
o Data manipulation
o Loading data from Les
o Data integrity
• Introduction to R: Part I
o Introduction to R
o Introduction to RStudio
o R objects
o Functional programming: apply
● Introduction to R: Part II
o More data types
o Control statements
o Functions
o Data transformations
● Manipulating Data with dplyr and tidyr
o Introduction to dplyr
o Built-in functions
o Join data sets
o Groupwise operations
o Reshape the layout
o Split/Combine cells
● Data Visualization with "ggplot2"
o Why ggplot2?
o The “Grammar of Graphics”
o Constructing a ggplot2 plot
o Scatterplots
o Bar charts
o Histograms
o Visualizing big data
o Saving graphs
● Advanced ggplot2
o Customized graphics
o Titles
o Coordinate systems
6
Data Science with Machine Learning Curriculum
o Scales
o Themes
o Axis labels
o Legends
o Other visualizations
● Introduction to Shiny
o A quick introduction to shiny
o Building a Shiny App from scratch
o Improving your Shiny App
● Shiny Topics
o GoogleVis
o Leaflet
o Shiny dashboard
● Foundations of Statistics
o Descriptive statistics
o Introduction to inferential statistics
o Introduction to machine learning
• Advanced Statistics
o Distributions
o Going through codes
Project 1: Exploratory Visualization & Shiny
7
Data Science with Machine Learning Curriculum
§ An example
§ Getting started
§ Items/spider/pipelines/settings.py
§ In class lab
● NumPy
o NumPy overview
o Ndarray
o Vectorized/Element-wise Operations
o Subscripting and Slicing
o Intro to Matrix Multiplication and Inversion
o Random number generation
o Case study
● Data Manipulations with Pandas
o Series and common operations
o Data frames and common operations
o Data manipulation
o Time series
● Data Visualization in the NumPy Stack
o Matplotlib
o Seaborn
o Case study
● SciPy and Data Analysis Roundup
o Introduction to SciPy
o Case study
Project 2: Data Analytics with Web Scraping
8
Data Science with Machine Learning Curriculum
Machine Learning I
● Simple Linear Regression
o Overview of Machine Learning
o Residuals, RSS, and the Coefficient of Determination.
o Assumptions of Simple Linear Regression.
o Model coefficients
§ Interpretation
§ Standard Errors
● Multiple Linear Regression
o Assumptions of Multiple (General) Linear Regression
o Coefficients of Continuous Features
o Multicollinearity and Overfitting
o Dummifying Categorical features
§ Coefficient Interpretation
o Case Study
● Penalized Linear Regression
o The Bias-Variance tradeoff
o Euclidean (L2) vs Manhattan (L1) Distances
9
Data Science with Machine Learning Curriculum
Machine Learning II
10
Data Science with Machine Learning Curriculum
● Tree-Based Models
o Decision Trees
o Bagging Trees
o Random Forests
o Case study
● Gradient Boosting
o First step to increase stability
o OOB Errors and assessment
o Comparison to the Decision Tree
● Random Forests
o Big picture behind Gradient Boosting
o Gradient Boosting with Trees
o Case study
● Support Vector Machines
o Maximum Margin Classifiers
o Support Vector Classifiers
o Support Vector Regressors
o Bias/Variance with SVM
o Case study
● Supervised Learning Roundup
o Regressors
o Classifiers
o Scalability and further topics
• Unsupervised 1: Clustering
o KMeans
o Hierarchical models
o Case study
• Unsupervised 2: Matrix Factorization
o Principal component analysis
o The other LDA: Latent Dirichlet Analysis
o Case studies
• Machine Learning Roundup
o Shoutout to those forgotten
o Working together: where unsupervised and supervised cooperate
o Case study
• Machine Learning Project
Project 3: Machine learning on housing price
11
Data Science with Machine Learning Curriculum
12