ML Lecture
ML Lecture
Satellite Data
Sean Foley
What is Machine Learning?
Linear Algebra Optimization
● The language used by the other fields ● Make number go up!
● Vectors, matrices, spaces ● Iterative techniques
● Numerical linear algebra: how to go fast! ● Gradient-based methods
● Leverage hardware
Signal Processing
● Signal vs. noise
● Mutual information / entropy Data Science
● Bits! ● Garbage in, garbage out
● 90% of the work is getting good data
● The most important part (in my opinion)
High-Dimensional Probability & Stats.
● If I throw one trillion twenty-sided dice…
● Surprisingly geometrical
● Some intuitive concepts break down in “The unreasonable effectiveness of data” - Peter
higher dimensions! Norvig
Supervised Unsupervised Reinforcement
Learning Learning Learning
Inputs Data
Agent
● K-Means Clustering
● Decision Trees
● Principal Component
● Random Forests
Analysis (PCA)
● Support Vector
● Gaussian Mixture
Machines (SVM)
Models
Model
Loss Function
Targets /
Predictions
Labels
Supervised
Learning
Inputs
Model
Loss Function
Targets /
Predictions
Labels
Regress chlorophyll-a
Loss
x Slide credit: Patrick Gray
Classify phytoplankton types
chlorophyll-a
chlorophyll-a
Synechococcus Synechococcus
Prochlorococcus Prochlorococcus
phycoerythrin phycoerythrin
Uh oh!!
Activation Functions
● Non-linear
● Historically:
○ Sigmoid
○ Tanh
● Modern:
○ ReLU
○ Leaky ReLU
Et voila…
Loss Function
● Differentiable
● Aligns with evaluation metrics
● e.g.
○ Mean Squared Error (L2 loss) <- regression
Loss
Function
Gradient Descent
W1 W2
Gradient Descent
● Step size
● Stochastic Gradient Descent (SGD)
● Other Gradient-based methods
○ Adam: most commonly-used
Evaluation Metrics (Classification)
Evaluation Metrics (Regression)
● Generalizability!
● Train: parameters
● Validation: hyperparameters
● Testing: nothing!
Training / validation script overview
Over/under fitting, revisited
underfitting overfitting
Graphics Processing Units (GPUs)
High-level code
CUDA
Spatial context!
Translational
symmetry!
Convolution
flip g!
2D discrete convolution
Image Cubes
Height
RGB
Height
OCI
Width
80
~2
Width
A Convolutional Layer: Basics
● bias
● activation function
● differentiable
# parameters:
https://adamharley.com/nn_vis/cnn/2d.html
Data Augmentation
CNNs + Satellite Data: Caveats