Advanced Machine Learning: Course Overview
Advanced Machine Learning: Course Overview
This image is very high-dimensional: comprising of 1024*1024*3 = 3 million dimensional real space
Words in a sentence
The sentence of 25 words has 25*50 K = 1.25 million dimensional discrete space
Different task settings
Given training data D, train a model M that can be used for
● Generation
○ Unconditional: Generate a sample X that is representative in D
○ Conditional: Given an input prompt X, generate a likely sample Y.
● Density estimation:
○ What is the probability that a given sample X is part of the training distribution D
● Translation
● Text-to-tree generation
Translation
Input: x Predicted sequence: y
• We want to output a probability with the output translation, and not just
produce one translation.
Density estimator
Course contents
Inference
● Boolean queries on conditional inference
● Marginalization queries: P(Xi), max_x P(x)
○ Sum-product and max-product Inference in Graphical Models
● Sampling
○ Classical methods of sampling in tractable model: forward sampling, importance
weighted sampling, Markov Chain Monte Carlo sampling (MCMC),
○ Recent methods usable in deep learning: Monte-Carlo with Langevin dynamics
Inference (Continued)
● Inference challenges in modern LLMs (a special Bayesian network)
○ Limitations of greedy decoding
○ Sampling multiple generations
○ Grammar constrained decoding
○ Speculative decoding