0% found this document useful (0 votes)

3 views30 pages

Machine Learning - Till Chapter5

The document provides an overview of data analytics and machine learning, emphasizing the distinction between the two fields and their applications in optimization and online advertising. It details various predictive models, optimization techniques, and the structure and training of neural networks, particularly in relation to tabular data. Additionally, it outlines exam expectations, study tips, and key concepts necessary for understanding supervised learning and optimization in machine learning.

Uploaded by

Adithya Ayanam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views30 pages

Machine Learning - Till Chapter5

Uploaded by

Adithya Ayanam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

1.

Introduction to Data Analytics and Machine Learning

• Data Analytics: Using data to build models for better decision-making.
o Descriptive: Summarizing patterns (e.g., visualization, clustering).
o Predictive: Forecasting outcomes (e.g., regression, classification).
o Generative: Modeling distributions (e.g., text/image generation).
o Prescriptive: Data-driven optimization and decision-making.
• Machine Learning vs. Data Analytics:
o ML focuses on patterns and predictions.
o Data analytics emphasizes decisions and value creation.

2. Optimization in Machine Learning

• Optimization plays a foundational role in analytics.
• Used in:
o Training models (gradient-based optimization).
o Decision-focused learning (end-to-end optimization).
o Reinforcement learning (dynamic decision-making).

3. Online Advertising and Ad Auctions

• Google AdWords:
o Revenue model based on Pay-Per-Click (PPC).
o Ads placed using a Generalized Second-Price Auction.
o Quality Score (QS) = Bid × Click-Through Rate (CTR) × 1000.
• CTR (Click-Through Rate):
o Measures probability of a user clicking an ad.
o Varies based on position, keyword, user profile, and device.
• Google's Optimization Problem:
o Maximize revenue while considering:
▪ Advertisers' budgets.
▪ CTR prediction models.
▪ Quality Score for ad ranking.

4. Predictive Models in Online Advertising

• Supervised Learning Approaches:
o Linear Regression: Baseline approach.
o CART Decision Trees: Simple interpretable models.
o Random Forests & Boosting: High predictive accuracy.
• CTR Prediction Models:
o Factors include ad position, user demographics, past CTRs.
o Used to optimize ad placement and maximize clicks.

5. Key Course Topics for Exam

• Gradient-Based Optimization: Used in deep learning models.
• Deep Learning Architectures:
o Feedforward, CNNs, RNNs, Transformers.
• Generative AI:
o Language models (GPT), diffusion models.
• Survival Analysis:
o Demand prediction using hazard functions.
• Reinforcement Learning:
o Decision-making under uncertainty.

6. Exam Format and Expectations

• Midterm Date: March 20, 2025 (In-class).
• Exam Focus:
o Conceptual Questions (based on lectures and homework).
o Data Analysis Questions (interpreting models, optimization).
• Grading Breakdown:
o Homework (30%)
o Midterm (30%)
o Final Project (30%)
o Discussion Lab (10%).

7. Additional Study Tips

• Review Lecture Notes: Focus on key models and optimization techniques.
• Practice Homework Problems: Many exam questions are similar.
• Understand Google Ad Optimization: Expect questions on auctions and CTR.
• Work with Python & PyTorch: Coding fluency may be tested in data analysis.

Key Points and Review of Lecture Notes: Supervised Machine Learning &
Optimization (Lecture 2)
1. Supervised Machine Learning (ML) Overview
• Supervised Learning: Training a model using labeled data (X, Y) pairs.
• Key Goals:
o Prediction: Estimating YYY given XXX.
o Inference: Understanding relationships between XXX and YYY.
• Common Supervised ML Methods:
o Linear Regression
o Logistic Regression
o Decision Trees (CART)
o Random Forests
o Boosting
o Regularization (Lasso, Ridge)
o Feature Engineering
• Applications:
o Predicting wine quality, loan defaults, click-through rates, sales volume, etc.
4. Optimization in Machine Learning
• Optimization is fundamental to machine learning models.
• Key Optimization Problems:
o Ordinary Least Squares (OLS): Minimize residual sum of squares (RSS).
o Regularized Regression: Minimize RSS with a penalty term (Lasso, Ridge).
o Gradient-Based Optimization: Used in deep learning models.
• Convex Optimization:
o If the loss function is convex, global minimization is guaranteed.
o Example: Least Squares Regression.
o Non-convex problems (e.g., deep learning) require heuristics (e.g., SGD).
8. Cross-Validation & Best Practices
• Cross-Validation:
o Split Data into Training & Test sets.
o K-Fold CV: Uses multiple subsets to ensure robustness.
o Helps to prevent overfitting.
• Key Model Selection Criteria:
o Bias-Variance Tradeoff:
▪ High Bias → Model is too simple (underfitting).
▪ High Variance → Model memorizes noise (overfitting).
o Hyperparameter Tuning:
▪ Regularization (L1, L2), Learning Rate, Tree Depth, etc.
• 9. Exam Prep Strategy
• Conceptual Understanding
• ✔ Know the difference between Supervised vs. Unsupervised Learning.
✔ Understand Linear Regression, its assumptions, and optimization techniques.
✔ Be familiar with Gradient Descent and why it's useful.
✔ Recognize parametric vs. non-parametric models and their tradeoffs.
✔ Understand the Least Squares Solution in matrix form.
✔ Learn the importance of convexity in optimization.
• Practice Problems
• ✔ Work through regression problems (OLS, Ridge, Lasso).
✔ Perform cross-validation and error analysis.
✔ Solve classification tasks (logistic regression, decision trees, Bayes classifier).
✔ Be comfortable with linear algebra concepts (matrices, gradients, convexity).

Key Points and Review of Lecture Notes: Supervised Machine Learning &
Optimization (Lectures 3 & 4)

1. Review of Optimization Concepts & Unconstrained

Problems
• Optimization is a fundamental aspect of machine learning, used to minimize loss
functions.
• Unconstrained Optimization: Involves minimizing a function without explicit
constraints.
• First-Order Condition: The optimal solution occurs when the gradient of the function
is zero.
• Convexity: If the function is convex, then any local minimum is also a global minimum.

2. Regularized Loss Function Minimization

• Many ML models are trained by minimizing a loss function that measures prediction
error.
• Regularization helps prevent overfitting by adding a penalty term.

Application to Ames Housing Dataset

• Linear regression used to predict housing prices.

• Training data: 2006-2008 (1936 samples).
• Test data: 2009-2010 (989 samples).
• Loss function: Mean Squared Error (MSE), which is equivalent to Residual Sum of
Squares (RSS).

3. Machine Learning as Loss Function Minimization

• The goal of ML algorithms is to minimize the loss function.
• A loss function measures how well the model’s predictions match actual values.
• Formulation:
o Regression: Least Squares Loss (RSS).
o Classification: Different loss functions used.

4. Gradient Descent for Optimization

• Gradient Descent: Iterative method for minimizing functions.
• Key Idea: Move in the direction of steepest descent (negative gradient).
Key Points for Exam Preparation - Feedforward Neural Networks (Lecture 5)
1. Overview of the Lecture

• The lecture introduces deep learning for tabular data using feedforward
networks.
• Topics covered:
o Feature engineering in Ames Housing Data.
o Introduction to neural networks and their structure.
o Single hidden layer feedforward networks.
o Backpropagation algorithm for training neural networks.

2. Ames Housing Data & Feature Engineering

• Ames Housing Dataset is used as a real-world dataset for regression tasks.

• Dependent Variable: Log of sale price.
• Independent Variables (~80 features):
o Zoning classification, dwelling type, year built, quality rating, living area, etc.
• Training/Test Split:
o Training Data: 2006-2008 sales (66% of data).
o Test Data: 2009-2010 sales (34% of data).
• Feature Engineering:
o Adding polynomial terms (e.g., 10-degree polynomials).
o Adding time-based trend features.
o Final dataset contains ~500 features.
3. Introduction to Neural Networks

• Definition: A neural network is a nonlinear statistical model that learns

relationships between inputs and outputs.
• Applications:
o Natural Language Processing (NLP)
o Image Recognition
o Speech Recognition
o Autonomous Driving
• Timeline of Deep Learning:
o 1940s: Early neural networks (McCulloch-Pitts model).
o 1980s: Neural networks gain popularity.
o 1990-2010: Decline in favor of tree-based models (Random Forest,
Boosting).
o 2010-present: Resurgence with deep learning.
• Why the resurgence?
o Availability of large datasets.
o Advances in GPU computation.
o Efficient software frameworks.

4. Feedforward Networks & Tabular Data

• Definition: A feedforward neural network (FNN) is a multilayer nonlinear function.

• Structure:
o Input layer: Takes feature values as input.
o Hidden layer(s): Applies weighted transformations and activation functions.
o Output layer: Produces final predictions.
• Tabular Data:
o Conventional ML models (Random Forest, Boosting) often outperform deep
learning.
o Deep learning does not necessarily outperform classical models on
tabular data.
o However, deep learning automatically performs feature engineering.

5. Neural Network Structure

• Single Hidden Layer Network:

o Input Layer: Takes in feature values.
o Hidden Layer: Transforms inputs using weighted sums + activation
functions.
o Output Layer: Final predictions.
• Key Components:
o Nodes: Perform computations.
o Edges: Represent connections with weights.
o Bias Terms: Offsets for flexibility.
o Activation Functions: Introduce nonlinearity.

6. Training a Feedforward Neural Network

• Goal: Learn the best weights and biases to minimize the loss function.
• Steps:
1. Initialize network weights randomly.
2. Compute forward pass to get predictions.
3. Compute loss function (e.g., MSE for regression, cross-entropy for
classification).
4. Use backpropagation to compute gradients.
5. Update weights using Stochastic Gradient Descent (SGD).
6. Repeat until convergence.

7. Backpropagation Algorithm

• Backpropagation = Gradient Descent + Chain Rule.

• Steps:
1. Forward Pass: Compute output layer activations.
2. Compute Loss: Compare predictions to ground truth.
3. Backward Pass: Compute partial derivatives of loss with respect to each
weight.
4. Update Weights: Apply gradient descent to minimize loss.
• Key Concepts in Backpropagation:
o Chain Rule: Computes gradients layer by layer.
o Gradient Descent: Updates weights based on gradient.
o Optimization: Uses learning rate to control step size.
9. Optimization in Neural Networks

• Gradient Descent:
o Updates weights based on loss gradients.
• Stochastic Gradient Descent (SGD):
o Uses mini-batches instead of full dataset for efficiency.
• Regularization:
o L2 Regularization (Weight Decay): Prevents overfitting by penalizing large
weights.
o Dropout: Randomly removes neurons during training to enhance
generalization.

10. Summary of Key Takeaways

• Neural networks are nonlinear models useful in various applications.

• Feedforward networks consist of input, hidden, and output layers.
• Activation functions introduce nonlinearity (e.g., ReLU, Sigmoid).
• Training involves: Forward pass, loss computation, backpropagation, weight
updates.
• Backpropagation efficiently computes gradients using the chain rule.
• Loss functions vary based on the problem type (MSE, cross-entropy, etc.).
• Optimization uses gradient descent (SGD) with regularization techniques.
How to Prepare for the Exam

1. Understand the structure of neural networks (input, hidden, output layers).

2. Know the different activation functions and their roles.
3. Be comfortable with backpropagation (forward pass, loss function, gradient
computation).
4. Review feature engineering concepts (why it's needed in neural networks).
5. Understand loss functions (MSE, logistic loss, cross-entropy).
6. Learn optimization techniques (SGD, weight decay, dropout).
7. Be prepared for conceptual questions on why deep learning does/does not work
well for tabular data.
8. Solve numerical problems related to forward and backward propagation.

(c) Output Layer

• Produces the final prediction for the model.

• Output depends on the type of problem:
o Regression: Single output, uses identity activation function.
o Binary classification: Single output with sigmoid activation.
o Multiclass classification: Multiple outputs, uses softmax activation.
2. Activation Functions and Their Roles

Activation functions introduce non-linearity into neural networks, allowing them to model
complex relationships.
3. Backpropagation Algorithm

Backpropagation is used to compute gradients efficiently when training a neural network.

Steps in Backpropagation

1. Forward Pass
o Compute predictions using current weights.
o Calculate loss based on predictions and actual values.
2. Compute Loss Function
o Measures how far predictions are from actual values.
o Examples: MSE (for regression), cross-entropy (for classification).
3. Backward Pass (Gradient Computation)
o Compute derivatives of the loss function w.r.t. each weight using the chain
rule.
o Compute gradients layer-by-layer, from output to input.
4.

4. Feature Engineering in Neural Networks

Feature engineering is the process of transforming raw data into a format that improves model
performance.

Why is Feature Engineering Needed?

• Neural networks do not always automatically capture complex patterns in

tabular data.
• Feature engineering helps in:
o Handling categorical variables (e.g., one-hot encoding).
o Scaling numerical features (e.g., standardization).
o Creating polynomial features for better non-linear relationships.
o Capturing temporal trends (e.g., time-based features in Ames Housing
Data).
Can Feature Engineering Be Automated?

• Deep learning can learn features automatically in domains like image and text
processing.
• However, for tabular data, feature engineering is still often necessary.

5. Loss Functions

Loss functions measure how well the neural network’s predictions match the actual values.

(a) Mean Squared Error (MSE)

7. Why Deep Learning Does/Does Not Work Well for Tabular Data
(a) Why Deep Learning May NOT Work Well

• Tree-based models (Random Forest, XGBoost) often outperform deep learning

for tabular data.
• Tabular data does not have hierarchical structure like images or text.
• Difficult to tune hyperparameters in neural networks.

(b) When Deep Learning Might Work

• If dataset is very large and contains complex interactions.

• If automated feature extraction is required.
Final Review Checklist

Understand neural network structure (input, hidden, output layers).

Know activation functions and their use cases.
Be comfortable with forward and backward propagation.
Understand feature engineering and when it’s necessary.
Learn loss functions for regression and classification.
Understand SGD, weight decay, dropout for optimization.
Be able to explain why deep learning works/does not work for tabular data.
Solve numerical problems involving forward and backpropagation.
2. Activation Functions
Question 2: Activation Function Choice

You are training a binary classification model and need to choose an activation function.
(a) Which activation function should be used in the final output layer?
(b) What are two potential problems of using a sigmoid activation in hidden layers?

Solution:

• (a) For binary classification, the sigmoid activation is used in the output layer
because it maps predictions to the range (0,1), making it interpretable as a
probability.
• (b) Two problems of using sigmoid in hidden layers:
1. Vanishing Gradient Problem – Gradients become very small for extreme
values, slowing down training.
2. Outputs are not zero-centered – Sigmoid outputs are always positive,
making gradient updates less efficient.
5. Conceptual Questions
Question 5: Deep Learning for Tabular Data
(a) Why do tree-based models (Random Forest, XGBoost) often outperform neural
networks on tabular data?
(b) When would deep learning be preferable for tabular data?

Solution:

• (a) Tree-based models perform better because:

1. They naturally handle missing values and categorical variables.
2. They do not require extensive feature scaling.
3. They work well on small-to-medium datasets without requiring large
amounts of data.
4. Feature interactions are automatically captured.
• (b) Deep learning is preferable when:
1. The dataset is very large (millions of records).
2. Feature interactions are too complex for tree-based models.
3. The data contains high-dimensional, continuous variables.

Conceptual Questions for Exam Preparation - Feedforward Neural Networks

1. Understanding Neural Network Structure

Question 1: Why Use Multiple Hidden Layers?
(a) Why might a single-layer perceptron be insufficient for complex problems?
(b) How do additional hidden layers improve a neural network’s performance?

Answer:

• (a) A single-layer perceptron can only learn linear decision boundaries. If the
data is not linearly separable, it will fail to learn meaningful patterns (e.g., XOR
problem).
• (b) Additional hidden layers allow the network to:
1. Capture non-linear relationships.
2. Learn hierarchical features (e.g., in images, edges → shapes → objects).
3. Model complex interactions in high-dimensional data.

2. Activation Functions
Question 2: Why Not Always Use ReLU?
(a) What are the advantages of ReLU over sigmoid and tanh?
(b) What are the potential problems with ReLU, and how can they be addressed?

Answer:

• (a) Advantages of ReLU:

1. Avoids vanishing gradient problem (gradient does not saturate for positive
inputs).
2. Computationally efficient (simple function: max(0, x)).
3. Sparse activations (some neurons output 0, leading to better
generalization).
• (b) Problems with ReLU and solutions:
o Dying ReLU Problem: Neurons can get stuck at zero output if gradients
become too small.
▪ Solution: Use Leaky ReLU (allows small gradients for negative
values) or ELU.
o Exploding activations: Can cause numerical instability.
▪ Solution: Use batch normalization.

3. Loss Functions
Question 3: Why Use Cross-Entropy Loss for Classification?
(a) Why is Mean Squared Error (MSE) a poor choice for classification?
(b) How does cross-entropy loss work in binary and multiclass classification?

Answer:

• (a) Problems with MSE for classification:

1. Slower convergence: It does not push predictions towards extreme values
(0 or 1).
2. Gradient vanishing issue: Sigmoid activation + MSE leads to small
gradients, making learning slow.

4. Backpropagation
Question 4: Why Do We Need Backpropagation?
(a) Why can’t we compute weight updates directly like in linear regression?
(b) How does backpropagation efficiently compute gradients?

Answer:

• (a) Direct weight updates don’t work because:

1. The loss function is non-linear due to activation functions.
2. There is no closed-form solution like in linear regression.
3. We need gradient-based optimization to adjust weights iteratively.
• (b) Backpropagation computes gradients efficiently using:
1. Forward pass: Compute activations layer by layer.
2. Backward pass: Compute derivatives layer-by-layer using the chain rule.
3. Weight updates: Use gradient descent to adjust weights.

5. Optimization in Neural Networks

Question 5: Why Is Stochastic Gradient Descent (SGD) Used Instead of
Standard Gradient Descent?
(a) What is the problem with computing gradients on the full dataset?
(b) What are the advantages and trade-offs of SGD?

Answer:

• (a) Problems with full-batch gradient descent:

1. Slow computation on large datasets.
2. Can get stuck in local minima without exploration.
3. Memory inefficient, especially for deep networks.
• (b) Advantages of SGD:
1. Faster updates (computes gradient on small mini-batches).
2. Stochastic nature helps escape local minima.
3. Works well in online learning settings.
• Trade-off: SGD has higher variance in updates, requiring techniques like momentum
or Adam optimizer.

6. Why Deep Learning May Not Work Well on Tabular Data

Question 6: Why Are Tree-Based Models Often Better for Tabular Data?
(a) What challenges do neural networks face when working with tabular data?
(b) When might deep learning still be useful?

Answer:

• (a) Challenges of Deep Learning for Tabular Data:

1. Feature interactions – Neural networks struggle with capturing interactions
between categorical and numerical features.
2. Need for large datasets – Tree-based models work well with smaller
datasets.
3. Harder to interpret – Decision trees offer explainability, whereas deep
networks are black-box models.
• (b) When Deep Learning Might Work Well:
1. Very large datasets with millions of observations.
2. Complex, high-dimensional data (e.g., genomic data).
3. Automated feature extraction needed (e.g., learned embeddings for
categorical data).

7. Regularization Techniques
Question 7: Why Do We Use Dropout?
(a) What problem does dropout solve?
(b) How does dropout work during training and testing?

Answer:

• (a) Problem: Overfitting – deep networks memorize training data instead of

generalizing.
• (b) How Dropout Works:
o Training: Randomly drop (remove) neurons with probability ppp.
o Testing: Use all neurons but scale their activations by ppp to balance the
effect.
8. Understanding Softmax and Probability Outputs
Question 8: Why Is Softmax Used in the Output Layer for Multiclass
Classification?
(a) What does the softmax function do?
(b) How does it differ from sigmoid?

9. Weight Initialization
Question 9: Why Not Initialize All Weights to Zero?
(a) What happens if all weights start at zero?
(b) What is a better initialization strategy?

Answer:

• (a) If all weights start at zero:

1. Every neuron in a layer will have the same gradient and learn identically.
2. Breaks symmetry, preventing useful learning.
• (b) Better initialization strategies:
1. Xavier Initialization (Glorot) – Keeps variance balanced across layers.
2. He Initialization – Preferred for ReLU-based networks.
10. Bias-Variance Tradeoff
Question 10: Why Might a Deep Network Have High Bias or High Variance?
(a) When does a neural network have high bias?
(b) When does it have high variance?
(c) How do we address each issue?

Answer:

• (a) High Bias (Underfitting):

o Network too simple, not enough capacity to learn patterns.
o Fix: Increase hidden layers, units, or use complex activation functions.
• (b) High Variance (Overfitting):
o Network memorizes training data, does not generalize.
o Fix: Use dropout, weight decay (L2 regularization), or early stopping.

H13-311 - V3.5 - Unlocked
100% (1)
H13-311 - V3.5 - Unlocked
132 pages
The Hundred-Page Machine Learning Book-Andriy Burkov (2019) - Removed
No ratings yet
The Hundred-Page Machine Learning Book-Andriy Burkov (2019) - Removed
145 pages
An Ingression Into Deep Learning - Resp
No ratings yet
An Ingression Into Deep Learning - Resp
25 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
6 pages
ML Notes
No ratings yet
ML Notes
16 pages
Course-Outline - Introduction To ML
No ratings yet
Course-Outline - Introduction To ML
3 pages
Data Science Notes C
No ratings yet
Data Science Notes C
4 pages
ML Notes-1
No ratings yet
ML Notes-1
59 pages
Ass Bigd
No ratings yet
Ass Bigd
9 pages
Ai Notes ch2
No ratings yet
Ai Notes ch2
2 pages
Lecture Notes On Machine Learning Concepts
No ratings yet
Lecture Notes On Machine Learning Concepts
5 pages
AI ML Concepts
No ratings yet
AI ML Concepts
97 pages
Machine Learning
No ratings yet
Machine Learning
256 pages
Introduction and Basics of Machine Learning
No ratings yet
Introduction and Basics of Machine Learning
9 pages
Chatgpt Unit - 1
No ratings yet
Chatgpt Unit - 1
5 pages
ML Interview Notes
No ratings yet
ML Interview Notes
3 pages
Cse Aiml Mtech First Year
No ratings yet
Cse Aiml Mtech First Year
22 pages
Machine Learning Topics
No ratings yet
Machine Learning Topics
4 pages
ML Notes
No ratings yet
ML Notes
52 pages
Advance Mechine Learning
No ratings yet
Advance Mechine Learning
2 pages
Data Science Notes B
No ratings yet
Data Science Notes B
5 pages
Machine Learning, AI & Its Applications: Live Online Instructor-Led Training On
No ratings yet
Machine Learning, AI & Its Applications: Live Online Instructor-Led Training On
6 pages
PSCS511 - Machine Learning
No ratings yet
PSCS511 - Machine Learning
23 pages
Summary of Machine Learning (ML) Course Material: Modules 1 & 2
No ratings yet
Summary of Machine Learning (ML) Course Material: Modules 1 & 2
5 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
Advanced Topics in Machine Learning: Supervised Learning, Deep Learning, and Optimization Techniques
No ratings yet
Advanced Topics in Machine Learning: Supervised Learning, Deep Learning, and Optimization Techniques
5 pages
ML Module 1
No ratings yet
ML Module 1
12 pages
Artificial Intelligence Essential
No ratings yet
Artificial Intelligence Essential
8 pages
AI Roadmap
No ratings yet
AI Roadmap
45 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
Aimlmid 2 Notes
No ratings yet
Aimlmid 2 Notes
4 pages
HPCL Section 13 Notes CH
No ratings yet
HPCL Section 13 Notes CH
17 pages
Roadmap
No ratings yet
Roadmap
6 pages
MLUnit - 1 Share
No ratings yet
MLUnit - 1 Share
162 pages
AI-ML-DS (Level-1) Lab Lesson Plan
No ratings yet
AI-ML-DS (Level-1) Lab Lesson Plan
3 pages
Notes On Machine Learning (ML)
No ratings yet
Notes On Machine Learning (ML)
3 pages
ML Unit 1
No ratings yet
ML Unit 1
37 pages
Super Cheatsheet Machine Learning
100% (1)
Super Cheatsheet Machine Learning
15 pages
Class Notes: The Basics of Machine Learning
No ratings yet
Class Notes: The Basics of Machine Learning
4 pages
Kak Anncho
No ratings yet
Kak Anncho
7 pages
AIML105
No ratings yet
AIML105
5 pages
Ai & ML FDP
No ratings yet
Ai & ML FDP
7 pages
BITS F464 Handout
No ratings yet
BITS F464 Handout
3 pages
Data Science Roadmap
No ratings yet
Data Science Roadmap
4 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
7 pages
Machine Learning (R20a0518)
No ratings yet
Machine Learning (R20a0518)
87 pages
Mla U1,2 Quicksumry
No ratings yet
Mla U1,2 Quicksumry
3 pages
Machine Learning
No ratings yet
Machine Learning
12 pages
CS-601 Machine Learning Study Guide
No ratings yet
CS-601 Machine Learning Study Guide
2 pages
NEP Syllabus Questions
No ratings yet
NEP Syllabus Questions
3 pages
Bits f464 Machine Learning l1
No ratings yet
Bits f464 Machine Learning l1
5 pages
ML Question Bank Ese
No ratings yet
ML Question Bank Ese
37 pages
Reflective Journal Writing 6 - 1733814927
No ratings yet
Reflective Journal Writing 6 - 1733814927
4 pages
CSC407 - Chapter 1
No ratings yet
CSC407 - Chapter 1
31 pages
Supervised Learning Final With Diagrams Cleaned
No ratings yet
Supervised Learning Final With Diagrams Cleaned
7 pages
Amldl QB
No ratings yet
Amldl QB
4 pages
Road Map
No ratings yet
Road Map
3 pages
Module 1 ML
No ratings yet
Module 1 ML
8 pages
Introduction To Machine Learning Algorithms - Scribd
No ratings yet
Introduction To Machine Learning Algorithms - Scribd
2 pages
AI & Deep Learning TensorFlow, Keras, PyTorch - 80 Hours-1
No ratings yet
AI & Deep Learning TensorFlow, Keras, PyTorch - 80 Hours-1
12 pages
40 Machine Learning Algorithms
From Everand
40 Machine Learning Algorithms
Anam Giri
No ratings yet
data science course training in india hyderabad: innomatics research labs
From Everand
data science course training in india hyderabad: innomatics research labs
innomatics research labs
No ratings yet
Addressing IoT Security Challenges Through AI Solutions
No ratings yet
Addressing IoT Security Challenges Through AI Solutions
6 pages
NN Unit - 1
No ratings yet
NN Unit - 1
27 pages
Write Master Thesis With A Company
100% (3)
Write Master Thesis With A Company
7 pages
Unit-6 AI (April 11, 2023)
No ratings yet
Unit-6 AI (April 11, 2023)
42 pages
AI ML Notes 2
No ratings yet
AI ML Notes 2
151 pages
Scheme of Study &syllabi: Be Cse
No ratings yet
Scheme of Study &syllabi: Be Cse
76 pages
QB Soft
No ratings yet
QB Soft
10 pages
ML Unit 5
No ratings yet
ML Unit 5
33 pages
Unit1 DL JNTUK
No ratings yet
Unit1 DL JNTUK
43 pages
2019 - NATURE - Deep Neural Networks in Psychiatry
No ratings yet
2019 - NATURE - Deep Neural Networks in Psychiatry
16 pages
Introduction-To-Ml-Part-3 Edited
No ratings yet
Introduction-To-Ml-Part-3 Edited
73 pages
s41598 024 61339 1
No ratings yet
s41598 024 61339 1
17 pages
Ann CNN RNN
No ratings yet
Ann CNN RNN
26 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
30 pages
CMPT 413/713: Natural Language Processing: Nat Langlab
No ratings yet
CMPT 413/713: Natural Language Processing: Nat Langlab
31 pages
An Overview and Comparative Analysis of Recurrent Neural Networks For Short Term Load Forecasting
No ratings yet
An Overview and Comparative Analysis of Recurrent Neural Networks For Short Term Load Forecasting
41 pages
Soft Computing Unit 1 Notes
No ratings yet
Soft Computing Unit 1 Notes
33 pages
NeuralNetworks One PDF
No ratings yet
NeuralNetworks One PDF
58 pages
10 11 12 Neural Network
No ratings yet
10 11 12 Neural Network
20 pages
Unit 1 Question and Answers
100% (1)
Unit 1 Question and Answers
29 pages
Cost Estimate Building PDF
No ratings yet
Cost Estimate Building PDF
115 pages
Predictive Mathematical Modeling For Excore Neutron Detectors Using A Neural Network-01
No ratings yet
Predictive Mathematical Modeling For Excore Neutron Detectors Using A Neural Network-01
7 pages
Machine Learning and Deep Learning Methods For Intrusion Detection Systems - Recent Developments and Challenges
No ratings yet
Machine Learning and Deep Learning Methods For Intrusion Detection Systems - Recent Developments and Challenges
33 pages
Facial Expression Recognition Using Backpropagation
No ratings yet
Facial Expression Recognition Using Backpropagation
6 pages
Multilayer Neural Network
No ratings yet
Multilayer Neural Network
27 pages
Approximation Capability To Functions of Several Variables Nonlinear Functionals and Operators by Radial Basis Function Neural Networks
No ratings yet
Approximation Capability To Functions of Several Variables Nonlinear Functionals and Operators by Radial Basis Function Neural Networks
7 pages
Bim309 Ai Week13
No ratings yet
Bim309 Ai Week13
53 pages
Chapter - 4 & 5
No ratings yet
Chapter - 4 & 5
63 pages

Machine Learning - Till Chapter5

Uploaded by

Machine Learning - Till Chapter5

Uploaded by

1.

Introduction to Data Analytics and Machine Learning

2. Optimization in Machine Learning

3. Online Advertising and Ad Auctions

4. Predictive Models in Online Advertising

5. Key Course Topics for Exam

6. Exam Format and Expectations

7. Additional Study Tips

1. Review of Optimization Concepts & Unconstrained

2. Regularized Loss Function Minimization

Application to Ames Housing Dataset

• Linear regression used to predict housing prices.

3. Machine Learning as Loss Function Minimization

4. Gradient Descent for Optimization

2. Ames Housing Data & Feature Engineering

• Ames Housing Dataset is used as a real-world dataset for regression tasks.

• Definition: A neural network is a nonlinear statistical model that learns

4. Feedforward Networks & Tabular Data

• Definition: A feedforward neural network (FNN) is a multilayer nonlinear function.

5. Neural Network Structure

• Single Hidden Layer Network:

6. Training a Feedforward Neural Network

• Backpropagation = Gradient Descent + Chain Rule.

10. Summary of Key Takeaways

• Neural networks are nonlinear models useful in various applications.

1. Understand the structure of neural networks (input, hidden, output layers).

(c) Output Layer

• Produces the final prediction for the model.

Backpropagation is used to compute gradients efficiently when training a neural network.

4. Feature Engineering in Neural Networks

Why is Feature Engineering Needed?

• Neural networks do not always automatically capture complex patterns in

(a) Mean Squared Error (MSE)

• Tree-based models (Random Forest, XGBoost) often outperform deep learning

(b) When Deep Learning Might Work

• If dataset is very large and contains complex interactions.

Understand neural network structure (input, hidden, output layers).

• (a) Tree-based models perform better because:

Conceptual Questions for Exam Preparation - Feedforward Neural Networks

1. Understanding Neural Network Structure

• (a) Advantages of ReLU:

• (a) Problems with MSE for classification:

• (a) Direct weight updates don’t work because:

5. Optimization in Neural Networks

• (a) Problems with full-batch gradient descent:

6. Why Deep Learning May Not Work Well on Tabular Data

• (a) Challenges of Deep Learning for Tabular Data:

• (a) Problem: Overfitting – deep networks memorize training data instead of

• (a) If all weights start at zero:

• (a) High Bias (Underfitting):

You might also like