0% found this document useful (0 votes)

14 views3 pages

Exam 1

Uploaded by

usman gujjer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views3 pages

Exam 1

Uploaded by

usman gujjer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 3

Which technique converts categorical variables into a form that can be provided to

ML models?
A) Normalization
B) One-hot encoding
C) PCA
D) Correlation analysis

What is the first step typically taken when beginning to clean a new dataset?
A) Apply normalization
B) Remove duplicate entries
C) Conduct an initial data assessment
D) Encode categorical variables

Why might you use target encoding for categorical variables?

A) To preserve the ordinal relationship between categories
B) To reduce the computational cost of the model
C) To incorporate the mean of the target variable into the feature
D) To increase the number of features

Which method is suggested for handling missing data in a dataset?

A) Deleting all rows with any missing data
B) Filling missing values with the mean or median
C) Using a predictive model to fill in missing values
D) Ignoring the presence of missing data during model training

What is the benefit of synthesizing date-related features from timestamp data?

A) Reduces the overall dataset size
B) Helps in identifying trends over time
C) Simplifies the model’s architecture
D) Reduces training time significantly

What role does one-hot encoding play in feature preparation?

A) It scales the data to a fixed range
B) It transforms categorical data into binary variables
C) It combines multiple features into one
D) It reduces the number of categories

How does injecting external information into a dataset typically affect model
performance?
A) Decreases accuracy due to increased complexity
B) No impact, as external data is usually irrelevant
C) Can increase accuracy by adding relevant contextual details
D) Always leads to overfitting

What is a common approach to determine the initial performance of a new dataset?

A) Building a complex model
B) Setting up a baseline model
C) Performing deep learning techniques
D) Implementing reinforcement learning

How is dealing with ordinal categorical data different from nominal?

A) It is encoded using binary encoding
B) It requires the categories to maintain an order
C) It is always treated as continuous data
D) It does not require encoding

What is an effective strategy to handle large datasets with many missing values?
A) Use only complete cases for analysis
B) Create synthetic data to fill gaps
C) Apply a robust scaler
D) Use imputation techniques based on the data distribution

What impact does feature engineering have on machine learning models?

A) It generally decreases model accuracy
B) It has no impact on model performance
C) It can significantly improve model performance
D) It increases the computational load without benefits

What does synthesizing numeric features from categorical data involve?

A) Directly mapping numbers to categories based on their frequency
B) Extracting numerical value from categorical descriptions
C) Assigning random numbers to categories
D) Generating statistical summaries for each category

Which encoding technique might lead to a dimensionality explosion in large

datasets?
A) Binary encoding
B) Label encoding
C) One-hot encoding
D) Hashing

Why might log transformation be used on a continuous variable?

A) To normalize the data
B) To encode categorical attributes
C) To handle underflows in computations
D) To correct multicollinearity

What is the primary goal of creating a baseline model?

A) To win machine learning competitions
B) To serve as a reference point for model improvement
C) To deploy the model into production
D) To test different types of neural networks

What does splitting apart complex descriptions into usable features typically
involve?
A) Separating text into different data types
B) Extracting key phrases using NLP techniques
C) Creating features based on the length of descriptions
D) Parsing structured data from unstructured text

How does feature engineering affect the training time of a model?

A) Increases significantly due to more data processing
B) Decreases as models become more efficient
C) No change, as models are independent of features
D) Varies depending on the type of features created

Which technique is used to prevent model overfitting?

A) Increasing the number of features
B) Using simpler baseline models
C) Applying regularization methods
D) Enhancing model training time

Short Answer Questions

Explain the concept of 'baseline model' and its importance in the model development
process.

Describe how external neighborhood information can be used in feature engineering

to enhance a model's predictive accuracy.

Answers:

B) One-hot encoding
C) Conduct an initial data assessment
C) To incorporate the mean of the target variable into the feature
B) Filling missing values with the mean or median
B) Helps in identifying trends over time
B) It transforms categorical data into binary variables
C) Can increase accuracy by adding relevant contextual details
B) Setting up a baseline model
B) It requires the categories to maintain an order
D) Use imputation techniques based on the data distribution
C) It can significantly improve model performance
B) Extracting numerical value from categorical descriptions
C) One-hot encoding
A) To normalize the data
B) To serve as a reference point for model improvement
D) Parsing structured data from unstructured text
A) Increases significantly due to more data processing
C) Applying regularization methods

Short Answer Questions Answers

Concept of 'Baseline Model':

A baseline model is a simple initial model that is set up at the beginning of the
machine learning workflow to serve as a reference point for all subsequent modeling
efforts. Its primary purpose is to provide a benchmark to compare the performance
of more complex models. By establishing a baseline, data scientists can measure the
incremental value added by more sophisticated algorithms and feature engineering.
This helps in understanding the effectiveness of different approaches and ensuring
that any increases in model complexity are justified by substantial improvements in
performance.

Using External Neighborhood Information in Feature Engineering:

External neighborhood information can significantly enhance a model’s predictive
accuracy by adding context that is not available from the internal dataset alone.
For instance, in real estate pricing models, incorporating neighborhood crime
rates, school district quality, and public transportation accessibility can provide
more accurate predictions of housing prices. This additional information helps the
model capture variations in prices due to external factors, leading to a more
nuanced understanding of the data. Feature engineering with external data involves
creating new features or modifying existing ones based on this external
information, which can be integrated during the data preprocessing stage.

QCM
No ratings yet
QCM
24 pages
Huawei Final Written Exam 2.2 Attempts
No ratings yet
Huawei Final Written Exam 2.2 Attempts
19 pages
CSE1703 - Fundamental of Data Science
No ratings yet
CSE1703 - Fundamental of Data Science
6 pages
20 Questions On Feature Engineering and Eda
No ratings yet
20 Questions On Feature Engineering and Eda
9 pages
Huawei Final Written Exam
50% (2)
Huawei Final Written Exam
18 pages
Unit 4 Basics of Feature Engineering
100% (1)
Unit 4 Basics of Feature Engineering
33 pages
AI Final Exam 2025
No ratings yet
AI Final Exam 2025
3 pages
01 - Feature Engg
No ratings yet
01 - Feature Engg
43 pages
AI-Module 4 - Updated
No ratings yet
AI-Module 4 - Updated
53 pages
ML-Unit 3
No ratings yet
ML-Unit 3
58 pages
Unit II
No ratings yet
Unit II
119 pages
AIP-210 CertNexus Certified Artificial Intelligence Practitioner Practice Questions
No ratings yet
AIP-210 CertNexus Certified Artificial Intelligence Practitioner Practice Questions
8 pages
Xii Ai Worksheet & Questions
No ratings yet
Xii Ai Worksheet & Questions
33 pages
100 Days of Machine Learning
No ratings yet
100 Days of Machine Learning
14 pages
7-8 Feature Engineering 101-Normalization
No ratings yet
7-8 Feature Engineering 101-Normalization
8 pages
Unit 2exploratory Analysis
No ratings yet
Unit 2exploratory Analysis
37 pages
Ds 5
No ratings yet
Ds 5
9 pages
Quiz 2
No ratings yet
Quiz 2
8 pages
Ai ML Unit 1
No ratings yet
Ai ML Unit 1
15 pages
Week 10
No ratings yet
Week 10
50 pages
NN 7
No ratings yet
NN 7
26 pages
DS 1
No ratings yet
DS 1
20 pages
Unit 4
No ratings yet
Unit 4
8 pages
Feature Engineering For Machine Learning
No ratings yet
Feature Engineering For Machine Learning
41 pages
Class PPT - Unit2
No ratings yet
Class PPT - Unit2
139 pages
Unit 1 - Capstone Project-Answer Key
No ratings yet
Unit 1 - Capstone Project-Answer Key
21 pages
Unit 3-2
No ratings yet
Unit 3-2
15 pages
Summery of Feature Eng
No ratings yet
Summery of Feature Eng
4 pages
Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
Data Science MCQs
No ratings yet
Data Science MCQs
9 pages
Exam 2
No ratings yet
Exam 2
3 pages
CAPSTONE
No ratings yet
CAPSTONE
16 pages
ML Unit 3
No ratings yet
ML Unit 3
17 pages
Capstone Project
No ratings yet
Capstone Project
17 pages
NASHEEEEYYYYYY
No ratings yet
NASHEEEEYYYYYY
30 pages
UNIT 1-Capstone Project Practice Questions
No ratings yet
UNIT 1-Capstone Project Practice Questions
14 pages
Unit 6aics
No ratings yet
Unit 6aics
25 pages
Capstone Project Question Answers
No ratings yet
Capstone Project Question Answers
11 pages
ISE 529 Mock Test Answers
No ratings yet
ISE 529 Mock Test Answers
6 pages
Feature Engineering
No ratings yet
Feature Engineering
15 pages
ML Week 8
No ratings yet
ML Week 8
12 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
5 pages
Assess Your Project Knowledge
No ratings yet
Assess Your Project Knowledge
2 pages
Mcqs 1
No ratings yet
Mcqs 1
34 pages
Set-B - CT2 - AnswerKey
No ratings yet
Set-B - CT2 - AnswerKey
10 pages
AIL Quiz Loc
No ratings yet
AIL Quiz Loc
33 pages
MCQ 3 Aiml
No ratings yet
MCQ 3 Aiml
2 pages
Data Science Final Mock Test
No ratings yet
Data Science Final Mock Test
47 pages
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
Summer Training Report - Ishan Patwal
No ratings yet
Summer Training Report - Ishan Patwal
52 pages
SampleQuestion - AIOL 2024
No ratings yet
SampleQuestion - AIOL 2024
5 pages
Machine: Learning
No ratings yet
Machine: Learning
24 pages
DPL302m (FPTU - AI) Flashcards - Quizlet
No ratings yet
DPL302m (FPTU - AI) Flashcards - Quizlet
11 pages
EDA Explanations
No ratings yet
EDA Explanations
22 pages
What Are The Differences Between Supervised and Unsupervised Learning?
No ratings yet
What Are The Differences Between Supervised and Unsupervised Learning?
21 pages
1Z0-1127-24 Practice Exam Questions
No ratings yet
1Z0-1127-24 Practice Exam Questions
12 pages
Expanded Feature Engineering
No ratings yet
Expanded Feature Engineering
7 pages
Shivaji University, Kolhapur
No ratings yet
Shivaji University, Kolhapur
12 pages
Hatdog 1.2
No ratings yet
Hatdog 1.2
18 pages
Chapter 01 Machine Learning
No ratings yet
Chapter 01 Machine Learning
22 pages
Akshay Rahangdale
No ratings yet
Akshay Rahangdale
1 page
Artificial Intelligence Class 10 SP
No ratings yet
Artificial Intelligence Class 10 SP
32 pages
Current Trends in AI
No ratings yet
Current Trends in AI
1 page
AI Project Cycle Class 9 Notes
No ratings yet
AI Project Cycle Class 9 Notes
8 pages
AnshPatelResume OP-2
No ratings yet
AnshPatelResume OP-2
1 page
ISC Linked List
No ratings yet
ISC Linked List
22 pages
Data Science by BITSPilani
No ratings yet
Data Science by BITSPilani
27 pages
Lesson 1: Introduction To Cognitive Ergonomics COGNITIVE ERGONOMICS-is A Specialty Area of
No ratings yet
Lesson 1: Introduction To Cognitive Ergonomics COGNITIVE ERGONOMICS-is A Specialty Area of
5 pages
Postgres Pro Vs EDB - v16
No ratings yet
Postgres Pro Vs EDB - v16
7 pages
Compiler Design
No ratings yet
Compiler Design
94 pages
570 - Assignment 01 - GBS210603 - PHẠM TRẦN MINH HIẾU
No ratings yet
570 - Assignment 01 - GBS210603 - PHẠM TRẦN MINH HIẾU
15 pages
DBMS (2018 2019)
No ratings yet
DBMS (2018 2019)
2 pages
Ex - No.5 - Naïve Bayesian Classifier
No ratings yet
Ex - No.5 - Naïve Bayesian Classifier
4 pages
1903,1918,1939,1948,1960 Population of The Philippines by Censal Year
No ratings yet
1903,1918,1939,1948,1960 Population of The Philippines by Censal Year
13 pages
Library and Information Science Thesis PDF
100% (1)
Library and Information Science Thesis PDF
4 pages
Movie Hub Documentation
No ratings yet
Movie Hub Documentation
4 pages
Computing Grade 6 Unit 3 Comprehension
No ratings yet
Computing Grade 6 Unit 3 Comprehension
3 pages
A Novel Hybrid CNN-LSTM Approach For Handwritten Text Recognition For The Washington Database
No ratings yet
A Novel Hybrid CNN-LSTM Approach For Handwritten Text Recognition For The Washington Database
5 pages
Database Security-Concepts, Approaches, and Challenges: Elisa Bertino, Fellow, IEEE, and Ravi Sandhu, Fellow, IEEE
No ratings yet
Database Security-Concepts, Approaches, and Challenges: Elisa Bertino, Fellow, IEEE, and Ravi Sandhu, Fellow, IEEE
18 pages
Unit 1
No ratings yet
Unit 1
3 pages
Menyiapkan Siswa Untuk Karir Masa Depan Melalui Pendidikan Berbasis Teknologi: Meninjau Peran Penting Kecerdasan Buatan
No ratings yet
Menyiapkan Siswa Untuk Karir Masa Depan Melalui Pendidikan Berbasis Teknologi: Meninjau Peran Penting Kecerdasan Buatan
15 pages
Machine Learning Masterclass 2023
No ratings yet
Machine Learning Masterclass 2023
6 pages
Kaspersky Lab 'S File Level Encryption Technology: The Case For Encryption
No ratings yet
Kaspersky Lab 'S File Level Encryption Technology: The Case For Encryption
4 pages
Abstract Booklet BESE2022
No ratings yet
Abstract Booklet BESE2022
13 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
RCMRD Organisation Structure 2016
No ratings yet
RCMRD Organisation Structure 2016
5 pages
Final Assignment SSC 232
No ratings yet
Final Assignment SSC 232
2 pages
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
IT Specialist: Artificial Intelligence Exam Prep - 500 Questions for Certification Success (0225)
From Everand
IT Specialist: Artificial Intelligence Exam Prep - 500 Questions for Certification Success (0225)
Satou Takahiro
No ratings yet

Exam 1

Uploaded by

Exam 1

Uploaded by

Which technique converts categorical variables into a form that can be provided to

Why might you use target encoding for categorical variables?

Which method is suggested for handling missing data in a dataset?

What is the benefit of synthesizing date-related features from timestamp data?

What role does one-hot encoding play in feature preparation?

What is a common approach to determine the initial performance of a new dataset?

How is dealing with ordinal categorical data different from nominal?

What impact does feature engineering have on machine learning models?

What does synthesizing numeric features from categorical data involve?

Which encoding technique might lead to a dimensionality explosion in large

Why might log transformation be used on a continuous variable?

What is the primary goal of creating a baseline model?

How does feature engineering affect the training time of a model?

Which technique is used to prevent model overfitting?

Short Answer Questions

Describe how external neighborhood information can be used in feature engineering

Short Answer Questions Answers

Concept of 'Baseline Model':

Using External Neighborhood Information in Feature Engineering:

You might also like