0% found this document useful (0 votes)

11 views10 pages

ClassTest1 DeepLearning

Uploaded by

RAUSHAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views10 pages

ClassTest1 DeepLearning

Uploaded by

RAUSHAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

What does BERT stand for?

A) Basic Encoder for Robust Transformers

B) Bidirectional Encoder Representations from Transformers

C) Binary Encoded Recursive Transformers

D) Balanced Embedding Representation Technology

Which of the following is NOT a
component of the transformer architecture?

A) Multi-head attention

B) Feed-forward neural networks

C) Positional encoding

D) Convolutional layers
What is the primary advantage of
the transformer architecture over RNNs?

A) Lower computational complexity

B) Ability to handle variable-length sequences

C) Parallel processing of input sequences

D) Smaller model size

What pre-training task does BERT
use to learn bidirectional context?

A) Next Sentence Prediction

B) Masked Language Modeling

C) Machine Translation

D) Both A and B
What is the purpose of the [CLS] token in BERT?

A) To mark the end of a sentence

B) To represent the entire sequence for classification tasks

C) To separate two sentences in the input

D) To mask random words in the input

How do transformer models handle out-of-vocabulary words?

A) ignore them

B) Use sub-word tokenization

C) Assign them a random embedding

D) Assign embedding of the closest word from vocabulary

What is the key difference between BERT and GPT models?

A) BERT uses encoders only while GPT uses decoders only

B) BERT is bidirectional while GPT is unidirectional

C) BERT is for classification tasks only, while GPT is for generation

D) Both A and B
What is the primary purpose of self-attention in transformer models?

A) To reduce the model size

B) To speed up training

C) To eliminate the need for positional encoding

D) To capture dependencies between

different positions in a sequence
What is the purpose of the scaling factor
in the scaled dot-product attention?

A) To normalize the input

B) To prevent vanishing gradients

C) To stabilize the gradients,

especially for large dimension inputs

D) To increase the model's capacity

What is the purpose of positional encoding in transformer models?

A) To add information about the order of the sequence

B) To increase the model's vocabulary

C) To reduce computational complexity

D) To enable multi-head attention

Answer For Introduction To Generative AI Quiz
75% (8)
Answer For Introduction To Generative AI Quiz
5 pages
Applied NLP
50% (2)
Applied NLP
8 pages
Question Bank
No ratings yet
Question Bank
14 pages
Deep Learning - Chorale Prelude
No ratings yet
Deep Learning - Chorale Prelude
2 pages
Overall Analysis: Solution Report
No ratings yet
Overall Analysis: Solution Report
19 pages
Jacob Devlin BERT
No ratings yet
Jacob Devlin BERT
43 pages
HKBK College of Engineering Department of Computer Science and Engineering
No ratings yet
HKBK College of Engineering Department of Computer Science and Engineering
24 pages
Learning To Answer by Learning To Ask - Getting The Best of GPT-2 and BERT Worlds PDF
No ratings yet
Learning To Answer by Learning To Ask - Getting The Best of GPT-2 and BERT Worlds PDF
10 pages
Transformer Part3 16 Mar 23 PDF
No ratings yet
Transformer Part3 16 Mar 23 PDF
59 pages
Pretraining Part1 16 Mar 23 PDF
No ratings yet
Pretraining Part1 16 Mar 23 PDF
32 pages
Vin AI
No ratings yet
Vin AI
55 pages
Stanford Dataset 2.0
No ratings yet
Stanford Dataset 2.0
9 pages
Fine-Tuning and Masked Lan-Guage Models: 11.1 Bidirectional Transformer Encoders
No ratings yet
Fine-Tuning and Masked Lan-Guage Models: 11.1 Bidirectional Transformer Encoders
22 pages
Deep Learning MCQ Previous Year MCQ
100% (1)
Deep Learning MCQ Previous Year MCQ
11 pages
Huggingface Co Blog Warm Starting Encoder Decoder Data Preprocessing
No ratings yet
Huggingface Co Blog Warm Starting Encoder Decoder Data Preprocessing
20 pages
DL MCQ
No ratings yet
DL MCQ
13 pages
2 - LLMs - ...
No ratings yet
2 - LLMs - ...
2 pages
MCQs DL Mid I R20 2023 With Answers
No ratings yet
MCQs DL Mid I R20 2023 With Answers
3 pages
Applied NLP - Project - Learner Template
No ratings yet
Applied NLP - Project - Learner Template
5 pages
LSTM To BERT
No ratings yet
LSTM To BERT
30 pages
Final
No ratings yet
Final
30 pages
Examen Deep Learning
100% (1)
Examen Deep Learning
8 pages
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
All About Encoder-Decoder Models
No ratings yet
All About Encoder-Decoder Models
50 pages
Go4braindumps 1z0 1127 24 Questions by Day 22 07 2024 11qa
No ratings yet
Go4braindumps 1z0 1127 24 Questions by Day 22 07 2024 11qa
12 pages
Transformers MUIA
No ratings yet
Transformers MUIA
34 pages
11 Bert
No ratings yet
11 Bert
66 pages
Final Mcqs
No ratings yet
Final Mcqs
7 pages
Practice MCQ
No ratings yet
Practice MCQ
19 pages
OCI Answers
No ratings yet
OCI Answers
14 pages
Deep Learning Important Questions As Per Jntuh Syllabus
No ratings yet
Deep Learning Important Questions As Per Jntuh Syllabus
4 pages
AI Escape Room - Questions
No ratings yet
AI Escape Room - Questions
6 pages
AI Quiz ch1+ch2
No ratings yet
AI Quiz ch1+ch2
25 pages
Deep Learning M2-T1-Student Question Bank
No ratings yet
Deep Learning M2-T1-Student Question Bank
2 pages
CT - 1 MCQ
No ratings yet
CT - 1 MCQ
14 pages
Genai 2 Marks
No ratings yet
Genai 2 Marks
4 pages
Share Feedback: 1Z0-1127-24: Free Certification For Oracle Generative AI (20 Q & A) - Results
No ratings yet
Share Feedback: 1Z0-1127-24: Free Certification For Oracle Generative AI (20 Q & A) - Results
19 pages
R 2032422
No ratings yet
R 2032422
11 pages
T Quiz1
No ratings yet
T Quiz1
4 pages
Ai 2024
No ratings yet
Ai 2024
7 pages
Deep Learning Viva Questions
No ratings yet
Deep Learning Viva Questions
10 pages
DL Bits
No ratings yet
DL Bits
3 pages
Assignment 7 Solution
No ratings yet
Assignment 7 Solution
3 pages
DL Unit-3 Question Bank
No ratings yet
DL Unit-3 Question Bank
39 pages
NLP MCQ Advanced Real 21 40
No ratings yet
NLP MCQ Advanced Real 21 40
6 pages
Domande ANN
No ratings yet
Domande ANN
28 pages
Lesson 14 - Transformer
No ratings yet
Lesson 14 - Transformer
124 pages
Transformers and Attention Mechanisms - Post Quiz - Attempt Review
No ratings yet
Transformers and Attention Mechanisms - Post Quiz - Attempt Review
5 pages
Assignment 10 Solution
No ratings yet
Assignment 10 Solution
6 pages
Assignment 6 Solution
No ratings yet
Assignment 6 Solution
3 pages
Ajaz Ahmad 101203540
No ratings yet
Ajaz Ahmad 101203540
7 pages
NLP MCQ Advanced Real 1 20
No ratings yet
NLP MCQ Advanced Real 1 20
7 pages
50MCQ Lecture1
No ratings yet
50MCQ Lecture1
18 pages
Exam Practice Questions
No ratings yet
Exam Practice Questions
17 pages
Dani Exam
No ratings yet
Dani Exam
9 pages
GenAI Workflow Automation NPTEL Zoom Course
No ratings yet
GenAI Workflow Automation NPTEL Zoom Course
88 pages
BERT Interview Questions and Cross Questions-1
No ratings yet
BERT Interview Questions and Cross Questions-1
9 pages
Stanford CS 224N Deep Learning For NLP Practice Quiz Pack
No ratings yet
Stanford CS 224N Deep Learning For NLP Practice Quiz Pack
4 pages
1Z0 1127 24
No ratings yet
1Z0 1127 24
9 pages
IGNOU PGDCA MCS 201 Programming in C and Python Previous Years Unsolved Papers
From Everand
IGNOU PGDCA MCS 201 Programming in C and Python Previous Years Unsolved Papers
Manish Soni
No ratings yet

ClassTest1 DeepLearning

Uploaded by

ClassTest1 DeepLearning

Uploaded by

What does BERT stand for?

A) Basic Encoder for Robust Transformers

B) Bidirectional Encoder Representations from Transformers

C) Binary Encoded Recursive Transformers

D) Balanced Embedding Representation Technology

B) Feed-forward neural networks

A) Lower computational complexity

B) Ability to handle variable-length sequences

C) Parallel processing of input sequences

D) Smaller model size

A) Next Sentence Prediction

B) Masked Language Modeling

A) To mark the end of a sentence

B) To represent the entire sequence for classification tasks

C) To separate two sentences in the input

D) To mask random words in the input

B) Use sub-word tokenization

C) Assign them a random embedding

D) Assign embedding of the closest word from vocabulary

A) BERT uses encoders only while GPT uses decoders only

B) BERT is bidirectional while GPT is unidirectional

C) BERT is for classification tasks only, while GPT is for generation

A) To reduce the model size

C) To eliminate the need for positional encoding

D) To capture dependencies between

A) To normalize the input

B) To prevent vanishing gradients

C) To stabilize the gradients,

D) To increase the model's capacity

A) To add information about the order of the sequence

B) To increase the model's vocabulary

C) To reduce computational complexity

D) To enable multi-head attention

You might also like