Gen AI
Gen AI
Revision
Revision
Revision
1. In LangChain, which retriever search type is used to balance between relevancy
and diversity?
Maximal Marginal Relevance (MMR) retriever. Similarity is incorrect.
2. What does a dedicated RDMA cluster network do during model fine-tuning and
inference?
3. Which role does a "model endpoint" serve in the inference workflow of the OCI
Generative AI service?
The above answer is incorrect. Serves as a designated point for user requests and
responses.
In summary, the model endpoint serves as the bridge between the trained model and
live production usage, enabling seamless and efficient inference in the OCI Generative AI
Service
4. What is a distinguishing feature of "Parameter-Efficient Fine-tuning (PEFT)" as opposed
to the classic "Fine-tuning" in Large Language Model training?
PEFT involves only a few or new parameters and uses labelled task-specific data
The key distinguishing feature of PEFT is its focus on updating only a small subset of the
model’s parameters or adding a few new ones, thereby achieving task-specific
adaptation with much lower computational and memory overhead compared to the
classic fine-tuning approach, which updates all the parameters of the model. This makes
PEFT particularly advantageous for adapting large pre-trained models to new tasks in a
resource-efficient manner.
5. How does the Retrieval-Augmented Generation (RAG) Token technique differ from RAG
Sequence when generating a model's response?
RAG Token retrieves relevant documents for each part of the response and construct the
answer incrementally.
6. Which component of Retrieval-Augmented Generation (RAG) evaluates and prioritizes
the information retrieved by the retrieval system?
Ranker
7. Which statement describes the difference between "Top k" and "Top p" in selecting
the next token in the OCI Generative AI Generation models?
Top K selects the next token based on its position in the list of probable token, whereas
top P selects based on the cumulative probability of the Top tokens.
In summary, Top k sampling limits the choice to a fixed number of top tokens, while
Top p sampling adapts the number of tokens based on their cumulative probability,
offering a balance between ensuring high-probability selections and allowing for greater
diversity.
8. Which statement is true about the "Top p" parameter of the OCI Generative AI
Generation models?
9. What is the primary function of the "temperature" parameter in the OCI Generative AI
Generation models?
Controls the randomness of the model output and affecting its creativity
10. What distinguishes the Cohere Embed v3 model from its predecessor in the OCI
Generative AI service?
11. What is the purpose of the "stop sequence" parameter in the OCI Generative AI
Generation models?
The stop sequence parameter in OCI Generative AI Generation models is used to define a
specific point at which the model should stop generating text. This ensures control over
the length and relevance of the generated output, enhances the usability of the
generated text in various applications, and helps prevent over-generation, making the
outputs more precise and contextually appropriate.
12. What does a higher number assigned to a token signify in the "Show Likelihoods"
feature of the language model token generation?
In this example:
The token "sunny" has the highest likelihood (0.75), indicating that the model
predicts "sunny" as the most probable next word in the sequence.
Lower likelihood tokens like "stormy" (0.02) are considered less probable by the
model in this context.
Summary
In the "Show Likelihoods" feature, a higher number assigned to a token signifies that the
token has a higher probability of being selected as the next token in the sequence. This
reflects the model's higher confidence that the token is the most appropriate and
contextually relevant choice given the preceding text.
Prompt template supports any number of variables, including the possibility of having
none
14. Which is NOT a built-in memory type in LangChain?
Conversation ImageMemory
16. "Given a block of code: qa = Conversational Retrieval Chain. from 11m (11m,
retriever-retv, memory-memory) when does a chain typically interact with memory
during execution?"
After user input but before chain execution, and again after core logic but before output
17. Which is NOT a category of pre trained foundational models available in the OCI
Generative AI service?
Translation models
Generation, summarization and embedding models are the pre trained foundational
models available in the OCI Generative AI service.
18. How are fine-tuned customer models stored to enable strong data privacy and
security in the OCI Generative AI service?
19. Why is normalization of vectors important before indexing in a hybrid search system?
It standardizes vector lengths for meaningful comparison using metric such as Cosine
Similarity
20. How does the architecture of dedicated AI clusters contribute to minimizing GPU
memory overhead for T- Few fine-tuned model inference?
By sharing base model weights across multiple fine-tuned models on the same group of
GPUs
21. "You create a fine-tuning dedicated AI cluster to customize a foundational model with
your custom training data. How many unit hours are required for fine-tuning if the cluster
is active for 10 hours?"
20 Unit hours
22. Which Oracle Accelerated Data Science (ADS) class can be used to deploy a Large
Language Model (LLM) application to OCI Data Science model deployment?
GenerativeAi
23. "Given the following prompts used with a Large Language Model, classify each as
employing the Chain-of- Thought, Least-to-most, or Step-Back prompting technique. 1.
Calculate the total number of wheels needed for 3 cars. Cars have 4 wheels each. Then,
use the total number of wheels to determine how many sets of wheels we can buy with
$200 if one set (4 wheels) costs $50. 2. Solve a complex math problem by first
identifying the formula needed, and then solve a simpler version of the problem before
tackling the full question. 3. To understand the impact of greenhouse gases on climate
change, let's start by defining what greenhouse gases are. Next, we'll explore how they
trap heat in the Earth's atmosphere."
25. What does "k-shot prompting" refer to when using Large Language Models for task-
specific applications?
Explicitly providing k examples of the intended task in the prompt to guide the model
output
26. Which technique involves prompting the Large Language Model (LLM) to emit
intermediate reasoning steps as part of its response?
Chain-of-thought
27. Which is the main characteristic of greedy decoding in the context of language
model word prediction?
The above answer is incorrect. To monitor the performance of the language model
30. How does the integration of a vector database into Retrieval-Augmented Generation
(RAG)-based Large Language Models (LLMs) fundamentally alter their responses?
It shifts the basis of their responses from pretrained internal knowledge to real-time data
retrieval
31. How do Dot Product and Cosine Distance differ in their application to comparing text
embeddings in natural language processing?
Dot product measures the magnitude and direction of vectors, whereas Cosine distance
focuses on the orientation regardless of magnitude.
32. What is a cost-related benefit of using vector databases with Large Language Models
(LLMs)?
They offer real-time updated knowledge bases and are cheaper than fine tuned LLMs
33. An AI development company is working on an advanced AI assistant capable of
handling queries in a seamless manner. Their goal is to create an assistant that can
analyze images provided by users and generate descriptive text, as well as take text
descriptions and produce accurate visual representations. Considering the capabilities,
which type of model would the company likely focus on integrating into their AI
assistant?
The above answer is incorrect. A diffusion model that specializes in producing complex
output
34. Which statement best describes the role of encoder and decoder models in natural
language processing?
Encoder models convert a sequence of words into a vector representation and decoder
models take that vector representation and generate output sequences based on it.
35. What issue might arise from using small data sets with the Vanilla fine-tuning method
in the OCI Generative AI service?
Overfitting
Fine-tuning on a small dataset may lead to overfitting. The model becomes too
specialized in replicating the training data, resulting in limited variety and poor
generalization to new data. To mitigate this, it’s crucial to strike a balance between
capturing patterns from the training data and maintaining the ability to generate diverse
and novel content
37. When should you use the T-Few fine-tuning method for training a model?
38. Which is a key advantage of using T-Few over Vanilla fine-tuning in the OCI
Generative AI service?
Faster training time and lower cost
39. How does the utilization of T-Few transformer layers contribute to the efficiency of
the fine-tuning process?
40. "What does ""Loss"" measure in the evaluation of OCI Generative AI fine-tuned
models? The difference between the accuracy of the model at the beginning of training
and the accuracy of the deployed model"
The level of incorrectness in the models predictions, with lower values indicating better
performance.
41. Which is a distinctive feature of GPUs in Dedicated AI Clusters used for generative AI
tasks?
The GPUs allocated for a customer’s generative AI tasks are isolated from other GPUs.
To penalize tokens that have already appeared, based on the number of times they have
been used
43. What happens if a period (.) is used as a stop sequence in text generation?
The model stops generating text after it reaches the end of the first sentence, even if the
token limit is much higher.
44. What is the main advantage of using few-shot model prompting to customize a Large
Language Model (LLM)?
It provides examples in the prompt to guide the LLM to better performance with no
training cost.
To create numerical representations of text that capture the meaning and relationships
between words or phrases