AI Professional Workshop
AI Professional Workshop
Workshop
General overview of ML/AL. And detailed
dive into Generative AI Applications
Abdelrahman Osama
29 April 24
01 02 03 04
AI Retrieval
Foundations How to
Generative AI Augmented
Recap Customize AI
Generation
Traditional Computing
X + Y = ?
Computer is given explicit instructions (“+”)
Given Given Given Required from
computer on how to perform the task
User >> X = 3 and Y = 2
Computer >> answer = 5
1 What is Machine Learning (ML)
-Training data-
Computer will find the relation between
input and output (“x”)
• Classification: Assign data points into categories (e.g. Identify animal in picture)
--The output is a category--
q Cat
Multiclass Clasification
q Dog
Computer has to assign to one of
many predefined categories q Mouse
AI services
Oracle AI Partners
ML for data platforms
OCI Data Science ML in Oracle Database MySQL Heatwave AutoML OCI Data Labeling
Data
AI infrastructure
Compute bare metal instances and VMs with OCI Supercluster with RDMA networking Block, object, and file storage; HPC filesystems
NVIDIA GPUs
OCI Vision is a computer vision-based service that provides users with several CV
capabilities like object detection and classification in images with the ability to
customize (train) models on your specific use case.
3 OCI AI Services – OCI Anomaly Detection
o Generative AI refers to the type of AI that can create new content, ranging from
text and images to music and code.
o Unlike traditional AI, Generative AI understands the provided data and creates
new examples using that knowledge.
4 Generative AI – Large Language Models (LLMs)
Prompt
Fine Tuning RAG
Engineering
4 Generative AI – Large Language Models (LLMs)
o Prompt Engineering is the guide that is provided for the model using fixed
and predefined prompts to control the model's response.
o Chain-of-thought: his technique involves breaking down a problem into intermediate steps or
reasoning paths before arriving at the final answer, making the model's thinking process more
transparent and logical.
Ex:
If you have 8 apples and you give 3 to your friend, how many do you have left?
Prompt: "Start by noting the total number of apples, which is 8. Then, subtract the number of
apples given away, which is 3. So, 8 minus 3 equals 5. Therefore, you have 5 apples left.”
o Least-to-Most: This approach structures the prompt by guiding the model to answer from
simpler to more complex components of a question, ensuring clarity and thoroughness in
understanding each part.
o Step-Back: In this method, the prompt directs the model to reconsider or re-evaluate
its previous responses or steps, adding a layer of reflection to ensure accuracy and
depth.
Example:
Question: "What are the causes of the French Revolution?”
Prompt: "Start by listing the immediate economic and social causes. Now, step back and
consider broader political issues that contributed. Reflect on how these elements
interacted to precipitate the revolution."
4 Generative AI – Large Language Models (LLMs)
o Prompt Injection (Jail-Breaking):
Prompt injection in large language models (LLMs) refers to the practice of deliberately inserting
specific commands or cues into the input prompt to manipulate or guide the model's output,
often used to bypass restrictions, alter behavior, or achieve specific responses.
Ex2
4 Generative AI – Large Language Models (LLMs)
>>Fine Tuning(FT): Traditional fine-tuning involves adjusting all the parameters of a pre-
trained neural network on a new, usually smaller dataset to adapt it to a specific task. $$$$$
GPT 3.5. > 175 Billion Parameters
Stored in encrypted
Object storage
4 Generative AI – Large Language Models (LLMs)
LoRA T-Few
4 Generative AI – Large Language Models (LLMs)
T-Few
T-Few is a fine-tuning technique for large language models that uses a very small
number of training examples to efficiently adapt the model to specific tasks or
domains by selectively updating a subset(fraction) of its parameters or layers.
V3
Generate Get embedding Store embeddings with
Document Chunks Document for chucks LLM document Chunk ID
Document
Chunks Embedding
Encoder!
Compare embedding with
ing
for LLM Chunks embeddings Oracle
dd
be
Em stions Embedding Vector
que
Database
Question
Relevant Relevant chunks
documents IDs
Document
LLM
Chunks
Decoder!
Use: question + document chunks + prompt
To answer questions
Answer
How does RAG work?
The policy for paid leave is that employees have to submit their leave at least 5 working
days before the first leave day, the leave request has to be approved by the line
manager and project manager inorder to be valid.
As for sick leave, employees can submit requests on the same day of leave, and only
line manager approval is required, but the employee has to submit a medical report.
Chunking
The policy for paid leave is that employees The leave (vacation) request has to be As for sick leave, employees can submit requests on the
have to submit their leave at least 5 approved by the line manager and same day of leave, and only line manager approval is
working days before the first leave day project manager to be valid required, but the employee has to submit a medical report.
Embedding Encoder!
-0.005 0.012 -0.008 -0.007 0.01 …etc -0.006 0.012 -0.0075 -0.003 0.02 .. etc -0.014 0.073 0.0096 -0.012 0.0099 0.00214 ..…etc
Vector representation
User question
-0.005 0.012 -0.008 -0.007 0.01 …etc
I want to apply for leave
tomorrow for my vacation. The policy for paid leave is that employees have
Is this fine? eval
nk retri to submit their leave at least 5 working days
Embedding C hu before the first leave day
-0.0055 0.02 -0.0085 -0.007 0.01 …etc -0.006 0.012 -0.0075 -0.003 0.02 ..etc
the leave (vacation) request has to be
((Dot product & Cosine Distance))
approved by the line manager and
project manager to be valid
“Please answer the following question: Any thing else I can help you with?
I want to apply for leave tomorrow for my vacation.
Is this fine?”
If the answer to the question is not contained in the context, please give the
following response:
“This question requires information that is not part of my knowledge base, please
contact HR via the following link. Thanks”
The model will be able to simply answer the question, as the answer to the
question is contained within the context provided for the model
Some Additional info about GenAI
o Sometimes, RAG systems use Ranker to prioritize (rank) the info retrieved from
the Vector DB.
o Top-K: Top-K sampling selects the next token from a fixed set of the K highest-
ranked tokens according to their probability distribution, limiting the selection to
the most likely individual choices.
o Top-P (Nucleus) Sampling: Top-P sampling chooses the next token from a subset
of tokens whose combined probability adds up to or exceeds the threshold P,
dynamically adjusting the number of tokens considered based on their collective
likelihood.
o In an LLM, you may use a “Stop Sequence” parameter that simply tells the model
when to stop generating.
Open Discussion
Let’s talk!
22