0% found this document useful (0 votes)
16 views35 pages

GenAI Workshop

Generative AI is a branch of artificial intelligence that creates new content autonomously, utilizing advanced models like GANs and Transformers. It involves training on large datasets to learn patterns and generate new data, with applications in various fields such as natural language processing and image generation. Vector databases are essential for managing high-dimensional embeddings generated by machine learning models, enabling efficient similarity searches and data retrieval.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views35 pages

GenAI Workshop

Generative AI is a branch of artificial intelligence that creates new content autonomously, utilizing advanced models like GANs and Transformers. It involves training on large datasets to learn patterns and generate new data, with applications in various fields such as natural language processing and image generation. Vector databases are essential for managing high-dimensional embeddings generated by machine learning models, enabling efficient similarity searches and data retrieval.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

21 SPHERES

A W O R L D W I T H I N

What is Gen AI?


Generative AI is a subset of artificial intelligence focused on
creating new content or data (images, videos, music, 3D
models) without human intervention.

Evolved from advancements in deep learning.It leverages

advanced
machine learning models, especially Generative Adversarial
Networks
(GANs), Variational Autoencoders (VAEs), and Transformer-

based
models like GPT (Generative Pretrained Transformer).
How Generative AI Works
1 Training 2 Understanding
The model is trained on
Patterns
The model learns patterns,
large datasets of a specific
structures, and
domain (e.g., text, images).
relationships within the
data.

3 Generation
When prompted, it generates new content based on the learned
patterns, often guided by parameters like context or user input.

Popular Generative AI Applications ChatGPT: NLP tool for human-like


conversations and various tasks. Google Bard: LLM chatbot for quick
answers and interactive tasks. DALL-E: Generates images from text
prompts.
Generative Models

Machine learning models that generate new data instances


similar to a training dataset.
Components of GAN
Generator (G) Discriminator (D)
The generator is responsible for creating new data The discriminator acts as a critic, evaluating the
samples. It takes random noise as input and transforms it authenticity of the data it receives. It takes both real data
into data that mimics the real dataset(could be images, samples (from the training dataset) and fake samples
text, or other types of data). The generator is typically generated by G, and it outputs a probability score that
structured as a deep neural network, and its goal is to indicates whether the input data is real or fake. The
produce samples that are indistinguishable from real data. discriminator is also a neural network, trained to maximize
its accuracy in distinguishing between genuine and
generated samples.
Transformers
RNN LSTM
vanishing gradient addresses vanishing
problem(long term memory gradient issue. But
loss) and losing context over sequential processing and
long sentences(sequential use static embeddings,
word processing) limiting their ability to
capture context-dependent
word meanings.

Transformers
are a neural network architecture introduced in the 2017 paper
"Attention is All You Need" by Vaswani et al.They perform various
machine learning tasks, particularly in natural language
processing (NLP).
Transformers Architecture

Transformers utilize self-attention to process


entire sentences simultaneously, unlike older
models that process words sequentially (RNNs,
LSTMs).
Architecture Overview
Self-Attention Positional Encoding Position-wise Feed-
Transformers utilize self-attention to Transformers lack inherent Forward Networks
process entire sentences sequential order understanding, so Consist of two linear transformations
simultaneously, unlike older models positional encodings are added to with ReLU activation. Present in both
that process words sequentially token embeddings to indicate each encoder and decoder layers,
(RNNs, LSTMs). token's position in a sequence. processing features at each position
in the sequence.
Self - Attention Mechanism
Input Embedding Query, Key, Value Attention Score Weighted Sum
Words are transformed Each word generates three The model computes the Combines the values
into vector vectors: Query (Q): The similarity between the based on attention scores
representations. word looking for context. query and all keys. to create a new context-
Key (K): The word aware representation of
providing context. Value the input.
(V): The information stored
in the word.
Encoder-Decoder Architecture
Encoder Decoder
The encoder processes the input sequence into a vector; Each encoder and decoder layer includes self-attention
the decoder converts this vector back into a sequence. and feed-forward layers. The decoder has an additional
encoder-decoder attention layer to focus on relevant parts
of the input.
Layer Structure of a
Transformer Architecture
Six layers of encoders and decoders in the transformer architecture. The
bottom encoder performs word embeddings, transforming each word into
a vector of size 512. Each layer of the encoder is designed for different
NLP tasks (e.g., part of speech tagging, semantic roles).
Applications of
Transformers:
NLP Tasks Speech Recognition
Machine translation, text Converts audio signals into
summarization, named entity transcribed text.
recognition, sentiment
analysis.

Computer Vision Recommendation


Applied to image Systems
classification, object Provides personalized
detection, and image recommendations based on
generation. user preferences.Text and
Music Generation: Generates
articles and composes music.
Text to Numerical
Representations
Tokenization
Assigning Numeric
Text is divided into smaller
IDs
pieces called tokens.
Each token is mapped to a
Word- level tokenization: Splits
unique numeric ID from a
text into words.
vocabulary.
Example: "I love AI" → \["I" ,
Example: Vocabulary = {"I": 1,
"love" , "AI"\]
"love": 2, "AI": 3}
Subword tokenization: Breaks
Tokens: \["I" , "love" ,
words into smaller units to
"AI"\]
handle unknown words.
Mapped IDs: \[1, 2, 3\]
Example: "unbelievable" → \
["un" , "##believe" , "##able"\]
Character-level tokenization:
Splits text into individual
characters.
Example: "AI" →
\["A" , "I"\]
Converting IDs to Embeddings
Converting IDs to Embeddings Representing Context (Contextual
The numeric IDs are converted into dense vectors Embeddings)
(embeddings). Embeddings are high- dimensional numeric Modern models like BERT and GPT generate contextual
arrays that encode the semantic meaning of tokens. embeddings, meaning the vector for a word depends on
Example: "I" → \[0.12, 0.89, -0.34, ...\] the surrounding words.
"love" → \[0.45, 0.67, -0.10, ...\] Example: The word "bank" in:
"River bank" → \[vector representing river context\]
"Money bank" → \[vector representing finance context\]
Why This Matters
These numerical representations allow models to calculate
relationships, similarities, or patterns using mathematical operations
like dot products, cosine similarity, etc.
What is a Vector Database?
Vector Database
A vector database is a specialized database for storing and managing high-
dimensional vector embeddings. These embeddings are commonly
generated by machine learning models and represent data like text,
images, or audio.
Core Features of Vector
Databases
Embedding Storage Efficient Similarity
Store embeddings as vectors of Search
fixed dimensions. Perform nearest-neighbor
searches to find vectors that
are most similar to a query
vector.

Metadata Handling
Store associated metadata (e.g., text, tags) with vectors.
Use Cases

Semantic Search Recommendation


Retrieve documents or data Systems
similar to a query (e.g., "AI tools Suggest items based on
for image generation"). embedding similarities (e.g.,
product recommendations).

Image Retrieval
Find similar images by comparing
their embeddings.
Examples of Vector Databases
Vector Database
Pinecone: Scalable vector database for real-time
applications.
Weaviate: Open-source and schema-free vector database.
Milvus: High-performance vector database for massive
datasets.

FAISS: A library for similarity search by Facebook AI.


What Are Embeddings?
Embeddings are dense, numerical representations of data (like words, sentences, or images) in a fixed-dimensional vector
space.

Purpose of Embeddings

Embeddings encode the semantic meaning and relationships between data points in a way that is machine-readable. Similar
concepts are represented by vectors that are close in the vector space.
Examples
Word Embeddings Image Embeddings Sentence Embeddings
Represent words like "king" and Encodes features of an image (e.g., Encodes entire sentences to capture
"queen" such that their relationship color, shape, objects). context and meaning.
(e.g., "king - man + woman = queen") is
preserved in the vector space.
How Are Embeddings
Generated?
Pretrained Models Custom Models
Models like BERT, GPT, and Fine-tune models to generate
CLIP generate embeddings.. embeddings for specific tasks.
How Embeddings Are Stored
1 Embedding Creation
o Input data (e.g., text, image)
→ Model→ Embedding vector.

2 Metadata Storage
o Alongside embeddings, relevant metadata (e.g., text, IDs) is stored for easy identification and retrieval.

3 Indexing
o Vectors are indexed using ANN (Approximate Nearest Neighbor) algorithms, like KD-Tree or HNSW, to make
similarity search efficient.

4 Vector Database
o Embeddings and metadata are stored in a vector database like Pinecone or Milvus.
Example Storage Structure
{

"embedding": \[0.12, 0.89,

\-0.34, ...\],"metadata": {

"text": "Artificial Intelligence"

, "category": "Technology" } }
Why Specialized Storage
Is Needed
Embeddings are high-dimensional (e.g., 768 dimensions for BERT).

Efficient similarity search (e.g., find the top 5 closest embeddings).


LLM
LLMs are advanced AI systems utilizing neural network techniques
with numerous parameters for language processing. They are
significant in Natural Language Processing (NLP). LLMs are AI
algorithms that process and understand human language using
self-supervised
learning.

Applications include text generation, machine translation,


summarization, image generation from text, coding, chatbots, and
conversational AI.

Examples: ChatGPT (OpenAI), BERT (Google).


Advancements in GPT Models: GPT-1: 117 million parameters
(2018) GPT-2: 1.5 billion parameters (2019) GPT-3: 175 billion
parameters (2020) GPT-4: Expected to have trillions of parameters
(2023).
How LLMs Work

Positional Input Embeddings Encoder Self-Attention


Embeddings Tokenization and vector Analyzes input text and Mechanism
Adds sequence order representation of text. generates hidden states. Weighs importance of
information to tokens, capturing
embeddings. dependencies.
Transformer Based LLM
Architecture Components:
 Input Embeddings: Tokenization and vector
representation of text.
 Positional Encoding: Adds sequence order information
to embeddings.
 Encoder: Analyzes input text and generates hidden
states.
 Self-Attention Mechanism: Weighs importance of
tokens, capturing dependencies.
 Feed-Forward Neural Network: Processes tokens
independently to capture
interactions.
 Decoder Layers: Used for autoregressive generation (in
some models).
 Multi-Head Attention: Allows simultaneous attention
to different input parts.
 Layer Normalization: Stabilizes learning and improves
generalization.
 Output Layers: Varies by task (e.g., linear projection for
language modeling).
Examples of LLMs
GPT-3 BERT
Developed by OpenAI, known Developed by Google for
for ChatGPT. various NLP tasks.

RoBERTa BLOOM
Enhanced version of BERT by First multilingual LLM
Facebook AI Research. developed through
collaborative efforts.
Fine-Tuning Large Language Models (LLMs)
1 Text generation 2 Translation

3 Summarization 4 Question-answering
Fine-Tuning Process

1 Select Base Model


Choose a pre-trained model aligned with task requirements and computational resources.

2 Choose Fine-Tuning Method


Determine the most suitable method (e.g., supervised, instruction-based, PEFT) based on task and dataset.

3 Prepare Dataset
Structure data for training, ensuring compatibility with model requirements.

4 Training
Utilize frameworks (TensorFlow, PyTorch, or libraries like Transformers) to fine- tune the model.

5 Evaluate and Iterate


Test, refine, and re-train the model as needed to boost performance.
Retrieval-Augmented
Generation (RAG)
RAG combines retrieval-based and generation-based models to enhance
text generation quality by leveraging large-scale databases for accurate
and contextually relevant information.
Key Components of
(RAG)

Retriever Generator
Fetches relevant information Generates coherent responses
from a large corpus (e.g., BERT). using retrieved information (e.g.,
GPT-3, T5).
Steps
Query
User provides input (e.g., "What are embeddings?").
1

Retrieval
Relevant documents or embeddings are fetched from a
2
database.

Augmentation
Retrieved data is appended to the query.
3

Generation
The augmented query is passed to an LLM to generate a
4
response.

You might also like