0% found this document useful (0 votes)
5 views15 pages

Generative AI

The document provides an overview of generative AI, focusing on recurrent neural networks (RNNs) and the rise of transformer architectures, particularly highlighting their impact on natural language processing and machine translation. It outlines the development timeline of large language models (LLMs) such as ChatGPT and discusses their training phases, limitations, and methods to enhance their responses. Additionally, it mentions various LLMs and frameworks available in the field.

Uploaded by

n200677
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views15 pages

Generative AI

The document provides an overview of generative AI, focusing on recurrent neural networks (RNNs) and the rise of transformer architectures, particularly highlighting their impact on natural language processing and machine translation. It outlines the development timeline of large language models (LLMs) such as ChatGPT and discusses their training phases, limitations, and methods to enhance their responses. Additionally, it mentions various LLMs and frameworks available in the field.

Uploaded by

n200677
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Generative AI:

An Overview
Understanding Recurrent Neural Networks (RNNs)
RNNs are a type of neural network.
They are designed to process
sequential data.
These architectures were widely
used for NLP tasks, speech
processing, and time series.
Challenge-?
The Rise of Transformers: Self-Attention
In 2017, researchers at Google
published a paper that proposed a
novel neural network architecture
for sequence modeling known as
Transformer.
Outperformed recurrent neural
networks (RNNs) on machine
translation tasks, both in terms of
translation quality and training cost.
A Timeline of Large Language Models
2022: ChatGPT
Generative Pre-trained Transformer 2.

2024: Meta's Llama 3, Claude 3, and Q2, and Mistral's Mixtral 8x7B
Larger and more powerful model.

2025: DeepSeek-R1
Multimodality: Text, Image, Video
Diving into ChatGPT
Generative Pre-trained Transformer
Next word prediction LLM is pre-trained on massive Encoder-decoder architecture
amount of text

Why did ChatGPT couldn't replace Google Search?

How was ChatGPT trained?


Large Language Models
What do LLMs essentially do?
LLMs as Machine Learning Task?
LLMs as Deep Learning Task?
Training Data for LLMs
Next word Generation
Phases of LLM Training
Pre-training Instruction fine tuning Reinforcement Learning
Massive amount of text data Curating Q n A dataset to from Human Feedback
from internet - books, train the model to answer (RLHF)
research papers, websites questions or instructions Align the output closer to
Model learns to predict the Model learns to become a human like responses
next word helpful assistant Responses are updated
considering human
feedback and preference.
Limitation of LLMs
1. Hallucination
2. Mathematical Problem solving
3. Context window
4. Cost
How to make LLMs respond better?
Zero-Shot Few-Shot Chain-of-Thought(CoT)
Give some instructions to Give some examples of how For complex tasks- prompt
solve a task. to solve a task. an LLM to <think step by
step=
Latest LLMs & Frameworks
LLMs Frameworks
Mistral Together AI- https://www.together.ai/

Mixtral Groq- https://groq.com/

Llama Replicate- https://replicate.com/

Gemini LiteLLM - https://www.litellm.ai/

DeepSeek Hugging Face- https://huggingface.co/


Generative AI Project Lifecycle

You might also like