14 ChatGPT
14 ChatGPT
https://arxiv.org/pdf/2203.02155.pdf
Goal of PPO-:Maximize the total reward of responses
generated from the model by including reward in the
Loss. If the response is very good the product of r and
Advantage function (A^) will be large. If the advantage
function will be negative the response will be bad.
Training ChatGPT
https://arxiv.org/pdf/2206.07682.pdf
https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-
to.html
https://arxiv.org/pdf/2204.02311.pdf
In an interview on CBS’s 60 Minutes on April 16
• CEO Sundar Pichai confirmed that there are still
elements of how AI systems learn and behave that
still surprises experts: “There is an aspect of this
which we call— all of us in the field call it as a ‘black
box’. You don’t fully understand. And you can’t quite
tell why it said this.” Pichai said the company has
“some ideas” why this could be the case, but it needs
more research to fully comprehend how it works.
• Asked if Google’s Bard is getting a lot of
“hallucinations,” Pichai responded: “Yes, you know,
which is expected. No one in the field has yet solved
the hallucination problems. All models do have this as
an issue.” The cure, Pichai said, is around developing
“more robust safety layers before we build, before we
deploy more capable models.”
What is Prompt Engineering?
Prompt: instructions and context passed to ChatGPT/language model to
achieve a desired task
Prompt Engineering: developing and optimizing prompts for an application
Chatbot?
Prompt Engineering Links
● check all the basics by looking at the OpenAI bestpractices article
a. Guides
i. Text completion - learn how to generate or edit text using our models
ii. Code completion - explore prompt engineering for Codex
iii. Fine-tuning - Learn how to train a custom model for your use case
iv. Embeddings - learn how to search, classify, and compare text
v. Moderation
b. OpenAI cookbook repo - contains example code and prompts for accomplishing common tasks
with the API, including Question-answering with Embeddings, https://github.com/openai/openai-
cookbook/tree/main/examples
c. Community Forum