0% found this document useful (0 votes)

28 views45 pages

Open AI GPT 3

Uploaded by

Jenifer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views45 pages

Open AI GPT 3

Uploaded by

Jenifer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

How To Fine-Tune Your Models With

OpenAI GPT-3 Technology

Do you want to learn how to train an AI to write high-quality content for your own use
cases?

You're in luck! This E-book will teach you everything you need to know. It's perfect for
anyone who wants to get started with AI writing.

With this E-book, you'll be able to create AI models that write high-quality content which
outperforms anything other tools could produce. In addition, you'll be able to take your
business to the next level with AI-generated content.

I will keep this as short as possible and offer you as much information in images as
possible.

Tools we are going to need for this to work and train OpenAI.

To get started with OpenAI, the first thing you need to do is register for an account. You can do
this by visiting the OpenAI website and clicking on the "Sign Up" button. Once you have
registered, you can access the OpenAI platform and start using its various features.

So, go to: https://openai.com/.

Navigate to the “footer” section of the website and press “Login.” From there, you should be able
to log in with Google and follow the steps to complete your account creation.

Or you can just sign-up for a new account if you prefer that way.

Once we have an account, we can freely play around.

Here is a link to the “fine-tuning section”: https://beta.openai.com/docs/guides/fine-tuning

I will teach you everything; however, it’s still good to look around and figure out some
things yourself. It will help in the long run!
The following tool we are going to need is Postman. You can sign-up for a free account
here: https://www.postman.com/postman-account/.

Postman allows us to send API requests to OpenAI and fine-tune files without having to
code(neat, isn’t it?).

It should look like that after accessing the link. But, do not worry, it’s free to use up to
certain limits!

Alright, let’s continue.

Our next little helper is this: https://onlinetexttools.com/remove-empty-text-lines.

We will need that to create the datasets for the fine-tuning, which can be used afterward
in OpenAI to produce our content.

This one is also free; just click the link and go to the “chain with…”.

It should look something like the following picture:

Once you’ve pressed the “chain with…”, input “JSON Stringify” and click our available
option.

It should look like this:

How will this help us? An AI model requires a lot of data. This data needs to be in a
specific format so the AI can understand it and learn from it. Converting it into a JSON
string is one way to format it.

Here is the link to the JSON stringify tool: https://onlinetexttools.com/json-stringify-text

(Use this to avoid modifying things yourself, just copy/paste standard text in there)

JSON strings are easy for computers to read and understand, making them the perfect
choice for training an AI model. By converting our text data into JSON strings, we will be
able to teach our AI more effectively and help it learn more quickly. Ultimately, this will
help us create better AI models that can be used for various tasks.

We will get into this later on. But, first, let’s jump to the central part: creating our Postman
workspace and preparing for fine-tuning.

It’s good to have all the tools ready, so let’s download one more that will help us later.

That is https://code.visualstudio.com/, Visual Studio Code. Installing this will enable you
to write the JSON strings later, which are used to fine-tune.

Simply - press the download button, and the process is simple. You just need to install it,
and I will teach you later how to use it and what we need from it.
In the next section, we will learn about Postman and how to configure it for OpenAI
fine-tuning.

Postman - Creating our first workspace.

I will try to keep this as simple as possible. So, for example, you wouldn’t have to spend your
time reading. Images for the win!

Click on the “Create workspace.”

After that, the page you are redirected to is the workspace creation, which will require
you to input a “name” and a “summary,” basically a description of your project.

That will help you remember what the space is about.

The following picture should help you out:

After creating our environment in Postman, what follows is reading a little on OpenAI’s
documentation. This will help you follow together through this guide with me.

Link: https://beta.openai.com/docs/api-reference/files

Our first job is how to upload a file to OpenAI using API(that file will contain our datasets
later on). For this example, we shall use dummy content.

Great!

After having a workspace, we need to create a collection in Postman.

Check the image below for reference:

You can name the collection anything you want; for this purpose, I will do “tuning boss” ;).

Not a big deal. Then just press “enter,” and it should save the fine.
Now we want to add our first “request.”

Press “add a request.”

If you read anything on the OpenAI documentation, you know the URL and the type of
request we need to make.

If you did not, here it is:

https://api.openai.com/v1/files

T h a t i s t o u p l o a d a f i l e t o OpenAI with the API. The request

s h o u l d b e m a d e b y u s i n g “ P OST.”

H e r e ’ s h o w i t s h o u l d l o o k :
As you can see, the link is added under the “POST.” So just copy/paste it in the free
space.

You can also rename the request, so we know what it does; I will use “Upload file to
OpenAI.”

The example below will suffice. I am positive you can do this without much hassle.
Getting our API key from OpenAI

We all of this to work, we will need the API key from the OpenAI account. Don’t worry.
OpenAI grants you 18$ of free credit, so the key is also available.

It can be found under Personal > View API keys.

After clicking that, you should be able to create or use an existing Key provided by
OpenAI.
Afterward, it’s just a matter of copying/pasting the key.

Configuring Postman to upload our dataset file

We want to create our upload file request in Postman. Therefore, we need to set up the
environment for it first.

Under the “upload file to OpenAI.” request we’ve just created, go to > headers. Now, do not really
bother yourself with what Postman has as default; we will add our own entries.

Under the “connection,” add as a key “Authorization.”

It should look like this:

Great. Now on the right side of it, we will add our API key from OpenAI.

Go to OpenAI and the API keys and copy or create a new secret key to use in our
process.

Before adding it, follow this format:

Bearer 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

That’s how it should look for you, where the

“SK-3vfr9H5yIONb1gauWDGyT3BlbkFJnylrOQzCNebQ0oJeGF5O” is the OpenAI secret
key.

Please do not try using mine, and it won’t work :).

We are done with configuring the Headers. We now need to jump into the “body section”
and create a purpose + add the same API key.

So under the “body,” select “form-data.”

This process will help us to complete the process - please check the image below for
reference and help.
Now that you’ve found it, the next step is to add some keys and values that will
correspond to OpenAI.

Firstly we want to add a “purpose,” then a “file,” which will be our dataset file as already
specified.
Please ensure that the “file” is set to “file” and not to text, as this will differ and will not
allow you to upload/select anything.

Remember that OpenAI takes only.JSONL type of files when it comes to fine-tuning, so I
will give you my dummy data for this purpose.

It will also help you have an overview of the dataset.

Link to download: https://www.mediafire.com/file/119voo44qurj1j9/paragraphs.jsonl/file

We will use the Visual Studio Code to open that file, so do not worry for now if you have
not installed it. However, I recommend installing it as we’ll need it soon!

If you’ve downloaded the file, all that is left is to upload it. Press “select files” and just
choose the paragraphs.jsonl -

That’s what our final result should look like. While we are at it, do not forget to save your
progress in Postman. Just use “CONTROL + S” on your keyboard. That will keep it.

Our last step is to send the request and see if the answer is positive.
Press the “send” button.

That is the result we are looking for. OpenAI documentation shows us what we need to
see to recognize success. However, sometimes the Postman glitches, and the requests
do not send. Worry not. Just restart the website and log in again.
Making the fine-tuning process in Postman

Now that we are done with the “file upload” and learned how to upload our file to OpenAI through
API request, things are becoming easier.

We will focus on creating the process that fine-tunes the file/dataset we make. So let’s jump right
into it!

Let’s go back a step and create a new request in Postman. We shall call this a “fine-tune
process.”

That will also have to be a “POST” request.

Use this: https://api.openai.com/v1/fine-tunes.

It can also be found here: https://beta.openai.com/docs/api-reference/fine-tunes/create.

Just input the link in the POST request as we did with the “file upload.

It should be like that.

Afterward, we are going to do things similarly.

Go to > Headers > Key

Content-Type and under the value add application/json

Authorization > Bearer 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

The exact process of adding our API key from OpenAI again.

The final result should look like this one.

Now we are going to press on the “Body” section and go to > “raw.”

That is where you are going to add the settings for fine-tuning. Here are my current ones:

{
"training_file": " ",
"model": "davinci",
"suffix": "bgdtune"
}
Suffixes can be anything you want, from your name to your favorite pet :). So we will use that later on!

Hopefully, you have followed that correctly, and that’s the result.

Now you may be wondering where to get the “training_file” ID. That’s simple; let’s return to our
“upload file” request and get it!
If you did not find it, press the “send” button to forward the request to OpenAI. Finally, let’s press
CONTROL + S to save our current progress in Postman.

Now that we are almost done with this, while you are on the “fine tune process” request, go to the
“send” button and click it.

That will send the file to OpenAI and start training it.

Your answer should look like this:

If you see this, then you are set. The file has been successfully sent to OpenAI, and it’s being
trained.

Congrats, seriously!

You made it this far, pat yourself on the shoulder, and let’s move to the next big part!
Retrieve our fine-tuned info from OpenAI

This will not be as complicated as the others, I promise. All you need is the API key from the
OpenAI account.

Create a new request in Postman and name it “Info on a specific fine-tune.”

Remember, this one is a “GET” request. Do NOT make it “POST.”

URL to add on the request: https://api.openai.com/v1/ﬁne-tunes/{ﬁne_tune_id} where the

“fine_tune_id” is your fine-tune file ID.

You can get that from the “fine tune process” request.” Then, after pressing send and getting the
response, the ID is listed right at the beginning.
As in the picture above, that’s how it looks. Now replace the (File_ID) with the real one.
That’s our final desired output. With the “Authorization” and the API key added. DO NOT
FORGET THAT!

We have to press “send,” like usually :).

Result:

{
"object": "fine-tune",
"id": "ft-puPlI5XhwrzD0lvW0YfMH6Op",
"hyperparams": {
"n_epochs": 4,
"batch_size": 1,
"prompt_loss_weight": 0.1,
"learning_rate_multiplier": 0.1
},
"organization_id": "org-Ygwg5D3pYmJxnsfIU6OUTd8D",
"model": "davinci",
"training_files": [
{
"object": "file",
"id": "file-GnQokZYstWYTzjg0IOlmrKcG",
"purpose": "fine-tune",
"filename": "paragraphs.jsonl",
"bytes": 41901,
"created_at": 1658510960,
"status": "processed",
"status_details": null
}
],
"validation_files": [],
"result_files": [],
"created_at": 1658511065,
"updated_at": 1658511070,
"status": "pending",
"fine_tuned_model": null,
"events": [
{
"object": "fine-tune-event",
"level": "info",
"message": "Created fine-tune: ft-puPlI5XhwrzD0lvW0YfMH6Op",
"created_at": 1658511065
},
{
"object": "fine-tune-event",
"level": "info",
"message": "Fine-tune costs $1.05",
"created_at": 1658511069
},
{
"object": "fine-tune-event",
"level": "info",
"message": "Fine-tune enqueued. Queue number: 3",
"created_at": 1658511070
}
]
}

That’s what yours should be like as well. Which means? Yes, you’ve thought that right! Our file is
being fine-tuned and is almost ready!

I did not add an image as the response is too long to be visible. The desired fine-tuned file will be
available to us in the “Playground” of the OpenAI. That takes some time. Worry not will be there
:).
Getting a list of fine-tunes info from OpenAI

Let’s say you got more files being fine-tuned right now. What are you going to do? Checking
each of them at a time may be boring, so there’s an option to list them all.

Create a request in Postman as “GET” with the URL: https://api.openai.com/v1/fine-tunes

You can name it something like “List fine-tunes.”

In the “Headers,” add “Authorization” and the “API KEY.” - Then just press “send.”

Our final output should be a “200 OK,” and the response from the OpenAI shares details of your
fine-tuned data.

That will list all the fine tunes you have (If you have many).

This is it. Now you just wait for your file to be processed by OpenAI and appear in the
Playground.

Here is what it will look like:

Your trained file will appear there when it is ready. Just click on it and expand:

Final Task - Creating a dataset for our use case!

Oh boy, here we go! This is undoubtedly the most challenging part of training the AI to write for
us. It takes time and research.

Here is where you are going to install Visual Studio Code. If that’s too daunting, then here is a
video that you can follow: How to install Visual Studio Code on Windows 10/11 [ 2022 Upd…

After you’ve installed the visual studio code, we are ready to go forward and create our datasets.
Why? So we can submit them as we’ve learned to OpenAI and train our model.

Automated data entry can be inaccurate and lead to errors in your datasets.

By manually building your datasets, you can ensure that the data is accurate and reliable. In
addition, you have complete control over the prompts and completions, so you can customize
them to match your specific needs.
That is how Visual studio code looks like on the inside. If you successfully installed it, then this is
the output.

Now, do you remember I gave you a file paragraphs.JSONL to download while we were
uploading dummy data to OpenAI?

If you still have it, we are going to use it here. If not, just look for the link above at the beginning
of this E-book. The link is there!

That file contains some data from my “paragraph” generator. It’s just a tiny part to get you
accustomed to the datasets.

In the visual studio code -> go to -> File -> Open File - > Open paragraphs.JSONL

Our inside should be looking just like this:

Look at the image below - >

Weird, right? I know. At first, I was shaking too. What is this? How am I supposed to do anything
with this?

Here is some OpenAI documentation about it:

https://beta.openai.com/docs/guides/fine-tuning/preparing-your-dataset

I assure you, it’s not super-duper-complicated once you get the hang of it.

What is completion? Completion is how you train the AI. For example, your desired output so
the AI will parse it and follow its pattern.

Here is what a dataset should look like:

{"prompt":"Company: BHFF insurance\nProduct: allround insurance\nAd:One stop shop for all

your insurance needs!\nSupported:", "completion":" yes"}
{"prompt":"Company: Loft conversion specialists\nProduct: -\nAd:Straight teeth in
weeks!\nSupported:", "completion":" no"}

The program that converts text to string? The one above is in the “tools” section. That's your
lifesaver. Check it out, prepare it, because we have a storm coming.

As you can see, datasets are formed from a “prompt” and the “completion,” which means “ideal
text generated” - what you want the generated text to look like. So that’s what you are training
the AI on.
Let’s break it down in an example with my paragraph generator. For that, I have used Jasper.AI
to form our Prompt + Completion. Look at this ->
{"prompt": "How to make money online with YouTube\nKeywords: Make money
online, Youtube\nTone of voice:Informative\nParagraph:", "completion":"
Making money online is a great way to earn some extra cash, and there are
plenty of ways to do it. One popular option is to make money with YouTube.
You can make money on YouTube by creating and publishing videos that
generate advertising revenue, or by partnering with existing channels and
companies as an influencer or product promoter. To get started, you'll
need to create a channel and start building up a following. Once you have
a sizable audience, you can begin pursuing opportunities to make money.
There are a few different ways to make money on YouTube, so be sure to
research the options and find the one that best suits your needs. With a
little effort, you can start making money with YouTube in no time."}

Take a look at it. See how it’s formed?

This is what the prompt would look like in an AI writer:

What is your paragraph about: How to make money online with YouTube
Keywords: Make money online
The tone of voice: Informative
Paragraph: *Here is where the AI will generate your desired output after being trained*

Start to understand it?

This is what a basic(without prompt and completion) looks like:

{"prompt":"this space is reserved for the prompt input \nKeywords:

\nTone of voice:\nParagraph:", "completion":" "}

So breaking that down would be us telling the AI what the paragraph is about, the keywords we
want it to follow, the tone of voice, and in the end, the desired output(completion).

I know this may be a little overwhelming, but trust me, it takes you a little time to get used to it.
However, once your brain has adjusted to the situation, it is just a matter of time until you create
insane things like “paragraph generator,” “AIDA Marketing,” “Article generator,” etc.

And since my model is done training, here is what it will look like in the OpenAI’s playground.
As you can see, your trained model will appear under the “Fine-Tunes”. So all you have to do is
to follow the instructive model you’ve used in the training file.

For our example with the “paragraph generator,” it will be this:

Isn’t that lovely? High-quality content in a matter of seconds! And the file used was tiny to train
the AI. Imagine what you can do with really GOOD datasets. BOOM! Explode! Create!

Creating a custom dataset

Now that you have an excellent idea of what it will be like, creating a dataset based on your use
case is not that complicated.

If you follow the patterns in the dummy file I’ve given you, nothing is stopping you. However, for
the sake of this guide, I will show you one example.

Let’s say you want an AIDA generator(Attention, Interest, Desire, Action) marketing tactic. Oh,
that’s how it is going to be.

For this example, we will reverse-engineer the Jasper AI AIDA prompt.

Here is how the dataset will look like:

{"prompt": "Company: Jasper\nProduct Description: Jasper AI is an AI

writing assistant that helps you write content faster, better, and
easier.\nTone of voice: engaging\nAIDA Generated:", "completion":
"Attention: Feeling overwhelmed by the thought of writing a blog post,
article, or even an email?\n\nInterest: Jasper AI is here to help! This AI
assistant will take care of all the tedious work for you so that you can
focus on what’s important – creating content that engages your
audience.\n\nDesire: Writing doesn’t have to be hard. With Jasper AI, you
can write like a pro in no time. Plus, our software is constantly learning
and evolving so your content will always be fresh and original.\n\nAction:
Sign up for a free trial today and see how Jasper AI can help you write
better content faster!"}

Breaking that down for you:

Company: Jasper
Product Description: Jasper AI is an AI writing assistant that helps you write content faster,
better, and easier.
Tone of voice: engaging
AIDA Generated(Completion): “Attention: Feeling overwhelmed by the thought of writing a blog
post, article, or even an email?

Interest: Jasper AI is here to help! This AI assistant will take care of all the tedious work for you
so that you can focus on what’s important – creating content that engages your audience.

Desire: Writing doesn’t have to be complicated. With Jasper AI, you can write like a pro in no
time. Plus, our software is constantly learning and evolving so your content will always be fresh
and original.

Action: Sign up for a free trial today and see how Jasper AI can help you write better content
faster!”

That’s what Jasper generated when using its AIDA framework. All you have to do is develop with
it and reverse-engineer it by training your own AI with it. Simple enough?

All I did with the text was run it through the tool that converts text to JSON strings(see in the tools
above) and input it in the “completion.” BAM! A part of the fine-tuning is done. There are 499 to
come :).

Here is how OpenAI tells you to format datasets:

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

Get it. Is that simple?

Let’s say you want the AI to write product descriptions. Watch how I turn one of the above strings
into a prompt to train the AI.

You can start by asking yourself: “How do I want to format this?”

So we always have the {“prompt”: “then the prompt text,” followed by a completion(your ideal text
to train the AI on),” completion”: “Here you input high-quality training content”}

Does this sound easy? I hope so, and it is when you’re getting into it. Let’s jump back to the
product description example.

What do we want? To fine-tune our AI for writing quality product descriptions.

{“prompt”: “Product: Google Ads \nDescription: Google Ads are the best way of advertising
online.\nFinal Output”, “completion”: “Introducing Google Ads. The right and the best way of
advertising your product online on autopilot.”}
This is how this would look in a prompt:

Product:
Write a short product description:
Final output(completion:

Then the AI would follow the “completion” given in the training to write a completely new and
high-quality one related to the product and product description input by the user.

-> that “\n” stands for “new line,” so basically, it helps the AI form an idea about the training
format we’re using.

->completion is the ideal text generated we are training the AI with. So you’re feeding one
example of a high-quality product description, and the AI will follow the pattern and formatting for
creating new ones.

->Product and Description represent precisely what they mean—teaching the AI how to format
this. First, we tell it that we need a product and a description. Then the user inputs a short
description of the product and gets to the “Completion,” which generates high-quality content.

Pay attention(Update!):

Here is a dataset for blog introductions:

-> https://www.mediafire.com/file/kya7ns5bspmzasw/Blog_Introductions.jsonl/file

Here is a dataset to create an “article generator.” Just follow the given examples for this
one, and you should be fine!

-> https://www.mediafire.com/file/1buyu1ybn1zbx3u/Testsv2.jsonl/file

As long as you follow these examples + the one I’ve given above (paragraph generator),
you should be able to create anything!

You can use this formatting model to train the AI on literally anything. I hope that was clear
enough and easy enough because it is the best way you can do this.

OpenAI said, “For a high-quality fine-tuning session, 500 or more completions are required”.
Urgh, that will take your time. But is it worth it? Completely, yes.

What’s left? Create a dataset using Visual studio code and the examples I gave you. Then, feed
it real examples from Jasper or Google articles. Do not worry; it won’t copy/paste and create
plagiarised content. The AI will only follow the pattern.
You can have a tool of your own in no time with custom-trained models that generate better and
faster outputs at a fraction of the cost. :)

I really hope this E-book was a solution for your needs.

Tips And Tricks On Datasets

Here are a few simple tips:

● First, keep your data training examples simple, and do not crowd your prompts.

● Make sure you follow the samples correctly.

● Always have the file in a . JSONL format(probably you know that already, just remember!)

● If the use cases are the same(can be any use case), you can train it in multiple niches.
However, if your project is too big and requires many data training examples, you should
lower it.

● 500 lines are enough for small frameworks. (Up to 100-150 words outputs).

● Anything that requires quality outputs should be fine-tuned with more than 1.000 lines.

● Creating things like "Article generators" requires a lot of training examples. A LOT.

● That's why I suggest creating a dataset and uploading it to AI21 Labs(another AI tech like
OpenAI); they do better with longer-form outputs.

What exactly does Fine-tuning mean?

When we say “fine-tuning,” we are not training or teaching the AI new things. Please remember
that it has already been given access to the internet itself!

When we fine-tune, we actually just do this: Hey, AI! So please take these instructions and
examples and write for me that way.

Fine-tuning = Narrowing it down to your specific use case.

Creating a dataset for your use case is literally you deciding on what you want to make.

Want a paragraph generator?

Here is an example: {"prompt":"Topic: making money\nBlog Paragraph: again, nothing comes in

here. Just a separator", "completion":" your ideal paragraph that contains that topic"}
{"prompt":"Topic: \nBlog Paragraph:", "completion":" "} -> empty example

Want an AIDA framework?

{"prompt":"Topic: making money\nAIDA: again, nothing comes in here. Just a separator",

"completion":" your ideal AIDA framework"}
{"prompt":"Topic: \nAIDA:", "completion":" "} ->empty example

It's pretty simple, as long as you know what you want.

Here are two other use-cases JSONL files. The AIDA and PAS frameworks are waiting for you to
be filled and fine-tuned!

The AIDA framework:

Download: https://www.mediafire.com/file/hjskdic99d0e6d0/AIDA.jsonl/file

The PAS framework:

Download: https://www.mediafire.com/file/ai3dfs02tpsri96/PAS.jsonl/file

These two files are also DIY(Do it yourself). So just follow the given instructions to fill the files
correctly and get your own frameworks!

# Stay tuned; more updates to come in the future!

# New section! Updated 9/11/2022 or 9th of September 2022.

Google Ads Headlines:

Download: https://www.mediafire.com/file/c3zfzybzsk5270j/BHWgoo.jsonl/file

Blog Outlines Generator:

Download: https://www.mediafire.com/file/v6wmdj2g6gb6a3w/BHWout.jsonl/file
Blog Conclusion Paragraphs:

Download: https://www.mediafire.com/file/z5ztgjptbz9qy4b/BHWconc.jsonl/file

Product Descriptions:

Download: https://www.mediafire.com/file/hrzpcmaqocqcy8w/BHWprd.jsonl/file

AI 21 Labs And How It Works -

Many of you have asked what exactly “AI21 labs” is and how it compares to OpenAI.

Long story short: AI 21 Labs is another AI tech that competes with OpenAI
regarding text generation(completion). They have been in the market for a decent
amount of time, and I think their technology is really great.

We’ve run some tests. Some of them were great, but some of them just failed to
deliver.

You can sign-up for AI21 labs here: https://www.ai21.com/.

They give you 90$ in free credits! Amazing. You actually get to test their stuff
correctly, even fine-tune some models.

You can find their pricing here: https://studio.ai21.com/pricing.

I will now show you how it looks on the inside. Well, as you would imagine,
nothing is out of the ordinary. However, they do give you access to some
pre-made frameworks and get to play around.

You can even press the “new” button and create your own framework in no time at
all. Of course, that is not precisely fine-tuning, but it’s pretty cool to play around
until you get used to things.
How to fine-tune with AI21

Compared to OpenAI, which has an entire learning curve, AI21 comes in with a simple
design.
Let’s just look above and see what we find :D.

Yep, as simple as that. Just navigate to the “datasets” section and press the “upload”
button.

That will prompt you to ask for the .JSONL file(dataset) you want to upload.

Then you can see live how your fine-tuning is going by pressing “Models.”
Once everything is done, you will find a model similar to OpenAI in the Playground.

Take a look here:

All that is left is to select the model you’ve fine-tuned. Since you will give it a name, it’s
very easy to find.

And use it in the playground or take the “code” out to make an application.
It’s that simple with AI 21. Thank you for taking the time to read and learn.

How do I get information to feed the AI, and what is my process?

Do you know what’s worse than not being able to fine-tune? Yep, that’s right. It’s worse
not knowing how to find the correct information and how to place it within your dataset.

All the datasets I’ve shared contain a prompt and completion; however, they are made
for you to fill with information. Otherwise, you can’t exactly learn the proper process and
how to find that info.

This is the best way to do everything, and it’s pretty easy.

Go to google.com (wow, complicated, thanks.), and let’s say you want to fill in the
Product description generator dataset with information that it’s only a little above this
section.

Yes? Ok.

And let’s search for “Best TVs 2022”. Of course, there are many results, and it’s
confusing what information you need to pull from there. Here is the situation where most
people get stuck.

Personally, I am picky. I hate taking information that I wouldn’t like on my website just
because I need to fine-tune something.

So here’s what I do:

- Find a website that ranks on the first/second/third page. (that's enough)

- Find all their “top 10 / best” content.
- Copy/paste the sections I love, the product, and the model on which I’d like to
train the AI.
Now let’s jump to an image example.
Let’s say we pick that one: https://www.digitaltrends.com/home-theater/best-tvs/.

Click on it, and let us begin extracting the correct information.

That’s the section with Cons and Pros, right?

What I think: Do I want my website to have the product's pros and cons? Definitely! So
what happens? I take them with me.

Now below, you have the product description. Do I like that? Most certainly. Then what? I
also pull that with me.
Here is how I would format this in a dataset:

{"prompt": "Product Name: 65-inch LG G2\nReview:", "completion": "Year after year, LG

continues to deliver the most impressive OLED TVs, and somehow, they continue to one-up
themselves when it comes to picture quality, design, and overall performance — and 2022 is no
exception. Whether you're a film fanatic, sports buff, or heavy-duty gamer, the LG G2 is the TV to
buy if you want a TV that can do it all.\n\nAvailable in four sizes (55-, 65-, 75-, and 83-inches),
the G2 OLED is the latest model in LG's Gallery Series, with the \"Gallery\" moniker referring to
the fact that this TV is designed to look like a wall-mounted painting. In fact, LG is so determined
to get the G2 on your sheetrock that you won't even find a pedestal or set of feet in the box (you
can purchase the stand separately).\n\nFrom regular cable TV channels to jaw-dropping 4K HDR
content, you're not going to beat the picture you'll get with the G2. Blacks are inky and deep,
colors are rich and abundant, and thanks to the Evo panel, this TV gets nice and bright, too. On
top of four HDMI 2.1 inputs, AMD FreeSync, NVIDIA G-Sync, and VRR support for all your
next-gen gaming needs, along with support for HDR10, Dolby Vision IQ, and HLG formats, the
G2 is also loaded with the best version of LG's WebOS system to date. Simply put: your Netflix
and Hulu streaming has never looked and felt so good.\n\nSure, OLED TVs don't always deliver
the brightest picture, and if your TV gets a lot of use during daylight hours, something like the
Samsung S95B may be a better overall fit. But we're betting that for most viewers, the brilliance
and clarity delivered by the LG G2 will be plenty.\n\nPROS:\n\nClass-leading
brightness\nOutstanding color accuracy\nPerfect black levels\nTop choice for gamers\nTop-notch
design\n\nCONS:\n\nDisappointing stand-mount option"}

Take a very close look. What do you notice?

We have the following format:

- A product name: 65-inch LG G2

- A section that separates prompt from completion which will always be empty
(“Review”)
- Then we have the completion, which contains the full product description taken
from that website.

What will this achieve for you?

- You will have the AI write content similar to what ranks on the first page of Google.
- You can have the AI write the factual content in the OpenAI’s playground. (Oh, but
how?)

Here is how you can format that trained model in the Playground to keep it factual:

I am pretty sure this is easy to comprehend, and you already understood what this is and
how it works.
Short tip: You can add additional prompts in the playground after fine-tuning a model. I
know it’s mentioned above within the “Tips & tricks” section, but I will repeat it.

Keeping the dataset simple is very important. You can tweak everything afterward!

If you look closely, you will notice that I’ve included the “Pros and Cons” in that example,
so the AI will know exactly what I want and where I want.

Ok, I will give you another example. Let’s take the “Blog outline generator” dataset and
try to fill in the information.

Here is a line from it:

{"prompt": "Blog Topic: Insert a topic(example: How to make money online)\nBlog

outline:", "completion": "Here paste your favorite outlines that you can find. Just find it on
Google. "}

How would I turn this into a ready-to-tune line? Go on Google and search for a random
blog topic/idea. In this case, I will take “How to book a plane ticket.”

You can use this tool to stringify your text: https://onlinetexttools.com/json-stringify-text

(it’s also given/presented at the beginning of the book in the “tools section).

Now let’s say we found an article: https://www.wikihow.com/Book-an-Airline-Ticket.

Great! It does not precisely have an outline, but it has different points that look similar.

So we’ll use that!

{"prompt": "Blog Topic: How to book a plane ticket\nBlog outline:", "completion": " Look
for flights in advance.\n\nCheck airfare deal websites.\n\nEnter your trip details on an
aggregator site.\n\nChoose how many stops you want to make.\n\nClick through to the
airline’s website.\n\nChoose your seat.\n\nChoose to make it a package deal or
not.\n\nRequest special accommodations."}

What did I do here?

- We have a topic/idea.
- A section that separates the prompt from completion(“Blog outline”)
- Then we have the completion
Here is what I took:

And I put them in the tool above like this:

Simple, efficient, and fast. You can do it too!

Hopefully, this has been clear enough and did not confuse you!

# Now, here is what comes with the new update in the future and what you should
expect:

- New datasets on different frameworks

- Other AI technologies you can use to create content
- AI image generation(Will teach you how to get approved by Dall-e2) and
others.
- Canon information on GPT-4 and what’s next.