0% found this document useful (0 votes)
29 views2 pages

Video 2 Machine Learning

Machine Learning

Uploaded by

Riaz Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views2 pages

Video 2 Machine Learning

Machine Learning

Uploaded by

Riaz Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
You are on page 1/ 2

The rise of AI has been largely driven by one tool in AI called machine learning.

In this video, you'll


learn what is machine learning so that by the end, you hopefully be able to start thinking how
machine learning might be applied to your company or to your industry. The most commonly used
type of machine learning is a type of AI that learns A to B or input to output mappings, and this is
called supervised learning. Let's see some examples. If the input A is an email and the output B you
want is, is this email spam or not? 0/1. Then this is the core piece of AI used to build a spam filter.
Or if the input A is an audio clip and the AI's job is output the text transcript, then this is speech
recognition. More examples, if you want to input English and have it output a different language,
Chinese, Spanish, something else, then this is machine translation. Or the most lucrative form of
supervised learning of this type of machine learning may be online advertising, where all the large
online ad platforms have a piece of AI that inputs some information about an ad and some
information about you and tries to figure out, will you click on this ad or not? By showing you the
ads that you're most likely click on, this turns out to be very lucrative. Maybe not the most inspiring
application, but certainly having a huge economic impact today. Or if you want to build a self-driving
car, one of the key pieces of AI is an AI that takes its input an image, and some information from
the radar or from other senses and outputs the position of other cars so your self-driving car can
avoid the other cars or in manufacturing. I've actually done a lot of work in manufacturing where
you take as input, a picture of something you've just manufactured, such as a picture of a cell
phone coming off an assembly line. This is a picture of a phone, not a picture taken by a phone.
You want to output, is there a scratch or is there a dent or some other defect on this thing you've
just manufactured? This is visual inspection which is helping manufacturers to reduce or prevent
defects in the things that they're making. Supervised learning also lies at the heart of generative AI
systems, like the ChatGPT and chat bots that generate texts. These systems work by learning
from huge amounts of texts. You download them from the Internet so that when given a few words
as the input, the model can predict the next word that comes after. These models, which are called
large language models or LLMs, generate new texts by repeatedly predicting what is the next word
they should output. Given the widespread attention on LLMs, let's look briefly on the next slide in
greater detail at how they work. Large language models are built by using supervised learning to
train a model to repeatedly predict the next word. For example, if an AI system has read on the
Internet a sentence like my favorite drink is lychee bubble tea, then the single sentence would be
turned into a lot of A to B data points for the model to learn to predict the next word. Specifically,
given this sentence, we now have one data point that says, given the phrase my favorite drink,
what do you predict is the next word? In this case, the right answer is given my favorite drink is,
what do you predict is the next word? The correct answer is lychee, and so on until you have used
all the words in the sentence. This one sentence is turned into multiple inputs A and outputs B for
the model to learn given a few words as input, what is the next word? When you train a very large
AI system on a lot of data, say hundreds of billions or even over a trillion words, then you get a
large language model like ChatGPT that given an initial piece of text called a prompt, is very good
at generating some additional words in response to that prompt. The description I presented here
does omit some technical details like how the model learns to follow instructions rather than just
predict the next word found on the Internet. Also how developers make the model less likely to
generate inappropriate outputs, such as one that exhibit discrimination or hand out harmful
instructions. If you're interested, you can learn more about these details in the calls generative AI
for everyone. At the heart of LLMs though, is this technology that learns from a lot of data to
predict what is the next word using supervised learning. In summary, supervised learning just
learns input, output, or A to B mappings. On one hand, input, output A to B seems quite limiting.
But when you find the right application scenario, this turns out to be incredibly valuable. Now, the
idea of supervised learning has been around for many decades, but it's really taken off in the last
few years. Why is this? When my friends asked me, hey, Andrew, why is supervised learning taking
off now? There's a picture I draw for them and I want to show you this picture now. You may be
able to draw this picture for others that ask you the same question as well. Let's say on the
horizontal axis, you plot the amount of data you have for a task. For speech recognition, this might
be the amount of audio data and transcripts you have. In a lot of industries, the amount of data you
have access to has really grown over the last couple of decades. Thanks to the rise of the Internet,
the rise of computers. A lot of what used to be, say, pieces of paper are now instead recorded on a
digital computer. We've just been getting more and more and more data. Now let's say on the
vertical access, you plot the performance of an AI system. It turns out that if you use a traditional AI
system, then the performance would grow like this. That as you feed it more data, its performance
gets a bit better. But beyond a certain point, it did not get that much better. It's as if your speech
recognition system did not get that much more accurate, or your online advertising system didn't
get that much more accurate at showing the most relevant ads even as you showed more data. AI
has really taken off recently due to the rise of neural networks and deep learning. I'll define these
terms more precisely in later videos, so don't worry too much about what it means for now. But with
modern AI, with neural networks, and deep learning, what we saw was that if you train a small
neural network, then the performance looks like this. Where as you feed in more data,
performance keeps getting better for much longer. If you train even slightly larger neural network,
say a medium sized neural net, then the performance may look like that. If you train a very large
neural network, then the performance just keeps on getting better and better. For applications like
speech recognition, online advertising, building self-driving car, where having a high performance,
highly accurate say speech recognition system is important, this has enable these AI systems get
much better and make, say, speech recognition products much more acceptable to users, much
more valuable to companies and to users. Now here are a couple of implications of this figure. If
you want the best possible levels of performance, your performance to be up here, to hit this level
of performance, then you need two things. One is it really helps to have a lot of data. That's why
sometimes you hear about big data. Having more data almost always helps. The second thing is
you want to be able to train a very large neural network. The rise of fast computers, including
Moore's law, but also the rise of specialized processors, such as graphics processor units or
GPU's, which you hear more about in the later video, has enabled many companies, not just the
giant tech companies, but many other companies to be able to train large neural nets on a large
enough amount of data in order to get very good performance and drive business value. In fact, it
was also this type of scaling increasing the amount of data and the size of the models that was
instrumental to the recent breakthroughs in training generative AI systems, including the large
language models that we discussed just now. The most important idea in AI has been machine
learning and specifically, supervised learning, which means A to B or input output mappings. What
enables it to work really well is data. In the next video, let's take a look at what is data and what
data you might already have and how to think about feeding this into AI systems. Let's go on to the
next video.

You might also like