100% found this document useful (5 votes)
323 views

Instant Download Programming Large Language Models With Azure Open Ai: Conversational Programming and Prompt Engineering With Llms (Developer Reference) 1st Edition Esposito PDF All Chapters

Engineering

Uploaded by

pahalmewes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (5 votes)
323 views

Instant Download Programming Large Language Models With Azure Open Ai: Conversational Programming and Prompt Engineering With Llms (Developer Reference) 1st Edition Esposito PDF All Chapters

Engineering

Uploaded by

pahalmewes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Download the full version of the textbook now at textbookfull.

com

Programming Large Language Models With Azure


Open Ai: Conversational Programming and
Prompt Engineering With Llms (Developer
Reference) 1st Edition Esposito
https://textbookfull.com/product/programming-
large-language-models-with-azure-open-ai-
conversational-programming-and-prompt-engineering-
with-llms-developer-reference-1st-edition-
esposito/

Explore and download more textbook at https://textbookfull.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Transforming Conversational AI: Exploring the Power of


Large Language Models in Interactive Conversational Agents
1st Edition Michael Mctear
https://textbookfull.com/product/transforming-conversational-ai-
exploring-the-power-of-large-language-models-in-interactive-
conversational-agents-1st-edition-michael-mctear/
textbookfull.com

Clean Architecture With net Developer Reference 1st


Edition Esposito Dino

https://textbookfull.com/product/clean-architecture-with-net-
developer-reference-1st-edition-esposito-dino/

textbookfull.com

Beginning Game AI with Unity: Programming Artificial


Intelligence with C# Sebastiano M. Cossu

https://textbookfull.com/product/beginning-game-ai-with-unity-
programming-artificial-intelligence-with-c-sebastiano-m-cossu/

textbookfull.com

The Summer of Impossibilities 2nd Edition Allen Rachael

https://textbookfull.com/product/the-summer-of-impossibilities-2nd-
edition-allen-rachael/

textbookfull.com
The Blitz 1940–41: The Luftwaffe's Biggest Strategic
Bombing Campaign 1st Edition Julian Hale

https://textbookfull.com/product/the-blitz-1940-41-the-luftwaffes-
biggest-strategic-bombing-campaign-1st-edition-julian-hale/

textbookfull.com

Ecology and Justice—Citizenship in Biotic Communities


David R. Keller

https://textbookfull.com/product/ecology-and-justice-citizenship-in-
biotic-communities-david-r-keller/

textbookfull.com

Internet of Things: Enabling Technologies, Security and


Social Implications (Services and Business Process
Reengineering) Santosh Kumar Pani
https://textbookfull.com/product/internet-of-things-enabling-
technologies-security-and-social-implications-services-and-business-
process-reengineering-santosh-kumar-pani/
textbookfull.com

The Only EKG Book You ll Ever Need Malcolm S Thaler M D

https://textbookfull.com/product/the-only-ekg-book-you-ll-ever-need-
malcolm-s-thaler-m-d/

textbookfull.com

Linux Administration a Beginner s Guide Wale Soyinka

https://textbookfull.com/product/linux-administration-a-beginner-s-
guide-wale-soyinka/

textbookfull.com
Biotechnological Applications in Human Health Provash
Chandra Sadhukhan

https://textbookfull.com/product/biotechnological-applications-in-
human-health-provash-chandra-sadhukhan/

textbookfull.com
Programming Large Language
Models with Azure Open AI:
Conversational programming and
prompt engineering with LLMs

Francesco Esposito
Programming Large Language Models with Azure Open AI:
Conversational programming and prompt engineering with
LLMs
Published with the authorization of Microsoft Corporation by: Pearson
Education, Inc.

Copyright © 2024 by Francesco Esposito.


All rights reserved. This publication is protected by copyright, and
permission must be obtained from the publisher prior to any prohibited
reproduction, storage in a retrieval system, or transmission in any form or by
any means, electronic, mechanical, photocopying, recording, or likewise. For
information regarding permissions, request forms, and the appropriate
contacts within the Pearson Education Global Rights & Permissions
Department, please visit www.pearson.com/permissions.
No patent liability is assumed with respect to the use of the information
contained herein. Although every precaution has been taken in the
preparation of this book, the publisher and author assume no responsibility
for errors or omissions. Nor is any liability assumed for damages resulting
from the use of the information contained herein.
ISBN-13: 978-0-13-828037-6
ISBN-10: 0-13-828037-1
Library of Congress Control Number: 2024931423
$PrintCode

Trademarks
Microsoft and the trademarks listed at http://www.microsoft.com on the
“Trademarks” webpage are trademarks of the Microsoft group of companies.
All other marks are property of their respective owners.

Warning and Disclaimer


Every effort has been made to make this book as complete and as accurate as
possible, but no warranty or fitness is implied. The information provided is
on an “as is” basis. The author, the publisher, and Microsoft Corporation
shall have neither liability nor responsibility to any person or entity with
respect to any loss or damages arising from the information contained in this
book or from the use of the programs accompanying it.

Special Sales
For information about buying this title in bulk quantities, or for special sales
opportunities (which may include electronic versions; custom cover designs;
and content particular to your business, training goals, marketing focus, or
branding interests), please contact our corporate sales department at
[email protected] or (800) 382-3419.
For government sales inquiries, please contact
[email protected].
For questions about sales outside the U.S., please contact
[email protected].

Editor-in-Chief
Brett Bartow

Executive Editor
Loretta Yates

Associate Editor
Shourav Bose

Development Editor
Kate Shoup

Managing Editor
Sandra Schroeder

Senior Project Editor


Tracey Croom

Copy Editor
Dan Foster

Indexer
Timothy Wright

Proofreader
Donna E. Mulder

Technical Editor
Dino Esposito

Editorial Assistant
Cindy Teeters

Cover Designer
Twist Creative, Seattle

Compositor
codeMantra

Graphics
codeMantra

Figure Credits
Figure 4.1: LangChain, Inc
Figures 7.1, 7.2, 7.4: Snowflake, Inc
Figure 8.2: SmartBear Software
Figure 8.3: Postman, Inc
Dedication

A I.
Perché non dedicarti un libro sarebbe stato un sacrilegio.
Contents at a Glance

Introduction

CHAPTER 1 The genesis and an analysis of large language models


CHAPTER 2 Core prompt learning techniques
CHAPTER 3 Engineering advanced learning prompts
CHAPTER 4 Mastering language frameworks
CHAPTER 5 Security, privacy, and accuracy concerns
CHAPTER 6 Building a personal assistant
CHAPTER 7 Chat with your data
CHAPTER 8 Conversational UI

Appendix: Inner functioning of LLMs

Index
Contents

Acknowledgments
Introduction

Chapter 1 The genesis and an analysis of large language models


LLMs at a glance
History of LLMs
Functioning basics
Business use cases
Facts of conversational programming
The emerging power of natural language
LLM topology
Future perspective
Summary

Chapter 2 Core prompt learning techniques


What is prompt engineering?
Prompts at a glance
Alternative ways to alter output
Setting up for code execution
Basic techniques
Zero-shot scenarios
Few-shot scenarios
Chain-of-thought scenarios
Fundamental use cases
Chatbots
Translating
LLM limitations
Summary

Chapter 3 Engineering advanced learning prompts


What’s beyond prompt engineering?
Combining pieces
Fine-tuning
Function calling
Homemade-style
OpenAI-style
Talking to (separated) data
Connecting data to LLMs
Embeddings
Vector store
Retrieval augmented generation
Summary

Chapter 4 Mastering language frameworks


The need for an orchestrator
Cross-framework concepts
Points to consider
LangChain
Models, prompt templates, and chains
Agents
Data connection
Microsoft Semantic Kernel
Plug-ins
Data and planners
Microsoft Guidance
Configuration
Main features
Summary

Chapter 5 Security, privacy, and accuracy concerns


Overview
Responsible AI
Red teaming
Abuse and content filtering
Hallucination and performances
Bias and fairness
Security and privacy
Security
Privacy
Evaluation and content filtering
Evaluation
Content filtering
Summary

Chapter 6 Building a personal assistant


Overview of the chatbot web application
Scope
Tech stack
The project
Setting up the LLM
Setting up the project
Integrating the LLM
Possible extensions
Summary

Chapter 7 Chat with your data


Overview
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Scope
Tech stack
What is Streamlit?
A brief introduction to Streamlit
Main UI features
Pros and cons in production
The project
Setting up the project and base UI
Data preparation
LLM integration
Progressing further
Retrieval augmented generation versus fine-tuning
Possible extensions
Summary

Chapter 8 Conversational UI
Overview
Scope
Tech stack
The project
Minimal API setup
OpenAPI
LLM integration
Possible extensions
Summary

Appendix: Inner functioning of LLMs

Index
Acknowledgments

In the spring of 2023, when I told my dad how cool Azure OpenAI was
becoming, his reply was kind of a shock: “Why don’t you write a book about
it?” He said it so naturally that it hit me as if he really thought I could do it.
In fact, he added, “Are you up for it?” Then there was no need to say more.
Loretta Yates at Microsoft Press enthusiastically accepted my proposal, and
the story of this book began in June 2023.
AI has been a hot topic for the better part of a decade, but the emergence
of new-generation large language models (LLMs) has propelled it into the
mainstream. The increasing number of people using them translates to more
ideas, more opportunities, and new developments. And this makes all the
difference.
Hence, the book you hold in your hands can’t be the ultimate and
definitive guide to AI and LLMs because the speed at which AI and LLMs
evolve is impressive and because—by design—every book is an act of
approximation, a snapshot of knowledge taken at a specific moment in time.
Approximation inevitably leads to some form of dissatisfaction, and
dissatisfaction leads us to take on new challenges. In this regard, I wish for
myself decades of dissatisfaction. And a few more years of being on the stage
presenting books written for a prestigious publisher—it does wonders for my
ego.
First, I feel somewhat indebted to all my first dates since May because
they had to endure monologues lasting at least 30 minutes on LLMs and
some weird new approach to transformers.
True thanks are a private matter, but publicly I want to thank Martina first,
who cowrote the appendix with me and always knows what to say to make
me better. My gratitude to her is keeping a promise she knows. Thank you,
Martina, for being an extraordinary human being.
To Gianfranco, who taught me the importance of discussing and
expressing, even loudly, when something doesn’t please us, and taught me to
always ask, because the worst thing that can happen is hearing a no. Every
time I engage in a discussion, I will think of you.
I also want to thank Matteo, Luciano, Gabriele, Filippo, Daniele,
Riccardo, Marco, Jacopo, Simone, Francesco, and Alessia, who worked with
me and supported me during my (hopefully not too frequent) crises. I also
have warm thoughts for Alessandro, Antonino, Sara, Andrea, and Cristian
who tolerated me whenever we weren’t like 25-year-old youngsters because I
had to study and work on this book.
To Mom and Michela, who put up with me before the book and probably
will continue after. To my grandmas. To Giorgio, Gaetano, Vito, and Roberto
for helping me to grow every day. To Elio, who taught me how to dress and
see myself in more colors.
As for my dad, Dino, he never stops teaching me new things—for
example, how to get paid for doing things you would just love to do, like
being the technical editor of this book. Thank you, both as a father and as an
editor. You bring to my mind a song you well know: “Figlio, figlio, figlio.”
Beyond Loretta, if this book came to life, it was also because of the hard
work of Shourav, Kate, and Dan. Thank you for your patience and for
trusting me so much.
This book is my best until the next one!
Introduction

This is my third book on artificial intelligence (AI), and the first I wrote on
my own, without the collaboration of a coauthor. The sequence in which my
three books have been published reflects my own learning path, motivated by
a genuine thirst to understand AI for far more than mere business
considerations. The first book, published in 2020, introduced the
mathematical concepts behind machine learning (ML) that make it possible to
classify data and make timely predictions. The second book, which focused
on the Microsoft ML.NET framework, was about concrete applications—in
other words, how to make fancy algorithms work effectively on amounts of
data hiding their complexity behind the charts and tables of a familiar web
front end.
Then came ChatGPT.
The technology behind astonishing applications like ChatGPT is called a
large language model (LLM), and LLMs are the subject of this third book.
LLMs add a crucial capability to AI: the ability to generate content in
addition to classifying and predicting. LLMs represent a paradigm shift,
raising the bar of communication between humans and computers and
opening the floodgates to new applications that for decades we could only
dream of.
And for decades, we did dream of these applications. Literature and
movies presented various supercomputers capable of crunching any sort of
data to produce human-intelligible results. An extremely popular example
was HAL 9000—the computer that governed the spaceship Discovery in the
movie 2001: A Space Odyssey (1968). Another famous one was JARVIS
(Just A Rather Very Intelligent System), the computer that served Tony
Stark’s home assistant in Iron Man and other movies in the Marvel Comics
universe.
Often, all that the human characters in such books and movies do is
simply “load data into the machine,” whether in the form of paper
documents, digital files, or media content. Next, the machine autonomously
figures out the content, learns from it, and communicates back to humans
using natural language. But of course, those supercomputers were conceived
by authors; they were only science fiction. Today, with LLMs, it is possible
to devise and build concrete applications that not only make human–
computer interaction smooth and natural, but also turn the old dream of
simply “loading data into the machine” into a dazzling reality.
This book shows you how to build software applications using the same
type of engine that fuels ChatGPT to autonomously communicate with users
and orchestrate business tasks driven by plain textual prompts. No more, no
less—and as easy and striking as it sounds!

Who should read this book


Software architects, lead developers, and individuals with a background in
programming—particularly those familiar with languages like Python and
possibly C# (for ASP.NET Core)—will find the content in this book
accessible and valuable. In the vast realm of software professionals who
might find the book useful, I’d call out those who have an interest in ML,
especially in the context of LLMs. I’d also list cloud and IT professionals
with an interest in using cloud services (specifically Microsoft Azure) or in
sophisticated, real-world applications of human-like language in software.
While this book focuses primarily on the services available on the Microsoft
Azure platform, the concepts covered are easily applicable to analogous
platforms. At the end of the day, using an LLM involves little more than
calling a bunch of API endpoints, and, by design, APIs are completely
independent of the underlying platform.
In summary, this book caters to a diverse audience, including
programmers, ML enthusiasts, cloud-computing professionals, and those
interested in natural language processing, with a specific emphasis on
leveraging Azure services to program LLMs.
Assumptions

To fully grasp the value of a programming book on LLMs, there are a couple
of prerequisites, including proficiency in foundational programming concepts
and a familiarity with ML fundamentals. Beyond these, a working knowledge
of relevant programming languages and frameworks, such as Python and
possibly ASP.NET Core, is helpful, as is an appreciation for the significance
of classic natural language processing in the context of business domains.
Overall, a blend of programming expertise, ML awareness, and linguistic
understanding is recommended for a comprehensive grasp of the book’s
content.

This book might not be for you if…

This book might not be for you if you’re just seeking a reference book to find
out in detail how to use a particular pattern or framework. Although the book
discusses advanced aspects of popular frameworks (for example, LangChain
and Semantic Kernel) and APIs (such as OpenAI and Azure OpenAI), it does
not qualify as a programming reference on any of these. The focus of the
book is on using LLMs to build useful applications in the business domains
where LLMs really fit well.

Organization of this book

This book explores the practical application of existing LLMs in developing


versatile business domain applications. In essence, an LLM is an ML model
trained on extensive text data, enabling it to comprehend and generate
human-like language. To convey knowledge about these models, this book
focuses on three key aspects:
The first three chapters delve into scenarios for which an LLM is
effective and introduce essential tools for crafting sophisticated
solutions. These chapters provide insights into conversational
programming and prompting as a new, advanced, yet structured,
approach to coding.
The next two chapters emphasize patterns, frameworks, and techniques
for unlocking the potential of conversational programming. This
involves using natural language in code to define workflows, with the
LLM-based application orchestrating existing APIs.
The final three chapters present concrete, end-to-end demo examples
featuring Python and ASP.NET Core. These demos showcase
progressively advanced interactions between logic, data, and existing
business processes. In the first demo, you learn how to take text from an
email and craft a fitting draft for a reply. In the second demo, you apply
a retrieval augmented generation (RAG) pattern to formulate responses
to questions based on document content. Finally, in the third demo, you
learn how to build a hotel booking application with a chatbot that uses a
conversational interface to ascertain the user’s needs (dates, room
preferences, budget) and seamlessly places (or denies) reservations
according to the underlying system’s state, without using fixed user
interface elements or formatted data input controls.

Downloads: notebooks and samples


Python and Polyglot notebooks containing the code featured in the initial part
of the book, as well as the complete codebases for the examples tackled in the
latter part of the book, can be accessed on GitHub at:
https://github.com/Youbiquitous/programming-llm

Errata, updates, & book support


We’ve made every effort to ensure the accuracy of this book and its
companion content. You can access updates to this book—in the form of a
list of submitted errata and their related corrections—at:
MicrosoftPressStore.com/LLMAzureAI/errata
If you discover an error that is not already listed, please submit it to us at
the same page.
For additional book support and information, please visit
MicrosoftPressStore.com/Support.
Please note that product support for Microsoft software and hardware is
not offered through the previous addresses. For help with Microsoft software
or hardware, go to http://support.microsoft.com.

Stay in touch
Let’s keep the conversation going! We’re on X / Twitter:
http://twitter.com/MicrosoftPress.
Chapter 1

The genesis and an analysis of large


language models

Luring someone into reading a book is never a small feat. If it’s a novel, you
must convince them that it’s a beautiful story, and if it’s a technical book,
you must assure them that they’ll learn something. In this case, we’ll try to
learn something.
Over the past two years, generative AI has become a prominent buzzword.
It refers to a field of artificial intelligence (AI) focused on creating systems
that can generate new, original content autonomously. Large language
models (LLMs) like GPT-3 and GPT-4 are notable examples of generative
AI, capable of producing human-like text based on given input.
The rapid adoption of LLMs is leading to a paradigm shift in
programming. This chapter discusses this shift, the reasons for it, and its
prospects. Its prospects include conversational programming, in which you
explain with words—rather than with code—what you want to achieve. This
type of programming will likely become very prevalent in the future.
No promises, though. As you’ll soon see, explaining with words what you
want to achieve is often as difficult as writing code.
This chapter covers topics that didn’t find a place elsewhere in this book.
It’s not necessary to read every section or follow a strict order. Take and read
what you find necessary or interesting. I expect you will come back to read
certain parts of this chapter after you finish the last one.
LLMs at a glance

To navigate the realm of LLMs as a developer or manager, it’s essential to


comprehend the origins of generative AI and to discern its distinctions from
predictive AI. This chapter has one key goal: to provide insights into the
training and business relevance of LLMs, reserving the intricate mathematical
details for the appendix.
Our journey will span from the historical roots of AI to the fundamentals
of LLMs, including their training, inference, and the emergence of
multimodal models. Delving into the business landscape, we’ll also spotlight
current popular use cases of generative AI and textual models.
This introduction doesn’t aim to cover every detail. Rather, it intends to
equip you with sufficient information to address and cover any potential gaps
in knowledge, while working toward demystifying the intricacies surrounding
the evolution and implementation of LLMs.

History of LLMs
The evolution of LLMs intersects with both the history of conventional AI
(often referred to as predictive AI) and the domain of natural language
processing (NLP). NLP encompasses natural language understanding (NLU),
which attempts to reduce human speech into a structured ontology, and
natural language generation (NLG), which aims to produce text that is
understandable by humans.
LLMs are a subtype of generative AI focused on producing text based on
some kind of input, usually in the form of written text (referred to as a
prompt) but now expanding to multimodal inputs, including images, video,
and audio. At a glance, most LLMs can be seen as a very advanced form of
autocomplete, as they generate the next word. Although they specifically
generate text, LLMs do so in a manner that simulates human reasoning,
enabling them to perform a variety of intricate tasks. These tasks include
sentiment analysis, summarization, translation, entity and intent recognition,
structured information extraction, document generation, and so on.
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
LLMs represent a natural extension of the age-old human aspiration to
construct automatons (ancestors to contemporary robots) and imbue them
with a degree of reasoning and language. They can be seen as a brain for such
automatons, able to respond to an external input.

AI beginnings
Modern software—and AI as a vibrant part of it—represents the culmination
of an embryonic vision that has traversed the minds of great thinkers since
the 17th century. Various mathematicians, philosophers, and scientists, in
diverse ways and at varying levels of abstraction, envisioned a universal
language capable of mechanizing the acquisition and sharing of knowledge.
Gottfried Leibniz (1646–1716), in particular, contemplated the idea that at
least a portion of human reasoning could be mechanized.
The modern conceptualization of intelligent machinery took shape in the
mid-20th century, courtesy of renowned mathematicians Alan Turing and
Alonzo Church. Turing’s exploration of “intelligent machinery” in 1947,
coupled with his groundbreaking 1950 paper, “Computing Machinery and
Intelligence,” laid the cornerstone for the Turing test—a pivotal concept in
AI. This test challenged machines to exhibit human behavior
(indistinguishable by a human judge), ushering in the era of AI as a scientific
discipline.

Note
Considering recent advancements, a reevaluation of the original
Turing test may be warranted to incorporate a more precise
definition of human and rational behavior.

NLP
NLP is an interdisciplinary field within AI that aims to bridge the interaction
between computers and human language. While historically rooted in
linguistic approaches, distinguishing itself from the contemporary sense of
AI, NLP has perennially been a branch of AI in a broader sense. In fact, the
overarching goal has consistently been to artificially replicate an expression
of human intelligence—specifically, language.
The primary goal of NLP is to enable machines to understand, interpret,
and generate human-like language in a way that is both meaningful and
contextually relevant. This interdisciplinary field draws from linguistics,
computer science, and cognitive psychology to develop algorithms and
models that facilitate seamless interaction between humans and machines
through natural language.
The history of NLP spans several decades, evolving from rule-based
systems in the early stages to contemporary deep-learning approaches,
marking significant strides in the understanding and processing of human
language by computers.
Originating in the 1950s, early efforts, such as the Georgetown-IBM
experiment in 1954, aimed at machine translation from Russian to English,
laying the foundation for NLP. However, these initial endeavors were
primarily linguistic in nature. Subsequent decades witnessed the influence of
Chomskyan linguistics, shaping the field’s focus on syntactic and
grammatical structures.
The 1980s brought a shift toward statistical methods, like n-grams, using
co-occurrence frequencies of words to make predictions. An example was
IBM’s Candide system for speech recognition. However, rule-based
approaches struggled with the complexity of natural language. The 1990s saw
a resurgence of statistical approaches and the advent of machine learning
(ML) techniques such as hidden Markov models (HMMs) and statistical
language models. The introduction of the Penn Treebank, a 7-million word
dataset of part-of-speech tagged text, and statistical machine translation
systems marked significant milestones during this period.
In the 2000s, the rise of data-driven approaches and the availability of
extensive textual data on the internet rejuvenated the field. Probabilistic
models, including maximum-entropy models and conditional random fields,
gained prominence. Begun in the 1980s but finalized years later, the
development of WordNet, a semantical-lexical database of English (with its
groups of synonyms, or synonym set, and their relations), contributed to a
deeper understanding of word semantics.
The landscape transformed in the 2010s with the emergence of deep
learning made possible by a new generation of graphics processing units
(GPUs) and increased computing power. Neural network architectures—
particularly transformers like Bidirectional Encoder Representations from
Transformers (BERT) and Generative Pretrained Transformer (GPT)—
revolutionized NLP by capturing intricate language patterns and contextual
information. The focus shifted to data-driven and pretrained language
models, allowing for fine-tuning of specific tasks.

Predictive AI versus generative AI


Predictive AI and generative AI represent two distinct paradigms, each
deeply entwined with advancements in neural networks and deep-learning
architectures.
Predictive AI, often associated with supervised learning, traces its roots
back to classical ML approaches that emerged in the mid-20th century. Early
models, such as perceptrons, paved the way for the resurgence of neural
networks in the 1980s. However, it wasn’t until the advent of deep learning in
the 21st century—with the development of deep neural networks,
convolutional neural networks (CNNs) for image recognition, and recurrent
neural networks (RNNs) for sequential data—that predictive AI witnessed a
transformative resurgence. The introduction of long short-term memory
(LSTM) units enabled more effective modeling of sequential dependencies in
data.
Generative AI, on the other hand, has seen remarkable progress, propelled
by advancements in unsupervised learning and sophisticated neural network
architectures (the same used for predictive AI). The concept of generative
models dates to the 1990s, but the breakthrough came with the introduction
of generative adversarial networks (GANs) in 2014, showcasing the power of
adversarial training. GANs, which feature a generator for creating data and a
discriminator to distinguish between real and generated data, play a pivotal
role. The discriminator, discerning the authenticity of the generated data
during the training, contributes to the refinement of the generator, fostering
continuous enhancement in generating more realistic data, spanning from
lifelike images to coherent text.
Table 1-1 provides a recap of the main types of learning processes.

TABLE 1-1 Main types of learning processes

Type Definition Training Use Cases


Supervised Trained on Adjusts Classification,
labeled data parameters to regression
where each input minimize the
has a prediction error
corresponding
label
Self- Unsupervised Learns to fill in NLP, computer
supervised learning where the blank (predict vision
the model parts of input data
generates its own from other parts)
labels
Semi- Combines Uses labeled data Scenarios with
supervised labeled and for supervised limited labeled
unlabeled data tasks, unlabeled data—for
for training data for example, image
generalizations classification
Unsupervised Trained on data Identifies inherent Clustering,
without explicit structures or dimensionality
supervision relationships in reduction,
the data generative
modeling

The historical trajectory of predictive and generative AI underscores the


symbiotic relationship with neural networks and deep learning. Predictive AI
leverages deep-learning architectures like CNNs for image processing and
RNNs/LSTMs for sequential data, achieving state-of-the-art results in tasks
ranging from image recognition to natural language understanding.
Generative AI, fueled by the capabilities of GANs and large-scale language
models, showcases the creative potential of neural networks in generating
novel content.

LLMs
An LLM, exemplified by OpenAI’s GPT series, is a generative AI system
built on advanced deep-learning architectures like the transformer (more on
this in the appendix).
These models operate on the principle of unsupervised and self-supervised
learning, training on vast text corpora to comprehend and generate coherent
and contextually relevant text. They output sequences of text (that can be in
the form of proper text but also can be protein structures, code, SVG, JSON,
XML, and so on), demonstrating a remarkable ability to continue and expand
on given prompts in a manner that emulates human language.
The architecture of these models, particularly the transformer architecture,
enables them to capture long-range dependencies and intricate patterns in
data. The concept of word embeddings, a crucial precursor, represents words
as continuous vectors (Mikolov et al. in 2013 through Word2Vec),
contributing to the model’s understanding of semantic relationships between
words. Word embeddings is the first “layer” of an LLM.
The generative nature of the latest models enables them to be versatile in
output, allowing for tasks such as text completion, summarization, and
creative text generation. Users can prompt the model with various queries or
partial sentences, and the model autonomously generates coherent and
contextually relevant completions, demonstrating its ability to understand and
mimic human-like language patterns.
The journey began with the introduction of word embeddings in 2013,
notably with Mikolov et al.’s Word2Vec model, revolutionizing semantic
representation. RNNs and LSTM architectures followed, addressing
challenges in sequence processing and long-range dependencies. The
transformative shift arrived with the introduction of the transformer
architecture in 2017, allowing for parallel processing and significantly
improving training times.
Random documents with unrelated
content Scribd suggests to you:
1.E.5. Do not copy, display, perform, distribute or redistribute
this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or


providing access to or distributing Project Gutenberg™
electronic works provided that:

• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project


Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite
these efforts, Project Gutenberg™ electronic works, and the
medium on which they may be stored, may contain “Defects,”
such as, but not limited to, incomplete, inaccurate or corrupt
data, transcription errors, a copyright or other intellectual
property infringement, a defective or damaged disk or other
medium, a computer virus, or computer codes that damage or
cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES -


Except for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU
AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE,
STRICT LIABILITY, BREACH OF WARRANTY OR BREACH
OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE
TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER
THIS AGREEMENT WILL NOT BE LIABLE TO YOU FOR
ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE
OR INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF
THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If


you discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person or
entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.

1.F.4. Except for the limited right of replacement or refund set


forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR
ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the


Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you do
or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.

Section 2. Information about the Mission of


Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status by
the Internal Revenue Service. The Foundation’s EIN or federal
tax identification number is 64-6221541. Contributions to the
Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.

The Foundation’s business office is located at 809 North 1500


West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws


regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or
determine the status of compliance for any particular state visit
www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states


where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot


make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.

Please check the Project Gutenberg web pages for current


donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.

Section 5. General Information About Project


Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.

Project Gutenberg™ eBooks are often created from several


printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.
back

You might also like