Week 5 Large Language Models

This document provides an overview of a module on large language models (LLMs), focusing on their practical applications such as sentiment analysis, data extraction, and topic modeling. It includes video summaries that detail the use of OpenAI's API for various tasks, including crafting effective prompts, understanding embeddings, and implementing Retrieval Augmented Generation. The document also highlights the limitations and costs associated with using LLMs, along with links to resources for further learning.

Uploaded by

revanthkalla1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views5 pages

Week 5 Large Language Models

Uploaded by

revanthkalla1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Large language models

This module covers the practical usage of large language models (LLMs) -- a relatively a new area.

This module is experimental. Things may break when trying these notebooks or during your GA.
Try again, gently.
LLMs incur a cost. We have created API keys for everyone to use gpt-3.5-turbo and text-
embedding-small. Your usage is limited to 50 cents for this course. Don't exceed that.
Use AI Proxy instead of OpenAI. Specifically:
1. Replace your API to https://api.openai.com/... with https://aiproxy.sanand.workers.dev/openai/...
2. Replace the OPENAI_API_KEY with the AIPROXY_TOKEN that someone will give you.

LLM Sentiment analysis (Video 1)

Video summary

This video explores using large language models (LLMs) for sentiment analysis and
classification. It demonstrates how to use the OpenAI API to analyze movie reviews for
sentiment and genre without any training.

Highlights:

• 00:063 Introduction to sentiment analysis with LLMs

o Using a small movie reviews dataset
o Identifying sentiment as positive or negative
o Determining the genre of the movie
• 01:054 Using OpenAI API for sentiment analysis
o Choosing the GPT-3.5 Turbo model
o Providing system instructions and movie reviews
o Analyzing the sentiment of reviews
• 04:095 Comparing different LLM models
o Testing GPT-3.5 Turbo and GPT-4 models
o Observing differences in sentiment analysis results
o Discussing model capabilities and token usage
• 10:126 Implementing sentiment analysis programmatically
o Using Python and OpenAI API
o Storing API keys and making requests
o Handling responses and extracting sentiment
• 24:027 Training LLMs with examples
o Providing examples to train the model
o Using few-shot learning for better results
o Discussing prompt engineering and cost considerations

You'll learn how to use large language models (LLMs) for sentiment analysis and classification,
covering:

• Sentiment Analysis: Use OpenAI API to identify the sentiment of movie reviews as positive
or negative.
• Prompt Engineering: Learn how to craft effective prompts to get desired results from
LLMs.
• LLM Training: Understand how to train LLMs by providing examples and feedback.
• OpenAI API Integration: Integrate OpenAI API into Python code to perform sentiment
analysis.
• Tokenization: Learn about tokenization and its impact on LLM input and cost.
• Zero-Shot, One-Shot, and Multi-Shot Learning: Understand different approaches to using
LLMs for learning.
Here are the links used in the video:

• Jupyter Notebook
• Movie reviews dataset
• OpenAI Playground
• OpenAI Pricing
• OpenAI Tokenizer
• OpenAI API Reference
• OpenAI Docs

LLM Extraction (Video 2)

Video summary

This video discusses how to systematically extract structured information from a dataset
using language models. It covers various techniques and tools, including JSON schemas and
OpenAI’s API, to extract and format data efficiently.

Highlights:

• 00:023 Introduction to data extraction

o Explanation of systematic information extraction
o Example of extracting state names and ZIP codes
o Importance of structured formats like JSON
• 03:054 Using OpenAI API for extraction
o Creating a secret for OpenAI API
o Extracting data using API keys
o Handling unstructured data
• 10:025 Challenges with inconsistent data
o Issues with varying data formats
o Example of famous addresses with inconsistent formats
o Using examples for one-shot prompting
• 18:006 Ensuring valid JSON output
o Importance of response format in JSON
o Using JSON schema for structured output
o Handling invalid JSON responses
• 27:007 Advanced techniques and tools
o Using JSON schema for defining structure
o Example of customer service emails
o Benefits of using structured extraction in production

You'll learn how to use LLMs to extract structure from unstructured data, covering:
• LLM for Data Extraction: Use OpenAI's API to extract structured information from
unstructured data like addresses.
• JSON Schema: Define a JSON schema to ensure consistent and structured output from
the LLM.
• Prompt Engineering: Craft effective prompts to guide the LLM's response and improve
accuracy.
• Data Cleaning: Use string functions and OpenAI's API to clean and standardize data.
• Data Analysis: Analyze extracted data using Pandas to gain insights.
• LLM Limitations: Understand the limitations of LLMs, including potential errors and
inconsistencies in output.
• Production Use Cases: Explore real-world applications of LLMs for data extraction, such
as customer service email analysis.
Here are the links used in the video:

• Jupyter Notebook
• JSON Schema
• Function calling

LLM Topic modelling (Video 3)

PART 1Video summary

This video explains the concept of embeddings in large language models (LLMs) and
demonstrates how to use OpenAI embeddings for topic modeling. It covers the process of
converting text into numerical arrays, visualizing embeddings, and classifying words into
categories using embeddings and clustering algorithms.

Highlights:

• 00:043 Introduction to embeddings

o Converts text into numerical arrays
o Similar numbers indicate similar meanings
o Visualized using TensorFlow projector
• 02:524 Classifying words into categories
o Example with fruits, countries, and companies
o Using OpenAI embeddings for classification
o Comparing different embedding models
• 10:025 Creating embeddings with OpenAI API
o Steps to create embeddings using API
o Importance of model selection
o Differences between embedding models
• 19:336 Clustering embeddings
o Using K-means clustering algorithm
o Visualizing clusters with Matplotlib
o Adjusting the number of clusters
• 27:037 Cost and efficiency of embeddings
o Cost comparison of different models
o Running embeddings on local machines
o Using sentence transformers for embeddings
You'll learn to use text embeddings to find text similarity and use that to create topics
automatically from text, covering:

• Embeddings: How large language models convert text into numerical representations.
• Similarity Measurement: Understanding how similar embeddings indicate similar
meanings.
• Embedding Visualization: Using tools like Tensorflow Projector to visualize embedding
spaces.
• Embedding Applications: Using embeddings for tasks like classification and clustering.
• OpenAI Embeddings: Using OpenAI's API to generate embeddings for text.
• Model Comparison: Exploring different embedding models and their strengths and
weaknesses.
• Cosine Similarity: Calculating cosine similarity between embeddings for more reliable
similarity measures.
• Embedding Cost: Understanding the cost of generating embeddings using OpenAI's API.
• Embedding Range: Understanding the range of values in embeddings and their
significance.
Here are the links used in the video:

• Jupyter Notebook
• Tensorflow projector
• Embeddings guide
• Embeddings reference
• Clustering on scikit-learn
• Massive text embedding leaderboard (MTEB)
• gte-large-en-v1.5 embedding model
• Embeddings similarity threshold

Retrieval Augmented Generation

The video is not available yet. Please review the notebook, which is self-explanatory.

You will learn to implement Retrieval Augmented Generation (RAG) to enhance language models'
responses by incorporating relevant context, covering:

• LLM Context Limitations: Understanding the constraints of context windows in large

language models.
• Retrieval Augmented Generation: The technique of retrieving and using relevant
documents to enhance model responses.
• Embeddings: How to convert text into numerical representations that are used for
similarity calculations.
• Similarity Search: Finding the most relevant documents by calculating cosine similarity
between embeddings.
• OpenAI API Integration: Using the OpenAI API to generate responses based on the most
relevant documents.
• Tourist Recommendation Bot: Building a bot that recommends tourist attractions based
on user interests using embeddings.
• Next Steps for Implementation: Insights into scaling the solution with a vector database,
re-rankers, and improved prompts for better accuracy and efficiency.
Here are the links used in the video:

• Jupyter Notebook
• gte-large-en-v1.5 embedding model
• Awesome vector database

Et Tu Code - Demystifying LLM, AI Mathematics, and Hardware Infra (2024)
No ratings yet
Et Tu Code - Demystifying LLM, AI Mathematics, and Hardware Infra (2024)
541 pages
Planet, Code - PYTHON For LARGE LANGUAGE MODELS - A Beginners Handbook For Leveraging Llms Into Modern Development Workflows and Applications (2025)
No ratings yet
Planet, Code - PYTHON For LARGE LANGUAGE MODELS - A Beginners Handbook For Leveraging Llms Into Modern Development Workflows and Applications (2025)
254 pages
PyTorch For Building Large Language Models
No ratings yet
PyTorch For Building Large Language Models
93 pages
A Comprehensive Overview of Large Language Models: Preprint 1
No ratings yet
A Comprehensive Overview of Large Language Models: Preprint 1
46 pages
LLM - Introduction 2024
No ratings yet
LLM - Introduction 2024
77 pages
Dokumen - Pub Quick Start Guide To Large Language Models Strategies and Best Practices For Using Chatgpt and Other Llms 9780138199425
No ratings yet
Dokumen - Pub Quick Start Guide To Large Language Models Strategies and Best Practices For Using Chatgpt and Other Llms 9780138199425
325 pages
Sinan Ozdemir - Quick Start Guide To Large Language Models, Second Edition-Addison-Wesley (2024)
No ratings yet
Sinan Ozdemir - Quick Start Guide To Large Language Models, Second Edition-Addison-Wesley (2024)
279 pages
2024 - NN - Python Development With Large Language Models From Text To Tasks Python Programming With The Help of Large Language Models - Millie
100% (2)
2024 - NN - Python Development With Large Language Models From Text To Tasks Python Programming With The Help of Large Language Models - Millie
134 pages
Large Language Models (LLM)
No ratings yet
Large Language Models (LLM)
139 pages
(10 December 2024, NeurIPS) Tutorial On Language Modeling
No ratings yet
(10 December 2024, NeurIPS) Tutorial On Language Modeling
255 pages
Complete Generative AI Curriculum
No ratings yet
Complete Generative AI Curriculum
6 pages
GenAI Curriculum (DataSpoof)
No ratings yet
GenAI Curriculum (DataSpoof)
4 pages
Little Guide To Building Large Language Models in 2024
100% (1)
Little Guide To Building Large Language Models in 2024
65 pages
LLMs in Production-MLC - GRC
No ratings yet
LLMs in Production-MLC - GRC
39 pages
Tutorial Membuat RAG AI ChatBot API Dengan Python FastAPI Dan Open Source LLMs
No ratings yet
Tutorial Membuat RAG AI ChatBot API Dengan Python FastAPI Dan Open Source LLMs
41 pages
Training Large Language Models
No ratings yet
Training Large Language Models
7 pages
Deploying GPT and LLM S 1739806000777
No ratings yet
Deploying GPT and LLM S 1739806000777
186 pages
Quick Start Guide To LLMs by Sinan Ozdemir 1703540700
100% (3)
Quick Start Guide To LLMs by Sinan Ozdemir 1703540700
275 pages
Large Language Model (LLM) 1
100% (1)
Large Language Model (LLM) 1
17 pages
To Create A LLM
No ratings yet
To Create A LLM
53 pages
Mastering LLMs and Generative AI
No ratings yet
Mastering LLMs and Generative AI
12 pages
Aryan A. What Is LLMOps. Large Language Models in Production 2024
100% (1)
Aryan A. What Is LLMOps. Large Language Models in Production 2024
67 pages
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
100% (5)
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
326 pages
Toc 9780138199302
No ratings yet
Toc 9780138199302
8 pages
2025 04 22 Intro To LLMsv1
No ratings yet
2025 04 22 Intro To LLMsv1
41 pages
Building A PDF Knowledge Bot With Open-Source LLMs - A Step-by-Step Guide - Shakudo
No ratings yet
Building A PDF Knowledge Bot With Open-Source LLMs - A Step-by-Step Guide - Shakudo
13 pages
Applications of Generative AI - Somsuvra Chatterjee
No ratings yet
Applications of Generative AI - Somsuvra Chatterjee
35 pages
Summer Course Material
No ratings yet
Summer Course Material
52 pages
Know Thy Frenemy
No ratings yet
Know Thy Frenemy
40 pages
LLMS&EMBEDDINGS
No ratings yet
LLMS&EMBEDDINGS
10 pages
Little Guide To Building Large Language Models in 2024
No ratings yet
Little Guide To Building Large Language Models in 2024
65 pages
Harness Proprietary Data With Foundational Models and RAG: by Marian Veteanu
No ratings yet
Harness Proprietary Data With Foundational Models and RAG: by Marian Veteanu
20 pages
Clase1 Generating Your First Text
No ratings yet
Clase1 Generating Your First Text
18 pages
Trending Terms in The AI and LLM Vicinity 1695959485
No ratings yet
Trending Terms in The AI and LLM Vicinity 1695959485
23 pages
《A Primer on Large Language Models and their Limitations
No ratings yet
《A Primer on Large Language Models and their Limitations
33 pages
Understanding Large Language Models (LLMS) - A Mode
No ratings yet
Understanding Large Language Models (LLMS) - A Mode
3 pages
4-HC24.PrimisAI - Hans Bouwmeester.v4
No ratings yet
4-HC24.PrimisAI - Hans Bouwmeester.v4
29 pages
Agentic AI
No ratings yet
Agentic AI
12 pages
Large Language Model (LLM) Interview Question and Answer Course
No ratings yet
Large Language Model (LLM) Interview Question and Answer Course
10 pages
Llmdevdaysession 1 Stakeholderreviewdt 202311151700153986852
No ratings yet
Llmdevdaysession 1 Stakeholderreviewdt 202311151700153986852
43 pages
Userdrive 1844/AIPrompts/65da8a56045061708821078
No ratings yet
Userdrive 1844/AIPrompts/65da8a56045061708821078
62 pages
14 Key Skills To Master Large Language Models 1729745509
No ratings yet
14 Key Skills To Master Large Language Models 1729745509
17 pages
Summary - Foundations On LLMs
No ratings yet
Summary - Foundations On LLMs
6 pages
LangChain From 0 To 1 Public 1 PpuSgEN
No ratings yet
LangChain From 0 To 1 Public 1 PpuSgEN
39 pages
ML A Deep Dive in The World of AI and LLM Tun'Up Munich - 241021 - 130023
No ratings yet
ML A Deep Dive in The World of AI and LLM Tun'Up Munich - 241021 - 130023
34 pages
LLM From Scratch
No ratings yet
LLM From Scratch
27 pages
LLM Overview
No ratings yet
LLM Overview
3 pages
Secret Power Strategies 2
No ratings yet
Secret Power Strategies 2
56 pages
Generative AI Curriculum
No ratings yet
Generative AI Curriculum
2 pages
LLM Review
No ratings yet
LLM Review
31 pages
Applying LLMs To Threat Intelligence - by Thomas Roccia - Nov, 2023 - SecurityBreak
No ratings yet
Applying LLMs To Threat Intelligence - by Thomas Roccia - Nov, 2023 - SecurityBreak
25 pages
Building Finetuning Aimodels
No ratings yet
Building Finetuning Aimodels
41 pages
Exploring The Evolution of Large Language Models: Architectures, Applications, and Future Directions
No ratings yet
Exploring The Evolution of Large Language Models: Architectures, Applications, and Future Directions
11 pages
The Busy Person Intro To LLMs. Covering All The Major Updates in The - by Vishal Rajput - AIGuys - Dec, 2023 - Medium
No ratings yet
The Busy Person Intro To LLMs. Covering All The Major Updates in The - by Vishal Rajput - AIGuys - Dec, 2023 - Medium
1 page
Tugas Bahasa Inggris Dasar PR Bsi
100% (1)
Tugas Bahasa Inggris Dasar PR Bsi
9 pages
Huyenchip Com 2023 04 11 LLM Engineering HTML
No ratings yet
Huyenchip Com 2023 04 11 LLM Engineering HTML
13 pages
LLM Project Guide
No ratings yet
LLM Project Guide
4 pages
Scalexm - Ai: A Compact Guide To Large Language Models
No ratings yet
Scalexm - Ai: A Compact Guide To Large Language Models
9 pages
Week 11 Chats
No ratings yet
Week 11 Chats
5 pages
LLM Basics
No ratings yet
LLM Basics
3 pages
Wiat III Clinician Sample Report A
100% (1)
Wiat III Clinician Sample Report A
37 pages
Cognitive Psychology - MEMORY
No ratings yet
Cognitive Psychology - MEMORY
24 pages
Large Language Models and Their Use Cases
No ratings yet
Large Language Models and Their Use Cases
3 pages
W.J. Mitchell - Word and Image
100% (2)
W.J. Mitchell - Word and Image
21 pages
Lesson 1. Introduction To Media and Information Literacy
No ratings yet
Lesson 1. Introduction To Media and Information Literacy
22 pages
Module 3-Non Digital
93% (14)
Module 3-Non Digital
4 pages
Toward A Research Practice Frederick Kie PDF
No ratings yet
Toward A Research Practice Frederick Kie PDF
31 pages
Translation Strategies in Subtitling
100% (1)
Translation Strategies in Subtitling
12 pages
Practical Research 2 Module 6 Q1
No ratings yet
Practical Research 2 Module 6 Q1
9 pages
LOS For The Role of Discipline in Learner Centered Classrooms
0% (2)
LOS For The Role of Discipline in Learner Centered Classrooms
3 pages
Data Science in Practice
No ratings yet
Data Science in Practice
34 pages
Brethour Module 2 Assignment 1 Target Method Match Plus Blueprint
No ratings yet
Brethour Module 2 Assignment 1 Target Method Match Plus Blueprint
2 pages
Ananda Wood Webpage SomeTeachings
No ratings yet
Ananda Wood Webpage SomeTeachings
63 pages
Manifesting Guide 1
No ratings yet
Manifesting Guide 1
5 pages
G0002A - PP - Teaching Roles, Responsibilities, Relationships and Boundaries
No ratings yet
G0002A - PP - Teaching Roles, Responsibilities, Relationships and Boundaries
19 pages
Foundations of Everyday Leadership Professor Gregory Northcraft
No ratings yet
Foundations of Everyday Leadership Professor Gregory Northcraft
24 pages
Heider (1944) An Experimental Study of Apparent Behavior
No ratings yet
Heider (1944) An Experimental Study of Apparent Behavior
18 pages
Speech Style
No ratings yet
Speech Style
20 pages
The Seven Dimensions of Culture
No ratings yet
The Seven Dimensions of Culture
8 pages
CNN Stanford2015
No ratings yet
CNN Stanford2015
129 pages
Art App Lesson 1
No ratings yet
Art App Lesson 1
19 pages
Age Gender Detection Presentation
No ratings yet
Age Gender Detection Presentation
15 pages
Assessment 1 Sample - EDUC6027
No ratings yet
Assessment 1 Sample - EDUC6027
15 pages
Experimental Research and Non-Experimental Research
No ratings yet
Experimental Research and Non-Experimental Research
14 pages
3 Dworkin'S Theory of Interpretation and The Nature of Jurisprudence
No ratings yet
3 Dworkin'S Theory of Interpretation and The Nature of Jurisprudence
27 pages
Syntax (Word Class)
No ratings yet
Syntax (Word Class)
16 pages
Research That Supports Using The Schoolwide Enrichment Model
No ratings yet
Research That Supports Using The Schoolwide Enrichment Model
10 pages
MasterMind 1 Unit 1 Wordlist
No ratings yet
MasterMind 1 Unit 1 Wordlist
3 pages
Lesson 2 - Comparative Adjectives
No ratings yet
Lesson 2 - Comparative Adjectives
2 pages
Software Engineering & Object Oriented Modeling
From Everand
Software Engineering & Object Oriented Modeling
Jitendra Patel
No ratings yet