0% found this document useful (0 votes)
22 views43 pages

Agents 2

The document discusses recent advances in large language models (LLMs) focusing on function calling and tool integration. It outlines various models such as Gorilla, ToolAlpaca, ToolLLM, and APIGen, highlighting their datasets, capabilities, and the diversity of APIs they utilize. The document emphasizes the importance of multi-turn dialogs and the ability to handle complex tasks through effective function calling mechanisms.

Uploaded by

Rajdip Ingale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views43 pages

Agents 2

The document discusses recent advances in large language models (LLMs) focusing on function calling and tool integration. It outlines various models such as Gorilla, ToolAlpaca, ToolLLM, and APIGen, highlighting their datasets, capabilities, and the diversity of APIs they utilize. The document emphasizes the importance of multi-turn dialogs and the ability to handle complex tasks through effective function calling mechanisms.

Uploaded by

Rajdip Ingale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

LLMs and Tools

Part-2: Function Calling


Large Language Models: Introduction and Recent Advances
ELL881 · AIL821

Dinesh Raghu
Senior Researcher, IBM Research
LLMs and Tools

Part 1: Incorporating Tools during Fine-tuning (Tool Augmentation)

Part 2: Teaching LLMs to Use APIs and Functions (Function Calling)

Part 3: Automating Complex, Multi-step Tasks (Agentic Workflows)

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Motivation
Let's say you are tasked to build a chatbot for IIT Delhi students using LLMs.

The goal is to add a conversational interface for the supports


1. institute rules/policy related queries
2. searching/adding/dropping courses
3. query academic calendar
4. pay fees

How would you build such a chatbot?


LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


What is Function Calling?

Image credit: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


What is Function Calling?
1. User specifies tools and enters a query

Image credit: https://docs.mistral.ai/capabilities/function_calling/


LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


What is Function Calling?
2. Model identifies the function and its arguments

tool_calls=[ FunctionCall(name='payment_status',
arguments={"transaction_id": "T1001"}) ]

Image credit: https://docs.mistral.ai/capabilities/function_calling/


LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


What is Function Calling?
3. User executes the function to obtain tool results

Function Call Output:

{"status": "Paid"}

Image credit: https://docs.mistral.ai/capabilities/function_calling/


LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


What is Function Calling?
4. Model uses the results to generate the final answer

Function Call Output:

{"status": "Paid"}

The status of your transaction with ID T1001 is "Paid". Is there


anything else I can assist you with?

Image credit: https://docs.mistral.ai/capabilities/function_calling/


LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


What is Function Calling?

1. User specifies tools and enters a query

2. Model identifies the function and its arguments

3. User executes the function to obtain tool results

4. Model uses the results to generate the final answer

Image credit: https://docs.mistral.ai/capabilities/function_calling/


LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


What data do we need?
What's the status of my transaction T1001? Tools

tool_calls=[ FunctionCall(name='payment_status',
arguments={"transaction_id": "T1001"}) ]

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


What data do we need?
What's the status of my transaction T1001? Tools

tool_calls=[ FunctionCall(name='payment_status',
arguments={"transaction_id": "T1001"}) ]

What's the status of my transaction T1001? Tool Output

The status of your transaction with ID T1001 is "Paid". Is there anything


else I can assist you with?

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


What data do we need?
What's the status of my transaction T1001? Tools What's the status of my transaction? Tools

tool_calls=[ FunctionCall(name='payment_status',
tool_calls=[ FunctionCall(name='payment_status', arguments={"transaction_id": ”?"}) ]
arguments={"transaction_id": "T1001"}) ]

Hi there! I can help with that. Can you please provide your transaction ID?

What's the status of my transaction T1001? Tool Output

The status of your transaction with ID T1001 is "Paid". Is there anything


else I can assist you with?

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Outline

1. Gorilla

2. ToolAlpaca

3. ToolLLM

4. APIGen

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Gorilla: Large Language Model Connected with Massive APIs

1. Synthesized a dataset named APIBench


• Using model cards in HuggingFace Model Hub, Torch hub, and
Tensorflow Hub

2. Finetuned LLaMA-7B model with APIBench to create the Gorilla model

*Gorilla: Large Language Model Connected with Massive APIs, Patil et. al., Nov 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Gorilla: Large Language Model Connected with Massive APIs

*Gorilla: Large Language Model Connected with Massive APIs, Patil et. al., Nov 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Gorilla: Large Language Model Connected with Massive APIs

First to use retriever augmented


training for APIs

• Leads to better generalization

• Can adapt to change in API specs

*Gorilla: Large Language Model Connected with Massive APIs, Patil et. al., Nov 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Gorilla: Large Language Model Connected with Massive APIs

AST Sub-Tree Matching For Evaluation

*Gorilla: Large Language Model Connected with Massive APIs, Patil et. al., Nov 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Gorilla: Large Language Model Connected with Massive APIs

Error Types in Function Calling

*Gorilla: Large Language Model Connected with Massive APIs, Patil et. al., Nov 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Gorilla: Large Language Model Connected with Massive APIs

*Gorilla: Large Language Model Connected with Massive APIs, Patil et. al., Nov 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


BFCL: Berkeley Function-Calling Leaderboard

image credits: screenshot of https://gorilla.cs.berkeley.edu/leaderboard.html

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Summary
1. APIBench (Gorilla)
• Low diversity in APIs
• Single turn dialogs

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


ToolAlpaca*

1. Synthesized a dataset 3000 examples


• Using 400 real-world inspired tools spanning 50 distinct categories

2. Finetuned Vicuna-7B (and 13B) model to create ToolAlpaca-7B (and 13B )

*ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases, Tang et. al., Sep 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


ToolAlpaca*

*ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases, Tang et. al., Sep 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


ToolAlpaca*

*ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases, Tang et. al., Sep 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


ToolAlpaca*

Evaluation results on unseen simulated tools and real-world APIs

*ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases, Tang et. al., Sep 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Summary
1. APIBench (Gorilla)
• Low diversity in APIs
• Single turn dialogs

2. ToolAlpaca Dataset
• High Diversity (400 real world inspired APIs)
• Multi turn dialogs with question generation and response generation

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


ToolLLM*

1. Synthesized a dataset named ToolBench


• Scraped 16, 464 real-word REST APIs from RapidAPI Hub

2. Finetuned LLaMA-7B model with ToolBench to create the ToolLLaMA model

*ToolLLM: Facilitating Large Language Models to Master 16000+ Real World APIs, Qin et. al., Oct 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


ToolLLM*

*ToolLLM: Facilitating Large Language Models to Master 16000+ Real World APIs, Qin et. al., Oct 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


ToolLLM*

*ToolLLM: Facilitating Large Language Models to Master 16000+ Real World APIs, Qin et. al., Oct 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


ToolLLM*

OOD generalization experiments on APIBench

*ToolLLM: Facilitating Large Language Models to Master 16000+ Real World APIs, Qin et. al., Oct 2023

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Summary
1. APIBench (Gorilla)
• Low diversity in APIs
• Single turn dialogs

2. ToolAlpaca Dataset
• High Diversity (400 real world inspired APIs)
• Multi turn dialogs with question generation and response generation

3. ToolBench (ToolLLM)
• High diversity (16K real world APIs from RapidAPI Hub)
• Multi turn dialogs
• Has single tool setup and multi tool setup

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


APIGen*

1. Synthesized a dataset named xlam-function-calling-60k


• Used only 3673 real-word REST APIs from RapidAPI Hub (scraped for ToolBench)

2. Finetuned DeepSeek-Coder-1.3B-instruct (and 7B) model with xlam-function-calling-


60k to create the s xLAM-1B(FC) (and 7B) model

* APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Dataset, Liu et. al., Jun 2024

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


APIGen*

* APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Dataset, Liu et. al., Jun 2024

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


APIGen*

* APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Dataset, Liu et. al., Jun 2024

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


APIGen*
Execution Checker

1. Functions are executed using the appropriate


backend
• Python functions are directly imported and
executed in a separate subprocess
• REST APIs are called to obtain results and
status codes).
2. Unsuccessful executions are filtered out

* APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Dataset, Liu et. al., Jun 2024

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


APIGen*
Semantic Checker

• Does the final answer semantically align with the


query’s objective?
• Query-answer pairs that execute successfully
can produce meaningless results due to
1. infeasible queries
2. incorrect arguments

* APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Dataset, Liu et. al., Jun 2024

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


APIGen*

* APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Dataset, Liu et. al., Jun 2024

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


APIGen*

image credits: screenshot of https://gorilla.cs.berkeley.edu/leaderboard.html

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Summary
1. APIBench (Gorilla)
• Low diversity in APIs
• Single turn dialogs

2. ToolAlpaca Dataset
• High Diversity (400 real world inspired APIs)
• Multi turn dialogs with question generation and response generation

3. ToolBench (ToolLLM)
• High diversity (16K real world APIs from RapidAPI Hub)
• Multi turn dialogs
• Has single tool setup and multi tool setup

4. xlam-function-calling-60k (APIGen)
• High diversity (3K high quality real world APIs from RapidAPI Hub)
• high quality multi turn dialogs – thanks to the 3 stage filtering
• Has single tool setup and multi tool setup along with its parallel variants

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Granite-Function-Calling-Model*
Issues with existing open models:
1. Openness: The best performing models are proprietary and the ones that have open licenses (e.g., Gorilla)
are trained using data generated from OpenAI models

2. Generalizability: Even though the datasets are generated using diverse sets of APIs (e.g., RapidAPIs),
Basu et al. (2024) has shown that models trained on these datasets have difficulty generalizing to out-of-
domain datasets

3. Granular tasks: Function calling encompasses multiple granular sub-tasks such as function-name
detection, slot filling, and detecting the ordered sequence of functions needed to be called. Existing
models trained to perform function calling lack the ability to handle these granular tasks independently

*Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks, Abdelaziz et. Al., Jun 2024

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Granite-Function-Calling-Model*

1. Openness: synthesize training data using models with provide open license

2. Generalizability: repurposed existing datasets with permissible license and added new synthesized
datasets to get better generalization

3. Granular tasks: enabled support for granular tasks

*Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks, Abdelaziz et. Al., Jun 2024

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Granite-Function-Calling-Model*

*Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks, Abdelaziz et. Al., Jun 2024

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu


Summary 5. Granite-Function-Calling-Model
• High diversity
1. APIBench (Gorilla) • Has single tool, & multiple tool setup
• Low diversity in APIs • Has multi turn dialogs
• Single turn dialogs • Open-sourced data and model with
permissible license
2. ToolAlpaca Dataset
• High Diversity (400 real world inspired APIs)
• Multi turn dialogs with question generation and response generation

3. ToolBench (ToolLLM)
• High diversity (16K real world APIs from RapidAPI Hub)
• Multi turn dialogs
• Has single tool setup and multi tool setup

4. xlam-function-calling-60k (APIGen)
• High diversity (3K high quality real world APIs from RapidAPI Hub)
• high quality multi turn dialogs – thanks to the 3 stage filtering
• Has single tool setup and multi tool setup along with its parallel variants

LLMs: Introduction and Recent Advances

LLMs: Introduction and Recent Advances Tanmoy Chakraborty Dinesh Raghu

You might also like