0% found this document useful (0 votes)
18 views

Quickstart Openai

Uploaded by

Praveen Rai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Quickstart Openai

Uploaded by

Praveen Rai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Quickstart

Purpose
Usage
Getting an API token
OpenAI
Python Library (v1.0 and above)
Python Library (v0.28.1 and below)
Synchronous
Event Loop
API Proxy
Supported Deployments
DALL-E
Python Library
REST API
AWS Bedrock
Python Library
API Proxy
Anthropic
Amazon Titan
Supported Deployments
Google Vertex
Python Library
API Proxy
PaLM 2 for Text
PaLM 2 for Chat
Supported Deployments
Proxy Hosts
Additional Links
Troubleshooting
Known Issues
Seeing exception AttributeError: 'str' object has no attribute 'get'
My requests to Azure OpenAI are timing out!
I want to automate rotation of my access token.
Don't see your issue?

Purpose
The LLM Proxy is a microservice that handles forward proxying of requests of Large Language Models (LLMs).

The general goal here is with Common Identity credentials, users and machine accounts will be able to access LLM APIs through requests to the LLM
proxy. All functionality and paths through direct usage of LLMs such as Azure OpenAI APIs should continue to operate as normal.

NOTE: Avoid sending Cisco, customer, partner, or other third-party confidential or highly confidential information, restricted data, or personal
information when using Azure OpenAI or AWS Bedrock unless your use case has been CASPR reviewed.

Usage
Getting an API token
The easiest way to get a token is to browse to the Webex Developer Portal. After logging in, you can easily get your own bearer token in the
documentation. To obtain a token valid in both the dev and integration environments, go here.

OpenAI

Python Library (v1.0 and above)


from openai import AzureOpenAI

api_version = "2023-07-01-preview"
endpoint = "https://llm-proxy.us-east-2.int.infra.intelligence.webex.com/azure/v1"

client = AzureOpenAI(
api_version=api_version,
azure_endpoint=endpoint,
api_key="<Your CI token here>"
)

completion = client.chat.completions.create(
model="gpt-35-turbo",
messages=[
{
"role": "user",
"content": "Hello",
},
],
)
print(completion.model_dump_json(indent=2))

Python Library (v0.28.1 and below)

Synchronous
import openai

openai.api_type = "azure_ad"
openai.api_base = "https://llm-proxy.us-east-2.int.infra.intelligence.webex.com/azure/v1"
openai.api_version = "2023-05-15"
openai.api_key = '<Your CI token here>'

openai.ChatCompletion.create(
deployment_id="gpt-35-turbo",
messages=[
{"role": "system", "content": "You are a chatbot. You will only respond enthusiastically."},
{"role": "user", "content": "Hi how are you?"},
{"role": "assistant", "content": "I am doing absolutely amazing! How are you?"},
{"role": "user", "content": "what do apples taste like?"}
],
temperature=0.5,
stream=False,
n=5,
)

Event Loop
import openai
import asyncio

openai.api_type = "azure_ad"
openai.api_base = "https://llm-proxy.us-east-2.int.infra.intelligence.webex.com/azure/v1"
openai.api_version = "2023-05-15"
openai.api_key = '<Your CI token here>'

async def complete():


return await openai.ChatCompletion.acreate(
deployment_id="gpt-35-turbo",
messages=[
{"role": "user", "content": "Hi"},
],
temperature=0.5,
stream=False,
)

asyncio.run(complete())

API Proxy
https://llm-proxy.us-east-2.int.infra.intelligence.webex.com/azure/v1/openai/deployments/<deployment-name>/chat
/completions?api-version=2023-06-01-preview \
-H "Content-Type: application/json" \
-H "Authorization: Bearer INSERT_CI_TOKEN" \
-d '{
"messages": [{"role": "user", "content": "hi"}],
"temperature": 0.5,
"max_tokens": 4000,
"top_p": 0.5
}'

Supported Deployments

Deployment Name Model Version Supported

gpt-35-turbo-16k gpt-35-turbo-16k 0301

gpt-35-turbo gpt-35-turbo 0301

text-embedding-ada-002 text-embedding-ada-002 1

gpt-4 gpt-4 0314

DALL-E

NOTE: Per this documentation, DALL-E in Azure OpenAI is not yet supported in the OpenAI 1.0 SDK. Make sure to use versions lower than 1.0

Python Library

api_base = "https://llm-proxy.us-east-2.int.infra.intelligence.webex.com/azure/v1" # Enter your endpoint here


api_key = "token" # Enter your API key here

import openai

openai.api_type = "azure_ad" # make sure to use azure_ad, not azure, for authorization to work properly
openai.api_base = api_base
openai.api_version = "2023-06-01-preview"
openai.api_key = api_key

response = openai.Image.create(
prompt="A big golden retriever, disposable camera",
size="512x512",
n=1,
)

image_url = response["data"][0]["url"]
print(image_url)

REST API
import requests
import json

url = "https://llm-proxy.us-east-2.dev.infra.intelligence.webex.com/azure/v1/openai/images/generations:submit?
api-version=2023-06-01-preview"

payload = json.dumps({
"prompt": "a multi-colored umbrella on the beach, disposable camera",
"size": "1024x1024",
"n": 1
})
headers = {
'Content-Type': 'application/json',
'Authorization': 'Bearer <your CI token here>'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

When making a manual REST request, you do have to a little bit extra work to get the actual image. The initial response from the API is something like:

{
"id": "random-job-token-here",
"status": "notRunning"
}

with headers attached. One of the headers, operation-location, will give the URL of the resulting image.

Example code to handle this, from this example from OpenAI:

import requests
import time
import os

api_base = '<your_endpoint>' # Enter your endpoint here - this will be the base LLM proxy URL
api_key = '<your_key>' # Enter your API key here

# Assign the API version (DALL-E is currently supported for the 2023-06-01-preview API version only)
api_version = '2023-06-01-preview'

# Define the prompt for the image generation


url = f"{api_base}openai/images/generations:submit?api-version={api_version}"
headers= { "api-key": api_key, "Content-Type": "application/json" }
body = {
"prompt": "a multi-colored umbrella on the beach, disposable camera", # Enter your prompt text here
"size": "1024x1024",
"n": 1
}
submission = requests.post(url, headers=headers, json=body)

# Call the API to generate the image and retrieve the response
operation_location = submission.headers['operation-location']
status = ""
while (status != "succeeded"):
time.sleep(1)
response = requests.get(operation_location, headers=headers)
status = response.json()['status']
image_url = response.json()['result']['data'][0]['url']

print(image_url)

AWS Bedrock
Python Library
Not supported yet. Coming soon!

API Proxy

Anthropic
curl --location 'https://llm-proxy.us-east-2.dev.infra.intelligence.webex.com/bedrock/v1/' \
--header 'Authorization: Bearer <Insert your CI Token here>' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic.claude-v1",
"body" : {
"prompt":"Human: Hello who are you?\nAssistant:",
"max_tokens_to_sample":2048,
"temperature":0.5,
"top_k":250,
"top_p":1
}
}'

Amazon Titan
curl --location 'https://llm-proxy.us-east-2.dev.infra.intelligence.webex.com/bedrock/v1/' \
--header 'Authorization: Bearer <Insert your CI Token here>' \
--header 'Content-Type: application/json' \
--data '{
"body": {
"inputText": "Tell me what you can do.",
"textGenerationConfig": {
"maxTokenCount": 512,
"stopSequences": [],
"temperature": 0.7,
"topP": 1
}
},
"model": "amazon.titan-tg1-large"
}'

Supported Deployments

Deployment Name Model Version Supported

Anthropic Claude V1 anthropic.claude-v1 1.3

Anthropic Claude Instant V1 anthropic.claude-instant-v1 1.3

Amazon Titan Large amazon.titan-tg1-large 1.01

Google Vertex

Python Library
Not supported yet, sorry!

API Proxy
PaLM 2 for Text
curl --location 'https://llm-proxy.us-east-2.dev.infra.intelligence.webex.com/vertexai/v1/text-bison:predict' \
--header 'Authorization: Bearer <Insert your CI token here>' \
--header 'Content-Type: application/json' \
--data '{
"instances": [
{ "prompt": "Give me ten interview questions for the role of program manager."}
],
"parameters": {
"temperature": 0.2,
"maxOutputTokens": 256,
"topK": 40,
"topP": 0.95
}
}'

PaLM 2 for Chat


curl --location 'https://llm-proxy.us-east-2.dev.infra.intelligence.webex.com/vertexai/v1/chat-bison:predict' \
--header 'Authorization: Bearer <Insert your CI token here>' \
--header 'Content-Type: application/json' \
--data '{
"instances": [{
"context": "My name is Ned. You are my personal assistant. My favorite movies are Lord of the Rings and
Hobbit.",
"examples": [ {
"input": {"content": "Who do you work for?"},
"output": {"content": "I work for Ned."}
},
{
"input": {"content": "What do I like?"},
"output": {"content": "Ned likes watching movies."}
}],
"messages": [
{
"author": "user",
"content": "Are my favorite movies based on a book series?",
},
{
"author": "bot",
"content": "Yes, your favorite movies, The Lord of the Rings and The Hobbit, are based on book series by J.R.
R. Tolkien.",
},
{
"author": "user",
"content": "When were these books published?",
}],
}],
"parameters": {
"temperature": 0.3,
"maxOutputTokens": 200,
"topP": 0.8,
"topK": 40
}
}'

Supported Deployments

Deployment Name Model Version Supported

PaLM Text text-bison 001

PaLM Chat chat-bison 001

Proxy Hosts
Environment Proxy Host Path Common Identity Access Control
Host

Production (US-EAST- https://llm-proxy.us-east-2.intelligence.webex. /azure https://idbroker.webex. Allow listed machine accounts who have gone through
2) com /v1 com RAI assessment

/bedrock
/v1

Production (EU- https://llm-proxy.eu-central-1.intelligence. /azure https://idbroker-eu. Allow listed machine accounts who have gone through
CENTRAL-1) webex.com /v1 webex.com RAI assessment

/bedrock
/v1

Integration https://llm-proxy.us-east-2.int.infra.intelligence. /azure https://idbrokerbts. All Cisco users and allow listed machine accounts
webex.com /v1 webex.com

/bedrock
/v1

Dev https://llm-proxy.us-east-2.dev.infra. /azure https://idbrokerbts. All Cisco users and allow listed machine accounts
intelligence.webex.com /v1 webex.com

/bedrock
/v1

Additional Links
INT/DEV Common Identity User Token Retrieval via Webex Developer Portal

Azure OpenAI API Documentation

Management APIs

Azure OpenAI is deployed as a part of the Azure AI services. All Azure AI services rely on the same set of management APIs for
creation, update and delete operations. The management APIs are also used for deploying models within an OpenAI resource.

The management APIs should work for querying quota / deployments, etc.

Troubleshooting
Known Issues

Seeing exception AttributeError: 'str' object has no attribute 'get'


Issue: Outgoing requests are returning back an exceptionAttributeError: 'str' object has no attribute 'get'.

Resolution: This issue is caused due to an invalid access token and the auth service returning an unparseable response. Fetch a new token from User
Token Retrieval via Webex Developer Portal and retry the request. We will be modifying the auth service response to return a more readable exception in
a future update.

My requests to Azure OpenAI are timing out!


Issue: Requests to Azure OpenAI are regularly timing out with 504s Proxied request exceeded 120s"

Resolution: This is a known issue that Azure is actively looking to fix in their datacenters. There is alot of demand for LLM usage in Azure and their GPUs
are constantly being overtasked. Your request most likely hit one of those instances and therefore timed out. The best available workaround at the moment
is to implement retry logic if your request does not complete within 10 seconds.

I want to automate rotation of my access token.


Issue: Access token rotation

Resolution: The current standard practice is to provision an CI machine account fromServiceNow and to use Common Identity APIs using the process
described in the Get Access Token Script to renew.
Don't see your issue?
Join the Ask LLM ProxyWebex Teams space and we will try to help out!

You might also like