-
Notifications
You must be signed in to change notification settings - Fork 3.9k
The concurrency of AsyncOpenAI cannot be fully utilized. #1725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report, what results do you get if you extract your |
Do you mean this? import asyncio
from functools import wraps
import httpx
import logging
from openai import AsyncOpenAI
# 限制并发请求的装饰器
def limit_async_func_call(max_size: int):
sem = asyncio.Semaphore(max_size)
def final_decro(func):
@wraps(func)
async def wait_func(*args, **kwargs):
async with sem:
try:
return await func(*args, **kwargs)
except Exception as e:
logging.error(f"Exception in {func.__name__}: {e}")
return wait_func
return final_decro
custom_http_client = httpx.AsyncClient(
limits=httpx.Limits(max_connections=2048, max_keepalive_connections=1024),
timeout=httpx.Timeout(timeout=None)
)
openai_async_client = AsyncOpenAI(
api_key="EMPTY", base_url="http://localhost:8203/v1", # 模拟本地 server
http_client=custom_http_client
)
# 假设这个是你要进行并发测试的函数
@limit_async_func_call(max_size=1024) # 限制并发为1024
async def custom_model_if_cache(prompt, system_prompt=None, history_messages=[], **kwargs):
messages = []
if system_prompt:
messages.append({"role": "system", "content": system_prompt})
messages.extend(history_messages)
messages.append({"role": "user", "content": prompt})
# 假设这里是要调用的外部 API
response = await openai_async_client.chat.completions.create(
model="gpt-3.5-turbo", messages=messages, temperature=0, **kwargs
)
return "hi"
|
yes! |
thanks, does this still happen if you just use |
Honestly, I don’t really understand network programming—it’s a bit beyond my skill set. If you could clearly tell me how the code should be changed (or even better, provide me with a modified version), I can quickly test it out! 😊 |
Of course! Here's what that code should look like (I haven't verified it) http_client = httpx.AsyncClient(
limits=httpx.Limits(max_connections=2048, max_keepalive_connections=1024),
timeout=httpx.Timeout(timeout=None)
)
http_client.post(
"http://localhost:8203/v1/chat/completions",
json=dict(model="gpt-3.5-turbo", messages=messages, temperature=0, **kwargs),
) |
I assume code should be like below in http_client = httpx.AsyncClient(
limits=httpx.Limits(max_connections=2048, max_keepalive_connections=1024),
timeout=httpx.Timeout(timeout=None)
)
@limit_async_func_call(max_size=1024) # 限制并发为1024
async def custom_httpx(prompt, system_prompt=None, history_messages=[], **kwargs):
messages = []
if system_prompt:
messages.append({"role": "system", "content": system_prompt})
messages.extend(history_messages)
messages.append({"role": "user", "content": prompt})
response = await http_client.post(
"http://localhost:8203/v1/chat/completions",
json=dict(model="gpt-3.5-turbo", messages=messages, temperature=0, **kwargs),
)
return "hi"
not sure if the following message would help ss -s
Total: 9464
TCP: 13509 (estab 3444, closed 9117, orphaned 10, timewait 5130)
Transport Total IP IPv6
RAW 7 2 5
UDP 5 5 0
TCP 4392 4361 31
INET 4404 4368 36
FRAG 0 0 0 |
Interesting, so you're getting similar results with the SDK and with |
Thanks so much for the investigation here! I'm going to close this in favour of #1596 for tracking. |
Uh oh!
There was an error while loading. Please reload this page.
Confirm this is an issue with the Python library and not an underlying OpenAI API
Describe the bug
I attempted to complete a stability test on the concurrency of AsyncOpenAI. I set the concurrency to 1024 but found that it kept running at a very low average level in a jittery manner, which has been consistent with my production test results.
To Reproduce
I put my code in three part. client.py server.py and main.py(used to create 100k client total)
server.py
client.py
main.py
To reproduce, open two terminal and run
python server.py
python main.py
seperately.I also save the log, you can use following code to draw:
draw.py
Code snippets
No response
OS
ubuntu
Python version
3.12
Library version
latest
The text was updated successfully, but these errors were encountered: