-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Closed
Closed
Copy link
Labels
:mlMachine learningMachine learning>bugFeature:GenAIFeatures around GenAIFeatures around GenAITeam:MLMeta label for the ML teamMeta label for the ML team
Description
The inference plugin leverages a thread to check for inference requests that are waiting to send based on the rate limit settings. If there are no tasks to send the thread sleeps and wakes up and checks later.
This issue is to investigate if we can improve the code to avoid sleeping and doing nothing. We should look into transitioning to how the TransformScheduler handles it by doing a scheduleWithFixedDelay
. We should also confirm that the threadpool is not utilized until a request is sent.
In 8.18 I believe this will be near immediately because the EIS authorization request is sent after a node boots up.
Metadata
Metadata
Assignees
Labels
:mlMachine learningMachine learning>bugFeature:GenAIFeatures around GenAIFeatures around GenAITeam:MLMeta label for the ML teamMeta label for the ML team