The Unified Gateway for Visual Intelligence.
Understand, reason over, and act on images, video, and documents with Orion — our flagship visual agent.
Loved by leading
AI companies
Loved by leading
AI companies
Interact with images, videos, documents in a single API.
Meet the agent that sees, reasons and acts
Frontier models like GPT, Claude, and Gemini can describe what they see – but they can’t act on it. Orion unites the reasoning power of large Vision-Language Models with the accuracy of specialized computer-vision tools – all through one unified API.
One interface for all your visual AI needs.
Images, documents, and videos – all through a single chat-completions interface.
Compose through conversation.
Chain visual operations like: detect → crop → enhance → analyze in a single conversation.
Integrate at warp speed
Drop-in replacement for the OpenAI SDK.
Same API pattern – new visual powers.
Drop-in replacement for the OpenAI SDK. Same API pattern – new visual powers.
Drop-in replacement for the OpenAI SDK. Same API pattern – new visual powers.
Auditable outputs
Every response comes with visual proof. Build and integrate with confidence.

Ridiculously versatile.
Ridiculously versatile.
Whatever your visual task – Orion knows how to act. Built with dozens of specialized computer-vision and multi-modal tools.
01
Image Intelligence
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Detection
Generate rich, contextual descriptions and semantic labels for any image.
Segmentation
Generate rich, contextual descriptions and semantic labels for any image.
Pointing
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
UI Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Image Tools
Generate rich, contextual descriptions and semantic labels for any image.

01
Image Intelligence
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Detection
Generate rich, contextual descriptions and semantic labels for any image.
Segmentation
Generate rich, contextual descriptions and semantic labels for any image.
Pointing
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
UI Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Image Tools
Generate rich, contextual descriptions and semantic labels for any image.

01
Image Intelligence
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Detection
Generate rich, contextual descriptions and semantic labels for any image.
Segmentation
Generate rich, contextual descriptions and semantic labels for any image.
Pointing
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
UI Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Image Tools
Generate rich, contextual descriptions and semantic labels for any image.

01
Image Intelligence
Caption & Tag
Generate rich, contextual descriptions and semantic labels for any image.
Detection
Generate rich, contextual descriptions and semantic labels for any image.
Segmentation
Generate rich, contextual descriptions and semantic labels for any image.
Pointing
Generate rich, contextual descriptions and semantic labels for any image.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
UI Parsing
Generate rich, contextual descriptions and semantic labels for any image.
Image Tools
Generate rich, contextual descriptions and semantic labels for any image.

02
Document Intelligence
Document Parsing
Parse, summarize or split documents in seconds.
Structured OCR
Generate rich, contextual descriptions and semantic labels for any image.
Redaction
Generate rich, contextual descriptions and semantic labels for any image.
02
Document Intelligence
Document Parsing
Parse, summarize or split documents in seconds.
Structured OCR
Generate rich, contextual descriptions and semantic labels for any image.
Redaction
Generate rich, contextual descriptions and semantic labels for any image.
02
Document Intelligence
Document Parsing
Parse, summarize or split documents in seconds.
Structured OCR
Generate rich, contextual descriptions and semantic labels for any image.
Redaction
Generate rich, contextual descriptions and semantic labels for any image.
02
Document Intelligence
Document Parsing
Parse, summarize or split documents in seconds.
Structured OCR
Generate rich, contextual descriptions and semantic labels for any image.
Redaction
Generate rich, contextual descriptions and semantic labels for any image.
03
Video Intelligence
Caption & Tag
Generate comprehensive descriptions and metadata for video content.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
Video Tools
Generate rich, contextual descriptions and semantic labels for any image.
03
Video Intelligence
Caption & Tag
Generate comprehensive descriptions and metadata for video content.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
Video Tools
Generate rich, contextual descriptions and semantic labels for any image.
03
Video Intelligence
Caption & Tag
Generate comprehensive descriptions and metadata for video content.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
Video Tools
Generate rich, contextual descriptions and semantic labels for any image.
03
Video Intelligence
Caption & Tag
Generate comprehensive descriptions and metadata for video content.
Generate & Edit
Generate rich, contextual descriptions and semantic labels for any image.
Video Tools
Generate rich, contextual descriptions and semantic labels for any image.

OpenAI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from openai import OpenAI
client = OpenAI(
base_url="https://agent.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
result = client.chat.completions.create(
model="vlmrun-orion-1",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "Analyze the image & animate into a video."},
{"type": "image_url", "image_url": {"url": "https://..."}}
]}
)
print(result.choices[0].message.content)
OpenAI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from openai import OpenAI
client = OpenAI(
base_url="https://agent.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
result = client.chat.completions.create(
model="vlmrun-orion-1",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "Analyze the image & animate into a video."},
{"type": "image_url", "image_url": {"url": "https://..."}}
]}
)
print(result.choices[0].message.content)
OpenAI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from openai import OpenAI
client = OpenAI(
base_url="https://agent.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
result = client.chat.completions.create(
model="vlmrun-orion-1",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "Analyze the image & animate into a video."},
{"type": "image_url", "image_url": {"url": "https://..."}}
]}
)
print(result.choices[0].message.content)
OpenAI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from openai import OpenAI
client = OpenAI(
base_url="https://agent.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
result = client.chat.completions.create(
model="vlmrun-orion-1",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "Analyze the image & animate into a video."},
{"type": "image_url", "image_url": {"url": "https://..."}}
]}
)
print(result.choices[0].message.content)
OpenAI
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from openai import OpenAI
client = OpenAI(
base_url="https://agent.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
result = client.chat.completions.create(
model="vlmrun-orion-1",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "Analyze the image & animate into a video."},
{"type": "image_url", "image_url": {"url": "https://..."}}
]}
)
print(result.choices[0].message.content)
For Developers
Designed for developers.
Designed for developers.
Familiar API, unfamiliar power
Familiar API, unfamiliar power
All your favorite vision tools, in a single box.
Drop-in replacement for OpenAI SDK.
Handles images, documents, videos via URL or upload.
Streaming support for real-time responses.
Structured outputs with Pydantic / Zod support.
Only pay for what you use
Only pay for what you use.
Only pay for what you use
Granular usage billing keeps your costs in check.
Granular usage billing keeps your costs in check.
100 credits = $1 USD
See credit-pricing table for more details.
Starter
$0
/mo
Pay-as-you-go
1 credit / image or page
1000 free sign-up credits
Up to 25K requests / month
Up to 10 requests / minute
Community Discord
Basic Usage Logs
Pro
$799
/mo
Usage-based
1 credit / image or page
100K credits included / month
Unlimited requests / month
Up to 100 requests / minute
Dedicated Slack support
Zero-Data Retention (ZDR)
Business Associate Agreement (BSA)
Enterprise
Custom
Invoiced Billing
Tier-based pricing
Volume discounts
Unlimited requests / month
Custom rate-limits
Dedicated Slack support
In-VPC Deployments, ZDR
SOC2, HIPAA, BAA Execution
Custom SLAs
For Enterprises
The new visual intelligence layer for your enterprise.
The new visual intelligence layer for your enterprise.
Deploy securely inside your VPC or private cloud – bringing visual intelligence directly to your infrastructure. Power document, image, and video understanding across teams. SOC 2 Type II and HIPAA-ready.






Frequently asked
questions
FAQs
How is Orion different from GPT-5, Claude, or Gemini?
Frontier models can describe what they see — but not act on it. Orion goes beyond perception by planning, executing, and validating visual tasks. Instead of just describing an image, Orion can detect, crop, segment, generate, and analyze in sequence — reliably and deterministically.
What can I build with Orion?
Developers are already using Orion for a wide range of applications – from visual ETL pipelines that detect, extract, and structure information, to automated product and marketing asset generation, document parsing and redaction, video summarization and clipping, and even medical and geospatial visual analysis. If your workflow involves visual data, Orion can make it intelligent and interactive.
How is Orion priced?
Orion’s pricing is designed to be flexible and transparent — based on the tools you use and the volume of visual tasks you run. Each visual capability (detection, segmentation, OCR, generation, etc.) is priced per use, allowing you to scale from experimentation to production without committing to fixed model tiers. Enterprise plans include predictable monthly or on-prem options for teams that need volume pricing, VPC deployments, or compliance guarantees.
How accurate is Orion compared to traditional CV models?
Orion taps into both modern VLM archteictures and traditional computer-vision models, allowing it to reason and accurately perform visual tasks. In benchmarks across several multi-modal tasks (MMMU, MMBench, DocVQA, RefCOCO etc), Orion consistent outperformed leading VLMs on multi-step visual reasoning, structured OCR and traditional CV tasks like detection, segmentation, tracking. See our whitepaper for more details.
How do you keep data private?
Our Pro-tier offering runs entirely in our private cloud deployment. Requests made to our APIs will be logged and made available in our observability dashboard. For our enterprise-tier, we can enforce higher privacy requirements (SOC2, GDPR, HIPAA).
Can I run Orion on-prem on in-VPC?
Yes. For enterprise deployments, VLM Run offers both VPC Peering and In-VPC hosting options — ensuring data never leaves your environment. We’re SOC 2 Type II and HIPAA-ready for teams with compliance requirements.
How is Orion different from GPT-5, Claude, or Gemini?
Frontier models can describe what they see — but not act on it. Orion goes beyond perception by planning, executing, and validating visual tasks. Instead of just describing an image, Orion can detect, crop, segment, generate, and analyze in sequence — reliably and deterministically.
What can I build with Orion?
Developers are already using Orion for a wide range of applications – from visual ETL pipelines that detect, extract, and structure information, to automated product and marketing asset generation, document parsing and redaction, video summarization and clipping, and even medical and geospatial visual analysis. If your workflow involves visual data, Orion can make it intelligent and interactive.
How is Orion priced?
Orion’s pricing is designed to be flexible and transparent — based on the tools you use and the volume of visual tasks you run. Each visual capability (detection, segmentation, OCR, generation, etc.) is priced per use, allowing you to scale from experimentation to production without committing to fixed model tiers. Enterprise plans include predictable monthly or on-prem options for teams that need volume pricing, VPC deployments, or compliance guarantees.
How accurate is Orion compared to traditional CV models?
Orion taps into both modern VLM archteictures and traditional computer-vision models, allowing it to reason and accurately perform visual tasks. In benchmarks across several multi-modal tasks (MMMU, MMBench, DocVQA, RefCOCO etc), Orion consistent outperformed leading VLMs on multi-step visual reasoning, structured OCR and traditional CV tasks like detection, segmentation, tracking. See our whitepaper for more details.
How do you keep data private?
Our Pro-tier offering runs entirely in our private cloud deployment. Requests made to our APIs will be logged and made available in our observability dashboard. For our enterprise-tier, we can enforce higher privacy requirements (SOC2, GDPR, HIPAA).
Can I run Orion on-prem on in-VPC?
Yes. For enterprise deployments, VLM Run offers both VPC Peering and In-VPC hosting options — ensuring data never leaves your environment. We’re SOC 2 Type II and HIPAA-ready for teams with compliance requirements.
How is Orion different from GPT-5, Claude, or Gemini?
Frontier models can describe what they see — but not act on it. Orion goes beyond perception by planning, executing, and validating visual tasks. Instead of just describing an image, Orion can detect, crop, segment, generate, and analyze in sequence — reliably and deterministically.
What can I build with Orion?
Developers are already using Orion for a wide range of applications – from visual ETL pipelines that detect, extract, and structure information, to automated product and marketing asset generation, document parsing and redaction, video summarization and clipping, and even medical and geospatial visual analysis. If your workflow involves visual data, Orion can make it intelligent and interactive.
How is Orion priced?
Orion’s pricing is designed to be flexible and transparent — based on the tools you use and the volume of visual tasks you run. Each visual capability (detection, segmentation, OCR, generation, etc.) is priced per use, allowing you to scale from experimentation to production without committing to fixed model tiers. Enterprise plans include predictable monthly or on-prem options for teams that need volume pricing, VPC deployments, or compliance guarantees.
How accurate is Orion compared to traditional CV models?
Orion taps into both modern VLM archteictures and traditional computer-vision models, allowing it to reason and accurately perform visual tasks. In benchmarks across several multi-modal tasks (MMMU, MMBench, DocVQA, RefCOCO etc), Orion consistent outperformed leading VLMs on multi-step visual reasoning, structured OCR and traditional CV tasks like detection, segmentation, tracking. See our whitepaper for more details.
How do you keep data private?
Our Pro-tier offering runs entirely in our private cloud deployment. Requests made to our APIs will be logged and made available in our observability dashboard. For our enterprise-tier, we can enforce higher privacy requirements (SOC2, GDPR, HIPAA).
Can I run Orion on-prem on in-VPC?
Yes. For enterprise deployments, VLM Run offers both VPC Peering and In-VPC hosting options — ensuring data never leaves your environment. We’re SOC 2 Type II and HIPAA-ready for teams with compliance requirements.
How is Orion different from GPT-5, Claude, or Gemini?
Frontier models can describe what they see — but not act on it. Orion goes beyond perception by planning, executing, and validating visual tasks. Instead of just describing an image, Orion can detect, crop, segment, generate, and analyze in sequence — reliably and deterministically.
What can I build with Orion?
Developers are already using Orion for a wide range of applications – from visual ETL pipelines that detect, extract, and structure information, to automated product and marketing asset generation, document parsing and redaction, video summarization and clipping, and even medical and geospatial visual analysis. If your workflow involves visual data, Orion can make it intelligent and interactive.
How is Orion priced?
Orion’s pricing is designed to be flexible and transparent — based on the tools you use and the volume of visual tasks you run. Each visual capability (detection, segmentation, OCR, generation, etc.) is priced per use, allowing you to scale from experimentation to production without committing to fixed model tiers. Enterprise plans include predictable monthly or on-prem options for teams that need volume pricing, VPC deployments, or compliance guarantees.
How accurate is Orion compared to traditional CV models?
Orion taps into both modern VLM archteictures and traditional computer-vision models, allowing it to reason and accurately perform visual tasks. In benchmarks across several multi-modal tasks (MMMU, MMBench, DocVQA, RefCOCO etc), Orion consistent outperformed leading VLMs on multi-step visual reasoning, structured OCR and traditional CV tasks like detection, segmentation, tracking. See our whitepaper for more details.
How do you keep data private?
Our Pro-tier offering runs entirely in our private cloud deployment. Requests made to our APIs will be logged and made available in our observability dashboard. For our enterprise-tier, we can enforce higher privacy requirements (SOC2, GDPR, HIPAA).
Can I run Orion on-prem on in-VPC?
Yes. For enterprise deployments, VLM Run offers both VPC Peering and In-VPC hosting options — ensuring data never leaves your environment. We’re SOC 2 Type II and HIPAA-ready for teams with compliance requirements.
Try Orion Free today.
Try Orion Free today.






by Autonomi Al Inc. All rights reserved. © 2025
by Autonomi Al Inc. All rights reserved. © 2025
by Autonomi Al Inc. All rights reserved. © 2025
by Autonomi Al Inc. All rights reserved. © 2025



