Priced to help you bring your app to the world
Available now
Available now
Available now
Available now
Our fastest multimodal model with great performance for diverse, repetitive tasks and a 1 million context window.
Free of charge*
Rate Limits**
15 RPM (requests per minute)
1 million TPM (tokens per minute)
1,500 RPD (requests per day)
Input price (per 1M tokens)
Free of charge
Context caching (per 1M tokens)
Free of charge, up to 1 million tokens of storage per hour
Output price (per 1M tokens)
Free of charge
Tuning price
Input/output prices are the same for tuned models. Tuning service is free of charge.
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
1000 RPM (requests per minute)
4 million TPM (tokens per minute)
Input price (per 1M tokens)
$0.075 for <= 128K tokens
$0.15 for > 128K tokens
Context caching (per 1M tokens)
$0.01875 for <= 128K tokens
$0.0375 for > 128K tokens
$1.00 / 1 million tokens per hour (storage)
Output price (per 1M tokens)
$0.30 for <= 128K tokens
$0.60 for > 128K tokens
Tuning price
Input/output prices are the same for tuned models. Tuning service is free of charge.
Prompts/responses used to improve our products
No
Our next-generation model with a breakthrough 2 million context window. Now generally available for production use.
Free of charge*
Rate Limits**
2 RPM (requests per minute)
32,000 TPM (tokens per minute)
50 RPD (requests per day)
Input price (per 1M tokens)
Free of charge
Context caching (per 1M tokens)
Not applicable
Output price (per 1M tokens)
Free of charge
Tuning price
Input/output prices are the same for tuned models. Tuning service is free of charge.
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
360 RPM (requests per minute)
4 million TPM (tokens per minute)
Input price (per 1M tokens)
$3.50 for <= 128K tokens
$7.00 for > 128K tokens
Context caching (per 1M tokens)
$0.875 for <= 128K tokens
$1.75 for > 128K tokens
$4.50 / 1 million tokens per hour (storage)
Output price (per 1M tokens)
$10.50 for <= 128K tokens
$21.00 for > 128K tokens
Tuning price
Input/output prices are the same for tuned models. Tuning service is free of charge.
Prompts/responses used to improve our products
No
Our first-generation model offering only text and image reasoning. Generally available for production use.
Free of charge*
Rate Limits**
15 RPM (requests per minute)
32,000 TPM (tokens per minute)
1,500 RPD (requests per day)
Input price (per 1M tokens)
Free of charge
Context caching (per 1M tokens)
Not applicable
Output price (per 1M tokens)
Free of charge
Tuning price
Input/output prices are the same for tuned models. Tuning service is free of charge.
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
360 RPM (requests per minute)
120,000 TPM (tokens per minute)
30,000 RPD (requests per day)
Input price (per 1M tokens)
$0.50
Context caching (per 1M tokens)
Not available
Output price (per 1M tokens)
$1.50
Tuning price
Input/output prices are the same for tuned models. Tuning service is free of charge.
Prompts/responses used to improve our products
No
Our state-of-the-art text embedding model.
Free of charge*
Rate Limits**
1,500 RPM (requests per minute)
Input price (per 1M tokens)
Free of charge
Context caching (per 1M tokens)
Not applicable
Output price (per 1M tokens)
Free of charge
Tuning price
Not applicable
Prompts/responses used to improve our products
Yes
*Gemini API free tier usage restrictions apply to EEA (including EU), the UK and CH. See Billing FAQs for details.
**Specified rate limits are not guaranteed and actual capacity may vary. Apply for an increased maximum rate limit (for paid tier only).
***Tuned model inference costs are billed at the same price as the base models. To get help with billing, see Cloud Billing support.
****Prices may differ from the prices listed here and the prices offered on Vertex AI. For Vertex prices, see the Vertex documentation.