OpenAI O3-Mini - OpenAI
OpenAI O3-Mini - OpenAI
OpenAI o3-mini
Pushing the frontier of cost-effective reasoning.
We’re releasing OpenAI o3-mini, the newest, most cost-efficient model in our reasoning
series, available in both ChatGPT and the API today. Previewed in December 2024, this
powerful and fast model advances the boundaries of what small models can achieve,
delivering exceptional STEM capabilities—with particular strength in science, math, and
coding—all while maintaining the low cost and reduced latency of OpenAI o1-mini.
OpenAI o3-mini is our first small reasoning model that supports highly requested
developer features including function calling, Structured Outputs, and developer
messages, making it production-ready out of the gate. Like OpenAI o1-mini and OpenAI
o1-preview, o3-mini will support streaming. Also, developers can choose between three
reasoning effortoptions—low, medium, and high—to optimize for their specific use
cases. This flexibility allows o3-mini to “think harder” when tackling complex challenges
or prioritize speed when latency is a concern. o3-mini does not support vision
capabilities, so developers should continue using OpenAI o1 for visual reasoning tasks.
o3-mini is rolling out in the Chat Completions API, Assistants API, and Batch API starting
today to select developers in API usage tiers 3-5.
ChatGPT Plus, Team, and Pro users can access OpenAI o3-mini starting today, with
Enterprise access coming in a week. o3-mini will replace OpenAI o1-mini in the model
picker, offering higher rate limits and lower latency, making it a compelling choice for
coding, STEM, and logical problem-solving tasks. As part of this upgrade, we’re tripling
the rate limit for Plus and Team users from 50 messages per day with o1-mini to 150
messages per day with o3-mini. Additionally, o3-mini now works with search to find up-
to-date answers with links to relevant web sources. This is an early prototype as we work
to integrate search across our reasoning models.
Starting today, free plan users can also try OpenAI o3-mini by selecting ‘Reason’ in the
message composer or by regenerating a response. This marks the first time a reasoning
model has been made available to free users in ChatGPT.
While OpenAI o1 remains our broader general knowledge reasoning model, OpenAI o3-
mini provides a specialized alternative for technical domains requiring precision and
speed. In ChatGPT, o3-mini uses medium reasoning effort to provide a balanced trade-
off between speed and accuracy. All paid users will also have the option of selecting o3-
mini-high in the model picker for a higher-intelligence version that takes a little longer
to generate responses. Pro users will have unlimited access to both o3-mini and o3-
mini-high .
Similar to its OpenAI o1 predecessor, OpenAI o3-mini has been optimized for STEM
reasoning. o3-mini with medium reasoning effort matches o1’s performance in math,
coding, and science, while delivering faster responses. Evaluations by expert testers
showed that o3-mini produces more accurate and clearer answers, with stronger
reasoning abilities, than OpenAI o1-mini. Testers preferred o3-mini's responses to o1-
mini 56% of the time and observed a 39% reduction in major errors on difficult real-world
questions. With medium reasoning effort, o3-mini matches the performance of o1 on
some of the most challenging reasoning and intelligence evaluations including AIME
and GPQA.
FrontierMath
Research-level mathematics: OpenAI o3-mini with high reasoning performs better than its predecessor
on FrontierMath. On FrontierMath, when prompted to use a Python tool, o3-mini with high reasoning
effort solves over 32% of problems on the first attempt, including more than 28% of the challenging
(T3) problems.
LiveBench Coding
LiveBench coding: OpenAI o3-mini surpasses o1-high even at medium reasoning effort, highlighting its
efficiency in coding tasks. At high reasoning effort, o3-mini further extends its lead, achieving
significantly stronger performance across key metrics.
General knowledge
With intelligence comparable to OpenAI o1, OpenAI o3-mini delivers faster performance
and improved efficiency. Beyond the STEM evaluations highlighted above, o3-mini
demonstrates superior results in additional math and factuality evaluations with
medium reasoning effort. In A/B testing, o3-mini delivered responses 24% faster than
o1-mini, with an average response time of 7.7 seconds compared to 10.16 seconds.
Read aloud
Model speed and performance
Latency: o3-mini has an avg 2500ms faster time to first token than o1-mini.
Safety
One of the key techniques we used to teach OpenAI o3-mini to respond safely is
deliberative alignment, where we trained the model to reason about human-written
safety specifications before answering user prompts. Similar to OpenAI o1, we find that
o3-mini significantly surpasses GPT-4o on challenging safety and jailbreak evaluations.
Before deployment, we carefully assessed the safety risks of o3-mini using the same
approach to preparedness, external red-teaming, and safety evaluations as o1. We thank
the safety testers who applied to test o3-mini in early access. Details of the evaluations
below, along with a comprehensive explanation of potential risks and the effectiveness
of our mitigations, are available in the o3-mini system card.
Jailbreak Evaluations
What's next
The release of OpenAI o3-mini marks another step in OpenAI’s mission to push the
boundaries of cost-effective intelligence. By optimizing reasoning for STEM domains
while keeping costs low, we’re making high-quality AI even more accessible. This model
continues our track record of driving down the cost of intelligence—reducing per-token
pricing by 95% since launching GPT-4—while maintaining top-tier reasoning
capabilities. As AI adoption expands, we remain committed to leading at the frontier,
building models that balance intelligence, efficiency, and safety at scale.
Authors
OpenAI
Training
Brian Zhang, Eric Mitchell, Hongyu Ren, Kevin Lu, Max Schwarzer, Michelle Pokrass, Shengjia Zhao, Ted
Sanders
Eval
Adam Kalai, Alex Tachard Passos, Ben Sokolowsky, Elaine Ya Le, Erik Ritter, Hao Sheng, Hanson Wang,
Ilya Kostrikov, James Lee, Johannes Ferstad, Michael Lampe, Prashanth Radhakrishnan, Sean Fitzgerald,
Sebastien Bubeck, Yann Dubois, Yu Bai
Andy Applebaum, Elizabeth Proehl, Evan Mays, Joel Parish, Kevin Liu, Leon Maksin, Leyton Ho, Miles
Wang, Michele Wang, Olivia Watkins, Patrick Chao, Samuel Miserendino, Tejal Patwardhan
Engineering
Adam Walker, Akshay Nathan, Alyssa Huang, Andy Wang, Ankit Gohel, Ben Eggers, Brian Yu, Bryan
Ashley, Chengdu Huang, Christian Hoareau, Davin Bogan, Emily Sokolova, Eric Horacek, Eric Jiang,
Felipe Petroski Such, Jonah Cohen, Josh Gross, Justin Becker, Kan Wu, Kevin Whinnery, Larry Lv, Lee
Byron, Manoli Liodakis, Max Johnson, Mike Trpcic, Murat Yesildal, Rasmus Rygaard, RJ Marsan, Rohit
Ramchandani, Rohan Kshirsagar, Roman Huet, Sara Conlon, Shuaiqi (Tony) Xia, Siyuan Fu, Srinivas
Narayanan, Sulman Choudhry, Tomer Kaftan, Trevor Creech
Search
Adam Fry, Adam Perelman, Brandon Wang, Cristina Scheau, Philip Pronin, Sundeep Tirumalareddy, Will
Ellsworth, Zewei Chu
Product
Antonia Woodford, Beth Hoover, Jake Brill, Kelly Stirman, Minnia Feng, Neel Ajjarapu, Nick Turley, Nikunj
Handa, Olivier Godement
Safety
Andrea Vallone, Andrew Duberstein, Enis Sert, Eric Wallace, Grace Zhao, Irina Kofman, Jieqi Yu, Joaquin
Quinonero Candela, Madelaine Boyd, Mehmet Yatbaz, Mike McClay, Mingxuan Wang, Saachi Jain,
Sandhini Agarwal, Sam Toizer, Santiago Hernández, Steve Mostovoy, Young Cha, Tao Li, Yunyun Wang
External Redteaming
Leadership
Aidan Clark, Dane Stuckey, Jerry Tworek, Jakub Pachocki, Johannes Heidecke, Kevin Weil, Liam Fedus,
Mark Chen, Sam Altman, Wojciech Zaremba
Our research
Overview
Index
Latest advancements
OpenAI o1
OpenAI o1-mini