AB Data

Posted: **Sun Feb 09, 2025 8:58 am**

OpenAI unveils a new flagship model called O1 that can "reason" on its own, ushering in a new paradigm in artificial intelligence and machine learning
OpenAI launches O1
OpenAI just unveiled its new O1 model, which can "reason" on its own before answering a user's query, smashing all benchmarks for complex tasks.

The new OpenAI model also internally codenamed "StrawberryQ*" has been rumored for a long time, even sparking conspiracy theories on Twitter such as "What did Ilya see?" People have long suspected that it is a model that can reason and improve itself, and now the truth has finally been revealed.

3.png

How does it work?
OpenAI O1 or "Strawberry" is a self-reasoning model that can argentina mobile database reason multiple steps before answering a question. The model breaks down complex tasks into multiple steps and then attempts to solve them. It is also self-critical, meaning it can correct itself if it is going in the wrong direction given the context.

This is very similar to how COT or thought chaining prompts work, but the key difference is that the COT step itself is trained through RL, which opens up a new mode of expansion. Hence the naming regression from GPT-4o to "O1".

Early LLMs had a long pre-training step that used a lot of compute so that the LLM would create a model of the world and capture all the information. Then at test time when we asked it a question, it would simply answer the question directly based on what it had learned. But now with O1, the LLM takes multiple steps to reason about its input and then come up with an answer. O1 will start with a relatively small number of reasoning steps, 10-20 steps, taking 15-20 seconds, but OpenAI plans to scale this up to hours, days, and weeks! Imagine asking an LLM to come up with a treatment for cancer, and then having it reason for weeks before coming up with an answer.

AB Data

OpenAI O1: A new paradigm for artificial intelligence

OpenAI O1: A new paradigm for artificial intelligence