Date: December 23, 2024
The last day of the 12-day-long event included announcing OpenAI’s o3 model, which many fans claim to have achieved AGI capabilities.
OpenAI made some impressive new revelations throughout its ‘12 Days of Ship-mas’ event. On the last day, OpenAI made one of its biggest announcements around a new reasoning model that will become the successor of o1 AI models. The newly introduced AI version is called o3, which has aced the ARC challenge, a prestigious AI reasoning test.
“This is a surprising and important step-function increase in AI capabilities, showing novel task adaptation ability never seen before in the GPT-family models,” said the creator of the ARC challenge, who is also a peer engineer at Google.
Even though o3 achieved a breakthrough score in the ARC challenge, it did not win the competition's grand prize. This means that the o3 model has unlocked a major step-function in increasing AI capabilities for general intelligence frameworks.
When compared to o1, o3 overperforms in several benchmarks while performing tasks like complex coding, solving scientific problems, and advanced mathematical challenges. For now, the newly introduced AI model is being cautiously rolled out for safety testing to researchers. The company has launched two models, o3 and o3-mini, offering variations based on the computing power required.
The o3 AI model scored 71.7% accuracy in the SWE-bench verified test, while o1 scored 48.9% accuracy when given the same computing power. This means that not only is the new AI reasoning model more power efficient than o1, but it also comes with the ability to do more in equivalent computing power to the top AI models present in the market.
Another feather in its cap is the EpochAI Frontier Math benchmark, in which the AI created a world-record high score. The o3 model scored a record-high 25.2% in this test, while the historical performance of any AI model has never crossed even 2%. OpenAI is building the new o3 version with power constraints in mind, and the o3 model is the perfect solution for lower-end task requirements.
By Arpit Dubey
Arpit is a dreamer, wanderer, and tech nerd who loves to jot down tech musings and updates. Armed with a Bachelor's in Business Administration and a knack for crafting compelling narratives and a sharp specialization in everything from Predictive Analytics to FinTech—and let’s not forget SaaS, healthcare, and more. Arpit crafts content that’s as strategic as it is compelling. With a Logician mind, he is always chasing sunrises and tech advancements while secretly preparing for the robot uprising.
Reddit Unveils AI-Powered Search Tool for Smarter Results
Reddit launched Reddit Answers, an AI-powered search tool that curates and summarizes discussions to enhance user experience and reduce reliance on Google.
OpenAI Scraps o3 Model, Pushes for Unified GPT-5 in a Major AI Overhaul
OpenAI is canceling its o3 AI model and merging it into GPT-5 for a simpler, more powerful system. A big move to stay ahead in the AI race.
Virtual Reality in Healthcare: Revolutionizing Patient Care
Experience the power of virtual reality in healthcare as it transforms medical training, patient care, and treatment methods with immersive technology for better accuracy, efficiency, and improved outcomes.
Google I/O 2025: Dates Announced for the Tech Giant’s Biggest Event of the Year
Google I/O 2025 is set for May 20-21! Expect big AI reveals, Android 16 updates, and more. Registrations are open for keynotes, demos, and game-changing tech innovations!