Home » OpenAI’s o1 model sparks controversy

OpenAI’s o1 model sparks controversy

OpenAI’s new O1 model is causing a stir in the AI community. The model can think and ponder as you use it. But is it really thinking?

Fascinating (premature?) victory lap from Sam Altman & OpenAI. This is quite the declaration.

Take this sort of stuff with a grain of salt, but also as useful signal about attitudes of AI insiders actually building new models. https://t.co/kN2NPq2Ud2 pic.twitter.com/qiIr5rUxh8

— Ethan Mollick (@emollick) September 23, 2024

And what does it mean if it is? Building such advanced AI has great potential but also big risks. We need to balance the threats of possible destruction against the chances for major improvements in human life.

Comparing the creativity of a representative human sample to GPT-4 finds "the creative ideas produced by AI chatbots are rated more creative than those created by humans… Augmenting humans with AI improves human creativity, albeit not as much as ideas created by ChatGPT alone." pic.twitter.com/ISCj6QKSxw

— Ethan Mollick (@emollick) September 23, 2024

A popular tech podcast dives into the reasons OpenAI built the O1 model. They discuss its unique launch and the public’s worries about what this AI can do.

A mindworm that has caused great damage among AI researchers is the implicit, universal assumption that every piece of data is a (random) "sample" from a (static) "distribution". This is a valid way of modeling some phenomena, but it isn't applicable to the vast majority of…

— François Chollet (@fchollet) September 24, 2024

The podcast looks at what this moment in AI means as companies make “agents” that can do more tasks for us.

The podcast also covers major legal and political news. This includes the trials for TikTok and Google, which are testing new areas in tech regulation. They also talk about former President Trump’s new cryptocurrency project and how it could affect the tech world.

@Forbes tech roundup: @OpenAI issues a warning, @nvidia and @salesforce strategy to enhance #AI for #businesses, @Intuit launches enterprise suite, @SlackHQ new #AI-powered assistant, and @Microsoft new Copilot Pages #techsolutions

Read: https://t.co/uED1Zx24j7 pic.twitter.com/NCSYLfnsv3

— Gene Marks CPA (@genemarks) September 22, 2024

Lastly, the podcast answers a key question from listeners: how to manage and clean up data on devices. They share tips and tools to help users easily deal with digital clutter. Check out the podcast for more details on OpenAI, TikTok, Google, and digital cleanup tools.

OpenAI’s latest GPT model, GPT-o1, is causing debate in AI. This is due to unexpected and troubling AI hallucinations. These AI-made falsehoods can show up in the outputs and the model’s chain-of-thought (CoT) reasoning.

“AI hallucinations” occur when AI systems create content that is wrong or made up. These errors can be obvious, like saying Abraham Lincoln flew jet planes, or they can be subtle and harder to spot.

This affects the reliability and trustworthiness of AI-generated content. Chain-of-thought reasoning, or CoT, is when AI works through a problem step-by-step, like how humans think logically. This aims to give more complete and correct results by showing users the reasoning process.

GPT-o1 requires CoT reasoning for every query. This can be good for fields like science, math, and coding, where complex problem-solving is needed. However, due to the heavy computing involved, it also means slower responses and maybe higher costs.

A key issue with GPT-o1 is that the CoT shown to users is not the actual reasoning the AI uses internally. OpenAI summarizes CoT for users, revealing the true process could risk their private info. This lack of openness raises doubts about how real and reliable the shown CoT is.

OpenAI’s chain-of-thought method scrutinized

Usually, AI hallucinations have been tied to generated content like essays or articles. But with GPT-o1, these hallucinations can also sneak into the shown CoT.

Since the visible CoT contains more AI-made content, it is at the same risk of hallucinations as any other AI output. AI hallucinations in the CoT are worrying because they can make people distrust AI systems meant to help human choices. With users often unable to see the real CoT, it’s harder to judge if AI outputs are correct and reliable.

As AI evolves, fixing AI hallucinations, especially in CoT reasoning, is key to keeping trust in AI systems that generate content. Being open about AI processes and studying how to reduce these hallucinations will be important steps forward. For those invested in AI and its uses, seeing and tackling these limits will be vital to using AI’s full potential while lowering its risks.

OpenAI’s new model is a big deal because it goes beyond just language skills. It can do complex reasoning, which matters for physics, coding, and more. So far, most progress in large language models (LLMs) has been about language.

However, OpenAI’s new o1 model is good at multi-step reasoning. This skill is needed for advanced math, coding, and other STEM areas. OpenAI says the o1 model uses a “chain of thought” method.

This lets it spot and fix mistakes, break down complex steps, and try different solutions. Tests show the model does well. It ranks in the top 11% on coding questions and would be among the top 500 high school students in the USA Math Olympiad.

It can also answer PhD-level questions in topics ranging from space science to chemistry. This matters because LLMs have mostly focused on language tasks so far, which has led to chatbots and voice assistants that can understand and make words.

But these LLMs often get facts wrong. They also lack the skills to solve key problems in fields like making new drugs, materials science, coding, or physics. OpenAI’s o1 model is one of the first to show that LLMs might soon be useful tools for researchers in these areas.

However, some experts warn we must be careful when comparing LLMs to human skills. It’s hard to measure how well a model like o1 can truly “reason.”

The model also still lacks open-ended reasoning. And it costs more to use this reasoning-heavy model.

OpenAI’s user surveys show that GPT-4o is still better for tasks that involve language. Even with these challenges, OpenAI’s new model is a big step forward. It hints that we are starting a race to make AI models that can outperform humans.

The new model could unlock new abilities and help solve complex problems. But we will only see its true potential as researchers and labs explore its potential.

April Isaacs

April Isaacs is a news contributor for DevX.com She is long-term, self-proclaimed nerd. She loves all things tech and computers and still has her first Dreamcast system. It is lovingly named Joni, after Joni Mitchell.

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.