What AI can’t do

In June, a team of Apple researchers released a white paper titled “The Illusion of Thinking,” which found that “state-of-the-art [large reasoning models] still fail to develop generalizable problem-solving capabilities, with accuracy ultimately collapsing to zero beyond certain complexities across different environments.”

In other words, once problems get complex enough, reasoning models stop working. Even more concerning, the models aren’t “generalizable,” meaning they might be just memorizing patterns instead of coming up with genuinely new solutions.

“We can make it do really well on benchmarks. We can make it do really well on specific tasks,” said Ali Ghodsi, the CEO of AI data analytics platform Databricks. “Some of the papers you alluded to show it doesn’t generalize. So while it’s really good at this task, it’s awful at very common sense things that you and I would do in our sleep. And that’s, I think, a fundamental limitation of reasoning models right now.”