Has Ed Zitron Discovered the Deadly Flaw with OpenAI and Its Flagship ChatGPT?

[ad_1]

Ed Zitron has been relentlessly pursuing the questionable economics of AI and has tentatively recognized a bombshell in his newest put up, Unique: Right here’s How A lot OpenAI Spends On Inference and Its Income Share With Microsoft. If his discovering is legitimate, giant language fashions like ChapGPT are a lot farther from ever changing into economically viable than even optimists think about. No surprise OpenAI chief Sam Altman has been speaking up a bailout.

By means of background, over a collection of sometimes very lengthy and relentlessly documented articles, Zitron has demonstrated (amongst many different issues) the completely monumental capital expenditures of the main AI incumbents versus comparatively skinny revenues, not to mention earnings. Zitron’s articles on the large money burn and large capital misallocation that AI represents have the work of Gary Marcus on elementary efficiency shortcomings as de facto companion items. A sampling of Marcus’ badly wanted sobriety:

5 latest, ominous indicators for Generative AI

5 indicators that Generative AI is dropping traction

Might China devastate the US with out firing a shot?

For a fast verification of how unsustainable OpenAI’s economics are, see the opening paragraph from Marcus’ November 4 article, OpenAI in all probability can’t make ends meet. That’s the place you are available:

A couple of days in the past, Sam Altman bought significantly pissed off when Brad Gerstner had the temerity to ask how OpenAI was going to pay the $1.4 trillion in obligations he was taking over, given a mere $13 billion in income.

By means of reference, most estimates of the scale of the subprime mortgage market centered on $1.3 trillion. And the AAA tranches of the bonds on mortgage swimming pools of AAA bonds had been cash good ultimately, though they did fall in worth in the course of the disaster when that was unsure. And in foreclosures, the houses practically all the time had some liquidation worth.

Now to Zitron’s newest.

Many, significantly AI advocates within the enterprise press, contend that even when the AI behemoths go bankrupt or are in any other case duds, they may nonetheless depart one thing of appreciable worth, because the constructing of the railroads (which spawned many bankruptcies) or the dot-com bubble did.

However these assumptions appear to be usually based mostly on a naive view of AI economics, that having made an enormous expenditure on coaching, the continuing prices of working queries is just not excessive and can drop to bupkis. This was the case with railroads, which had excessive mounted prices and negligible variable prices. The community results of Web companies produce related outcomes, with scale will increase producing each appreciable person advantages and decreasing per-customer prices.

That’s not the case with AI. Not solely are there very giant coaching prices, there are additionally “inference” prices. And so they aren’t simply appreciable; they’ve vastly exceeded coaching value. The viability of AI is dependent upon inference prices dropping to a relatively low degree.

Zitron’s doubtlessly devastating discover is breadcrumbs that counsel that OpenAI’s inference prices are significantly greater than they fake. Zitron additional posits that the person costs for ChatGPT significantly subsidize the inference expenditures. As a result of the reporting on AI economics by all the massive gamers is so abjectly terrible, Zitron’s allegations could nicely pan out.

First, a detour to elucidate extra about inference. From Primitiva Substacks’ All You Must Learn about Inference Price from the tip of 2024. Emphasis unique:

Over the primary 16 months after the launch of Gpt-3.5, the market’s consideration was fixated on coaching prices, usually making headlines for his or her staggering scale. Nonetheless, following the wave of API value cuts in mid-2024, the highlight has shifted to inference prices—revealing that whereas coaching is pricey, inference, much more.

In keeping with Barclays, coaching the GPT-4 collection required roughly $150 million in compute assets. But, by the tip of 2024, GPT-4’s cumulative inference prices are projected to achieve $2.3 billion—15x the price of coaching.

As an apart, Gary Marcus identified in October that GPT-5 didn’t arrive in 2024 as had been predicted and has been disappointing. Again to Primitiva:

The September 2024 launch of GPT-o1 additional accelerated compute demand to shift from coaching in direction of inference. GPT-o1 generates 50% extra tokens per immediate in comparison with GPT-4o and its enhanced reasoning capabilities end result within the era of inference tokens at 4x output tokens of GPT-4o.

Tokens, the smallest models of textual knowledge processed by fashions, are central to inference compute. Sometimes, one phrase corresponds to about 1.4 tokens. Every token interacts with each parameter in a mannequin, requiring two floating-point operations (FLOPs) per token-parameter pair. Inference compute could be summarized as:

Whole FLOPs ≈ Variety of Tokens × Mannequin Parameters × 2 FLOPs.

Compounding this quantity enlargement, the worth per token for GPT o1 is 6x that for GPT-4o’s, leading to a 30-fold improve in complete API prices to carry out the identical activity with the brand new mannequin. Analysis from Arizona State College exhibits that, in sensible purposes, this value can soar to as a lot as 70x. Understandably, GPT-o1 has been accessible solely to paid subscribers, with utilization capped at 50 prompts per week….

The associated fee surge of GPT-o1 highlights the trade-off between compute prices and mannequin capabilities, as theorized by the Bermuda Triangle of GenAI: the whole lot else equal, it’s inconceivable to make simultaneous enhancements on inference prices, mannequin efficiency, and latency; enchancment in a single will essentially come at sacrifice of one other.

Nonetheless, developments in fashions, methods, and {hardware} can broaden this “triangle,” enabling purposes to decrease prices, improve capabilities, or cut back latency. Consequently, the tempo of those value reductions will finally dictate the velocity of worth creation in GenAI….

James Watt’s steam engine was such an instance. It was invented in 1776, however took 30 years of improvements, such because the double-acting design and centrifugal governor, to lift thermal effectivity from 2% to 10%—making steam engines a viable energy supply for factories…

For GenAI, inference prices are the equal barrier. Not like pre-generative AI software program merchandise that had been thought to be a superior enterprise mannequin than “conventional companies” largely due to its near-zero marginal value, GenAI purposes must pay for GPUs for real-time compute.

Zitron is suitably cautious about his findings; maybe some heated denials from OpenAI will clear issues up. Do learn the complete put up; I’ve excised many key particulars in addition to some qualifiers to spotlight the central concern. From Zitron:

Primarily based on paperwork seen by this publication, I’m able to report OpenAI’s inference spend on Microsoft Azure, along with its funds to Microsoft as a part of its 20% income share settlement, which was reported in October 2024 by The Data. In easier phrases, Microsoft receives 20% of OpenAI’s income….

These numbers on this put up differ to these which were reported publicly. For instance, earlier experiences had stated that OpenAI had spent $2.5 billion on “value of income” – which I consider are OpenAI’s inference prices – within the first half of CY2025.

In keeping with the paperwork seen by this article, OpenAI spent $5.02 billion on inference alone with Microsoft Azure within the first half of Calendar Yr CY2025.

As a reminder: inference is the method by which a mannequin creates an output.

It is a sample that has continued by the tip of September. By that time in CY2025 — three months later — OpenAI had spent $8.67 billion on inference.

OpenAI’s inference prices have risen persistently during the last 18 months, too. For instance, OpenAI spent $3.76 billion on inference in CY2024, which means that OpenAI has already doubled its inference prices in CY2025 by September.

Primarily based on its reported revenues of $3.7 billion in CY2024 and $4.3 billion in income for the primary half of CY2025, it appears that evidently OpenAI’s inference prices simply eclipsed its revenues.

But, as talked about beforehand, I’m additionally capable of make clear OpenAI’s revenues, as these paperwork additionally reveal the quantities that Microsoft takes as a part of its 20% income share with OpenAI.

Concerningly, extrapolating OpenAI’s revenues from this income share doesn’t produce numbers that match these beforehand reported.

In keeping with the paperwork, Microsoft obtained $493.8 million in income share funds in CY2024 from OpenAI — implying revenues for CY2024 of at the least $2.469 billion, or round $1.23 billion lower than the $3.7 billion that has been beforehand reported.

Equally, for the primary half of CY2025, Microsoft obtained $454.7 million as a part of its income share settlement, implying OpenAI’s revenues for that six-month interval had been at the least $2.273 billion, or round $2 billion lower than the $4.3 billion beforehand reported. By September, Microsoft’s income share funds totalled $865.9 million, implying OpenAI’s revenues are at the least $4.329 billion.

In keeping with Sam Altman, OpenAI’s income is “nicely extra” than $13 billion. I’m not positive the best way to reconcile that assertion with the paperwork I’ve seen….

As a result of sensitivity and significance of this data, I’m taking a much more blunt strategy with this piece.

Primarily based on the data on this piece, OpenAI’s prices and revenues are doubtlessly dramatically completely different to what we believed. The Data reported in October 2024 that OpenAI’s income might be $4 billion, and inference prices $2 billion based mostly on paperwork “which embody monetary statements and forecasts,” and particularly added the next:

OpenAI seems to be burning far much less money than beforehand thought. The corporate burned by about $340 million within the first half of this 12 months, leaving it with $1 billion in money on the stability sheet earlier than the fundraising effort. However the money burn may speed up sharply within the subsequent couple of years, the paperwork counsel.

I have no idea the best way to reconcile this with what I’m reporting at the moment. Within the first half of CY2024, based mostly on the data within the paperwork, OpenAI’s inference prices had been $1.295 billion, and its revenues at the least $934 million.

Certainly, it’s robust to reconcile what I’m reporting with a lot of what has been reported about OpenAI’s prices and revenues.

So that is fairly a gauntlet to have thrown down. Not solely is he saying that OpenAI should still have business-potential-wrecking compute prices., however his proof signifies that OpenAI has additionally been making critical misrepresentations about prices and revenues. As a result of OpenAI is just not public, OpenAI has not essentially engaged in fraud; one presumes it have correct with these to whom it has monetary reporting obligations about cash issues. But when Zitron has this proper, OpenAI has been telling howlers to different essential stakeholders.

The Monetary Occasions, with whom Zitron reviewed his knowledge earlier than publishing, is amplifying them. From How excessive are OpenAI’s compute prices? Probably loads greater than we thought:

Pre-publication, Ed was form sufficient to debate with us the data he has seen. Listed here are the inference prices as a chart:

The article then appropriately presents caveats, as did Zitron lengthy type, together with kinda-sorta feedback from Microsoft and OpenAI:

The very best place to start is by saying what the numbers don’t present. The above is known to be for inference solely…

Extra importantly, is the information right? We confirmed Microsoft and OpenAI variations of the figures offered above, rounded to a a number of, and requested in the event that they recognised them to be broadly correct. We additionally put the information to individuals accustomed to the businesses and requested for any steering they might supply.

A Microsoft spokeswoman informed us: “We received’t get into specifics, however I can say the numbers aren’t fairly proper.” Requested what precisely that meant, the spokeswoman stated Microsoft wouldn’t remark and didn’t reply to our subsequent requests. An OpenAI spokesman didn’t reply to our emails apart from to say we should always ask Microsoft.

An individual accustomed to OpenAI stated the figures we had proven them didn’t give a whole image, however declined to say extra. In brief, although we’ve been unable to confirm the information’s accuracy, we’ve been given no cause to doubt it considerably both. Make of that what you’ll.

Taking the whole lot at face worth, the figures seem to point out a disconnect between what’s been reported about OpenAI’s funds and the working prices which might be going by Microsoft’s books…

As Ed writes, OpenAI seems to have spent greater than $12.4bn at Azure on inference compute alone within the final seven calendar quarters. Its implied income for the interval was a minimal of $6.8bn. Even permitting for some fudging between annualised run charges and period-end totals, the obvious hole between revenues and working prices is much more than has been reported beforehand. And, like Ed, we’re struggling to elucidate how the numbers could be to this point aside.

If the information is correct — which we are able to’t assure, to reiterate, however we’re penning this put up after giving each corporations each alternative to inform us that it isn’t — then it will name into query the enterprise mannequin of OpenAI and practically each different general-purpose LLM vendor. Sooner or later, going by the figures, both working prices must collapse or buyer expenses must rise dramatically. There’s no trace of both development taking maintain but.

A fast search on Twitter finds nobody but trying to put a glove on Zitron. Within the pink paper feedback part, a couple of contend that Microsoft making weak protests concerning the knowledge means it will probably’t be relied upon. Whereas that’s narrowly right, one would anticipate a extra sturdy debunking given the implications. And a few of the supportive feedback add worth, like:

Bildermann
It explains why ChatGPT has develop into so dumb. They’re attempting to cut back inference prices.

His title is Robert Paulson
The actual fact we now have to make use of a gypsy with a magic 8 ball to determine these numbers for the corporate that’s “going to revolutionize each trade” is extra telling then the numbers themselves

No F1 key

Zitron has positively been hitting that haterade, however Microsoft press saying the numbers ‘aren’t fairly proper’ makes me suppose that is fairly correct.

manticore
That creaking noise is the lid being prized off the can of worms –

MS had higher get on high of this. That earnings stream is very unlikely – becoz straight line and so forth and so forth – which implies that their projections are going to be badly affected and presumably there needs to be a Ok break up within the projection line sooner or later. MS getting holed beneath the waterline has real-world impacts.

Multipass
I’ve been studying Ed’s weblog for some time now and whereas he’s clearly biased in a single course, it comes throughout as infinitely extra credible than something Sam Altman has stated in years.

The actual difficulty in my eyes is that the income numbers are so opaque and obfuscated that no one has any thought if any of this may earn money.

The truth that Microsoft and Google appear to be deliberately muddying the waters relating to non-hosting-related LLM-driven revenues and that OpenAI and Anthropic have been disclosing mainly nothing ought to come throughout as a serious purple flag, and but no one appears to care.

Offended Analyst nonetheless
Spoiler alert: know-how maturity won’t assist.

They are going to practice and practice and practice ever bigger fashions (parameter rely within the trillions), feeding all of it the information they’ll get or fabricate, utilizing extra highly effective supercomputers than these working the physics simulations of the US nuclear arsenal. They are going to manually hack (which is why they want hundreds of builders) further logic across the mannequin, advantageous tuned for increasingly situations.

However it should all simply papering over the inescapable reality {that a} generative pre-trained transformer mannequin is intelligence as a lot as CGI is actuality: that’s precisely zero, it’s all a crude, approximate imitation devoid of the underlying nature of the factor. GPTs, for instance, can’t remedy logical issues as a result of GPT fashions lack the services to have a conceptual illustration of an issue, or in themselves to carry onto any ‘thought’. That’s additionally why everytime you attempt to use a GPT to rigorously advantageous tune a response, it principally can’t, it should simply regenerate the whole lot even when explicitly instructed not to take action.

The essential query is: does it matter?

It may very nicely be that the imitation recreation will attain some extent (with all that handbook hacking and testing hundreds of trajectories to pick and condense the most probably response throughout inference) the place will probably be capable of create and keep the phantasm of intelligence, even sentience, that tons of of hundreds of thousands will find yourself simply utilizing it anyway, no matter accuracy or substance. There are early warning of that already.
It additionally stands to cause that almost all tech bros know this, however associate with the sport as a result of 1) it’s all about relevance and engagement, there’s a number of cash to be made even from simply imitation, and a couple of) most probably they consider they want to participate on this section of AI improvement to be in place for the subsequent one.

In any case, there isn’t a path for GPT in direction of intelligence, it isn’t a scaling or maturity difficulty.

Allow us to see if and when some sneakers drop after this report. The naked minimal must be sharper questions at analyst calls.

[ad_2]

Has Ed Zitron Discovered the Deadly Flaw with OpenAI and Its Flagship ChatGPT?

Shopper Problem

Weekly Preliminary Unemployment Claims Lower to 214,000

Hyperlinks 12/24/2025 | bare capitalism

Trump Grants 5-Day Vacation To Federal Staff

Has Ed Zitron Discovered the Deadly Flaw with OpenAI and Its Flagship ChatGPT?

Related Posts

Shopper Problem

Weekly Preliminary Unemployment Claims Lower to 214,000

Hyperlinks 12/24/2025 | bare capitalism

Trump Grants 5-Day Vacation To Federal Staff