Artificial Confidence #2: The week AI labs became Palantir
Anthropic and OpenAI both stood up consulting arms this month. GitHub Copilot quietly admitted its subscription pricing never made sense. Vercel published the receipts.
There’s been a shift: the model layer used to be the prize, yet this week the AI industry quietly conceded there’s a great chance that it might be the loss leader. Anthropic launched a $1.5 billion enterprise services joint venture with Blackstone, Hellman & Friedman, and Goldman Sachs on May 4. Seven days later, OpenAI countered with a $4 billion subsidiary called DeployCo, valued internally at $14 billion, anchored by 150 forward-deployed engineers (this cycle’s Hot New Job Title for “engineers who don’t embarrass themselves trying to hold a conversation”) acquired from a London consultancy you have not previously heard of. Three management consulting firms, Bain & Company, Capgemini, and McKinsey, wrote checks into the entity that seems explicitly designed to replace them. Either they’re hedging against their own obsolescence, or they would simply like to stop doing PowerPoints; both readings are supported by the press release. I too would like to stop doing PowerPoints, but that’s the job sometimes.
The week’s other money headlines line up underneath this thesis. If I can stuff them into one sentence like a flailing Labrador into an apartment bathtub: Cerebras traded, Anthropic floated $900 billion, OpenAI capped Microsoft’s revenue share at $38 billion, GitHub Copilot conceded its subscription pricing was never going to survive contact with agentic workloads, and Vercel published seven months of production data showing exactly why. That’s great, but to see most of what actually shipped, you had to read past the IPO frenzy to find. And that, dear reader, is why I’m writing this.
What Actually Changed (Adjusted For Spin)
Anthropic launched Claude Platform on AWS, demoting Bedrock to “legacy” in its own docs
On May 11, Anthropic announced general availability of Claude Platform on AWS, a direct Anthropic API surface accessed through an AWS account. Authentication is IAM. Billing is through AWS Marketplace. Audit goes through CloudTrail. AWS is reduced to authentication, audit, and billing, which they do well—but none have anything to do with running the actual product. The hyperscaler has been demoted to “being Stripe, if Stripe’s UX was absolute dogshit.” Note: the inference runs on Anthropic-managed infrastructure outside the AWS security boundary, which is the sort of sentence that would have made a compliance officer eat a stapler in 2024 and is now a marketing bullet that the compliance officer is debating eating instead.
The framing AWS chose is “additive.” The framing Anthropic’s own docs chose is less beneficial to AWS, and consequently more honest. The existing Bedrock integration is now labeled “legacy Amazon Bedrock integration.” Claude Opus 4.7, Anthropic’s current flagship, does not have an ARN-versioned model ID on the legacy interface at all, which is the corporate equivalent of moving someone into a smaller office next to the printer spewing toner into the air and forgetting to give them the new keycard.
I clock two details the press release did not lead with. First, migrating from Bedrock changes the SigV4 signing context, the base URL, the API format, the model IDs, the SDK client, the streaming format, the request headers, and the region availability, with an implicit customer message of “good luck, asspony.” Eight independent changes is not a “migration,” it’s a goddamned rewrite. Second, negotiated discounts and AWS Marketplace private offers do not transfer automatically between Bedrock and Claude Platform on AWS. Translation: if you spent a quarter negotiating your Bedrock pricing, you get to spend another quarter negotiating again, and your existing EDP commit is not portable. For added fun, the AWS-side pricing structure on this brings new meaning to “absurd:”
Usage is denominated in Claude Consumption Units (CCUs) at $0.01 USD per CCU. The CCU price is fixed and never discounted. Anthropic rates your token usage in USD at standard per-model, per-feature rates, applies any negotiated discount, then converts the result to CCUs at $0.01 per CCU. Discounts result in fewer CCUs metered, not a lower CCU price. CCUs are not prepaid credits; there is no CCU balance or commitment.
That’s a lot of words to say “you will not know what any of this costs until the bill shows up.”
GitHub Copilot is moving to usage-based billing June 1, and the multipliers portend darkness
GitHub announced that on June 1, all Copilot plans transition from premium request units to GitHub AI Credits, where one AI Credit equals one cent. Token consumption gets billed at published API rates, base subscription prices stay the same, code completions stay unlimited, and everything agentic gets metered.
The honest version of why came from GitHub’s own chief product officer, Mario Rodriguez: “a quick chat question and a multi-hour autonomous coding session can cost the user the same amount.” It’s the kind of sentence you give a Senate committee when you have stopped trying to pretend the previous answer made sense; you bury the implementation in complexity that the senator from Pennsyltucky has no hope of answering, since they used to be a surgeon instead of a cloud economist.
It still doesn’t make a whole ton of sense as to why, so we go deeper to the most useful disclosure: the annual-plan model multiplier table. For annual subscribers who stay on premium request billing after June 1, the Claude Opus 4.7 multiplier moves from 7.5x to 27x. The GPT-5.4 multiplier moves from 1x to 6x. GPT-4.1, previously a free 0x model under paid plans, is being pulled from the free tier entirely. The “prices aren’t changing” framing is technically accurate and also wildly misleading. The math under the prices changed by between four and six times depending on which frontier model you use. This is GitHub negotiating a contract cancellation through pricing. Annual subscribers running Opus 4.7 in agentic mode are being told, in the language of multipliers, that their existing contract no longer makes economic sense for GitHub, and would they please consider switching to monthly. Anyone who has ever taken a corporate “voluntary” buyout will recognize the structure.
If your team was paying $10 a month for Copilot Pro and burning Opus 4.7 in agentic mode, the unit economics you have been depending on were never real. The bill that arrives in July is the first one that reflects what your usage actually costs to serve, and a whole lot of folks are very much not going to enjoy this experience.
OpenAI deleted DALL·E 2 and DALL·E 3 from the API on May 12
Not deprecated like an AWS deprecation, deprecated like a Google deprecation: completely removed. The DALL·E 2 and DALL·E 3 model snapshots are no longer available through the OpenAI API as of May 12. The Realtime API Beta got the same “Old Yeller” treatment the same day. Your migration paths are gpt-image-2, gpt-image-1, gpt-image-1-mini, and the GA Realtime API respectively. Honestly, the hardest part of modern AI is teasing meaning from the model strings. Christ, I never thought I’d be nostalgic for AWS’s crap-ass naming “strategy.” Sure, Amazon DocumentDB (with MongoDB Compatibility) is a bad name, but at least you knew what the hell it was for.
If you’re one of the developers using the public OpenAI Image API in 2024, and you were not paying attention to the deprecation calendar, congratulations: you shipped a broken product last Tuesday. I am old; one of the things I always appreciated about AWS is that it’s vanishingly rare where their deprecations mean a thing that worked last week is broken this week. Meanwhile, that’s kinda the lived experience of being a Google customer. You get used to rapid change, invariably by surprise.
IBM announced Red Hat AI Inference on IBM Cloud, GA May 22
On the other end of the change continuum, over in IBM-land they’re launching a serverless inference API too. Powered by vLLM, OpenAI-compatible API, the catalog includes Granite, Mistral-Small-3.2, Llama 3.3 70B Instruct, GPT-OSS-120B, and Nemotron-3-Nano-30B-FP8... If that doesn’t sound painful enough, it’s billed through IBM Cloud IAM. This is Bedrock and Vertex and AI Foundry, only with an IBM logo on it. Every hyperscaler and also IBM now sells inference-with-IAM as the product. The audit trail, business relationships, and significant install base comprising various forms of hostagetaking is the new moat.
Follow The Money (Or Watch It Follow Itself)
Cerebras traded, priced at 111x trailing revenue, then dropped 10%
Cerebras Systems priced at $185 on May 13, sold 30 million Class A shares, and raised $5.55 billion. Shares opened at $350 on May 14, peaked at $385, closed the first day at $311.07, and dropped 10% on Friday. The marketed range moved from $115–125 to $150–160 to $185 in the days before pricing, which is what happens during a roadshow when you can feel and also unfortunately smell the room. At the IPO price, the implied fully diluted valuation was $56.4 billion. At the day-one peak, it was north of $120 billion.
You’ll have to forgive me, but I’m from the 1900s, an era where it seems money meant something different than it does today. Cerebras’s FY25 revenue was $510 million, up 76% from $290 million the year before. At the IPO price, that valued the company at roughly 111 times trailing revenue. At the day-one close, more than 180 times. The Cerebras pitch is that this is not a normal chip company being valued like a chip company, it is scarce AI infrastructure being valued like scarce AI infrastructure, and the difference is the entire bull case. The Cerebras bull case is that we will, in fact, never have enough compute. The Cerebras bear case is that we will. Both cases were priced at $185 a share.
Vercel published April production data, and the labs are not competing on the same axis
On May 12, Vercel published seven months of AI Gateway production data covering more than 200,000 unique teams. The headline numbers for April: Anthropic took 61% of spend on 26% of token volume. Google took 21% of spend on 38% of volume. OpenAI took 12% of spend on 13% of volume, with spend share roughly tripling between March and April after the GPT-5.4 and GPT-5.5 releases. The labs are not competing for the same call. Anthropic is winning the high-stakes layer. Google is winning the high-volume-low-cost layer, which was basically the only pitch that Amazon’s Nova models had. And OpenAI is winning whatever it just shipped last week. Vercel’s own framing is that “spend follows the cost of being wrong”, which is the cleanest one-line summary of inference economics anyone has shipped this year.
The same dataset shows 22.2% of AI Gateway requests in April ended with a tool call, carrying 58.9% of total token volume. The agentic share roughly doubled from October. The cost surface of production AI is now shaped like an agent, not a chat, and at the top of the request-volume curve, the average team is routing across 35 distinct models. The standard story about lab lock-in inverts the higher you go on the curve. Lab lock-in is a sales pitch. Routing graphs are infrastructure.
Vercel has skin in this game; the AI Gateway is their pitch to be the routing layer between those workloads and the labs, and “look at the multi-model fleets at the top of the curve” is exactly the argument a company selling routing infrastructure would want to make. The data is still the cleanest production-traffic read anyone has published, in part because nobody else with the volume has been editorially willing. I’m hesitant to overindex on their list of top providers, because unless customers override it the default selection is “whatever Vercel wants to use.” One wonders if they’re making these decisions based on their own commercial terms with various inference providers.
Anthropic’s valuation cycle is now playing speed chess
In February, Anthropic closed its Series G at a $380 billion post-money valuation on a $30 billion raise led by GIC and Coatue (gesundheit). On April 29, TechCrunch reported Anthropic had received preemptive offers at $800–900 billion and was sizing a $40–50 billion round. On May 12, Bloomberg reported the talks had crystallized into “at least $30 billion” at “more than $900 billion” pre-money, closing by the end of this month.
Four numbers, three weeks, one company. What’s interesting is that each leak walked the prior number in a particular direction. The raise size started at $40–50 billion (April 29), then narrowed to $30 billion (May 12). The valuation floor went from $800 billion (April 29) to $900 billion (May 12). The pattern is what it looks like: a series of trial balloons sized to discover the elastic limit of the room. Compare to the same company at $61.5 billion in March 2025 and you have a roughly fifteen-fold private-market valuation move in fourteen months, which is statistically indistinguishable from a meme stock with a science publication.
The (Reported! We have nothing concrete!) revenue figure has done its own dance. End of 2025: $9 billion. Mid-February: $14 billion. Late February: $19 billion. April 7: $30 billion, per CFO Krishna Rao. April 29: TechCrunch sources said “closer to $40 billion.” Some mid-May reports cite $44 billion. OpenAI, which has its own reasons to argue this, maintains the $30 billion figure is overstated by approximately $8 billion on a gross-versus-net cloud revenue accounting argument, which would make the comparable number $22 billion. If you would like to know Anthropic’s annualized revenue today, please specify a date, a source, and which Magic 8-ball you consulted. They will not agree, and neither will Anthropic.
The growth itself is clearly real; I mean, eight of the Fortune 10 are paying customers (who the hell are the two holdouts, and can we talk?). One thousand of their customers spend over $1 million a year (theoretically on purpose), doubled from the February disclosure. Claude Code reached $2.5 billion in run-rate revenue within nine months of public launch. I want to be clear here: I’m not an idiot who denies reality, and I don’t have an agenda. My skepticism isn’t around whether the revenue exists, but rather which number, on which day, with which accounting treatment, ends up in the S-1.
OpenAI capped Microsoft’s revenue share at $38B, $54B below Microsoft’s planning target
The Information reported on May 11 that OpenAI and Microsoft agreed to cap total revenue-sharing payments at $38 billion, which is coincidentally how much the first publicly announced OpenAI AWS deal was worth in November, so maybe that’s the default amount in OpenAI’s QuickBooks installation or something. Microsoft had internally been targeting approximately $92 billion in returns from its OpenAI stake, per planning documents disclosed in the Musk v. Altman trial. The cap takes roughly $54 billion off Microsoft’s modeling and puts it back on OpenAI’s side of the table, which is exactly the kind of number you want to wave at IPO bookbuilders. Meanwhile Microsoft will presumably make up the shortfall and then some by putting ads into the GitHub service outage notifications.
In the same renegotiation, Microsoft’s license to OpenAI models was extended to 2032 but also made non-exclusive. OpenAI can now serve all its products across any cloud provider. Microsoft’s previous revenue share to OpenAI was eliminated, leaving the cash flow one-directional. Read together, this is the contractual end of OpenAI’s Azure-exclusive era, made just visible enough that a public-market investor reading the S-1 will not accidentally believe the words “strategic partnership” mean anything specific.
OpenAI launched DeployCo, raised $4B, and bought 150 Palantir-style engineers
OpenAI’s corporate ADHD struck again as they announced DeployCo on May 11, a majority-owned subsidiary capitalized with more than $4 billion from nineteen investors led by TPG. The implied valuation reported by Axios is $14 billion, which is the number you produce by assuming a consulting practice that has existed for one day will scale faster than every consulting practice that has ever existed in the history of the world. The investor structure reportedly includes a 17.5% guaranteed return, which is the rate at which OpenAI has chosen to borrow $4 billion while calling the borrowing equity. That’s a similar guaranteed rate of return to that of many crypto emails lurking in my spam folder from 2019.
Three management consulting firms wrote checks: Bain & Company, Capgemini, and McKinsey. My snark aside, companies are generally not run by idiots. Therefore, the polite reading is they are buying option value on the disruption of their own business. The less polite but spot-on reading is they have correctly priced the future cost of saying no.
Concurrent with the launch, OpenAI agreed to acquire Tomoro, an Edinburgh-and-London consultancy founded in 2023 in alliance with OpenAI, employing approximately 150 forward-deployed engineers. At a $14 billion unit valuation, those engineers are valued at approximately $93 million per head, which is generous even by 2026 AI hiring standards. Anthropic shipped the same play seven days earlier with Blackstone, Hellman & Friedman, and Goldman Sachs on a $1.5 billion joint venture. Both labs have now formally conceded that the company that sells the model is not necessarily the company that captures the margin on its deployment; these are likely the early days of the frontier labs devouring their own ecosystems as pressure to show revenue builds.
The deeper reason I suspect drives these moves is that token revenue has structural unit economics problems that a public market analyst will, sooner rather than later, notice. Consultant time does not. Forward-deployed engineering is a revenue line that gets booked in dollars, not in inference losses, and converts cleanly to a chart that ends with the line going up. The labs aren’t pivoting to consulting because consulting is a great business. They’re pivoting to consulting because consulting is the only revenue line on the deck that does not require a footnote.
Reliability: A Brief Retrospective
The Claude status page records at least one investigated incident on May 12, 13, 14, 15, 16, and 18. The AI fanboys will no doubt point out that taking Sunday off has biblical precedent, and we’re closer than ever to summoning God via JSON. The May 13 cluster includes two separate investigations totaling roughly two and a half hours. The May 14 investigation lasted about two hours. The May 15 incident is the most interesting one editorially: the status update specifically notes that “success rates for Opus 4.7 have returned to normal” while Opus 4.6 and Sonnet 4.6 were still degraded. The newer flagship recovered first. The older model and the smaller cheaper one stayed down longer.
The 90-day uptime numbers on the same status page tell a similar story by tier. Claude API: 98.99%. Claude Code: 99.14%. Claude Cowork: 99.45%. Claude for Government: 99.87%. The government tier gets approximately eight times less downtime than the public API, which is a useful way to think about exactly how much your federal contracting line item should cost to be worth it. OpenAI’s equivalent number is 99.82%.
The number that offers more insight than either of those is the one Vercel published on May 12 from seven months of AI Gateway data. About 3.5% of requests on the gateway end up rescued by failover to a healthy alternative. Measured by tokens, the rescue rate runs at 5.1%. Measured by dollars, 4.9%. The expensive end of the workload, long contexts, multi-step agent runs, heavy reasoning calls, is also the end most likely to need rescuing. A provider’s SLA measures request-level uptime. A production application experiences cost-weighted uptime, and the two blow themselves apart on exactly the calls that paid for the model. If your CFO is reading the SLAs and believing them, the CFO is reading the wrong document.
The Hype Audit Department
It’s worth saying out loud, because the prospectus does not. Cerebras shipped 30 million shares to public markets last week on the strength of a 76% revenue growth rate from $290 million to $510 million. That growth is real, or at least “real enough that if it’s not somebody will theoretically be going to jail.” The customers driving it are two government-funded UAE entities operating in the same emirate under the same sovereign sponsorship, plus a Master Relationship Agreement with OpenAI whose payments do not start materializing in the income statement until 2027 and whose existence assumes that OpenAI will be a buyer of physical inference compute four years from now in the volumes its current cap-table mathematics requires.
The phrase “diversified customer base” appears in the prospectus. The phrase “the same emirate’s two largest AI procurement vehicles” does not. The first phrase is technically accurate. The second is also technically accurate, and if we’re being direct it’s the one that should be priced in. Cerebras’s 86% two-customer concentration in 2025 is one percentage point higher than its 85% single-customer concentration in 2024. The change is a new LLC name on the second-largest line, not a new geography.
Both LLCs sit inside the same sovereign portfolio, and the prospectus knows it. On page 22, the company discloses that “G42 and MBZUAI are considered related parties with respect to each other as defined by Accounting Standards Codification 850.” For those of you who aren’t giant nerds, ASC 850 is the accounting rule that requires companies to flag related-party connections; Cerebras checked the box, used the accounting-standard citation as the entire structural acknowledgment, and stopped, hoping everyone else would too. The prospectus never specifies the relationship.
The relationship is this: G42 is chaired and controlled by Sheikh Tahnoun bin Zayed Al Nahyan, the UAE’s National Security Advisor since 2016, who oversees roughly $1.5 trillion in sovereign capital and was deputy national security advisor at the time of Project Raven, the surveillance program Reuters documented in 2019 that hired former NSA personnel to spy on American citizens, journalists, and dissidents. MBZUAI (pronounced like an Amazon seller who’s about to scam you) stands for “Mohamed bin Zayed University of Artificial Intelligence” and is named after his brother, Sheikh Mohamed bin Zayed Al Nahyan, the President of the UAE. E.
The closest the S-1 comes to acknowledging any of this is one passing reference, in the same risk factor, to “laws or regulations applicable to OpenAI, G42 or MBZUAI, or the United Arab Emirates.” That is the entire UAE disclosure. The investor is left to assemble the rest, which is the work AC exists to do.
The bear case for CBRS is not “the growth is fake,” it’s that “Mohamed bin Zayed University of Artificial Intelligence” and “G42” are not the names of a diversification strategy any more than “we are diversified between a guy and also his brother, both of whom have diplomatic immunity” is.
One last thing
This week the AI industry decided what kind of company it actually is. It turns out that it’s not a model vendor with a consulting practice attached (usually called “Professional Services”). Rather, it’s a consulting practice with a model vendor attached, and the model vendor’s job is to keep the consulting practice differentiated from Accenture. Anthropic and OpenAI both put $5.5 billion of investor capital toward this thesis in the same fortnight. Cerebras went public on the back of a procurement relationship with two foreign-government-adjacent buyers. Microsoft accepted a $54 billion haircut on its OpenAI returns so the cap table would look right for an IPO. GitHub admitted the subscription pricing it has been running for two years was never going to survive the agentic workloads it explicitly built the product around. Vercel published the data. The model layer is still the thing investors are buying, but it is not the thing they are paying for.
If you run AI workloads and you have not renegotiated your cloud commit this quarter, your counterparty just made it harder for you. If you sell consulting and you have not noticed that the labs are now your competitors, your counterparty also just made it harder for you. And if you have a Copilot Pro subscription on auto-renew and you have not looked at the June multiplier table, you are about to be one of the case studies in a future issue of Artificial Confidence. Either way, the bill changed.
See you next week.
— C




Finally, an AI newsletter I can stand reading!
That Vercel stat where 22.2% of requests ending in a tool call but carrying 58.9% of token volume is the whole story. GitHub didn't "wake up" to agentic economics; everyone is discovering at once that the most valuable workloads are also the most expensive to serve, and nobody priced for that. The labs aren't pivoting to consulting because they want to. They're pivoting because token revenue has a footnote and consultant hours don't.