OpenAI's checkout retreat wasn't a technology failure. It was a positioning one. Why AI shopping agents belong inside the brand experience, not above it.


OpenAI quietly pulled back from instant checkout earlier this month. The feature let users purchase products directly through ChatGPT: no redirect, no brand site, just a prompt and a transaction. After months of near-zero purchase conversions, they've pulled it. Users were browsing and comparing products through the chat interface. They just weren't buying there.
Two years ago this was the future. Not one platform's future. Everyone's. AI shopping agents were going to replace the product page, the checkout, the brand experience. Consumers would describe what they wanted and an AI would handle the rest. Big confident predictions. Lots of investment. I remember reading the announcements thinking it made sense at the time.
We've heard this before.
Facebook spent years and significant capital telling us that the metaverse was where shopping, socialising, and working would happen. It wasn't. Not because the technology didn't work. Because the assumption underneath it did. People were not waiting to strap a headset on to browse a virtual mall. The experience was technically possible. It just wasn't something anyone had asked for.
The same assumption runs under agentic checkout in chat: that friction is the only thing people want removed from shopping. It isn't.
People want to see products. Feel some confidence about what they're buying. Understand what they're choosing. Sometimes they want to be surprised by something they didn't know they needed. The reason product photography gets a budget, the reason "try on" features keep appearing on fashion sites, the reason brands obsess over how a product page looks and feels: presentation is where the conversion happens. Often it's most of the work.
A chat interface that abstracts away the product page isn't reducing friction. It's removing the part of the experience that actually turns interest into a purchase. What you're left with isn't shopping anymore. It's just ordering.

But the underlying idea is right: AI should handle the parts of buying that nobody enjoys. The gap is in where that AI lives.
Some products you simply need to buy. Weekly groceries are the obvious case. I order from Redmart most weeks, mostly the same things, just topping up. There's no discovery happening. No excitement about the interface. I'm clicking through a mediocre experience to buy oat milk and pasta because I have to. An agent that handles that for me is useful. When the decision's already made, get out of the way. For this category, standalone AI agents will move fast, and the platforms that build them first will win.
But then there's the other kind of purchase. The ones where you don't know exactly what you want, or you know what you want but not which version, or you're not sure you need it but you're open to being convinced. That's where the experience actually matters.
Travel insurance is the one I think about most. We worked on a travel insurance flow last year where the highest abandonment wasn't at payment. It was plan selection. Users had to compare coverage options, figure out what applied to their trip, and make a decision while wading through dense text. An AI that sits in that moment, helps you compare two plans, asks what kind of trip you're taking, and points you to what covers you? That's not replacing anything. That's fixing the part of the experience that was actually broken.
The model that actually works is embedded, not separate. Not a chat interface you open before you visit the site. An AI that lives inside the brand's own experience, handles the moments users hate (form-filling, plan comparison, the "which one is right for me" loop), and then gets out of the way when it's done. The brand keeps doing what it does well. The user gets help exactly where they get stuck.
This is where AI commerce actually goes. Not above the brand experience. Parallel to it. A sidekick that removes friction without removing the shopping.

Most brands aren't ready for this.
Here's the thing: even the sidekick model requires your experience to be structured for it. Three different layers. Most platforms need work on all three:
UX. The handoff points need to be designed explicitly. Where the AI steps in, how it frames the question, when it steps back: all of that has to be deliberate. An AI that appears at the wrong moment, or doesn't know when to step aside, creates a different kind of friction.
Operational. The agent needs live data: what's in stock, what applies to this user, what the current options actually are. Static content doesn't work. The agent is only as useful as what it can access in real time.
Technical. The foundation has to be built for it, not retrofitted to a checkout flow that was designed before any of this existed. That retrofit is exactly the kind of experience OpenAI just walked away from.
And there's a detail in OpenAI's data that says everything: users were happy discovering and comparing products inside ChatGPT. They used the AI for browsing. Then the moment it came to payment, they went back to Amazon or Walmart. Which isn't the concept failing. It's the concept proving itself, just not in the way OpenAI structured it. People used the AI for the help they needed and the brand for the part they trusted. They figured out the sidekick model through their own behaviour, without anyone designing it explicitly.
If you're working out what this means for your platform, this is the kind of problem we work on. Let's look at it together.
Related: Performance design and why pretty websites don't pay the bills · How we rebuilt a travel insurance checkout for 75% faster to buy

Multidisciplinary talent breaks hiring systems built for specialists. In the AI era, that structural blind spot is a competitive liability most organisations still haven't addressed.

Performance design isn't UX with better metrics. It's a different discipline. What it is, how AI changes it, and what shifts when you start doing it.
Let's Talk
First call is always diagnostic. You describe where the numbers feel wrong — most of the time, we can identify the cause before we’ve seen the product.
Not a pitch. A look at the problem together.