Custom Apps vs. AI-First mobile operating systems
Some failed PoCs helped me better understand what we want out of our mobile apps and how AI might impact UX at the OS level, not the app level
In my last post, I speculated about customizable apps and where UI might head in the future. Since then, I’ve experimented with a few proof-of-concept “DevBot” ideas using a chatbot-style window. In one POC, prompts through this chatbot called ChatGPT models, which generated code that integrated into the React-based front end to add features. In another POC, I tried calling Cursor’s background agent API to implement new features.
Both proof-of-concept websites mostly proved two already-obvious points:
It’s hard to graft together LLM outputs to create a simple one-shot code generation environment
It’s dangerous for a website to perform brain surgery on itself
The main blocker was that the “DevBot” interface needed much more control over the backend code builder, and the chat UI needed to be far more interactive. Basically the interface needed to be more like the agent chat interfaces in Cursor/VS Code. When I asked the DevBot to add real functionality, such as asking it to “add a box to the home page showing today’s weather,” it was completely beyond its capabilities, even after I forced the backend to carefully parse error messages and iterate on the code.
All these excuses aside, I could get the chat window to make superficial CSS changes (”make the background color of this page dark blue”) successfully. I could also make simple HTML edits through the interface (”add a link to the text on the contact page”).
Anyone who’s dived head first into a project without sufficient planning will probably recognize that feeling where every step only reveals two more steps before you reach your goal. That’s how these projects felt.
One thought struck me as I experimented with these POCs, staring at the empty text box in the “DevBot” chat window: how intimidating that text box was. Even though I knew my clunky little DevBot POC could barely make anything other than font and color changes, the possibility of requesting any feature from my phone felt intimidating.
I began to wonder whether my speculation in my last post was wrong-headed. Is there really any value in making the mobile app user experience extremely customizable? Maybe the evolution of modern mobile apps from their starting point in the 2000s to today provides some insights.
Why apps are the way they are
iOS and Android apps in the 2000s were siloed, isolated experiences. They couldn’t run tasks in the background, sharing options were limited, and notifications didn’t really exist.
Android innovations on first-gen mobile apps found their way into iOS with a lag of a year or more. Android 1.5 supported home screen widgets, which were one of the first instances of a mobile app surfacing data outside the app interface itself. Widgets have since evolved into information-rich app surfaces supported by both iOS and Android.
Notifications came next, allowing apps to send short messages to users outside the app itself. iOS’s Notification Center gathered all these notifications in one place. App developers probably worried about this change, since it meant they had less control over when and how users saw their content. However, notifications were designed to bring users back into the apps. My guess is that app developers ended up with more total user engagement and revenue from notifications than they lost from people spending less time browsing within their apps.
Over time, notifications became more descriptive and interactive. Tapping a notification could open the corresponding app. Later, notifications offered richer UI, letting users type responses into text boxes within the notification to complete an interaction (e.g., “Send this message”) without ever opening the app.
Mobile apps’ evolution toward interactive surfaces outside siloed apps indicates the likely path forward. We’re already seeing AI integration with notification surfaces, where notifications are summarized by AI to condense what has become a cacophony of mobile app notifications.
Next steps as AI intercepts UI in mobile apps
Deeper AI integration into mobile apps will likely change mobile applications in two major ways:
Cross-app integrations for agentic, autonomous operation
Deep and personalized natural language summary and notifications
It remains to be seen how long mass market adoption will take. I find it hard to imagine mainstream users handing over purchasing power to agents without multiple authorization gates. Similarly, as the potential drawbacks of using AI for data access become well known, a stigma about hallucinating AI will likely mean most users approach AI summarization with a “trust but verify” mindset. Users will still want to click into a summary and find information directly.
Also, any deeper app customization that breaks down existing app design patterns will run up against a simple fact: broadly speaking, we like the way our apps work today. We enjoy (or at least we’ve been conditioned to enjoy) endlessly scrolling through social media feeds. Our scrolling thumbs have muscle memory.
Jumping back to the anxiety of that blank text box where I could potentially ask for any feature: I think given sufficient design power, most users would still design user interfaces more or less like the ones we’ve all been conditioned to use. Instead of some bold new agentic “build your own apps” interface on mobile, we’ll see a continued, slow, but relentless march toward opening up more unified app surfaces where AI-driven summarizations and intuitive operations still support business models that rely on driving users into the apps themselves.
I think that will continue being the case, even after LLMs begin running apps from within ChatGPT and similar chatbot interfaces.
Running apps Within ChatGPT
OpenAI recently announced a new SDK that lets app developers integrate their features into ChatGPT’s chat interface. The pitch is straightforward: users can chat with apps like Spotify and Zillow through ChatGPT’s natural language interface.
OpenAI enabled this using APIs built around Model Context Protocol. App developers can use the SDK to build these features now, with a wider release planned later this year.
This represents another step in app developers seemingly giving up precious user attention from their mobile apps and websites. While this could look like an existential threat to their business model, their experience with mobile apps tells a different story—notifications and widget surfaces still drive users toward their apps, or more precisely, toward their data and algorithms.
OpenAI’s deep integration with other apps feels less like a threat to app developers and more like a way to encroach on mobile operating systems’ control of attention. OpenAI’s agentic operations using deep linkage to third-party algorithms and data could act as a flanking maneuver around what Android and Apple are already attempting with their own AI integrations. If OpenAI or Anthropic reaches a critical mass of third-party app developers and nails the UX for third-party integrations before mobile operating systems do, it will be much harder for Apple and Google to keep up.
Maybe the real paradigm shift isn’t that mobile apps will break down under AI customization pressure, but that mobile operating systems themselves will buckle under the pace of innovation at leading AI companies.

Couldn't agree more. Your findings really hit home on the current limits of LLMs for anything beyond trivial code changes, which is a key takeaway. It feels like teh jump from CSS edits to adding real functionality like a weather box needs a much more advanced architectural reasoning capability from the agent itself.