Challenging the Chatbot
Intelligent applications need new interaction patterns
The chatbot is genuinely an incredible interface.
I personally love using chatbots. I use Cursor for coding every day. I use ChatGPT for querying things, exploring ideas, and getting unstuck. Even a lot of the products we build for ourselves are conversational in nature.
What makes them so compelling is how well they capture intent. You can show up with a vague thought, a half-formed question, or a messy goal, and the system can still meet you there. That is a real shift in how people interact with software.
I heard Naval describe Large Language Models (LLMs) as something like natural language computing. That phrase has stuck with me because it captures what makes chatbots feel so powerful. They let you compute with open-ended language. Instead of learning a command syntax, searching through menus, or translating your intent into the shape of an application, you can just say what you mean.
"Large language models let you compute with open-ended language."
From a product design perspective, though, especially for the operational products people use every day, this is where things get more complicated.
Tools like CRMs, email clients, dashboards, workflow builders, and collaborative systems have a lot of value in their fixed, deterministic interfaces. They expose state. They make repeated actions fast. They let teams look at the same object and understand what is happening. They give users buttons, filters, tables, timelines, approvals, and views that are optimized for the work.
Chatbots have excelled in single-player use cases, especially when the user is exploring, writing, searching, or asking for help. But they struggle more in collaborative, high-frequency, operational workflows.
You would not want to use a conversational chatbot at a checkout kiosk. In that context, the job is not to explore ambiguous intent. The job is to finish a repeated workflow with as few mistakes and as little friction as possible. A deterministic interface, optimized around the right sequence of actions, is simply better for that kind of work.
That is the tension. Chat is an amazing way to express intent, but it is not always the best way to represent state, coordinate action, or run a product workflow.
What Exactly Is Software?
To understand where chatbots fit, it is worth asking a more basic question: what exactly is software?
At the simplest level, software is the human-computer interface. Computers can store information, run calculations, automate work, coordinate systems, and produce outputs faster than any person could by hand. But that power is only useful if humans have a way to direct it, inspect it, and trust what comes back.
Even as models become more capable, the interface still matters. Someone still has to design how people express intent to the computer, how the computer represents its state, and how the user knows what to do next.
One useful way to look at that interface is through a systems design lens: inputs, process, and output. The product gives users a way to provide information or intent. The system stores and transforms that information through some process. Then it produces an output the user can consume, trust, or act on.
For decades, a lot of the software we use every day has followed a fairly deterministic pattern.
We build interfaces that gather information from users, store that information as core entities or data models, apply some business logic on top, and then produce an output the user can rely on.
Input surface
Forms, fields, tables, prompts
Data model
Employees, accounts, records
Business logic
Rules, calculations, workflows
Output surface
Reports, records, artifacts
Take payroll software. The product gives you an input surface for defining employees, compensation, bank details, tax information, and pay schedules. Underneath that interface, the software applies the calculations and business rules required to run payroll. The output is not a message in a transcript. It is a durable artifact: payroll that has been processed, recorded, and can be returned to later.
That pattern shows up everywhere. The interface collects structured information. The system applies logic. The product produces state or an artifact that users can inspect, trust, and act on again.
This is why deterministic interfaces have mattered so much. Forms, tables, dropdowns, dashboards, approvals, and workflow states may feel rigid, but they make the system legible. They turn work into shared state that can be inspected, edited, repeated, and trusted by more than one person.
What Chatbots Unlocked
This is where chatbots changed the input surface of software.
In a traditional application, the user has to know where to go before they can do the thing they want to do. They navigate to the right page, find the right workflow, open the right form, and then translate their intent into the fields the product exposes.
Chat changes that sequence. Instead of navigating through the product first, the user can describe what they want. The system can interpret that intent, route them toward the relevant workflow, and extract the inputs the workflow needs. The input surface is no longer only a deterministic form. It can be open-ended language.
This gets even more powerful when the chatbot has access to more than the immediate message. If it can search the web, look across an internal database, retrieve documents, remember past context, or understand the user's history, the input process becomes personalized. The user is not filling out a blank form from scratch. They are collaborating with a system that can infer context, retrieve missing details, and ask for clarification only when it needs to.
This is why chat feels so magical. It collapses navigation and input into one natural language surface.
It also changes the shape of the workflow itself.
In the past, companies had to define rigid deterministic pipelines for getting things done. They could not capture every possible user intent, so they settled on a fixed number of input fields required to execute a fixed workflow. Once the product had access to that data, it could run the application logic and produce the output.
That is still useful. A lot of software should be deterministic. But LLMs are unusually good at joining the dots between things. They can understand how pieces of context relate to each other, decide which capabilities are relevant, and assemble a path through a larger surface area.
This is the argument we made in Primitives over Pipelines: instead of forcing agents through rigid backend workflows, give them modular capabilities they can compose.
Chatbots make that idea visible at the product level. Instead of building only narrow vertical tools with one prescribed workflow, products can expose primitives to an LLM and let the model assemble the right path for the user's intent.
Take something like Notion. In the past, it was primarily a document editor and workspace. But once you expose its primitives to an LLM, it can start to feel like more than a document editor. It can become an application builder, an internal operating tool, or a flexible layer for assembling workflows across the information already inside the workspace.
The product does not need to anticipate every possible path up front. If the model can accept open-ended input and the system exposes the right primitives, the model can assemble workflows and application logic at runtime.
So the chatbot paradigm shifts two parts of the old software loop. It changes the input surface, because users can describe intent in language. And it changes the application logic surface, because the system can compose primitives instead of only executing predetermined pipelines.
It also changes the output surface.
In traditional software, the output was usually a deterministic artifact or state that everyone consumed in roughly the same way. A report, a record, a dashboard, an invoice, a payroll run, a task list. The product decided the shape of the artifact, and the user consumed it.
LLMs make that output more malleable. The system can assemble an artifact that depends on a specific person's use case, context, role, or intent. One user might need a summary. Another might need a table. Another might need a workflow, a draft, a dashboard, a checklist, or a diff.
That means the full lifecycle of software is changing. How we capture input intent, how we process application logic, and how we represent the output to the user can all become more adaptive.
The chatbot is the most obvious expression of that shift, but the deeper change is not the chat box itself. The deeper change is that every part of the software loop has become more flexible.
Chatbots are taking off because they are an incredible form factor for open-ended work. They are especially strong for coding, writing, research, and general-purpose assistants where the user is exploring intent as they go.
But much of the software we use every day is not primarily exploratory. It is operational. Payroll, checkout, inventory, scheduling, CRM, analytics, support queues, and approval workflows all depend on visible state, repeated actions, and shared artifacts. In those contexts, the best interface is often not a conversation. Sometimes it is a table, a scanner, a dashboard, a diff, a form, a calendar, or a button. Sometimes, like the Uniqlo checkout, the best interface is almost no interface at all.
So if chat is not the whole interface, what else should we design?
Language Into State
One useful pattern is to let language capture intent, then immediately turn that intent into visible, editable application state.
Filters Become Intent
Traditional dashboards often make users translate their goal into filter logic. If you want to find large software purchases from last quarter that still need approval, you have to know which fields exist, which menus to open, and how the product names each status or category.
An intelligent interface can let the user start with the goal instead. The system can interpret the request, apply the relevant structured filters, and show the resulting filter state back to the user so it can be inspected, edited, or cleared.
Let's look at a concrete example. Imagine a traditional banking interface with a table of transactions and a handful of filters for fields like amount, merchant, department, and policy. If you use a product like Brex or Ramp, this interface probably feels familiar.
In traditional software, those fields usually become a dynamic form made of filter buttons, selects, combo boxes, and inputs. The user has to assemble the right combination by hand, then watch the table update as each piece of state changes. Try clicking around the figure below to feel how much of the interaction is spent translating intent into interface mechanics.
| Merchant | Amount | Department | Policy |
|---|---|---|---|
| AWS | $2,840 | Engineering | Needs review |
| OpenAI | $1,320 | Engineering | Needs review |
Even in this small static example, you probably spent a few seconds learning the interface: opening menus, checking which values exist, and seeing how each choice changes the table. And even when it works, the outcome may not map exactly to what you had in mind. The interface can only express the shapes its developers anticipated. If you want to ask for the same information in a slightly different way, you run into the edges of the product.
That is one reason traditional software can feel frustrating. There is an onboarding curve, an activation energy, and then a memory burden. You have to learn the product's vocabulary and remember how to operate it later. But this model has a real advantage: it is predictable. The same filters produce the same view for you, your colleague, and everyone else on the team. That shared deterministic state creates trust in a way a pure chatbot often does not, because two people can ask similar questions and receive different outputs depending on their context.
So the opportunity is not to replace the interface with chat. The opportunity is to find a compromise: let language express intent, then turn that intent into visible, editable state. Try the version below by describing the transactions you want to see.
| Merchant | Amount | Department | Policy |
|---|---|---|---|
| AWS | $2,840 | Engineering | Needs review |
| OpenAI | $1,320 | Engineering | Needs review |
The demo above is making a real request to Meta Llama 3.1 8B Instruct through OpenRouter. As you type, the app debounces the input, asks the model to return a structured filter object, then renders those filters back into the same visible table state.
Small aside: we love using OpenRouter because it makes this kind of prototyping weirdly fun. They recently released an MCP server that can help you search for the cheapest and fastest model for a specific use case. We used that to explore options for this demo. The first recommendation we tried was inclusionai/ling-2.6-flash, which looked ideal on paper: extremely cheap, low-latency, and advertised support for structured outputs. But our tiny eval caught the product-level details that matter here. It sometimes returned merchant names with the wrong casing, and in one case invented a policy filter while missing the amount filter. So we switched to the Meta model, which passed the demo harness and gave us a better balance of speed, cost, and structured-output reliability.
The practical takeaway is that this no longer feels like a science project bolted onto the side of the interface. In our local runs, most requests landed in the 300-700ms range, which is close enough to the latency people already tolerate in normal software interactions. The cost is also small enough to change the product math. One representative request cost $0.00000842, which means you could run roughly 1.2 million filter searches before spending $10.
And this is the expensive, slow version of the future. Inference costs keep falling, latency keeps improving, and smaller models keep getting better at narrow structured tasks. That makes it reasonable to imagine interactions like this eventually feeling instantaneous enough that the user does not think about the model at all.
The interesting design question becomes: where is it worth spending a little intelligence on each keystroke? Anywhere the product is trying to capture intent, translate vague language into application state, or help the user shape a query, this pattern starts to open up.
That is why challenging the chatbot matters. The future interface may not look like one big conversation. It may look like the software we already know, with intelligence quietly embedded wherever users are trying to express what they mean.
Want help implementing this in production? Let's talk
Rubric is an applied AI lab helping teams design and ship intelligent products.


