From PRD to Prototype: Ship Your Spec as Working Software
AI prototyping tools cut the cost of a working prototype from a sprint to a session. The workflow, what the prototype must still carry, and where a written PRD still wins.
Lovable, v0, Bolt, and Replit Agent can turn a written prompt into a running web app in under an hour. In 2023, a PM who wanted a clickable prototype booked a designer for a week of Figma work or borrowed an engineer for a sprint. In 2026, the same PM builds it alone in a working session, with real interactions and seeded data. That changes what a spec can be. A stakeholder who would argue with six pages of prose for a week will react to a working screen in five minutes, and the reaction is more honest, because they are responding to behavior instead of to their own mental model of your paragraph.
This guide covers the workflow, the spec work that still has to happen around the prototype, and the handoff. It sits inside the broader role shift described in the AI product management field guide: AI absorbed the production layer of the job and left the judgment where it was.
What changed
Two things moved at once.
First, the tools crossed a usefulness line. Earlier generations produced static mockups or code that fell over on the second click. The current generation produces software you can put in front of a user: forms that validate, sortable tables, state that persists across a session, and seeded data that looks plausible. The output is far from production quality (more on that below), but it is real enough to test a flow against.
Second, hiring expectations moved with the tools. PM interview loops at AI-forward companies have started replacing the take-home strategy doc with a build exercise: here is a problem, here is a tool, come back with something we can click. Job postings increasingly name prototyping tools alongside SQL and analytics. We have not found a survey that puts a reliable number on this shift, so treat it as a pattern from job boards and interview reports rather than a measured trend. The direction is hard to miss, though: the artifact that proves you can do the job is shifting from a document to a demo.
The workflow: problem to prototype in a session
The order of the steps carries most of the value.
1. Frame the problem in writing first. Before opening the tool, write about 150 words: who the user is, what specific pain you are removing, one metric that would prove it worked, and what is out of scope. This is the same thinking a PRD forced, compressed. If you cannot write those 150 words, the tool will happily build a guess, and you will spend the session reacting to its design choices. The prompt is the thinking. The techniques in prompt engineering for product work apply directly here, especially context and constraints up front.
2. Build one flow, fake the rest. Prototype the single flow you need a reaction to. Stub the navigation and hardcode the login. Seed it with realistic data, because stakeholders react differently to "Acme Corp, $4,200 MRR, churned March 12" than to "Lorem ipsum."
3. Put it in front of people the same week. Five users or three stakeholders, screen share or a shared link. Watch where they click before you explain anything. A prototype turns "I think users want X" into "two of five users tried to do X and got stuck here," which is a different quality of input.
4. Revise while feedback is fresh. When feedback lands, change the prototype during the call if you can. The cost of a revision dropped from a Jira ticket to a sentence, so spend revisions freely.
5. Hand off deliberately. Covered in its own section below, because this is where teams get hurt.
Your prompt thread is the design rationale: every constraint you added, every direction you rejected. Export it and attach it to the handoff. An engineer reading "I told it to block submission until both fields validate because users were losing drafts" gets the why, which screenshots never carry.
What the prototype must still carry
A prototype is biased toward the happy path. The demo shows a user with clean data doing the intended thing. Most of what a PRD existed to pin down is exactly what the demo does not show, and that work does not disappear because the document did.
Pair the prototype with a one-page companion that covers four things:
Edge cases. What happens with zero items, 10,000 items, a 400-character name, a user on a free plan? Click through your own prototype hunting for these. The tool will have invented an answer for each one. Some of those answers will be wrong in ways that look fine.
Error states. The prototype never loses network or hits a 500, but production will. Write down what the user sees in each case, even if it is one line per state.
Non-goals. Prose was good at "this release does not include bulk editing." A prototype cannot show an absence. Engineers who infer scope from what they can click will infer wrong, in both directions.
The why. The prototype shows what you decided. It does not record what you rejected or what evidence backed the call. Two sentences per decision is enough.
The prototype's job is to communicate intent precisely. That bar is much lower than production quality and much higher than a sketch. Judge it as a communication artifact: does an engineer looking at it know what to build, and does the companion page tell them what not to build?
The production gap
AI-generated code looks finished. It compiles and the demo runs clean. Security research says the polish is surface-deep: Veracode's 2025 GenAI Code Security Report found that roughly 45% of AI-generated code samples introduced security flaws from the OWASP Top 10, across more than 100 models, and the rate did not improve with newer or larger models1. Developer experience data points the same way. In Stack Overflow's 2025 survey, the top frustration with AI tools was solutions that are "almost right, but not quite," and close to half of respondents said debugging AI-generated code takes more time than expected2. A METR randomized trial found experienced open-source developers took 19% longer on tasks with AI assistance while believing they had been faster3.
For a prototype none of this matters, because there is no real data and no attack surface. It starts mattering the moment someone says "this is basically done, can we just ship it?"
Anything touching authentication, payments, or personal data gets a fresh build from engineering. Do not let the prototype's code become the starting point. Treat vibe-coded output as disposable by default, and if the prototype ever held real customer data during a demo, flag that to security too.
The handoff, done properly, looks like this: the prototype link, the one-page companion (edge cases, error states, non-goals, the why), the prompt history, and an explicit list of what is faked. Then engineering estimates the production build as new work. Sometimes that estimate kills the feature, which is the system working: you spent a session finding out, instead of a quarter.
When the estimate comes back, you are doing investment math: build cost against expected return, and how long until the feature pays for itself. Run it here rather than in the hallway.
Calculator type "roi-payback-calculator" not found
When a written PRD still wins
The prototype-as-spec works when the thing being specified is an interface a person clicks. Plenty of product work is not that.
| Spec format | Communicates well | Hides | Best for |
|---|---|---|---|
| Clickable prototype | Flow, layout, interaction feel, copy in context | Edge cases, non-goals, system behavior, rationale | Single-team UI features, concept tests, stakeholder alignment |
| Written PRD | Scope boundaries, dependencies, contracts, rationale, audit trail | How the thing actually feels to use | API work, regulated domains, multi-team programs |
| Prototype + one-pager | Both, at the cost of maintaining two artifacts | Little, if you keep them in sync | Most UI feature work in 2026 |
A written PRD remains the better spec when the interface is organizational rather than visual. An API contract is consumed by other teams' code; a prototype of it specifies nothing they can build against. Regulated domains (healthcare, lending, anything with a compliance sign-off) need a reviewable, versioned document, because an auditor cannot approve a Lovable link. And when four teams have dependencies on the same release, the spec's job is sequencing and ownership, which prose and a dependency table handle and a demo does not.
If you are building agentic or AI-native features, the spec also has to carry evals and failure budgets, which no prototype expresses. Building agentic products covers that layer.
Prototype plus one-pager beats either alone for standard feature work. The prototype settles the arguments prose is bad at (does this flow feel right), the page settles the ones a demo is bad at (what is out, what can fail, why).
Your first prototype this week
Three steps, sized for one working week:
- Pick a feature your team has already validated but not yet designed. You want the framing work done, so the session tests the tool, not your discovery.
- Write the 150-word frame, then spend 90 minutes in one tool (v0, Lovable, or Bolt; pick whichever, the differences matter less than starting).
- Show the result to one engineer and one user. Ask the engineer what the demo hides. Ask the user to complete the task without help.
The engineer's answer becomes your one-pager outline. The user's struggle becomes your next iteration.
FAQ
Do I need to know how to code to build AI prototypes? No. You need to write clear constraints and debug by description ("the save button does nothing after I edit the second row"). Reading basic code helps when the tool gets stuck, but the bottleneck is clarity of intent.
Which tool should I start with? Whichever your team can open today. v0 is strong for React-style UI, Lovable and Bolt build fuller apps with hosting, Replit Agent suits anything you want to keep running, and Figma Make fits orgs that live in Figma. Tool choice matters less than the framing step before it.
Does the prototype replace the PRD entirely? It replaces the part of the PRD that described the interface, which was the longest and least-read part. Keep a one-pager for edge cases, error states, non-goals, and rationale. For API contracts, regulated work, and multi-team programs, keep the full written spec.
Can we ship the prototype to production? Treat it as disposable. Roughly 45% of AI-generated code carries OWASP Top 10 flaws1, and the prototype was built with zero review. A low-stakes internal tool with no sensitive data is the only defensible exception, and even then have an engineer read it first.
What exactly do I hand to engineering? Four things: the prototype link, the one-page companion, the prompt history, and a list of everything that is faked or hardcoded. Then let engineering estimate the production build as new work.
Sources
Footnotes
-
2025 GenAI Code Security Report, Veracode. Veracode tested over 100 large language models across Java, Python, C#, and JavaScript; 45% of generated code samples failed security tests and introduced OWASP Top 10 vulnerabilities. ↩ ↩2
-
2025 Developer Survey, AI section, Stack Overflow. "AI solutions that are almost right, but not quite" was the top frustration at 66%, followed by "debugging AI-generated code is more time-consuming" at 45%. ↩
-
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity, METR. A randomized controlled trial with experienced open-source developers: tasks took 19% longer with AI assistance, while developers believed AI had sped them up by 20%. ↩