Mob Elaboration: What Happens When AI Runs the Requirements Room

The first time I facilitated a Mob Elaboration session, the Product Owner read the Intent aloud: a new customer onboarding flow for a financial services platform. Within four minutes, the AI had generated twelve user stories, acceptance criteria for each, a proposed domain model, and a decomposition into three independent units of work.

A senior developer stared at the screen and said nothing for about ten seconds. Then: “Half of these are wrong.”

He was right. The AI had generated a clean, internally consistent set of stories that completely missed the regulatory constraint governing how customer identity verification works in that jurisdiction. The domain model had no concept of verification tiers. The acceptance criteria assumed a single approval path where the business actually required three, depending on the customer’s risk profile.

The interesting part was not that the AI got it wrong. The interesting part was what happened next. The team spent the following ninety minutes doing something they had never done in a traditional planning session: they systematically interrogated every assumption the AI had made, surfaced domain knowledge that had been living in people’s heads for years, and produced a specification more precise than anything they had written before.

That session changed how they think about requirements. It also shipped three weeks ahead of the original estimate, because the team spent ninety minutes arguing about the specification instead of three sprints discovering it was wrong.

The Shift: Why Traditional Requirements Break When AI Executes

For decades, requirements have been written with an implicit assumption: a human developer will read them, fill in the gaps with judgment, ask clarifying questions when something is ambiguous, and make reasonable decisions about edge cases. Requirements could afford to be imprecise because the person executing them would compensate.

AI does not compensate, it executes. When you write “the system should handle edge cases gracefully” for a human developer, they know to ask what the edge cases are. When an AI agent reads the same requirement, it generates code that handles whatever edge cases it infers from the context, which may or may not resemble the ones that matter to your business. Requirements are no longer just a communication artifact between people. They are becoming an executable input: the closer they are to code in precision, the more predictable the system becomes.

This is the shift that the developer community is already discovering independently. The specification is the only thing constraining what the AI will confidently get wrong. But the specification itself needs a different process when AI is both the consumer and the first-draft author.

Hunt and Thomas put it well in The Pragmatic Programmer: “The real world is messy, conflicted, and unknown. In that world, exact specifications of anything are rare, if not downright impossible. That’s where we programmers come in. Our job is to help people understand what they want.” When the developer is an AI agent, that job becomes the entire value proposition of the requirements room. As Coding with AI (2025) warns: “Without precise, well-defined requirements, the tools will help you build a bad project faster.” The speed that makes AI valuable is the same speed that makes ambiguity dangerous.

Mob Elaboration is the ritual that structures this conversation: the process through which a team transforms a vague business intent into a specification precise enough for an AI agent to execute correctly.

The Ritual in Action

A session runs two to four hours with five to eight people: the Product Owner, two to three developers, a QA representative, and the facilitator. The AI is a participant, not a tool running in the background. The facilitator manages the interaction between the team and the AI, enforcing time discipline and challenging both human assumptions and AI-generated artifacts.

The session moves through five phases, each with a specific purpose and a clear exit condition.

Intent Clarification opens the session. The Product Owner reads a one-to-three sentence Intent statement aloud. The AI asks clarifying questions. The facilitator’s job in this phase is to ensure the AI asks deep enough questions and the team answers with real context, not abstractions. In one session, a PO described the intent as “automate the invoice approval workflow.” The AI’s first three questions were generic. The facilitator pushed: “Ask about the exceptions.” That single redirect surfaced a manual override process that handled roughly a third of all invoices and that nobody had mentioned because it was “just how we do it.” When the AI’s questions become repetitive, the marginal value of additional context has dropped. This phase typically runs fifteen to thirty minutes, and skipping it is the single most common cause of session failure.

Story Generation follows. The AI produces user stories with acceptance criteria based on the clarified intent. The facilitator reads each story aloud with the team. The Product Owner confirms whether each story matches their expectation; developers confirm whether each story is buildable. The danger here is rubber-stamping: approving AI output without scrutiny because it looks polished and internally consistent. Consistency is not correctness. A healthcare team once approved a clean appointment scheduling decomposition without noticing the model had no concept of appointment types; in-person visits, telehealth, and lab appointments had different scheduling rules that the AI had silently collapsed into one.

Unit Division groups validated stories into independent, team-sized units of work. The key question is dependency: can Unit A be built without waiting for Unit B? If not, the division is wrong. Hidden dependencies between units are the second most common source of rework.

Risk and NFR Analysis is the phase teams most want to skip and least can afford to. The facilitator insists on at least three identified risks and measurable non-functional requirements before moving on. “It should be fast” is not an NFR. “Response time under 200ms at the 95th percentile” is. In one session, the team tried to move past this phase in five minutes. The facilitator held the room: “What happens if this service goes down during peak hours?” Silence. Then the architect said, “We don’t have a fallback.” That conversation added forty minutes to the session and saved the team from deploying a service with no degradation path. The fallback they designed in that room would have taken two weeks to retrofit after launch. This phase also defines how the team will know the Intent was successful, what metric moves.

Bolt Planning closes the session by breaking units into time-boxed execution blocks, each scoped to roughly one day of Mob Construction work. The team identifies which bolts can run in parallel, assigns ownership, and commits to a timeline.

The session ends with a recap, PO confirmation, and a scheduled date for the first Mob Construction session.

What Teams Discover

Three patterns emerge consistently across sessions, regardless of industry or team size.

Stories get more precise than anything the team has written before. The combination of AI-generated first drafts and systematic human validation produces specifications that are simultaneously more detailed and more accurate than traditional user stories. The AI forces completeness by generating acceptance criteria for every story. The team forces correctness by challenging every assumption. Neither could produce the result alone. As Vlad Khononov argues in Learning Domain-Driven Design, “DDD is about letting your business domain drive software design decisions.” Mob Elaboration operationalizes this: the domain experts are in the room, the AI surfaces the questions, and the facilitator ensures the answers get captured.

Domain models get explicit earlier. In traditional development, domain models emerge gradually through code. Implicit assumptions about how the business works get encoded in class hierarchies, database schemas, and API contracts, often inconsistently across different parts of the system. Schwentner describes this problem precisely in Domain-Driven Transformation (2025): classic requirements engineering works like a telephone game, with interviews conducted separately and requirements derived in isolation, leading to misunderstandings that compound through the system. Collaborative Modeling methods solve this by bringing business and technical people together to exchange ideas directly. In the ritual, the AI generates an explicit domain model during story generation. The team sees it, debates it, and corrects it before a single line of code exists. Decisions that traditional agile postpones until they become expensive to change get made when they are cheap to change.

Teams catch architectural assumptions they would have deferred. The AI’s first draft is a mirror. It reflects back what the team told it, including the gaps. When the financial services team saw the AI’s domain model with no verification tiers, they did not just add the missing concept. They realized they had been carrying an implicit assumption about identity verification for years, one that had never been written down because everyone “just knew.” As Hohpe observes in The Software Architect Elevator (2020), tacit knowledge “exists only in employees’ heads but isn’t documented or encoded anywhere,” and encoding it into explicit artifacts “eliminates unwritten rules and undesired variation.” The AI’s inability to read minds forced the team to make their knowledge explicit.

These discoveries are not accidental. They are structural consequences of the ritual’s design: AI proposes, humans validate, the facilitator enforces rigor. The combination produces artifacts that neither party could produce independently.

The Facilitator’s Role

The facilitator is not a project manager, not a Scrum Master, and not a technical lead. The facilitator manages the conversation between the team and the AI, and that conversation has failure modes that traditional facilitation does not prepare you for.

The most dangerous failure mode is approval fatigue. AI-generated artifacts are internally coherent: stories align with the domain model, the logical design follows from the domain model, code implements the logical design. That consistency creates a false sense of correctness. Everything fits together, so it must be right. Wells describes the underlying mechanism in Enabling Microservice Success (2024) through cognitive load theory: teams have a finite capacity for germane cognitive load, the kind that produces real understanding. Each AI-generated artifact the team reviews consumes that capacity. By the third or fourth model, the team’s ability to genuinely scrutinize is depleted, but the artifacts keep coming. I watched it happen in real time: a team spent twenty minutes rigorously debating the first domain model the AI produced, caught two errors, and felt good about their diligence. By the third model, they were nodding through it in four minutes. The facilitator stopped the room: “We’re approving faster. Is that because the quality improved, or because we’re tired?” It was the latter. They found a missing entity in the model they had been about to approve. The facilitator’s job is to periodically break the internal frame: “Forget what AI generated. Based on what you know about this business, is anything missing or wrong?”

The second failure mode is domain blind spots. The team validates what AI proposes without asking what AI omitted. AI decomposes based on the information it has; it does not know what it does not know. The facilitator must explicitly ask at the end of each phase: “What’s missing from this model? What business rules aren’t represented here?”

The third is the loudest-voice problem, amplified by AI. When AI generates a first draft, the person who speaks first about it anchors the team’s evaluation. The facilitator manages participation: written observations before group discussion, explicit rotation of who leads the review, direct questions to quieter team members.

This is an emerging leadership skill. Adkins wrote in Coaching Agile Teams that “the coach creates the container; the team creates the content.” In Mob Elaboration, the facilitator creates the container for a conversation that did not exist before AI entered the room. The content, the validated specification, emerges from the collision between AI’s speed and the team’s domain knowledge.

The goal is not a permanent facilitator role. After six to eight sessions, the team should absorb the capability. The facilitator’s success is measured by how quickly they become unnecessary.

The Requirements Room Didn’t Disappear

The developer community is converging independently on specification before generation, role separation, and quality gates. What they lack is the facilitated team dimension.

Solo developers building three-agent pipelines are solving the right problem at the individual level. But a specification written by one person, even a skilled one, carries that person’s assumptions and blind spots. Mob Elaboration works precisely because it is a team ritual: the Product Owner brings the business intent, the developers bring technical constraints, QA brings the failure scenarios, and the AI brings speed and completeness. The facilitator ensures none of these perspectives gets lost. As Helfand documents in Dynamic Reteaming (2020), mob programming creates a “group memory” where “the context was built from the communications that we’d been having all day.” In Mob Elaboration, that shared context does not just help the team; it becomes the specification that the AI executes.

The requirements room has not disappeared. It has become the most important room in the building, because the quality of what comes out of that room now determines the quality of everything the AI produces downstream. A vague intent produces vague stories, which produce vague code, which produces vague bugs that are expensive to diagnose and expensive to fix. A precise intent, refined through structured team elaboration, produces specifications that AI agents can execute correctly the first time.

The bottleneck in AI-driven development was never code generation. It was always intent definition. The difference now is that ambiguity does not slow you down; it compounds instantly, because the AI will generate confidently from whatever it has. Mob Elaboration is how teams remove that ambiguity before the first line of code makes it expensive.

Ricardo