
Most engineering organisations that have reached the Agent stage are focused on one question: how fast can it build? How quickly can AI generate code, write tests, produce documentation?
That is the wrong question.
When Xceptor's engineering team first watched AI agents work through their delivery pipeline, the instinct was to scrutinise the code. Senior engineers dissected every function, every variable name, every architectural choice. The first few connectors were delivered faster than expected, but the team was spending almost as much time reviewing to fix as they had previously spent building.
Around the third connector, something shifted. Engineers stopped line-by-line critique and started asking a different question: is the plan right?
That shift from reviewing code to reviewing plans is the actual insight from twelve months of building an agentic product development lifecycle at Xceptor. And it has implications for every engineering team trying to move AI from individual tool to team-level operating model.
Xceptor runs a data automation platform serving financial institutions across 170+ SaaS instances. Their engineering team (39 people at the start of this engagement, 49 today) had rolled out GitHub Copilot and was experimenting with Claude. Individual developers were faster. Team-level delivery metrics had not moved.
The problem was sequential handoffs. Features moved through product, architecture, development, QA, and DevOps as a relay race, and wait time between stages often exceeded the work time within them. AI was making each runner faster. The baton drops between them were unchanged.
The Agentic PDLC addressed that directly. It covers nine stages: feature requirements, architecture and design, test strategy, story breakdown, implementation planning, build, test, PR review, and release. Two roles govern it Product and Builder and it is deployed as a governed plugin, currently at v1.13 with 46 org-wide installs. Every stage runs on slash commands. AI proposes at every stage. Humans approve every gate. No artefact moves forward without sign-off.
That last part is the design principle that makes everything else work.
The value of this pipeline is measurable, not claimed. Two proof points from production, both demonstrated on stage at CTO Craft.
Connector build (MS Graph and Databricks). Two connectors the team estimated at two weeks each were delivered in two days each.Achieved an 83% cost reduction compared to the traditional approach while requiring only a modest investment of Product and Builder resources. AI token cost was £110. The original target was one day per connector. "I didn't get there," Michael Kinloch, Xceptor's SVP Engineering, said on stage, "but I think we can streamline it a bit more."
Config Builder (Loan Notices). The pipeline extended beyond code into platform configuration translating plain-language business process descriptions into structured Xceptor configuration. The Loan Notices project was delivered in 6 days against a 26-day estimate. Delivered the solution for less than a quarter of the anticipated conventional cost. The same pipeline now handles both code and platform configuration. The agent determines which one it is building.
Both results were presented with updated figures at CTO Craft London in March 2026 and CTO Craft Toronto in May 2026 the same programme, verified results, presented by the engineering leader responsible for them.
Before any code is written, the agent produces an implementation plan: every file it will create or modify, what changes it will make, what tests it will write. The pattern across every connector and config build is consistent when the plan is correct, execution is accurate. When the plan has a gap, the output reflects that gap, reliably.
Course-correcting a plan takes minutes. Course-correcting generated code takes hours.
This changes where senior engineers should concentrate their attention. The instinct is to review the code because that is the deliverable. But by the time the code exists, you are already downstream of the decision that matters. The architecture document, the threat model, the implementation plan; these are the artefacts worth the scrutiny. The code is downstream output.
The plan is the primary review artefact. When the plan is right, execution follows. That is not a claim about model intelligence, but a structural observation about where errors originate.
Xceptor's rework rate before this programme was 30%. Requirements gaps were surfacing during build or QA rather than during planning. That is a sequencing failure, not a quality failure: the right questions were not being asked at the right stage.
The agentic pipeline asks the same questions at every stage, every time, regardless of whether the standup ran over or someone skipped a step. It runs the same security checks. It applies the same structure to architecture documents, threat models, and release notes. Artefact quality became consistent across the team, not dependent on who happened to own the ticket that week.
Rework dropped from 30% to under 10%.
Story creation went from 3 hours to under 1 hour. Test scripting time dropped 75% per engineer. 80% of the engineering team 39 of 49 engineers adopted the pipeline within six months, measured by actual usage rather than licence installs.
As Mike Kinloch, SVP Engineering at Xceptor put it at CTO Craft: "for us it is not really about speed and delivery. The biggest thing was the quality of the artefacts and the process... really robust architecture design documents, really robust threat modelling, great release notes... which beats speed hands down."
Three problems are worth naming for teams building this kind of pipeline.
Story granularity mismatch. The pipeline initially broke features into small, human-readable stories. For an AI, each story requires reloading full context and a small story costs nearly as much context as a large one. One larger story per connector worked better. The approach that makes sense for human sprint planning is the wrong approach for agentic execution.
Artefact detail calibration. The agent generates more detail than humans need to review. Early design documents created overhead rather than reducing it. Expect two to three iterations on prompt design before output consistently matches how your team actually works. This is not a failure it is the calibration phase every team will go through.
Token consumption patterns. The agent sometimes loaded entire documents when it needed only specific sections. The team is evaluating whether non-engineering stages should run on smaller, cheaper models, reserving the primary model for the build stage. Model selection by pipeline stage, not by preference, is worth building into your architecture from the start.
The cognitive shift was visible before the metrics confirmed it. Engineers moved from "reviewing to fix" to "reviewing to perform." They were not debugging AI output. They were steering it.
Product owners shifted from writing stories to reviewing and directing agent-generated requirements. Developers moved from writing boilerplate to making architecture decisions. QA moved from scripting tests manually to approving AI-generated suites and catching edge cases.
The pipeline does not remove human judgment from the process. It relocates where that judgment is applied upstream, at the plan stage, where it compounds. Downstream, in the code, it ratifies.
Three streams are scaling in parallel. The Connectors AI PDLC is rolling out to the full delivery team. Features AI PDLC and Config Builder AI PDLC are each on a 6-week sprint to MVP. The Code PDLC and Config PDLC then merge into a single pipeline: one lifecycle, two output types.
The commercial implication of that merger is significant, and worth a separate piece. When the same governed pipeline that delivers code in days also delivers customer-facing platform configuration, the timeline compression does not stay inside the engineering org. It reaches customer delivery, pre-sales, and professional services. Project timelines compress. Customer self-service expands. The professional services model shifts.
That is not a future state. The Loan Notices Config Builder was demonstrated to a customer at PoC stage. As Xceptor’s team recounted: "we put it in front of our customer and demoed it... they were like, 'Hey, this works. How did you build it so fast? It normally takes much longer than this.'"
For engineering leaders working through their own AI adoption: the leverage in an agentic PDLC is not the speed of code generation. It is the consistency of the process upstream of code, and the compounding that follows when artefact quality is no longer variable.
Start with the plan. Make the plan the primary review artefact. The code will follow.
For the full Xceptor case study with metrics across all delivery phases, read: How Xceptor Moved AI Out of the Pilot Phase and Into Every Stage of Delivery.
For the three-stage adoption roadmap, read: The Augment - Automate - Agent Journey: A Practical Roadmap for Engineering Leaders.