Reducing software delivery lead time from around twelve days to under eight days does not require a rewrite. It requires a sequence: measure where time is actually lost, shrink batch size at the planning stage, automate the deploy path on the system you have, and remove human approval gates that have no defect-prevention value. Each step compounds the next. The system you already have is the only system you can actually ship from.
The most common response to a 12-day lead time is a rewrite proposal.
The pitch sounds reasonable. "The existing system is too brittle to deploy quickly. The new platform will be designed for CD from day one. Give us six months and we will get lead time under one day."
Twelve months later, the new platform is half-built. The existing system is shipping at 14 days, not 12. The team has zero new continuous-delivery experience because they have been waiting for the new system to start practicing it. The transformation has gone backwards.
This is the rewrite trap. This post is about the alternative: how to take a team with a 12-day lead time and get them under 8 days on the system they already have, in a sequence that is more predictable than most teams expect.
The result is not theoretical. Teams that follow this sequence consistently land in the 7-to-8-day range within a quarter or two. The exact number depends on starting state. The trajectory does not.
What Lead Time Actually Measures
Lead time for changes is one of the four DORA metrics. It measures the time from code commit to code running in production. Elite teams ship in under one hour. High performers ship in under one day. Medium performers ship in under one month. Low performers take more than a month.
Most teams reporting "we ship every two weeks" actually have lead times of three to five weeks. The sprint cadence is not the lead time. Lead time includes everything: code review, manual QA, change advisory board approval, deploy windows, hotfix interruptions, rework after staging found a defect. The clock starts at commit. It stops in production.
This matters because lead time is the single best predictor of organizational delivery health. Long lead time means defects sit in queues longer, learning loops are slower, and feature throughput is capped by batch coordination overhead. Cutting lead time in half is not just faster delivery. It changes which engineering practices are economically viable. Short lead time makes trunk-based development cheap. Long lead time makes long-lived branches mandatory.
Why Teams Default to Rewrites
The default response to long lead time is structural reasoning: the system is the bottleneck, so replace the system. This feels rigorous. It is usually wrong.
When we measure where the 12 days actually goes, the system is rarely the largest line item. We typically see something like this:
- Code review wait time: 2.5 days
- Manual QA cycle: 3 days
- Staging rebuild and integration check: 1 day
- Change Advisory Board approval: 2 days
- Deploy window scheduling: 1.5 days
- Active development and rework: 2 days
The actual deployment of the artifact is often under an hour. The system can deploy faster than the organization will let it.
A rewrite addresses none of these. The new platform will still have code review wait time, manual QA, integration checks, CAB approval, and deploy windows, unless the organization changes those things. And organizations that defer the change ("we will fix this on the new platform") almost never make it.
The Sequence That Actually Works
The path from 12 days to 7 days has four steps in a specific order. Skipping a step or reordering them does not work. Each step creates the preconditions for the next.
Step 1: Measure where the time actually goes
Before changing anything, instrument the pipeline to capture timestamps at every stage transition. Commit pushed. PR opened. First review comment. PR merged. Deployed to staging. QA started. QA passed. CAB approved. Deployed to production.
This is one week of work and usually requires no code changes. Most CI systems already capture this data and the team has just never aggregated it. The output is a stage-by-stage histogram of where time is spent.
Almost every team is surprised by the result. The bottleneck is rarely where they think. Teams that "knew" code review was slow discover the actual bottleneck is the gap between PR-approved and PR-merged. Teams that "knew" deployment was hard discover deployment takes 40 minutes and the CAB approval averages 2 days.
You cannot fix what you cannot see. Step 1 is unglamorous and unblocks every later step.
Step 2: Shrink the batch at the planning stage
The single highest-leverage intervention is reducing the size of what enters the pipeline. A 500-line change spends longer in every stage than a 50-line change does. Larger PRs sit in review longer because reviewers wait for time to read them carefully. Larger releases require longer QA cycles because the surface area is bigger. Larger batches make CAB approval slower because more risk attaches to each decision.
Most teams skip this step because they treat batch size as a property of the work, not the team. It is not. Batch size is a planning choice. A feature can ship as one batch or as five. The teams shipping in 7 days are not working on smaller features. They are slicing the same features differently.
The discipline here is at the plan phase, not the code phase. If your planning process produces tickets that take a week of coding, your lead time is bounded below by a week regardless of what you do to the rest of the pipeline. Most teams need to learn vertical slicing: every ticket must be independently shippable and independently valuable. Backend-only and frontend-only stories are a smell. So is any ticket with the word "and" in the title.
Step 3: Automate the deploy path on the system you have
Now, and only now, invest in deploy automation. This step is almost always done out of order, which is why it so often fails. Teams automate deployment first, before measuring, and discover deploy time was 5% of the bottleneck. Months of pipeline work produce hours of lead-time reduction.
In sequence, this step is high-leverage because the previous steps have already shrunk batch size. Smaller batches make automated deployment safer (less to go wrong) and faster (less to coordinate). The pipeline you build now is one you can actually run frequently, which builds the operational experience needed for step 4.
The principle: do not optimise a pipeline you are not yet using. Automation amplifies whatever cadence already exists. If you deploy once a fortnight, automating the deploy path takes you from "painful fortnightly release" to "cheap fortnightly release." That is a useful improvement, but it does not move the lead-time needle. The needle moves when the automation enables a release cadence that was previously impossible. That requires step 2 first.
Step 4: Remove approval gates with no defect-prevention value
The last step is the hardest because it is organisational, not technical. Change Advisory Boards, deploy windows, release-manager sign-off, and manual gate approvals exist in almost every long-lead-time team. They were added for legitimate reasons, usually after an incident. They persist because nobody has authority to remove them and nobody can prove they are not preventing defects.
The fix is to measure. For every approval gate, ask one question: in the last 12 months, how many defects has this gate caught that would otherwise have reached production? If the answer is zero or near-zero, the gate is theatre. It is adding lead time without adding safety. Remove it.
Gates that have caught defects are different. Do not remove those. Automate them. The CAB that catches database migration risks should be replaced by a pipeline check that fails when a destructive migration is committed without a corresponding rollback plan. The release manager who catches missing changelog entries should be replaced by a CI rule. Every legitimate gate has an automation path. Manual review at scale is not safety. It is queueing.
Why This Order Matters
Teams that try these steps out of order plateau. The most common mistake is starting with step 3 (deploy automation) because it feels like the obvious technical fix. The second most common is jumping to step 4 (removing approvals) because it is the most frustrating bottleneck. Both fail.
Without step 1, you cannot prove that step 4 is safe to do. You have no defect-prevention data to defend removing a gate, and the first incident after removal becomes evidence that the gate was load-bearing. Without step 2, deploy automation produces a fast pipeline you still cannot run frequently because each release is too large to risk. The steps are sequenced because each one supplies the evidence and the precondition the next one needs.
This is also why "we will fix it on the new system" fails. The new system will need exactly the same four steps, in the same order. The team that has not run them on the existing system has no experience to apply when the new system goes live.
What This Looks Like in Numbers
A representative trajectory for a team starting at 12.4 days:
- Week 1-2 (measure): No lead-time change. The histogram reveals 8 days in queues, 1 day in active work, 3 days in approvals.
- Week 3-6 (shrink batch): Average PR size drops from 480 lines to 180 lines. Review wait drops from 2.5 days to 1.2 days. Lead time drops to 9.8 days.
- Week 7-10 (automate deploy): Deploy time from artifact-ready to production drops from 4 hours to 12 minutes. The bigger effect is psychological. Engineers stop batching changes to amortise deploy pain. Lead time drops to 8.6 days.
- Week 11-14 (remove gates): CAB replaced with pipeline-enforced migration checks. Deploy windows removed for changes flagged low-risk by the pipeline. Lead time drops to 7.2 days.
The exact numbers vary. The shape does not. The largest single jump usually comes from step 2 (batch size). The most psychologically satisfying step is usually step 4 (removing gates), but it does not work until the prior steps have built the safety net that replaces those gates.
Notice what is not on this list. No platform migration. No new framework. No containerisation initiative. No CI tool switch. The codebase did not change. The team did not change. The organisation around the codebase changed.
How to Apply This Monday Morning
Pick one team. Not the whole organisation. One team that wants to do this and has cover from leadership.
That team:
-
This week: Add timestamp logging to the deploy pipeline. By Friday, have a histogram of where time is spent on a real PR from last week. Share it with the team.
-
Next two weeks: Pick one in-flight ticket. Slice it into the smallest independently shippable increment. Ship that first. Repeat for the next ticket. Measure the lead time on the slices versus the original tickets.
-
Following month: Build the deploy automation for the existing system. Not the new system. The one you actually run today.
-
Then: Run the approval-gate audit. For each gate, find the defects it has caught in the last year. Remove or automate every gate that has caught zero.
This is not a transformation programme. It is a four-step diagnostic and intervention sequence. The team that runs it ends up at a 7-day lead time on the system they already had. The team that waits for the rewrite ends up at 14 days a year later.
The Bottom Line
Lead time is a property of the system and the organisation around it. Rewrites address the system. The organisation is usually the larger problem. The sequence above attacks both, in the order that compounds: measure, slice, automate, remove. The result is not a rewrite. It is a team that ships in half the time on the same codebase, and now has the muscle memory to keep going.
Twelve days to seven days is not the destination. It is the proof that the destination is reachable from here.