What to Kill in Your Engineering Org This Quarter

What to Kill in Your Engineering Org This Quarter — eleven retirement candidates for AI-native teams

Most engineering orgs spent the last year adopting AI tools and kept everything else exactly as it was. The standups still happen. The sprints still close. The velocity charts still get presented to leadership. The PRDs still arrive verbose. The PR review still happens at uniform depth, the manual QA still runs on every change, the same job descriptions still circulate for engineering hires.

This is the actual reason most AI adoptions are underperforming. The tools are not the bottleneck. The accumulated process from the pre-AI era is. Every ritual, role, report, and artifact you kept assumed humans were the throughput constraint. Agents broke that assumption. The process did not get the memo.

Here are eleven specific things to kill this quarter. Not all at once, not all of them necessarily, but as a serious list of candidates for retirement in a team that wants AI adoption to be more than a velocity bump. For each one: what it was for, why it is now expensive, and what to replace it with.

1. Two-Week Sprints

Sprints were a coordination ritual for slow, expensive human execution. The two-week window let teams batch planning and replan based on what was learned. Both jobs do not survive AI-native execution. Planning needs to anchor on outcomes that take longer than a sprint to deliver. Replanning needs to be continuous, not batched.

The replacement is the outcome cycle: a four-to-six-week unit anchored on a specific outcome, with continuous flow inside and an outcome review at the end. Sprints get retired. The work gets better paced. Leadership gets a clearer story to tell upward.

2. Daily Standups

The standup surfaced blockers and kept the team aware of each other's work. AI-native systems do both better. Blockers surface in the work because state is visible in the systems the work runs through. Awareness of each other's work is better served by an async end-of-day post (three lines: what I worked on, what I learned, what I am picking up next) that persists as searchable context.

The 9am standup costs the whole team thirty minutes plus context switching. The async cadence is two minutes per person, asynchronously, and produces a written record that the team can search a month later. Replace it.

3. Story Points

Story points were an attempt to estimate work in a unit that abstracted away from human hours. They worked imperfectly even when human hours were the constraint. They do not work at all when an agent does the implementation in a fraction of the time and the variance is in the specification quality.

The unit of estimation that matters now is risk: how reversible is this change, how unknown is the outcome, how hard is it to verify. Points should be retired in favour of risk tiers. A change is low risk, medium risk, or high risk. That is the only granularity the team needs.

4. Velocity as a Primary Metric

Velocity measures throughput. Throughput, in an AI-native team, is not a useful primary metric, because it can rise while the system gets worse. A team can ship more PRs while the change failure rate climbs and the incident rate compounds. Velocity will tell you the first half of that story and stay silent about the second half.

The replacement is change failure rate as the primary metric, with velocity as a secondary signal. A team with rising CFR and rising velocity is in worse shape than a team with flat velocity and falling CFR. The current convention of presenting velocity to leadership and not presenting CFR is a leadership reporting failure. Fix it this quarter.

5. AI Tool Licence Counts as a Maturity Signal

The first slide of every AI adoption deck shows licence counts: "we rolled out Copilot to 200 engineers." This metric is meaningless. Tool access does not produce outcomes. The teams getting durable returns from AI adoption are not the ones with the most licences. They are the ones with the deepest context infrastructure, the cleanest agent-readable test suites, and the most adapted review processes.

Replace licence counts with maturity signals: context layer completeness, test suite agent-readiness, review process risk-tiering, agent ticket close rate. These are the things that actually move outcomes. The licence count is a vanity metric. Treat it as one.

6. PR Review at Uniform Depth

The traditional rule is that every PR gets at least one review, sometimes two, regardless of what changed. This made sense when PR volume was bounded by human production speed. It does not survive AI-native PR volume, because reviewers cannot pay equal attention to ten times the changes.

The replacement is risk-tiered review. Low-risk PRs with full automated check passes go through a lighter review, sometimes auto-merging after a cooldown window. High-risk PRs get deep review. Senior engineer attention is allocated by risk, not by queue order. Teams that do this well find their senior engineers are doing more architecture work and less reviewing. The team's overall review depth goes up where it matters and down where it does not.

7. The Single Engineering Manager Owning Everything

The EM job that includes context layer maintenance, agent stack ownership, verification infrastructure, AND people management, coaching, hiring, and stakeholder syncs is two jobs in one. Most EMs are doing both badly because both done well is more than one person should be expected to carry.

The replacement is splitting the role: EM for people and delivery, Engineering Lead for technical depth and the agent stack. Both report to the same manager. Both have equal seats at the leadership table. This split costs at most one promotion. It unlocks the technical leadership work that nobody currently has time to do.

8. Engineer-as-Implementation-Machine Job Descriptions

The job description for "senior engineer" most companies are still using describes someone who writes code, reviews code, and ships features. That description is for the pre-AI engineer. The AI-native senior engineer's job is more like: specifies work clearly enough that agents and other engineers can act on it, evaluates AI-generated output for system fit, maintains the parts of the context layer their domain covers, owns verification depth in their area, and yes, also writes code where judgment requires it.

The job description rewrite is overdue in almost every company. Do it this quarter. The right description will change who you hire and who you promote, which is what makes it the highest-leverage HR change available right now.

9. Quarterly Architecture Review

The quarterly architecture review meeting was an attempt to surface and approve significant technical decisions. It was inadequate even when it worked, because architecture decisions do not respect quarterly calendars. In an AI-native team, the cadence is even more wrong: significant technical decisions are being made weekly as the team's agent stack and context layer evolve.

The replacement is a continuous architecture decision record practice. Significant decisions get written down when they happen. Reviews are ad-hoc and lightweight. The quarterly meeting is retired. The decisions still get made, and they get documented better.

10. Manual QA on Every Change

The full-time manual QA team running scripted test cases against every change was a rational response to "tests are expensive to write and maintain, so verify manually." That equation has flipped. AI tools can generate comprehensive test coverage faster than any human team can write it, and they can generate it as part of every change.

Manual QA still has a role in exploratory testing and in catching the things automated testing misses, but as the default verification path for every change, it is retired. The replacement is automated coverage that the team trusts, agent-generated where possible, with manual QA reserved for the high-value exploratory work where human judgment is the actual asset.

11. Verbose PRDs and Vague Tickets

The PRD got longer over the years as product managers added context that humans missed. The ticket got vaguer as engineers learned to fill in the gaps from their own judgment. Both directions were wrong, and both compound badly in AI-native execution.

The replacement is the agent-ready ticket: outcome statement, constraints, acceptance criteria specific enough to verify, and risk tier. This is shorter than the average PRD and clearer than the average ticket. The teams that have invested in this format are getting agent close rates that are not available to teams running on prose-heavy specifications.

How to Run the Retirement

Do not try to retire all eleven at once. The team will lose its rhythm before the new rhythm has formed. A reasonable sequence:

This quarter: pick three. The ones with the highest leverage in your specific team. For most teams, that is items 4 (velocity as primary metric), 6 (uniform PR review), and 7 (single EM owning everything). These three retirements unlock the conditions for the rest.

Next quarter: pick four more, with at least one ritual change (1, 2, or 9), one role change (7 or 8), and one artifact change (11).

Quarter after that: finish the list, with adjustments based on what worked. By the end of three quarters, the team is running on a different operating model. The transition was steady, not chaotic. The new shape held.

The retirement list is not a manifesto. It is an honest inventory of what most engineering orgs are still doing because they always have. The leaders who run this inventory and act on it will be operating on a fundamentally different system than the leaders who do not. That gap will widen over the next year. The question is whether you want to be on the right side of it.

The Audit That Most Engineering Leaders Have Not Run

There is one exercise I'd recommend every engineering leader do before the next planning cycle. Pull up the last quarter's calendar. Look at every recurring meeting. Look at every standing report. Look at every artifact your team produced more than once. For each one, ask: what was this for, and is the answer still true?

Some of them will be: yes, this is still load-bearing, keep it. Some of them will be: this was load-bearing two years ago and is now ceremonial, retire it. A few will be: I have no idea why we do this. Those are the cheapest wins on the list.

The exercise is not glamorous. It takes a focused half-day. The return is a calendar and a process inventory that reflect the team you actually have, running on the tools you actually use, doing the work that actually matters. That is the foundation on which the rest of the AI-native transition rests. Without it, the tools are an expensive overlay on a system that is fighting them.

Hiring more engineers will not fix this. New tools will not fix this. The fix is leadership willing to look at the inventory and act on it. The list above is a starting set.

I help engineering teams close the gap between "we use AI tools" and "AI actually changed how we deliver." Book a 20-minute call and I'll tell you where the leverage is.