Technical Debt Is a Risk Conversation
A practical framework for turning engineering pain into decision-ready technical debt proposals that managers, product partners, and tech leads can evaluate.
Most failed technical debt conversations do not fail because the debt is imaginary.
They fail because the proposal asks everyone else to translate engineering pain into delivery risk.
An engineer says, “The payment provider code is tightly coupled and hard to test.” A manager hears something fuzzier: “The code is ugly, and engineering would like time to make it nicer.” Product hears: “The feature roadmap is about to get more expensive, but I am not sure why.”
That translation gap is where good proposals go to die.
The work is real. The pain is real. But the decision is not ready yet.
Image Placeholder
Technical debt as a risk map
Create a hero illustration showing a code module connected to delivery risk, release confidence, team knowledge, and roadmap predictability.
The problem is not debt
Technical debt is not automatically bad.
Sometimes debt is exactly the right trade-off. If a team is still validating whether a feature should exist, a fast and slightly awkward implementation can be responsible engineering. Spending two extra weeks polishing an architecture for a feature nobody uses is not quality. It is expensive decoration.
Debt becomes a problem when it changes the economics of future work.
That shift usually shows up in ordinary symptoms:
- a feature that should take two days starts taking two weeks
- every release needs manual checking because the automated tests are not trusted
- one vendor integration leaks into unrelated parts of the system
- engineers avoid a module because nobody understands the side effects
- product cannot plan confidently because estimates become random
At that point, the technical issue has become a planning issue.
And planning issues are business issues.
Managers do not prioritize “clean code”
This can feel frustrating, but it is healthy.
A manager’s job is not to make the codebase beautiful. Their job is to decide priorities under constraints. They are usually thinking about the next quarter, release risk, predictability, team health, and the cost of saying yes to one thing instead of another.
So when we say:
This module has high coupling and weak test coverage.
we are asking them to perform a translation:
Does this delay a release? Does this increase production risk? Does this burn the team out? Does this block a business goal?
That translation is our job.
Not because managers are incapable of understanding engineering detail. Good ones often can. But because a prioritization conversation should not begin with a reverse-engineering exercise.
The more decision-ready the proposal is, the more likely the conversation becomes about trade-offs instead of taste.
Use the smallest useful framework
The most useful format I know is boring:
A decision-ready technical debt proposal
That is it.
The hard part is not remembering the three words. The hard part is refusing to hide behind engineering vocabulary when the actual decision is about risk.
1. Problem: one sentence, not an architecture review
The problem should be small enough to hold in your head.
Bad:
The provider integration violates separation of concerns, creates tight coupling between billing, authentication, onboarding, and plan management, and makes tests hard to write because provider logic is spread across multiple services.
This might be technically accurate. It is also too much for a planning conversation.
Better:
Our payment provider logic is spread across four parts of the system, so every plan change requires updates in multiple unrelated places.
Now the shape of the problem is obvious. There are multiple places. A plan change touches all of them. The risk is easier to see.
A good problem statement does not explain every implementation detail. It gives people the handle they need to pick up the conversation.
2. Impact: architecture is the cause, not the consequence
This is where engineers often lose the room.
We stop at “the architecture is bad.” But bad architecture is not the impact. It is the cause.
Impact explains what the technical issue does to users, the business, or the team. It is stronger when it is concrete:
Delivery
Make time visible
“Last quarter this added about six weeks of extra work across three billing-related changes.”
Release risk
Name the confidence problem
“Every plan change now requires manual regression testing because automated tests do not cover provider-specific edge cases.”
Roadmap
Connect it to an upcoming goal
“This creates a real risk that the enterprise plan slips because SSO and billing both depend on this module.”
Team health
Surface knowledge debt
“Two engineers are the only people comfortable changing this flow, which creates a delivery risk if either is unavailable.”
Notice what changed.
We are not hiding the technical problem. We are connecting it to something the organization can reason about.
3. Proposal: do not hand someone homework
A weak proposal sounds like this:
Let’s think about what to do with this.
That is not a request. That is homework for someone else.
A stronger proposal sounds like this:
I propose we spend one sprint isolating the payment provider behind a small internal interface. After that, the next plan iteration should take less than a week. We can validate this on the upcoming SSO plan change.
This is easier to evaluate because it has edges:
- scope: one sprint
- action: isolate the provider
- expected result: future plan iteration under one week
- validation: test the improvement on the SSO change
The manager may still say no. That is fine. There are always trade-offs.
But now the conversation is about whether one sprint is worth reducing a specific delivery risk. That is a much better conversation than “does clean code matter?”
Same facts, different framing
Different roles care about different kinds of risk. The facts should stay the same, but the framing should change.
| Audience | Usually cares about | Useful framing |
|---|---|---|
| Product | Delivery speed, feature sequencing, user value | “This affects when we can ship the next valuable feature.” |
| Engineering manager | Risk, predictability, team health, staffing | “This creates or removes a delivery risk.” |
| Tech lead | Maintainability, boundaries, scalability, technical risk | “This design increases future change cost in these specific areas.” |
With a tech lead, it is useful to talk about coupling, testability, boundaries, and system design. Still connect those details to impact.
With a manager, the central question is usually:
What risk does this create or remove?
With product, the central question is usually:
How does this affect when we can ship the next valuable thing?
This is ordinary communication. If I explain a database indexing issue to a user, I do not start with B-trees. I say, “The page is slow because the system has to scan too much data.”
Same idea here.
A concrete before and after
Imagine a billing system where payment provider logic has leaked everywhere.
The engineer version might sound like this:
The billing code is too coupled to Stripe. We need to abstract it because it is hard to test and hard to add new plan types.
This is not terrible. But it still asks the reader to do work.
A planning-ready version is sharper:
Now people can evaluate the trade-off:
- Is one sprint worth reducing this risk?
- Is the enterprise plan important enough?
- Do we believe the estimate?
- Can we do a smaller version?
- What happens if we postpone it?
Those are the right questions.
Make the proposal small enough to believe
“Let’s refactor the billing system” is usually not a proposal. It is a fog machine.
It sounds expensive. It sounds open-ended. It sounds like the kind of thing that starts with optimism and ends with everyone crying into Jira tickets.
A believable proposal is usually smaller:
- one sprint to isolate one provider
- two days to add tests around the riskiest checkout flows
- one week to remove duplicate plan calculation logic
- one migration path for the next feature only
The goal is not to fix everything. The goal is to reduce a specific risk.
That distinction matters because technical debt work often fails when the finish line is “make the architecture better.” Nobody knows when that is done.
If the finish line is “make the next plan iteration take less than a week,” you have something measurable. Maybe you hit it. Maybe you do not. Either way, you learn.
Image Placeholder
From vague refactor to bounded risk reduction
Create a simple before/after diagram: a large fuzzy 'refactor billing' cloud on one side, and three small scoped risk-reduction cards on the other.
Do not hide the trade-off
Technical debt work is still work.
If we ask for one sprint, that sprint comes from somewhere. It may delay a feature. It may reduce short-term velocity. It may require help from QA or product.
A credible proposal names that cost:
This will likely delay the small admin reporting improvement by one sprint. I think it is worth it because the enterprise plan depends on the billing flow, and fixing this now reduces release risk for that work.
That is much stronger than pretending refactoring is free.
The question is not:
Should we care about quality?
The question is:
Is this the right quality investment at this moment?
Sometimes the answer is no.
If the feature may be deleted next month, maybe do not refactor it. If the system is used by five internal users and changes twice a year, maybe the ugly code is acceptable. If the company has a hard regulatory deadline, maybe the debt is real but still not the top priority.
Good engineering includes knowing when not to polish.
Knowledge debt is delivery risk too
There is another kind of debt that often hides behind technical debt: knowledge debt.
If only one engineer understands a critical system, the company has a risk. Not because that engineer did anything wrong. Often it happens because they are very good and keep solving problems quickly.
But if they are unavailable, the team slows down or gets stuck.
That is not a personal problem. It is a system problem.
Sometimes the best proposal is not “give me time to fix this myself.” It is:
I propose that another engineer implements this cleanup while I supervise and review the design. That reduces the single-person dependency and gives the team more coverage in this area.
This improves the system and the team at the same time.
Learning to delegate technical debt work is a senior engineering skill. Not because delegation sounds fancy, but because it reduces operational risk.
A practical exercise
Take one painful technical problem from your current project and write it in this format:
Prompt Set
Problem: [one sentence] Impact: [what this means for users, business, delivery, or team health] Proposal: [specific request with scope, cost, and expected result]
Then ask one question:
If someone reads this, will they understand why we should do it?
Not what we want to do.
Why.
If the answer is “I am not sure,” rewrite it. Even better, show it to someone on your team and ask, “Is it clear why this matters?”
The first draft will probably be clumsy. That is normal. The goal is not to write a perfect business case. The goal is to practice translating engineering pain into product and business impact.
The checklist
When you want to prioritize technical debt, try this:
- Do not start with implementation details.
- Explain the problem in one sentence.
- Connect it to user, business, delivery, or team impact.
- Make a concrete request.
- Keep the scope small enough to believe.
- Include how you will validate the result.
- Be honest about what work gets delayed.
- Change the framing for the audience without changing the facts.
This does not guarantee approval.
But it changes the conversation from:
Engineering wants time to clean code.
To:
We can spend one sprint to reduce a specific delivery risk for an important release.
That is the conversation technical debt deserves.
The work starts before the refactor. It starts when we make the risk clear enough that other people can make a good decision.
Further reading
Related notes that continue the same thread.
ai
I Stopped Asking the LLM to Remember Everything
A practical story about replacing full-history prompting with a small state machine for faster, calmer, more predictable LLM conversations.
ai
AI Doesn’t Make You Learn Faster. It Changes What You Learn.
Anthropic’s research points to a useful distinction: AI can improve output while weakening skill formation when you are still learning the domain.