Tech Debt and Pragmatism
The judgment test. Can you ship fast when speed matters, say no to over-engineering, AND make the case for paying down debt when it's earning interest - or do you have a single mode?
About this theme
What interviewers are evaluating
- →Do you have all three modes (ship fast, decline gold-plating, pay down debt), or just one?
- →Can you articulate the heuristic you used to decide which mode the situation called for?
- →When you shipped fast and rough, did you flag the debt explicitly and track it - or did you abandon it?
- →When you declined to over-engineer, did you make the case with concrete reasoning about future cost - or just say no?
- →When you paid down debt, did you justify the investment with concrete data, or did you just rewrite because the code was ugly?
- →Did you respect the prior engineers' decisions when paying down debt, or assume they were idiots?
- →At senior+ levels: did you generalize the heuristic into a team norm or planning input?
- →Did the trade-off you made hold up over time, or did the debt become a real problem you had to revisit?
Common prompts
Variations on these are asked at every level. Have a story pre-loaded for at least three of them.
- ?Tell me about a time you shipped something rough on purpose to hit a deadline.
- ?Describe a situation where you declined to over-engineer something. How did you make the call?
- ?Walk me through a time you justified paying down significant tech debt.
- ?Tell me about a time you disagreed with a teammate about whether to ship fast or do it right.
- ?Describe a time the rough version you shipped came back to bite you.
- ?Walk me through how you decide between the quick fix and the proper fix.
- ?Tell me about a time you inherited a system with significant debt. What did you do?
- ?Describe a time you had to push back on a refactoring proposal that wasn't worth doing.
Sample STAR answers
Both strong and weak examples, with notes on what makes each work (or fail). Read the weak examples carefully - the patterns they show up are the ones interviewers are trained to spot.
Strong: Shipped a deliberately rough v1, paid down later
- Situation
- I was the lead engineer on a fraud-rule engine for a payments company. We had a 5-week window to ship something before a known fraud-ring escalation hit the platform. The 'right' design was a config-driven rule engine with hot-reload, audit log, and dry-run testing. The estimated build time was about 10 weeks.
- Task
- I had to decide between the right design at 10 weeks (which would miss the fraud window and likely cost us $300-500K in fraud losses) and a rougher version that could land in 5. The temptation in both directions was real: my tech lead instinct wanted the right design; the deadline wanted the hack.
- Action
- I sketched the rough version explicitly, then made the case for it in writing. The rough version: rules were hardcoded in a single file, deploy-required to update, no audit log (just git history), no dry-run mode. To compensate, I designed three guardrails: (1) Every rule had an explicit 'enabled' boolean and a 'shadow_mode' boolean, so we could push a rule live in shadow first and check the metrics before flipping enabled. (2) Every rule emitted a structured log event we could query - that gave us audit-via-query even without a proper audit log. (3) I wrote a one-page 'this is the rough version, here's what's missing, here's the deferred work' doc and added it to the on-call runbook so the next engineer would understand the seams. I shared the proposal with my EM and the security review. Security pushed back on no-audit-log; I walked them through the structured-log approach and showed a query that produced an audit-equivalent view. They accepted it as an interim posture with the deferred work tracked. I shipped in 4.5 weeks. The fraud-ring detection went live in shadow on day 32, we validated the metrics for 3 days, and turned it on day 35. We blocked an estimated $420K of fraud in the first month. Six months later, I came back and built the full rule engine - now with real load data, real query patterns, and three engineers' worth of operational experience telling us which abstractions actually mattered. The v2 was about 40% smaller in scope than what I'd originally planned because I'd learned which features didn't matter.
- Result
- Direct outcome: we hit the fraud window and blocked $420K of fraud in month one. The deferred work was tracked, paid down, and the v2 shipped 7 months in. The deeper outcome: shipping the rough version with explicit guardrails (shadow mode, structured logs, deferred-work doc) meant the debt was visible and bounded, not invisible and growing. I learned that the difference between 'shipped fast' and 'shipped fast and accumulated debt I'll regret' is whether the seams are documented and the deferred work is tracked. I now write a 'what's deferred' note as part of any rushed launch - it's the single highest-leverage habit I have around tech debt.
What makes this strong: (1) The candidate had a clear case for shipping rough (the fraud window), not just 'we were under pressure.' (2) Designed compensating guardrails (shadow mode, structured logs) instead of just cutting corners blindly. (3) Engaged with the security pushback substantively. (4) Tracked the deferred work explicitly - the v2 actually got built, it didn't become permanent. (5) The v2 was 40% smaller because the candidate had learned what mattered - that's the senior pattern of using the rough version to inform the right one. (6) Concrete outcome ($420K fraud blocked). (7) Generalizable habit (the 'what's deferred' note). (8) Both modes - shipped fast AND came back for the right version - in a single story.
Strong: Declined a teammate's over-engineering proposal
- Situation
- On a 4-person backend team, a senior peer proposed building an event-sourced architecture for a new feature - tracking subscription state for a B2B SaaS product. We had ~200 enterprise customers and a relatively simple state machine (active, paused, canceled, with a few transitions). The proposed design included an event store, projections, snapshots, and a CQRS read path.
- Task
- I was the tech lead. The peer was credible and had used event sourcing well at a previous company. The proposal wasn't crazy - it would generalize. But my read was that we were 200 customers in, our state transitions were boring, and the projected complexity would slow every future feature. I had to push back without dismissing his thinking.
- Action
- I did three things. First, I asked him to walk me through the cases where event sourcing would pay off. He named four: audit trail (real but solvable other ways), debugging via replay (real but rare), bitemporal queries (we had no requirement), and 'we'll need it eventually as we scale.' I took those seriously - not as 'wrong' but as 'how likely, how soon, what's the cost of not having it.' Second, I did the cost math. Implementing event sourcing for this state machine would add roughly 3 weeks to the initial build, ~30% more lines of code to maintain per feature touching subscription state, and a non-trivial onboarding load on new engineers (event sourcing is unfamiliar to most). I also pulled our roadmap - we had 6 subscription-state-touching features in the next 4 quarters. Third, I wrote a one-pager: 'Event sourcing for subscriptions: when it pays off and when it doesn't.' I named the four use cases he'd raised, gave each a probability and a deferred-cost estimate (e.g., 'audit trail - we'll need this in 12-18 months for SOC2 - cost of retrofitting is ~2 weeks; cost of building now is ~3 weeks plus ongoing complexity tax'). I proposed a middle path: ship a normal CRUD design now, add an append-only state-change log table (1 day of work) that gives us audit and replay if we need them, and revisit event sourcing at the point we hit a real bitemporal or replay-debugging need. I shared the doc with him before posting it for the team. He pushed back on one point - my replay-debugging probability was too low. We went back and forth on it; he was partly right, but the conclusion didn't change because the cost of replay-debugging without event sourcing was 'occasional manual reconstruction,' not 'impossible.' He came around. We shipped the CRUD design plus the append-only log.
- Result
- Roughly 18 months later, we'd shipped 9 subscription-touching features. None of them required event sourcing. The append-only log was used twice for audit-style queries and once for a tricky bug investigation - the equivalent of 'replay-style' debugging at a fraction of the complexity. SOC2 happened, and the audit log met the requirement. We never did add event sourcing - the bitemporal use case never materialized. The peer told me later he thought I'd been right; the team had moved faster than his previous company on subscription work, and he attributed about half of that to not having the event-sourcing tax. The deeper lesson: 'we'll need it eventually' is the most expensive phrase in engineering. The right response is 'when we need it, we'll know specifically why - until then, leave a seam (the append-only log) and stay simple.'
What makes this strong: (1) Took the peer's reasoning seriously and walked through it case by case, didn't dismiss. (2) Did real cost math (lines of code, onboarding load, roadmap touch frequency). (3) Proposed a middle path with a cheap seam (append-only log) that captured most of the option value without the tax. (4) The decision held - 18 months later the simpler design was still right. (5) The peer's pushback got engaged with substantively. (6) Generalizable lesson framed sharply ('we'll need it eventually' is the most expensive phrase). (7) Doesn't dismiss event sourcing as a pattern - just says it's wrong here. That's pragmatism, not anti-rigor.
Weak: Refactored because the code was ugly
- Situation
- I joined a team and the code was really messy. There were tons of inconsistencies and the patterns weren't clean.
- Task
- I wanted to clean it up to make it more maintainable.
- Action
- I spent about 6 weeks refactoring the codebase. I introduced a cleaner architecture, standardized patterns, and improved test coverage. There was some pushback but I made the case that the long-term maintainability was worth it.
- Result
- The codebase was much cleaner and easier to work in. The team appreciated the improvements.
Why this is weak: (1) No business case - 'the code was messy' is aesthetic, not a justification for 6 weeks of work. (2) No specifics about what the debt was actually costing the team in real terms (incidents, velocity, onboarding time, bugs). (3) 'Some pushback but I made the case that long-term maintainability was worth it' is exactly the answer interviewers screen against - it's the over-engineer's instinct dressed up as judgment. (4) No measurement of whether the refactor actually paid off - 'cleaner' is not a metric. (5) Treats the prior engineers' code as obviously bad without engaging with why they made the choices they did. The strongest version would name what the debt was costing in dollars or velocity, what the candidate considered before deciding to pay it down, and how they measured whether the investment paid off.
Common pitfalls
- ×Single-mode stories. If all your tech-debt stories are 'I cleaned up someone else's mess,' you read as the engineer who refactors but doesn't ship. If they're all 'I shipped fast,' you read as the engineer who leaves disasters.
- ×Aesthetic-driven refactors. 'The code was ugly' is not a business case. Strong debt-paydown stories have a real cost (incidents, velocity, onboarding, bugs).
- ×No deferred-work tracking. Shipping fast without flagging the debt explicitly is the difference between 'pragmatism' and 'leaving a mess.'
- ×Over-engineering disguised as best practices. 'I introduced a cleaner architecture' is exactly the language interviewers screen against.
- ×Disrespecting prior engineers. Stories that frame inherited code as obviously bad reveal a candidate who'll be hard to work with. Strong stories engage with why the prior choices were made.
- ×No measurement. 'The codebase was cleaner' or 'we shipped faster' without a number reads as opinion, not signal.
- ×'We'll need it eventually' framing. Justifying complexity with possible future needs is the textbook over-engineer's argument; interviewers know it.
- ×Stories where the rough version became permanent debt and the candidate doesn't acknowledge it. Mature pragmatism includes the cases where the debt outlived its justification.
Follow-up strategies
Interviewers will probe. Be ready for the follow-up questions that test the depth of your story.
- →If asked 'how do you decide between the quick fix and the proper fix?' - have a heuristic. 'Reversibility plus blast radius' or 'how much downstream code depends on getting this right' are concrete frames.
- →If asked 'how do you justify debt paydown to non-engineers?' - have a translation. Velocity tax (X% slower on every feature touching this), incident frequency, onboarding cost are all stakeholder-readable.
- →If asked 'have you ever shipped fast and regretted it?' - have one. 'I shipped without instrumenting and we couldn't tell when the bug started' is credible.
- →If asked 'have you ever over-engineered and regretted it?' - have one. The strongest answer is a real story where simpler would have been right.
- →If asked 'how do you balance debt paydown against feature work?' - have a specific approach. '20% time for debt' or 'every Nth sprint is a debt sprint' are concrete; 'we balance them carefully' is filler.
- →If asked 'what's the right amount of test coverage for a rough v1?' - have an opinion. 'High coverage on the boundaries (input validation, output contracts) and low coverage on the internals' is a strong frame.
- →If asked 'how do you respect prior engineers' decisions when paying down debt?' - have a practice. 'I read the original PR and its discussion before refactoring' is concrete.
- →If asked 'what's the worst tech debt you've ever seen?' - the question is whether you'll dunk on prior teams or engage with the trade-offs that produced it. Engage.
Related behavioral themes
Bias for Action
Amazon LPSpeed matters. But the principle is reversible-vs-irreversible reasoning, not 'I work fast.' Get this distinction wrong and the answer reads as reckless.
Dive Deep
Amazon LPLeaders operate at all levels. The interviewer is testing whether you actually understand your own systems - or whether you summarize what your team built.
Missed Deadline / Incident
GeneralThe honesty test. Can you own a missed commitment or production incident specifically and without flinching - or do you blame the team, the requirements, or the on-call rotation?
Companies that test this theme
Practice these stories live
Reading STAR answers is the floor. The interview signal is in delivering them out loud, with follow-ups, under pressure. The AI mock interview probes your stories the way real interviewers do.
Start an AI mock interview →