AI Coding Agents Are Changing What “Hireable” Means
Inside hiring loops, a new kind of competence is starting to separate candidates

Last week Business Insider quoted Andrew Hsu saying, “Software engineering has changed so much that we’ve changed our hiring process.”
A little later he added, “All the traditional tech screen coding questions are gone entirely.”
I saw the screenshot, had the reaction I usually have to that genre, and closed the app.
I have been in tech long enough to know how often a local workflow gets dressed up as destiny. One company changes a loop, finds the right audience, says it loudly, and suddenly half the industry treats it like a labor market bulletin. Most of that stuff is branding with good timing.
Then Monday morning I was in a hiring debrief, and the whole thing felt less theoretical.
We were discussing a candidate who would have looked solid in a very familiar way a year ago: good background, decent systems work, clear communicator, sensible interview overall. The profile was easy to picture. Bring them in, give them a reasonable slice of the roadmap, expect steady output.
One interviewer had asked about coding agents in day-to-day work.
The answer floated. Boilerplate, rough drafts, repetitive tasks, the usual language about speeding up the boring parts. Nothing broken in the answer, just no weight in it. No real examples, no clear account of where the tool had gone sideways, widened a small change, or turned a simple task into something expensive to review.
A second candidate came up a few minutes later. Rougher around the edges, better answer by a mile. They described using an agent to draft a refactor, then cutting most of it because the generated version had crossed a transaction boundary and pulled validation logic into the wrong layer. They mentioned a test file that looked thorough until they noticed the mocks had removed the retry path that actually mattered. They talked about reading generated diffs for state changes, side effects, ownership, observability, rollback shape.
At that point the weekend post landed differently. Still too neat, still too pleased with itself. Underneath it, though, there was a real pressure I recognized immediately from inside the job.
What Changed in Practice
The public version of this conversation still gets pulled toward the wrong drama.
One side wants amazement: look how much code the machine can produce. The other side wants purity: real engineers should keep their hands clean and stay loyal to the old shape of the work. Neither view is very useful once you are actually staffing a team.
The real change is narrower, more operational, more annoying.
A lot of work that used to function as a decent proxy for engineering strength now comes cheaper than it used to. Scaffolding, repetitive backend changes, first-pass tests, migration drafts, internal scripts, integration glue, the kind of implementation work that used to tell you whether somebody could move without needing constant rescue. Agents can cover some of that surface now. Sometimes well enough to help. Sometimes in a way that pushes cost into review and leaves the team to discover the invoice later.
So when a candidate tells me they are fast, I need more than the word.
Fast through what.
Fast with what level of verification.
Fast in a way that preserves the shape of the system, or fast in the way a model is fast: one plausible answer after another, each one carrying a little hidden damage for somebody else to find.
That split carries a lot more weight now than raw throughput on its own.
What Stronger Candidates Sound Like
The answers I trust are usually less polished and more specific.
They come with edges. Actual failure. A little scar tissue.
Stronger candidates talk about generated code in the language of systems, not vibes. Permission checks that slipped into the wrong helper. Retry logic that looked covered until somebody followed the real path. Migrations that technically ran but said nothing useful about rollback, lock contention, partial backfill, or what happens halfway through deployment when a worker dies and leaves uneven state behind. Refactors that reduced duplication on paper while flattening ownership boundaries that had been there for a reason.
That kind of answer tells me the person has spent time in the expensive part of the loop.
One candidate recently said they are happy to use an agent for serializer churn, thin client expansion, broad test scaffolding, maybe the first pass on a repetitive integration. Billing paths, permission evaluation, concurrency-sensitive code, retry-heavy jobs, anything near idempotency, they keep tighter. Good answer. Somebody has clearly had to unwind a generated patch that looked tidy and behaved badly under real traffic.
Another candidate described how they review AI-generated tests: remove the implementation mentally, then ask whether the test would still fail for the bug they actually care about. Also good. That is a person who has seen how often machine-written tests verify the model’s assumptions instead of the system’s behavior.
This is the level I care about now.
Code appearing quickly is the easy part. The more expensive skill shows up right after: judgment, suspicion, boundary control, the ability to say no to a patch that looks respectable and still weakens the system.
The Hiring Signal Got Dirtier
The market is going to make a mess of this for a while.
Weak engineers can package themselves much better than before. Fuller repos, tighter demos, cleaner take-homes, more branch activity, more visible output, more confidence in the first thirty minutes. Even the mistakes arrive with nicer formatting now. That makes the screening layer noisier. Hiring teams that already struggle to distinguish polished from durable are not having an easier time.
Then the person joins, and the bill starts showing up in ordinary places.
Review takes longer. Scope drifts. Generated work arrives in plausible shapes that need line-by-line scrutiny. The team pays in attention for speed that looked free during the interview loop. Senior engineers start spending time stripping abstractions back down to system size, fixing tests that went green by asserting less, cleaning up helper layers that make code feel tidier while making the execution path harder to reason about.
That is why I barely react anymore when someone says they use AI heavily.
I want to know where they stop trusting it.
I want to hear which invariants they check after a generated change touches a boundary that matters.
I want to know whether they have any instinct for containment, any reflex at all for cutting a patch back down before it starts exporting cleanup cost to the rest of the team.
Those answers tell me much more than tool fluency.
Mid-Level Is Where the Shift Gets Expensive
The sharpest pressure is landing around mid-level engineers.
A solid mid-level engineer used to build trust through steady implementation: pick up tickets, handle the rough edges, write tests that mean something, merge changes without creating a wake of secondary problems for somebody more senior to sort out later. That still matters. It just no longer separates people the way it used to.
Straightforward implementation is cheaper now. The costly part sits one layer above it: preserving judgment while the output comes faster.
You can see the divide clearly in practice. One engineer uses the tool to clear routine work while keeping boundaries intact, preserving state transitions, and protecting the operational shape of the system. Another fills the branch with generated volume and leaves review holding the bag: blurred ownership, weaker logging, duplicate side effects, wider abstractions, generalized patterns that ignore the constraints of the codebase they landed in.
From ten feet away, both can look productive.
Inside the system, they are completely different hires.
And yes, this is exactly where experience starts to matter. A tech lead can hear the difference before any dashboard catches it. It comes through in the verbs people use, the risks they bring up unprompted, the categories they have in their head when they talk about generated work. Stronger engineers mention observability, migration safety, rollback, caching semantics, retry behavior, permission paths, test isolation. Weaker ones say the code looked fine.
That answer used to pass more often than it should have. It is aging badly.
The Entry Ramp Is Getting Worse
The part I like least sits at the bottom of the ladder.
A lot of junior engineers learned the job through repetitive features, maintenance work, low-risk fixes, coverage, internal tooling, endpoint extensions, the long unremarkable stretch of software work that teaches you where systems resist and how easily a “simple change” turns into a longer afternoon. Nothing prestigious there. Plenty of value in it.
That layer is getting squeezed.
Some of the work disappears into tooling. Some survives in a more review-heavy form. Some gets pushed upward into workflows where a junior is expected to supervise generated output before they have enough reps to know what deserves suspicion.
That is a rough trade.
The market starts wanting signs of judgment earlier, while some of the work that used to help build judgment gets thinner. Then everybody acts surprised when entry-level hiring feels harsher and stranger. The advice aimed at younger engineers still sounds older than the job itself: practice the classic screen, build a few projects, sharpen the visible ritual. Fine, but the gate moved. Teams are looking for maturity around review burden, boundary control, containment, tool skepticism, and they are looking for it sooner.
Those muscles still come from time in real systems, around real failures, with stronger engineers correcting your read. There is no prompt shortcut for that.
What I Listen For Now
I still care about fundamentals: system sense, debugging, tradeoffs, clear thinking under pressure, all the old ingredients that kept software from turning into a landfill.
My follow-ups changed.
A generated change that got killed before merge says a lot. So does a patch that passed and still violated an invariant somebody cared about. I pay attention when a candidate can explain where they keep the tool away from first drafts, how they review an AI-generated migration, why a supposedly faster workflow ended up slower once the rollback risk and verification burden were counted honestly.
The answers I trust get concrete very quickly: stale cache keys, swallowed exceptions, duplicate side effects, brittle mocks, permission gaps, partial writes, logging that looked adequate until there was an actual incident, schema assumptions smuggled into a helper because the model likes to tidy things it does not understand.
That is engineering language. That is somebody who has been in the work.
“AI-native” tells me almost nothing.
“Here’s where I stop the tool from crossing a boundary it does not understand” tells me a lot.
What Looks Hireable From Inside a Team
From inside a team, hireability has a different shape than the internet version.
I still want somebody who can build.
I also want somebody who can read through machine output without getting seduced by speed, formatting, or volume. Somebody who notices when the architecture is being softened one plausible patch at a time. Somebody who knows when the green test suite is covering for a bad change. Somebody who can use an agent for leverage without turning the rest of the team into cleanup staff.
The deeper hiring question now sits closer to control than mere tool adoption.
The post from last week got attention because it sounded bold. Fine. The more interesting version happens in a debrief room when two candidates look similar from a distance, then one of them starts talking in the language of invariants, rollback, scope control, test quality, review burden, and the other one does not.
The conversation gets much more honest right there.
You are listening for whether the person can keep the quality bar in place once the tools start producing plausible work at speed.
