
Comment sections on AI threads tend to split into "we're all cooked" and "AI is useless." I'd like to cut through the noise and learn what's actually working and what isn't, from concrete experience.
If you've recently used AI tools for professional cod...
If you've recently used AI tools for professional coding work, tell us about it.
What tools did you use? What worked well and why? What challenges did you hit, and how (if at all) did you solve them?
Please share enough context (stack, project type, team size, experience level) for others to learn from your experience.
The goal is to build a grounded picture of where AI-assisted development actually stands in March 2026, without the hot air.
Haven't seen this mentioned yet, but the worst part for me is that a lot of management LOVES to use Claude to generate 50 page design documents, PRDs, etc., and send them to us to "please review as soon as you can". Nobody reads it, not even the people making it. I'm watching some employees just generate endless slide decks of nonsense and then waffle when asked any specific questions. If any of that is read, it is by other peoples' Claude.
It has also enabled a few people to write code or plan out implementation details who haven't done so in a long (sometimes decade or more) time, and so I'm getting some bizarre suggestions.
Otherwise, it really does depend on what kind of code. I hand write prod code, and the only thing that AI can do is review it and point out bugs to me. But for other things, like a throwaway script to generate a bunch of data for load testing? Sure, why not.
I've been tasked with code reviews of Claude chat bot written code (not Claude code that has RAG and can browse the file system). It always lacks any understanding of our problem area, 75% of the time it only works for a specific scenario (the prompted case), and almost 100% of the time, when I comment about this, I'm told to take it over and make it work... and to use Claude.
I've kind of decided this is my last job, so when this company folds or fires me, I'm just going to retire to my cabin in the rural Louisiana woods, and my wife will be the breadwinner. I only have a few 10s of thousands left to make that home "free" (pay off the mortgage, add solar and batteries, plant more than just potatoes and tomatoes).
Though, post retirement, I will support my wife's therapy practice, and I have a goal of silly businesses that are just fun to do (until they arent), like my potato/tomato hybrid (actually just a graft) so you can make fries and ketchup from the same plant!
>so you can make fries and ketchup from the same plant!
We should be friends. I like your ideas.
You got any land down there? I would like to be close to you and, post retirement, eat said french fries daily.
I'll keep that in mind when marketing. I was going to go with French Fry Tree.
Name to consider: twoatos (pot- and tom-)
That sounds lovely. I think too many people get attached to the structure of life as they've lived it for the last n years and resist natural phase transitions for far too long. Good luck with retirement and your dream of being the botanical equivalent of the mean kid from Toy Story:p
I noticied what previously would take 30 mins, now takes a week. For example we had a performance issue with a DB, previously I'd just create a GSI (global secondary index), now there is a 37 page document with explanation, mitigation, planning, steps, reviews, risks, deployment plan, obstacles and a bunch of comments, but sure it looks cool and very professional.
Im now out of the workforce and can’t even imagine the complexity of the systems as management and everyone else communicate plans and executions through Claude. It must already be the case that some code based are massive behemoths few devs understand. Is Claude good enough to help maintain and help devs stay on top of the codebase?
The code is fine, strong reviews help and since we're slower due to all slop communication also helps.
I quit my last job because of this. I’m pretty sure manager was using free chatgpt with no regard for context length too, because not only was it verbose it was also close to gibberish. Being asked to review urgently and estimate deadlines got old real fast
I've definitely seen this, I have a theory as to how this kind of thing actually would affect AI predictions since people seem to only focus on the pure-productivity enhancing effects of AI and discounting the fact that a large portion of work was never productive to begin with...
If you shove clearly AI generated content at me, I will use an AI to summarize it.
Or I'll walk up to your desk and ask you to explain it.
Jump straight to the second option. You have to presume that the content they sent you has no relation whatsoever to their actual understanding of the matter.
Be prepared for "I Asked claude and it said: ..." at some point you will just ask claude via a microphone
We all use Claude at my work and I have a very strict rule for my boss and my team: we don’t say “I asked Claude”. We use it a lot, but I expect my team to own it.
I actually think there’s almost an acceptable workflow here of using LLMs as part of the medium of communication. I’m pretty much fine with someone sending me 500 lines of slop with the stated expectation that I’ll dump it into an LLM on my end and interact with it.
It’s the asymmetric expectations—that one person can spew slop but the other must go full-effort—that for me personally feels disrespectful.
I also don't mind that. Summarized information exchange feels very efficient. But for sure, it seems like a societal expectation is emerging around these tools right now - expect me to put as much effort into consuming data as you did producing it. If you shat out a bunch of data from an LLM, I'm going to use an LLM to consume that data as well. And it's not reasonable for you to expect me to manually parse that data, just as well as I wouldn't expect you to do the same.
However, since people are not going to readily reveal that they used an LLM to produce said output, it seems like the most logical way to do this is just always use an LLM to consume inputs, because there's no easy 100% way to tell whether it was created by an LLM or a human or not anymore.
Concept -> LLM fluff -> LLM summary -> Recipient
This kinda risks the broken telephone problem, or when you translate from one language to another and then again to another - context and nuance is always lost.
Just give me the bullet points, it's more efficient anyway. No need to add tons of adjectives and purple prose around it to fluff it up.
Some day someone brilliant will discover the idea of "sharing prompts" to get around this issue. So, instead of sending the clean and summarized LLM output, you'll just send your prompt, and then the recipient can read that, and in response, share their prompt back to the original sender.
A true prisoners dilemma!
I think we'll eventually move away from using these verbose documents, presentations, etc for communication. Just do your work, thinking, solving problems, etc while verbally dumping it all out into LLM sessions as you go. When someone needs to be updated on a particular task or project, there will be a way to give them granular access to those sessions as a sort of partial "brain dump" of yours. They can ask the LLM questions directly, get bullet points, whatever form they prefer the information in.
That way, thinking is communication! That's kind of why I loved math so much - it felt like I could solve a problem and succinctly communicate with the reader at the same time.
That sounds intriguing. LLM as moderator or coordinator or similar.
If you write 3 bullet points and produce 500-pages of slop why would my AI summarise it back to the original 3 bullet points and not something else entirely?
It won't, and that's the joke. They will write three bullet points, but their AI will only focus on the first two and hallucinate two more to fill out the document. Your AI will ignore them completely and go off on some unrelated tangent based on the of the earlier hallucinations. Anthropic collects a fee from both of you and is the only real winner here.
is this better than normal communication in any way, or just not much worse?
> It’s the asymmetric expectations—that one person can spew slop but the other must go full-effort—that for me personally feels disrespectful.
This has always been the case. Have some junior shit out a few thousand lines of code, leave, and leave it for the senior cleanup crew to figure out what the fuck just happened...
There's a discussion going on that if you use an LLM to generate code, should the prompts (and related stuff) be a part of the pull request.
If you shove content at me that I even suspect was AI generated I will summarily hit the delete button and probably ban you from sending me any form of communication ever again.
It's a breach of trust. I don't care if you're my friend, my boss, a stranger, or my dog - it crosses a line.
I value my time and my attention. I will willingly spend it on humans, but I most certainly won't spend it on your slop when you didn't even feel me worth making a human effort.
I highly recommend you let your dog use LLMs. They have trouble composing long messages on human-centric keyboards.
I've found in my (admittedly limited) use of LLMs that they're great for writing code if I don't forsee a need to review it myself either, but if I'm going to be editing the code myself later I need to be the one writing it. Also LLMs are bad at design.
Master Foo and the Programming Prodigy: https://catb.org/~esr/writings/unix-koans/prodigy.html
what code do you write that you don't need to mantain/read again later?
For me it's throwaway scripts and tools. Or tools in general. But only simple tools that it can somewhat one-shot. If I ever need to tweak it, I one-shot another tool. If it works, it's fine. No need to know how it works.
If I'm feeling brave, I let it write functions with very clear and well defined input/output, like a well established algorithm. I know it can one-shot those, or they can be easily tested.
But when doing something that I know will be further developed, maintained, I mainly end up writing it by hand. I used to have the LLM write that kind of code as well, but I found it to be slower in the long run.
Definitely a lot of one-shot scripts for a given environment... I've started using a run/ directory for shell scripts that will do things like spin up a set of containers defined in a compose file.. build and test certain sub-projects, initialize a database, etc.
For the most part, many of them work the first time and just continue to do so to aid a project. I've done similar in terms of scaffolding a test/demo environment around a component that I'm directly focused on... sometimes similar for documentation site(s) for gh pages, etc.
Soem things have gone surprisingly well.
> Also LLMs are bad at design.
I've found that SoTA LLMs sometimes implement / design differently (in the sense that "why didn't I think of that"), and that's always refreshing to see. I may run the same prompt through Gemini, Sonnet, and Codex just to see if they'd come up with some technique I didn't even know to consider.
> don't forsee a need to review it myself either
On the flip side, SoTA LLMs are crazy good at code review and bug fixes. I always use "find and fix business logic errors, edge cases, and api / language misuse" prompt after every substantial commit.
Obviously you should also use Claude to consume those 50 pages. It sounds cynical, but it's not. It's practical.
What I've learned in 2 years of heavy LLM use - ChatGPT, Gemini, and Claude, is that the significance is on expressing and then refining goals and plans. The details are noise. The clear goals matter, and the plans are derived from those.
I regularly interrupt my tools to say, "Please document what you just said in ...". And I manage the document organization.
At any point I can start fresh with any AI tool and say, "read x, y, and z documents, and then let's discuss our plans". Although I find that with Gemini, despite saying, "let's discuss", it wants to go build stuff. The stop button is there for a reason.
I use an agents.md file to guide Claude, and I include a prominent line that reads UPDATE THIS FILE WITH NEW LEARNINGS. This is a bit noisy -- I have to edit what is added -- but works well and it serves as ongoing instruction. And as you have pointed out, the document serves as a great base if/when I have to switch tools.
One group of people pretends to have written something and another group of people pretends to have read something. Much productivity is gained.
Zizek had a great point about this.
At least both get paid in not-pretend money.
Similarly, managers at my workplace occasionally use LLMs to generate jira tickets (with nonsense implementation details), which has led junior engineers astray, leaving senior engineers to deal with the fallout.
And then you use AI to summarize those 50 pages :)
Getting similar vibes from freelance clients sending me overly-articulated specs for projects, making it sound like they want sophisticated implementations. Then I ask about it and they actually want like a 30 row table written in a csv. Huge whiplash.
I instituted a simple “share the inputs” along with the outputs rule which prevents people doing exactly this. Your only value contribution is the input and filtering the output but for people with equal filtering skill, there’s no value in the output
The first point is so true. How do people expect me to work with their 20 page "deep research" document that's built by a crappy prompt and they didn't even bother to proofread.
The best thing to do is to schedule meetings with those people to go over the docs with them. Now you force them to eat their own shit and waste their own time the more output they create.
Love the intent, but isn't that wishful if you don't have any leverage? e.g., the higher up will trade you for someone who doesn't cause friction or you waste too much of your own time?
I had Claude review one. It was... not complimentary. Seemed to help a bit.
I've had this experience too. In the case of vibe code, there is at least some incentive from self-preservation that prevents things from getting too out of hand, because engineers know they will be on the hook if they allow Claude to break things. But the penalties for sloppy prose are much lower, so people put out slop tickets/designs/documentation, etc. more freely.
It makes my work suck, sadly. Team dynamics also contributes to that, admittedly.
Last year I was working on implementing a pretty big feature in our codebase, it required a lot of focus to get the business logic right and at the same time you had be very creative to make this feasible to run without hogging to much resources.
When I was nearly done and worked on catching bugs, team members grew tired of waiting and starting taking my code from x weeks ago (I have no idea why), feeding it to Claude or whatever and then came back with a solution. So instead of me finishing my code I had to go through their version of my code.
Each one of the proposals had one or more business requirements wrong and several huge bugs. Not one was any closer to a solution than mine was.
I had appreciated any contribution to my code, but thinking that it would be so easy to just take my code and finishing it by asking Claude was rather insulting.
I completely understand.
We're in a phase where founders are obsessed with productivity so everything seens to work just fine and as intended with few slops.
They're racing to be as productive as possible so we can get who knows where.
There are times when I honestly don't even know why we're automating certain tasks anymore.
In the past, we had the option of saying we didn't know something, especially when it was an area we didn't want to know about. Today, we no longer have that option, because knowledge is just a prompt away. So you end up doing front-end work for a backend application you just built, even though your role was supposed to be completely different.
This feels similar to the slow encroachment of devops onto everything. We're making so much shit nowadays that there is nobody left but developers to shepherd things into production, with all the extra responsibility and none of the extra pay commensurate with being a sysadmin too.
> Today, we no longer have that option, because knowledge is just a prompt away
Something resembling knowledge anyway. A sort of shambling mound wearing knowledge like a skinsuit
While I agree, I can't deny that AI is doing the job most of the time. But the hunt for the supreme productivity feels disgusting sometimes.
There’s a lot more going on there than AI …
Not really, this is exactly what I expect due to baseless lies from the AI companies and a disdain for employee payroll by the C-suite.
they fantasize about unpaid interns writing specs and nobody ever needed to look at the code in a few years
This seems to be a team problem more than anything? Why are your coworkers taking on your responsibilities? Where's your manager on this?
Could be an emergent team problem that wouldn’t have had cause to exist before AI.
That works when it's humans you can talk to. The same problem happens with AI agents though and "no, use the latest code" doesn't really help when multiple agents have each compacted their own version of what "latest" means.
I'm running Codex on a Raspberry Pi, and Claude Code CLI, Gemini CLI, and Claude in Chrome all on a Mac, all touching the same project across both machines. The drift is constant. One agent commits, the others don't know about it, and now you've got diverged realities. I'm not a coder so I can't just eyeball a diff and know which version is right.
Ended up building a mechanical state file that sits outside all the context windows. Every commit, every test run, every failed patch writes to it. When a new session starts, the agent reads that file first instead of trusting its own memory. Boring ops stuff really, but it's the only thing that actually stopped the "which version is real" problem.
Not really AI problem, more like garbage coworkers.
It has made my job an awful slog, and my personal projects move faster.
At work, the devs up the chain now do everything with AI – not just coding – then task me with cleaning it up. It is painful and time consuming, the code base is a mess. In one case I had to merge a feature from one team into the main code base, but the feature was AI coded so it did not obey the API design of the main project. It also included a ton of stuff you don’t need in the first pass - a ton of error checking and hand-rolled parsing, etc, that I had to spend over a week unrolling so that I could trim it down and redesign it to work in the main codebase. It was a slog, and it also made me look bad because it took me forever compared to the team who originally churned it out almost instantly. AI tools are not good at this kind of design deconflicting task, so while it’s easy to get the initial concept out the gate almost instantly, you can’t just magically fit it into the bigger codebase without facing the technical debt you’ve generated.
In my personal projects, I get to experience a bit of the fun I think others are having. You can very quickly build out new features, explore new ideas, etc. You have to be thoughtful about the design because the codebase can get messy and hard to build on. Often I design the APIs and then have Claude critique them and implement them.
I think the future is bleak for people in my spot professionally – not junior, but also not leading the team. I think the middle will be hollowed out and replaced with principals who set direction, coordinate, and execute. A privileged few will be hired and developed to become leaders eventually (or strike gold with their own projects), but everyone in between is in trouble.
If you dont take a stand and refuse to clean their mess, aren't you part of the problem? No self respecting proponent of AI enabled development should suggest that the engineers generating the code are still not personally responsible for its quality.
Ultimately that's only an option if you can sustain the impact to your career (not getting promoted, or getting fired). My org (publicly traded, household name, <5k employees) is all-in on AI with the goal of having 100% of our code AI generated within the next year. We have all the same successes and failures as everyone else, there's nothing special about our case, but our technical leadership is fundamentally convinced that this is both viable and necessary, and will not be told otherwise.
People who disagree at all levels of seniority have been made to leave the organization.
Practically speaking, there's no sexy pitch you can make about doing quality grunt work. I've made that mistake virtually every time I've joined a company: I make performance improvements, I stabilize CI, I improve code readability, remove compiler warnings, you name it: but if you're not shipping features, if you're not driving the income needle, you have a much more difficult time framing your value to a non-engineering audience, who ultimately sign the paychecks.
Obviously this varies wildly by organization, but it's been true everywhere I've worked to varying degrees. Some companies (and bosses) are more self-aware than others, which can help for framing the conversation (and retaining one's sanity), but at the end of the day if I'm making a stand about how bad AI quality is, but my AI-using coworker has shipped six medium sized features, I'm not winning that argument.
It doesn't help that I think non-engineers view code quality as a technical boogeyman and an internal issue to their engineering divisions. Our technical leadership's attitude towards our incidents has been "just write better code," which... Well. I don't need to explain the ridiculousness of that statement in this forum, but it undermines most people's criticism of AI. Sure, it writes crap code and misses business requirements; but in the eyes of my product team? That's just dealing with engineers in general. It's not like they can tell the difference.
Hi thanks for this brilliant feature. It will really improve the product. However it needs a little bit more work before we can merge it into our main product.
1) The new feature does not follow the existing API guidelines found here: see examples an and b.
2) The new feature does not use our existing input validation and security checking code, see example.
Once the following points have been addressed we will be happy to integrate it.
All the best.
The ball is now in their court and the feature should come back better
This is a politics problem. Engineers were sending each other crap long before AI.
..so they copy/paste your message into Claude and send you back a +2000, -1500 version 3 minutes later. And now you get to go hunting for issues again.
There is an alternative way make the necessary point here.. Let it go through with comments to the effect that you can not attest to the quality or efficacy of the code and let the organization suffer the consequences of this foray into LLM usage. If they can't use these tools responsibly and are unwilling to listen to the people who can, then they deserve to hit the inevitable quality wall Where endless passes through the AI still can't deliver working software and their token budget goes through the ceiling attempting to make it work.
I think you're falling victim to the just-world fallacy.
I am absolutely certain the world isn't just. I'm also absolutely certain the world can't get just unless you let people suffer consequences for their decisions. It's the only way people can world.
IME that simply doesn't work in professional environments. People will either misrepresent the failure as a success or find someone else to pin the blame on. Others won't bother taking the time to understand what actually happened because they're too busy and often simply don't care. And if it's nominally your responsibility to keep something up, running, and stable then you're a very likely scapegoat if it fails. Which is probably why people are throwing stuff that doesn't work at you in the first place. Trying to solve the problem through politics is highly unlikely to work because if you were any good at politics you wouldn't have been in that situation in the first place.
> ... I make performance improvements, I stabilize CI, I improve code readability, remove compiler warnings, you name it ...
These are exactly the kind of tasks that I ask an AI tool to perform.
Claude, Codex, et al are terrible at innovation. What they are good at is regurgitating patterns they've seen before, which often mean refactoring something into a more stable/common format. You can paste compiler warnings and errors into an agentic tool's input box and have it fix them for you, with a good chance for success.
I feel for your position within your org, but these tools are definitely shaking things up. Some tasks will be given over entirely to agentic tools.
> These are exactly the kind of tasks that I ask an AI tool to perform.
Very reasonable nowadays, but those were things I was doing back in 2018 as a junior engineer.
> Some tasks will be given over entirely to agentic tools.
Absolutely, and I've found tremendous value in using agents to clean up old techdebt with oneline prompts. They run off, make the changes, modify tests, then put up a PR. It's brilliant and has fully reshaped my approach... but in a lot of ways expectations on my efficiency are much worse now because leadership thinks I can rewrite our techstack to another language over a weekend. It almost doesn't matter that I can pass all this tidying off onto an LLM because I'm expected to have 3x the output that I did a year ago.
> My org [...] is all-in on AI with the goal of having 100% of our code AI generated within the next year.
> People who disagree at all levels of seniority have been made to leave the organization.
So either they're right (100% AI-generated code soon) and you'll be out of a job or they'll be wrong, but by then the smart people will have been gone for a while. Do you see a third future where next year you'll still have a job and the company will still have a future?
"100% AI-generated code soon" doesn't mean no humans, just that the code itself is generated by AI. Generating code is a relatively small part of software engineering. And if AI can do the whole job, then white collar work will largely be gone.
I agree, but it seems like if we can tell the AI "follow these requirements and use this architecture to make these features", we're a small step away from letting the AI choose the requirements, the architecture and the features. And even if it's not 100% autonomous, I don't see how companies will still need the same number of employees. If you're the lead $role, you'll likely stay, but what would be the use of anyone else?
Unfortunately not many companies seem to require engineers to cycle between "feature" and "maintainability" work - hence those looking for the low-hanging fruits and know how to virtue signal seem to build their career on "features" while engineers passionate about correct solutions are left to pay for it while also labelled as "inefficient" by management. It's all a clown show, especially now with vibe-coding - no wonder we have big companies having had multiple incidents since vibing started taking off.
Culture and accountability problems aren't limited to software.
It's best to sniff out values mismatches ASAP and then decide whether you can tolerate some discomfort to achieve your personal goals.
Shipping “quality only” work for a long time can be stressful for your colleagues and the product teams.
You’re much better off mixing both (quality work and product features).
> Shipping “quality only” work for a long time can be stressful for your colleagues and the product teams.
I buried the lede a bit, but my frustration has been feeling like _nobody_ on my team prioritizes quality and instead optimizes for feature velocity, which then leaves some poor sod (me) to pick up the pieces to keep everything ticking over... but then I'm not shipping features.
At the end of the day if my value system is a mismatch from my employer's that's going to be a problem for me, it just baffles me that I keep ending up in what feels like an unsustainable situation that nobody else blinks at.
"aren't you part of the problem?"
Yes? In the same way any victim of shoddy practices is "part of the problem"?
Employees, especially ones as well leveraged and overpaid as software engineers, are not victims. They can leave. They _should_ leave. Great engineers are still able to bet better paying jobs all the time.
> Great engineers are still able to bet better paying jobs all the time
I know a lot of people who tried playing this game frequently during COVID, then found themselves stuck in a bad place when the 0% money ran out and companies weren’t eager in hiring someone whose resume had a dozen jobs in the past 6 years.
You obviously haven't gone job hunting in 2026
I hope you get the privilege soon
Employees are not victims. Sounds like a universal principle.
Came here to say this. The right solution to this is still the same as it always was - teach the juniors what good code looks like, and how to write it. Over time, they will learn to clean up the LLM’s messes on their own, improving both jobs.
> and refuse to clean their mess
You can should speak up when tasks are poorly defined, underestimated, or miscommunicated.
Try to flat out “refuse” assigned work and you’ll be swept away in the next round of layoffs, replaced by someone who knows how to communicate and behave diplomatically.
ramraj07 went on to clarify that they were advocating for putting the onus for cleanup back on mess generators.
They clearly were not advocating for flat out refusing.
Just reply with this to every AI programming task: https://simonwillison.net/2025/Dec/18/code-proven-to-work/
It's just plain unprofessional to just YOLO shit with AI and force actual humans to read to code even if the "author" hasn't read it.
Also API design etc. should be automatically checked by tooling and CI builds, and thus PR merges, should be denied until the checks pass.
> did not obey the API design of the main project
If they're handing you broken code call them out on it. Say this doesn't do what it says it does, did you want me to create a story for redoing all this work?
Thst is definitely one tell, the hand rolled input parsing or error handling that people would never have done at their own discretion. The bigger issue is that we already do the error checking and parsing at the different points of abstraction where it makes the most sense. So it's bespoke, and redundant.
That is on the people using the AI and not cleaning up/thinking about it at all.
> At work, the devs up the chain now do everything with AI – not just coding – then task me with cleaning it up.
This has to be the most thankless job for the near future. It's hard and you get about as much credit as the worker who cleans up the job site after the contractors are done, even though you're actually fixing structural defects.
And god forbid you introduce a regression bug cleaning up some horrible redundant spaghetti code.
Near future being the key term here imo. The entire task I mentioned was not an engineering problem, but a communication issue. The two project owners could have just talked to each other about the design, then coded it correctly in the first pass, obviating the need for the code janitor. Once orgs adapt to this new workflow, they’ll replace the code janitors with much cheaper Claude credits.
Lol you may be on to something there.. 'a code janitor'.
We’ve had this too and made a change to our code review guidelines to mention rejection if code is clearly just ai slop. We’ve let like four contractors go so far over it. Like ya they get work done fast but then when it comes to making it production ready they’re completely incapable. Last time we just merged it anyways to hit a budget it set everyone back and we’re still cleaning up the mess.
> It was a slog, and it also made me look bad because it took me forever compared to the team who originally churned it out almost instantly.
The hell you are playing hero for? Delegate the choice to manager: ruin the codebase or allocate two weeks for clean-up - their choice. If the magical AI team claim they can do integration faster - let them.
IME one thing that makes this choice a very difficult one is oncall responsibilities. The thing that incentivizes code owners to keep their house in order is that their oncall experience will be a lot better. And you're the only one who is incentivized to think this way. Management certainly doesn't care. So by delegating the choice to management you're signing up for a whole bunch of extra work in the form of sleepless oncall shifts.
If someone is making the kind of mistakes that cause oncall issues to increase, put that person on call. It doesn't matter if they can't do anything, call them every time they cause someone else to be paged.
IME too many don't care about on call unless they are personally affected.
> If someone is making the kind of mistakes that cause oncall issues to increase
the problem is that identifying the root cause can take a lot of time, and often the "mistakes" aren't clearly sourced down to an individual.
So someone oncall just takes the hit (ala, waking up at 3am and having to do work). That someone may or may not be the original progenitor of said mistake(s).
Framed less blamefully, that's basically the central thesis of "devops". That is the notion that owning your code in production is a good idea because then you're directly incentivized to make it good. It shouldn't be a punishment, just standard practice that if you write code you're responsible for it in production.
I think you need coding style guide files in each repo, including preferred patterns & code examples. Then you will see less and less of that.
I've heard of human engineers who are like that. "10x", but it doesn't actually work with the environment it needs to work in. But they sure got it to "feature complete" fast. The problem is, that's a long way from "actually done".