p.enthalabs

Ask HN: Where is the programming profession going?

news.ycombinator.com · Read Story HN original

I had been running a small (3 people) software company for about 4 years. Since closing down, I recently hung out at a friend's company to see what they were working on (15 ppl). To preface: I'm a heavy user of Claude (rarely write code by hand), but what I'm seeing in person has been rather shocking to me, and I wanted to calibrate with others.

In particular: - the code is not the source of truth anymore; it's ask claude to write, and ask claude to explain - LoC, abstractions, and all those "software development principles" does not seem to matter to people - Code review is not done by humans - Actually understanding the problem deeply seems to be offloaded to claude - Some developers are running like 5+ simultaneous claude sessions, and no code is being looked at - Explosion of llm-generated tests

First off, is this similar to what's going on at your company?

If this company is representative, it feels like software development is going from a precise occupation that requires high degree of understanding to something probabilistic and offloaded understanding (to eventually not an occupation at all honestly).

I'm interested to hear other folks' perspectives.

Comments

I'm a Senior Freelance Programmer, I can see many of my past and present clients moving towards the exact path you described. I keep warning them during meetings that Claude model isn't sustainable for long, eventually the VCs will come for their revenues and Claude will be forced to close their access to all but the most enterprisey ones with deep pockets. The mere electricity cost for that kind of high level reasoning and abstraction can't be subsidized forever. However, there are other forces which pull them towards Claude and AI workflows. Most of the clients are in a "wait and watch" mode right now, using LLM assistance for code generation but not fully depending on them.

Before LLMs came, there used to be the technical debt to deal with in a project, now there is also the added cognitive debt which is way more subtle and impactful long-term. If your source of truth isn't source code but a prompt (or even a series of prompts with branches) and the executor of prompts is a non-deterministic agent, I think you've already lost the battle there.

You ignore that Claude are not alone, tech progresses and reduce costs, and there are always the Chinese alternatives which are becoming sufficiently better over time.
> Claude model isn't sustainable for long, eventually the VCs will come for their revenues

This is cope. There are multiple open models that are already good enough and cheap enough at API rates to sustain this.

Using today's model prices as a rebuttal is a very weak argument.

Two years ago, SOTA was gpt-o1, and it was much more expensive than Fable. Now, for $4,699, you can easily run a much smarter Qwen3.6-35B locally with DGX Spark.

Think about where we are. This is an era where a new SOTA arrives every two months. It took LLMs only about 18 months to go from chain-of-thought reasoning to disproving the unit-distance conjecture. chatGPT itself is only three and a half years old.

DeepSeek V4, released two months ago, is almost as cheap as the electricity costed, has the ability to being absolutely a top-tier model in 2025 standards.

This. And the "individual with technical savvy saving his child-like clients" scenario described earlier in this thread is delusional and Dilbertian.
The electricity cost per unit of machine “reasoning” is vastly less than the cost of salary for human reasoning. That’s a weak argument. You should focus on the second part… LLMs (at least today’s) don’t build simple solutions, and the complexity they introduce has a cost.
> LLMs (at least today’s) don’t build simple solutions ...

... by default.

Not sure why this is downvoted but you truly can get a lot of mileage out of asking Claude to simplify things.
Definitely. One of my fav techniques is to ask an LLM to simplify something by 90% or 99% if it looks overly complicated. Or asking the output to be critically reviewed by another agent too.
As a Freelance Programmer, are you even getting consistent clients at decent rates? If so, how are you getting clients consistently and how do you convince businesses that you are better than AI?
I fully agree with that. Well said.

You're standing on the shore, and your clients are having fun in the water. The tide is going up, and you're screaming at your clients "come back! it's not safe". And so, they show you the face. You appear to them like the boring guy who's not fun to hang out with. Eventually the tide is high, there is strong current, and they are being swept away further and further from the shore and they are panicking : "pyeri! help us! please!"

People (the non tech people, the MBA people) don't want to hear what you, the tech guy has to say. You're the not fun guy. Stay in touch until they do need you and say : you were right. That's the day you charge them a dear price for the service.

AI is still at the bait stage of rollout. They subsidize it, they want you to get hooked onto it to the point where you cannot do without it. Then only, they start to charge. I used google code assist for around 9 months. It was free. I would ask it questions from time to time, to help to fix bugs, and to avoid to spend an hour browsing SO. Now, it's around $30 per month. They are losing too much too fast atm, they have reached the stage where they have to start to charge. Another one of their strategies is : IPO. Once they (openai/anthropic) are listed on the nasdaq, you will pay whether you want it or not (via your exposure to the nasdaq/S&p500 with your etfs).

> Stay in touch until they do need you and say : you were right. That's the day you charge them a dear price for the service.

That assumes OP will still be in business if that day even happens. Odds are that cheap AI access will be around for longer than freelancers can remain solvent.

I have had some truly spectacular results that still kind of stagger me in the last few months using Claude in my hobby projects -- but even though Claude insists on trying to slip its name into the git history as credit it's not Claude -- it's me. Someone who has studied CS and software engineering for decades will craft different prompts from someone without that background. A suggested axiom: there is nothing I can build with Claude that I could not build myself with my current level of CS knowledge, assuming I had infinite focus and time. In my hands it can go as far I could anyway, and no further. (But it is faster!) My experience bears that out so far.
> Someone who has studied CS and software engineering for decades will craft different prompts from someone without that background.

This, to me, is the biggest differentiator. In terms of results, there's a huge yawning chasm between the person who says "Claude make me a $thing" versus the person who puts in the effort to lay down the overall architecture, gives some thoughts to libraries and dependencies, performance trade-offs etc, and only then begins prompting.

Knowing how to implement Djikstra or a linked list by heart is no longer important. Actual software engineering skills are more important than ever.

> Knowing how to implement Djikstra or a linked list by heart is no longer important.

This was never important. The important part was always knowing when to use them.

>The important part was always knowing when to use them.

Two things can be true simultaneously. I think there was a time when deep familiarity with implementing algorithms was important.

always was. Still is.
It is for coding interviews
Even more narrow - you only need to know when to consider them.

I only ever bothered remembering enough of any algorithm to know my options and a few rules of thumb. If I ever actually need to consider the details of the algorithm, I certainly need to spend a lot more time thinking through the problem and its solution. Knowing a specific algorithm well enough to pump out in 15 minutes is a party trick that is as useful as being able to change a tire in 3 minutes flat. A great time saver that will be functionally useful maybe 3 times in your life...

The gap is closing; a shitty wannabe programmer will eventually learn the structures one way or another. Agentic coding just got many new people involved, and these new people create noise.
> hobby projects

Unfortunately despite being impressive for solo stuff, such results don’t scale to software you’d give to others.

Claude writes probably 95% of our code now, fintech, amongst top 5 in the world in what we do. I am 100% certain we're not even at the forefront of using agents for coding compared to some others.

It definitely can scale.

Are there any code reviews? Or is AI reviewing code also?
We still review everything. And we guide by planning, prompting, speccing.

So we're not actually much faster at the core code, because reviews still take time. Ultimately, we're on the finance markets and we have regulatory pressures and I, as the human, am responsible for putting the code out there.

But we're freeing up a lot of time to get other things correct. We have n x more metrics now because plumbing in basic stuff is now trivial. We now have dozens more tools and skills to help analyse issues (e.g. why this price and not x), answer questions, etc.

I now have skills to scrape logs, download unpack and scape our bus persistence, link to kdb, and so on, all in my claude prompt, joining it all together and the AI is figuring things out. I can diagnose things, I don't know, maybe 100 times faster?

It is revolutionary, and I am highly sceptical of the motivations of people who keep saying otherwise.

When you review everything, do you understand every line of code before approving? Do you make it rewrite code that is too abstract or unclear for future humans to understand? Does AI write the tests and do you review those with the same diligence?

I don't disagree that it's revolutionary in many ways, but I am seeing lots of companies make very costly mistakes by relying too heavily on AI without fully understanding the code it writes and without fact checking its outputs by a human.

Yes we understand all code merged. Yes, I have coding rules and standards that the agent follows to ensure it writes code in virtually the same style as the whole team does. Someone isn't going to let my change be merged if they don't understand it.

My hunch however is that in a few years we're not going to be reviewing as heavily as we improve guardrails and can trust the AI code and review cycle more. I'm not sure what to expect for my career at that point.

Fair enough but speed, especially the kind that comes with LLMs, is fast enough to open new ways of working and doing things. We don't have infinite time and if there's something that can give me multiple, for example, UI suggestions in a minute which I can pick from, it's a different way of working than sitting with a UI designer for several hours have discussions. So, while I agree with you in theory, I don't fully agree with you in, what I think you're implying, when it comes to practice.
For the last 6 decades or so, a computer was a machine assumed to operate with high levels of precision and deterministic outputs. Such precision enabled spacecraft like Voyager 1 & 2 to travel billions of miles from Earth, staying on course, semi-operational and sending telemetry- 50 years after launch.

Now we have machines that, when asked to produce a paperclip, may instead produce a butter knife, or a banana, or maybe just a "try again later".

These modern "tools" are quite a different animal. They're more akin to roulette wheels that generate massive amounts of heat and CO2.

This is cope. sota agents produces what's asked exactly, usually it's the asking that's the problem not the result, improve the prompt and the output drastically improves.
> ask claude to write, and ask claude to explain

This works, until it doesn’t. I’m continuously shocked by these stories, where so many people put the future of their job/company in the hands of these agents after only a few months of existing.

I still constantly run into bad output from LLMs, from code to basic questions. I don’t understand how anyone can hand things over to something that is laughably wrong on a pretty regular basis, often in subtle ways that won’t be noticed by someone who isn’t reading closely and thinking critically.

They’ve gotten better, but I still regularly give them the old Nick Burns treatment, push it out of the way, and do it myself.

It's a really fun philosophical exercise to ask what it means for them to be "wrong." My perspective is that they are fantastic at association and generalization (of language and symbols in particular), but whether they're identifying the associations you care about or generalizing to the level of abstraction you're aiming for is a complete crapshoot. If you aren't checking and correcting them, and discarding the misfires, you will end up with a very pretty Tower of Babel.
One area where I feel safe saying they are “wrong”, rather than just going with a different assumption that was left unsaid, would be when it makes up API endpoints. It sees the general pattern in an API, then makes up an endpoint that sounds good, follows the pattern, but isn’t actually implemented.

I’ve also seen a lot of issues with co-workers using an LLM to write their readme files. I look at the readme for what return values I should get, go to use them, and get an error. I check the code, and sure enough, none of the variables in the readme exist. The LLM just through they sounded good. Things like this I would say are pretty objectively wrong.

> One area where I feel safe saying they are “wrong”, rather than just going with a different assumption that was left unsaid, would be when it makes up API endpoints. It sees the general pattern in an API, then makes up an endpoint that sounds good, follows the pattern, but isn’t actually implemented.

I remember seeing this maybe 6+ months ago, but using paid plans, RAG, and a high thinking mode has eliminated a ton (almost all) of those kinds of hallucinations. Open models and free tiers are not there yet though.

> I’ve also seen a lot of issues with co-workers using an LLM to write their readme files. I look at the readme for what return values I should get, go to use them, and get an error. I check the code, and sure enough, none of the variables in the readme exist. The LLM just through they sounded good. Things like this I would say are pretty objectively wrong.

LLMs don't co-sign the quality of PRs though — your coworkers do. It's not unusual for docs to get oudated and not be maintained enough in small codebases, but that's not an LLM specific problem.

There's nothing shocking about this. The vast majority of software/source code is pretty terrible anyways, code that is full of bugs, slow to use, has little to no automated tests and very hard to maintain.

To the extent that it gets fixed or works at all, it's not because of competent developers doing rigorous analysis of the software, it's because either someone testing it or using it gets annoyed, reports an issue, and then that specific issue gets patched out.

If using LLMs to perform a similar function shocks you, then you should have been shocked already by the proliferation of pretty bad software for the better part of the last couple of decades.

So many criticisms of LLMs assume that people have been writing software very diligently, applying a high standard of engineering, subjecting the code to a battery of rigorous tests, passing it through a strict review process... and that does happen for some software, especially software that is commonly used, but it's not true for the vast majority of software developed.

AI is no good, but neither are people, isn’t a great sales pitch.

I think for small tools that people want to make for themselves, that’s great. Where I see a problems are when other people and money get involved. If something goes wrong, who is accountable? Claude wrote it, Claude reviewed it, Claude submitted the PR… yet Claude can’t have any real accountability.

It's an absolutely phenomenal sales pitch to executives. A ton of automation is sold on the basis that it's probably not going to be as good as having a dedicated person do it, but that automation leads to much lower maintenance scales better, is more deterministic and reproducible.
"A computer can never be held accountable

Therefore a computer must never make a management decision"

-- Internal IBM training manual, 1979

I think small tools people make for themselves is realistically less than 1% of software produced. Most of the code, and - to the GP’s point - bad code, is produced in corporations with plenty of money and budget.

There is just such a tremendous amount of waste at every company, in that the headcount and software expands to fill the budget. I’m not defending Elon, but look at how much he slashed from X (80% or so?) and the company still has its core product functioning and an active user base.

There is a ton of software (especially internal) at essentially every company that also is low accountability before Claude. “Oh Ted built that but he’s working on a new important project. I understand it’s broken and that’s impacting you but we won’t be able to prioritize this until next quarter at least. Can you set up a meeting next month to discuss?”

Honestly the outcome for all of these LLMs is indeed is likely a higher amount of software with no accountability, but it’s also an improved ability to juggle more of that software to the same (realistically low) standard.

> little to no automated tests

I'm still amazed people don't achieve extremely high test quality, since you get tests "for free" now.

One of the limitations of testing were always that people "design" things so they're hard to test.

And then they argue "This can't be tested", or "Refactoring this for testing is not worth it."

It is now. Yet, I work on codebases with no tests and lots of yolo co-authoring.

You get quantity of tests, but the tests are not good quality by default, at all.
I’m not sure how you can say something general about the quality of tests unless you mean by simply prompting “make tests” or similar.

Yes, I’ve experienced that those tests succeed, and the app still breaks trivially on first run.

What I mean is: you design the tests. You analyse patterns. You insist on making testable code (average code by humans isn’t, so neither is average code by LLMs unless you specify testability as a design constraint.

One way to get testable code is to mock all interfaces. This is usually expensive, but not difficult for an AI, because you can set the success criterion to be interface exactness of your mock for a series of plausible and somewhat extensive interactions.

The tests you can make with AI are as good as you can make them otherwise, you just save time doing them, which should justify making more extensive testing.

it was hype all day long and managers forgot that ai is tool and not some magic stick. tool like dewalt or makita. after ai went out i got expected from some collegues at company to generate 600 700 lines of code or more, i tried to explain i cannot read or understand whats actually happening that fast, but they were like just push, go, copy paste it. complete autodrive mode, insane. then i spend weekend fixing it, making me double mad. whats actually happening is retarded, cos of all stories out there managers thinking that claude generate perfect code, and u could make twitter clone in half a day...
AI is just a tool, and, as always, people will use it incorrectly and lazily. Are we forgetting the good old days of Copy/Paste from Stack Overflow?

LLMs just made it more convenient for the same people to take the lazy route.

I saw this, all of this happening years before ChatGPT existed, but with outsourcing to Indian dev shops.

You'd be shocked how often I see the meat-space equivalent of vibe coding!

"I trust the developers."

"You really shouldn't!"

The thing to realise is that there is no fundamental difference between outsourcing a development task to other human developers versus outsourcing[1] it to LLMs.

Either way, total and complete understanding is being sacrificed in the name of productivity and scalability.

It's just there's one extra layer of work assignment now, with ICs handing off tasks to agents.

What this has revealed to ICs is the BIG issue that has plagued all software development for decades, especially since outsourcing became so popular: Oversight is critical, and more importantly: authority can be delegated, but responsibility cannot.

LLM output is fine, as long as you review everything it does.

This is the same as any competent dev team manager reviewing PRs for quality, paying attention to critical matters such as security, adherence to high level design and low-level style standards, etc.

Some do.

Many never did.

[1] This doesn't have to be a contract with an overseas provider, by "outsourcing" I mean any variant of not-your-own-hands-on-keyboard. Any scenario where a customer or manager assigns tasks to developers other than themselves.

Even if / when it does work, the value being produced is reduced to the dollars paid to Anthropic or OpenAI or whoever. What are you even contributing? What’s stopping the ai provider from coming in and eating your lunch?
We're still running the race, but it's just not on foot anymore. You can still run it into the wall if you're not careful where you're going.
Remember you had to quit social media to keep your sanity in check? Ok, now AI. Same thing.
Not the same thing. Developers' clients are being approached by thousands of people instead of a handful. It creates the illusion that everyone can do the same thing for cheaper.
I mean, literally the answer is that nobody knows. Maybe the robots replace us all. Maybe they shift those who remain into being some combination of Product Manager and QA. Maybe there's still a role for a technical overseer even in the medium-long run.

But it sounds like you're really asking about the state of the world today. If so, I don't think that ideal state is like your friend's company (or at least, as it appeared to be to you). It might be possible that you can make that "dark factory" pattern work (StrongDM seems to be doing it), but it would require infrastructure and discipline that I doubt they're mustering. Think about how CD didn't involve taking a sloppy build process with no testing or observability and just going straight to prod -- it required building up a lot of infra and discipline first.

But on the other hand, I don't think the ideal present involves artisan hand-crafting code either. I haven't written a line of code by hand in enough months that it would genuinely feel weird if I were to try to program that way despite decades of having done just that. That era's done with, and moderate normie practices right now today are more about supervising and guiding agents than about chiseling code into clay tablets.

I've posted a recent article about the future of software development https://saturnino.substack.com/p/out-of-the-loop?r=7eqhw&utm...

Basically, in a decade or so, we'll be completely out of the loop in software development; even this title won't exist anymore (like the 2000's webmaster). We'll still be around, but with different roles.

For what it’s worth, I find comments and articles with assertive predictions like this difficult to take at face value.

I don’t even disagree with the premise, but it shifts the burden of assessing likelihood back onto the reader, so it doesn’t really add much value to me.

This has always been a very different profession depending on where you work and what you're working on.

I haven't worked at a startup in over a decade, but the stories I hear now sound the same as back then. There's lots of wasted effort for mediocre to poor code destined to be rewritten or thrown away until there's enough investment to justify more work. At which point, "more work" just means more sprawling slop instead of fixing the technical debt rotting at the foundation.

AI just put a spotlight on the futility of trying to run before you can walk. Whether so many founders are going to stay in denial about it is yet to be seen. Statistics about any line of business says yes. This is how most businesses fail and most of them have to fail.

From what you said: Not looking at code is bad, not because Claude can slip a few bugs (it can), but because LLMs tend to default to writing more code and features than needed, which isn't a good thing. I see a lot of people making 10+ PRs per day, but most of them are just going back to fix earlier PRs.

Claude always likes to "go big," for example, by choosing tools that can support millions of concurrent users or by adding unnecessary layers of abstraction that create more maintenance pain. I guess that's good for LLM companies, since more tokens are spent fixing the mess it caused.

Every time I enter plan mode for a huge feature, I end up cutting about 30-60% of the task scope before the LLM can actually start the work. I review the final code, and I still find things to cut. As said before "The best code is no code, or code you don’t have to maintain" [0]

0: https://www.simplethread.com/20-things-ive-learned-in-my-20-...

My personal experience: writing code has always been the easy part. AI does most of that now.

Understanding the problem and the existing system well enough to design the right solution, even with AI assistance, is a higher cognitive load. I’m doing a lot more of that lately.

I’m more productive, but also more tired. This may be due in part to the breadth of what my team owns, which makes my day a bit more context-switchy than other teams.

As others in this thread have noted, the situation is still evolving. However, I worry less each day about being replaced by AI. There has always been more work than available bandwidth in my experience.

What seems clear to me is that expectations around velocity and throughput will increase (are increasing). AI use will be required to meet those expectations. Learning to use this new tool effectively will be essential for career progression (and preservation).

Spot on, in my experience.
Agree. Also, there is a lot fog at the moment. AI generates more code, we need a lot of markdowns now to teach it how to write "good code"... and <insert here a lot of AI processes>. But at the end... a programmer has to take ownership of that code and responsibility, meaning: reading A LOT of code and/or coding more code.
Responding to my own comment to add that I think this moment favors the curious and passionate. None of what I wrote above is a complaint. I’m having more fun now than I have in a long time.
> My personal experience: writing code has always been the easy part. AI does most of that now.

The only reason dev jobs paid more (by a factor of two or more) than pure solution modeling was because "writing code" was the hard part.

If you wanted to get paid just modeling the solution and handing it off to a coding team, those jobs were available for decades, typically called Business Analysts but few devs moved from dev to BA.

> Understanding the problem and the existing system well enough to design the right solution, even with AI assistance, is a higher cognitive load.

I've found that the act of physically writing refines my understanding a lot more than simply reading.

We don't typically expect a person to read a trigonometry textbook and then perform well on an exam. They have to drill problems to surface their misunderstandings to themselves.

My fear is that, with developers adopting your approach, they're "designing" systems in much the same way that a read-the-book-only trigonometry student solves trigonometry problems.

Perhaps solution was the wrong word for me to use here. It was intended to encompass the implementation details (abstractions, architecture, observability, etc)… All the decisions the engineers would normally make during planning and execution. Once I have that nailed down, the act of writing the code is largely mechanical.

That’s the source of my “easy” framing. It has always had the lower cognitive load in my experience. Now that I can offload the mechanical part to AI, I spend more time on the hard parts.

I still read plenty of code along the way, maybe less of it now because it’s easier to surface which parts of the code I need to read.

GP's "design the right solution" is a role between "programmer" and "business analyst" that got merged with "programmer" to become "developer" decades ago. That's where the high salary came from. It's been reemerging as "architect" now that "developer" has been watered down to include "programmer".
Who hires “pure solution modelers”? I don’t think I’ve ever encountered someone like that.
Aren't they simply called "consultants"?
> Who hires “pure solution modelers”? I don’t think I’ve ever encountered someone like that.

They're called Business Analysts, sometimes simply Analysts, and that's effectively their job - come up with a spec and give it to the software engineers.

I've never seen BAs execute that way. I don't think that is an accurate description of their role and its link to SWEs.
Agreed; typically (especially if a client is involved) I’ve seen creative and dev define and articulate spec and expected behavior, and BA documents it. Then, when it inevitably evolves, BA’s job is to capture the changed spec. The artifact is simply documentation as a client deliverable that’s then often never referenced or used for anything outside of maybe complaining that a feature doesn’t do what the spec claimed, or maybe as context when the client takes their whole project to another vendor to be rebuilt from scratch lol
It’s still lower level than a business analysis though so it’s not the same
thank you for putting into words that which has been hard for me to describe — I’ve noticed the worse a dev was at their job the more high their opinions of AI seem to be. The subject textbook analogy (trig book in your ex.) is a perfect frame of reference for why that might be the case…

to further that example, many people with the help of AI are ostensibly copy pasting trig problems from the book without understanding the mechanics running through them and labouring under the impressions they’ve become closer to skilled mathematicians

other thing could be also true, if you are great developer who spent decade honing their craft (vim, working on hobby projects, grinding when your friends party) you would hold cognitive bias against it as it flips the script. I don't think our profession is going away, but the shift is happening and it's not very comfortable one.
There was a time back in the 1980s (and probably before) when "analyst" paid better than "programmer". The programmer wrote the code; the analyst figured out what the code was supposed to do to meet the business need.

In my view, "programmer" merged with "analyst" to become "software engineer".

Coding is the easy part, huh? Sure, buddy.
> My personal experience: writing code has always been the easy part. AI does most of that now.

That's exactly why I don't have AI writing my code. It is doing the easiest part of the job (making symbols appear in the text), which isn't actually valuable to me. A good tool should help me to do hard things, not easy things.

I seem to only have discussions about architecture with it.
> What seems clear to me is that expectations around velocity and throughput will increase (are increasing).

This is why I don’t understand why folks around here (that are employed) feel so enthusiastic about AI. We are going to be working more in a rush to produce stuff that we won’t be feeling as proud of as we did before AI. Unless you were in the profession for the money, the delights of crafting software simply go away and AI is pushing us closer to be just… well, I don’t know, but I don’t like it. Sure thing, if you are a CEO, this new state of things must be wonderful

There was a recent interview with Dax Raad on the Pragmatic Programmer podcast, and they talked briefly about it. We would like a future where we do just a bit more work and are happier with legacy codebases or work on getting rid of tech debt, but that definitely won't be something our employers are interested in.
I didn’t know modern (2015-2026) software engineers were making such a strong distinction between “writing code” and “designing solutions”. It’s not the majority of engineers “design” and then hand over the implementation to others (at least Ive never seen that before).

From my experience, a typical software engineer needs to understand the business (e.g., knowing who your users are), design a solution (e.g., we probably need an event-driven arch right here) and write the code (e.g., we should use select for update skip locked to avoid over claiming). They all are equally challenging imho

For me in large tech:

- Humans still own the code

- All code reviewed by humans

- LLM adoption varies across the org. Some are heavy users and some less. Some suspicious some less.

Where are we heading? Depends on model/harness capabilities. Likely some sort of mix where some projects will still require expert humans and others will just be vibe coded. How much we lean in that direction - we'll see.

how is that company doing?

i think that is a more important question that you shouldn't ignore.

do they have growing revenue?

And more important, how will they be doing in a year or two?
What are you writing that Claude is actually writing all of it? Every time I get past the green field stage, I just end up throwing out what it writes half the time since its trash. Claude seems really great at fix this unit test, generate this boiler plate, take this uml and build this framework out. But when I am doing refactorings, or implementing things that are beyond monotonous, I end up writing it all by hand. My best luck is still do the design, query AI for possible choices, sketch out the framework of what I am writing, have AI critique my plan, and then have AI design individual methods, then fix what it writes.
What you say could be theoretically possible, but it's probably an issue with your usage of if. For eg: if any of this hard non-promptable project is available on github, or you've seen this problem in any large scale github project, you can share that. I've rarely seen a repo and a problem that claude can't chew through with the right prompt.
People keep saying things like

> it's probably an issue with your usage of if

> I've rarely seen a repo and a problem that claude can't chew through with the right prompt

> a skill/PEBKAC issue

But then I remember how Anthropic couldn't fix the flickering issue for many months. It just does not compute.

Is it that people working at Anthropic can't prompt and it's a "skill issue" too? I mean, the terminal does not flicker in a lot of other complex TUI apps that I use every day - Midnight Commander, Emacs, tmux, etc. These are open source, Claude could be prompted to "just do what Midnight Commander does". So what is it?

What does a random bug in one LLM's frontend app have to do with learning how to do prompt engineering well?
Because it proves that even the greatest prompt-engineers in the world are unable to vibe code their way out of a simple bug. The fact that this example is a small, annoying, random bug that is relatively harmless does not mean that the next bug won't be as harmless or even as apparent.
pretty big leap of faith there
Opencode had the same flickering issue not too long ago and they fixed it by switching from ink to opentui if I am not mistaken. So a solution is possible and known. Antropic just doesn't seem to care.
I mean this with no disrespect, but

> Every time I get past the green field stage, I just end up throwing out what it writes half the time since its trash.

Is a skill/PEBKAC issue. You still need to exercise engineering best-practices like decomposing work to the smallest unit before taking a task on, brainstorming design first and implementation last, clearly defining your success criteria and requirements before beginning any work, etc.

I'm on a >10yr old codebase and have been able to get my org to orchestrate entire features, fully unit tested, e2e tested, storybooked, from scratch without touching an IDE. Refactorings and the endless mountain of 80% completed migrations from one pattern to another are now trivially able to offload.

Point your SOTA de jeur at the original docs, a few of the original examples/PRs and have it draft a skill describing the work, the scope, and the success metrics. Iterate on the skill with the main agent by subagenting to test the skill until you are happy with the result and it mostly gets it right with the guardrails you've defined. Again - keep the scope extremely small. It gives much less rope for the agents to hang themselves with and it is less cognitive load when you have to review/test the PR.

Then set up a reasonable cadence for it to execute an autonomous thread on and review when you get comfortable.

----

The issue I've been running into lately is simply that we've got so many PRs coming in that actually doing thorough human reviews on them is not sustainable relative to the rate the team is creating agents to open them and people (especially juniors and mid level) are getting burned out by essentially having entire days where they are just doing code reviews.

> What are you writing that Claude is actually writing all of it? Every time I get past the green field stage, I just end up throwing out what it writes half the time since its trash.

For the current state of frontier models, you need to break the steps down so that the LLM understands a process like what you might go through as you expect it (which is often different for everyone).

i.e., get it to agree to a spec, then get it to agree to a build plan, agree on unit test signatures, UI etc as needed, then let it build, ...

"Prompt engineering"

What role do you serve in this process?

I can take all of those steps, turn them into separate skills, then give them to a product manager or business analyst who makes half your salary, but has far more knowledge about the customers needs than you do.

the creative inputs to the whole process and judgment aspects of telling it how to refine and edit