On LLMs, Hype, and Billionaire Assholes
I use LLMs as a daily tool in software development, and I have very mixed feelings about that. The hype makes honest assessment nearly impossible.
CEOs like Sam Altman have been remarkably successful at controlling the narrative, promising imminent superintelligence, burning through water, energy, and compute at extraordinary scale, and boiling the ocean in the process. It's hard to evaluate a tool clearly when the people selling it are this loud and this willing to deceive. So here is my take on it all.
When ChatGPT first came out, I paid $20 per month for access while I evaluated it. Friends and colleagues were telling me it was pretty amazing. I was working through some genuinely hard problems at the time and it couldn't help with any of them. The chat interface felt clumsy and slow. I cancelled after a few months.
What changed my mind, partially, was Claude Code. Not because it's smarter in some abstract sense, but because it runs in my terminal, which is where I spend 90% of my working life. I've lived in the terminal since I started using Linux in the mid-90s, and a tool that works there is a fundamentally different thing than a chat window in a browser. Claude Code could try something, see what happened, adjust, and try again. That's the same feedback loop I use, just faster for certain tasks.
I tested its limits deliberately. I tried YOLO mode, turning it loose with a plan and minimal intervention, the way the enthusiasts describe, and got exactly zero results I'd ship. The outputs worked in the sense that they ran, but the bugs were subtle, the kind that don't surface immediately but that a junior developer would have caught before committing. The "wow" reaction people have watching an AI generate something from scratch is real, but it fades fast once you actually start using the thing.
What it's actually good for, in my experience, is narrower than the hype suggests: getting past the small frictions that quietly eat time. Bash syntax I can never quite remember. One-off scripts for a specific task I'll never need again. Boilerplate I'd otherwise write on autopilot. It doesn't make me faster at the hard parts of my job. It handles some of the tedious parts so I can stay focused on the parts that require actual thought.
Is that useful? Yes. Is it a revolution? Maybe? Is it worth the resource consumption and the hype and Sam Altman's ego polishing? Absolutely not!
That said, the capabilities are continuously growing and I've noticed a real shift in the efficacy of the latest models. Claude has gotten genuinely impressive at writing Rust, when I provide expectations and coding style guidelines. I can give it a fairly detailed plan, clear expectations, and a way to test and validate the result and it will get the job done. More frequently the abstractions are concise and meaningful. It's writing code because the code needs to exist, not because generating code is the goal. That failure mode is real and worth mentioning specifically: LLMs are inclined to produce content volume. It's very easy to end up with dozens of unit tests that dutifully exercise Rust's standard library rather than anything specific to your application. The model will cheerfully tell you "All 150 tests are passing!!" which has exactly zero meaning to the functionality of your application. Quantity over quality is the wrong metric because the hardest part of professional software engineering is always maintenance.
I carry a small amount of unfounded guilt for finding genuine utility in something I think is mostly wasteful and overhyped. But I think that's actually the honest engineer's response to it. As engineers we can't control the hype because we aren't billionaire assholes looking for more money and more power and more fame. What we can do is keep learning, think critically, and distill truth from the madness.