Failure Modes of AI in Software Development

Everything old is new again.

1. The Cheap Developer Fallacy

I can get ten cheap developers for the same price as one expensive one, so obviously I'll be better off.
My ten cheap developers are not getting anything done. They're producing lots of code, but it doesn't work and it's not converging to anything. I need more cheap developers.
It's not working. I need a cheap manager.
My cheap manager just made things worse. Now the cheap developers are completely out of control, and I can't make my cheap manager execute on my plan, that I have to come up with.

We know where this line of thinking led. A lot of people seem willing to run down that well-trodden path, but now with "agents" instead of cheap developers.

2. You Just Have to Review the Code

It used to be that everyone wrote code in C or C++, and we'd have bugs that would lead to exploits - in particular we'd have memory-safety bugs. The wisdom was that you needed "good enough programmers" that would "check every return code" and "test things well".

Boy did we ever have bugs. Turns out nobody was "good enough".

Now we have coding agents that produce code that fulfills the requirements, but takes shortcuts, does some clever-Hans stuff, which leads to it failing in different ways. The wisdom is that you need "good enough code reviews" that "check all the code" and "test things well".

Been there, done that.

Humans are really bad at reviewing code, just as we are worse at debugging than we are at writing new code. Having the machine write code and the human reviewing it therefore turns things upside down - the machine excels at reviewing unknown code it is airdropped into, the human takes forever just to build up enough of a context to start and then just can't focus for long enough.