Support Ukraine

Deep Learning and the Entry Barrier to Software Development

Ever since Linus Torvalds provided the final piece of the puzzle by getting his Linux operating system self-hosted in 1991, we software developers have basically owned the means of production. Anyone with computer access could get an operating system and, thanks to the massive undertaking that is the GNU project, also a top-notch compiler (GCC) and editor (vi or Emacs). Later, with the birth of a viable platform for in-browser web applications, anyone with access to a web browser and a text editor could learn JavaScript.

This abundance of free resources had a great impact on the IT business:

  • Anyone who wanted, and who had $120 to put on a laptop, could have free access to everything they needed to learn to code. This resulted in a massive widening of the talent pool.

  • Since everyone - even large commercial entities[1] - used the same tools, the skills transferred easily.

  • Professionals could retrain themselves with ease and with no cost beyond their time. New job requires you to write Python but you haven't written a line of Python[b] in your life? No problem. Download[c] and cram that stuff.

  • It was possible to go from idea to prototype very cheaply and quickly, resulting in a startup-friendly environment with low barriers of entry.

With deep learning this changes - not completely, but somewhat. Writing a new application was something "two people in a garage" could do - but training a new Large Language Model is simply out of their reach, with the cost of training it ranging from "only" millions of dollars (GPT-3[2]) to over a hundred million dollars (Google Gemini Ultra[3]).

For an open source analogy: it is as if the source code is available, but compiling it is such a computationally arduous process that only the IT megacorporations can do it.

On a smaller scale, anyone getting serious about developing AI tools is almost required to buy a GPU costing about $500 to run locally, and that's just getting started. Hugging Face writes, in an article about LoRA, excitedly that [t]he greater memory-efficiency allows you to run fine-tuning on consumer GPUs like the Tesla T4, RTX 3080 or even the RTX 2080 Ti![*][f]. The RTX 2080 Ti is $1000. This is not meant as a slight on Hugging Face: being able to run the fine-tuning for on a four-digit price machine is cause for excitement, but it is meant as an example of where the threshold for entry sits. You can run things online as well, but then you pay per second.

It's impossible not to notice that the entry barrier has gone up; and for someone who lived through the era before open source and saw the abundance of resources that replaced it, it is impossible not to be a little sad to see how the window of opportunity and cheap access is closing a little bit.

For the future I'm just going to make a prediction: gigantism is often the last stage in the evolution of a technology - if you can't make it better or smarter, just make more of it and crush your problem with sheer mass. I don't think these billion or trillion weight LLMs are going to keep scaling up. We're already seeing their limitations. We may therefore see a process whereby consumer technology catches up.