Our role as software engineers is on the cusp of a fundamental transformation, driven by now widely available large language models (LLMs), deployed as generative AI. (I’ll avoid for now the question of whether this is actually artificial intelligence, but there’s no question these LLMs have some impressive capabilities.) As I write this, ChatGPT is the dominant name in this game, but alternatives such as Google Bard are quickly gaining ground.
This revolution in software engineering is unstoppable. LLMs have become so good that we would be foolish not to train them to craft quality code. Any IT department or software engineering company that refuses to touch AI will inevitably fall behind and fade into irrelevance. Any developer who refuses to touch it will soon become replaceable.
This is all going to either augment our ability to tackle the most challenging problems the world presents us, or it’s going to accelerate the rate at which we can deploy horrible, damaging, dangerous code into the wild.
All of this has happened before
When I was young and building my first websites, I used Dreamweaver, a tool which promised to automate website development. You could choose a template or create your own. Add some copy and images. Make some WYSIWYG adjustments. And Dreamweaver would spit out the HTML and CSS for your site.
Dreamweaver worked, sort of. The generated code was generally an unmaintainable mess, so I’d always go in and edit the HTML before deploying. It was still a lot faster than writing raw HTML and CSS from scratch.
What the tools have done is freed developers from the more repetitive tasks of developing a simple website while catching some of their more common errors before or as they make them. This has sped up the development of the easy, familiar stuff that goes into a website, letting human developers focus on the more exciting challenges and bespoke capabilities.
The tools haven’t replaced the people. They’ve simply augmented their abilities.
LLMs get loquacious
I don’t have to tell you that LLMs and generative AI have hit a watershed moment. GPT-3, GPT-4, ChatGPT, Google Bard, and others have broken out from fringe curiosity to cultural obsession. Sure, there’s some hype, but also a lot of really interesting innovation happening as people find clever ways to apply these new technologies.
At my own company, we believe in investing in quality tools that make our engineers’ lives easier and more efficient, so we recently purchased GitHub Copilot licenses for all our engineers. (We’re watching with interest the development of Amazon CodeWhisperer.) GitHub Copilot is built on the OpenAI Codex, created by the same organization that developed GPT-4 and ChatGPT. So Copilot is basically a cousin of ChatGPT that specializes in software engineering.
Here’s how our engineers are using GitHub Copilot:
- Streamlining tedious tasks: Copilot allows us to more quickly complete tedious tasks such as filling out variables in an interface or templating a function or behavior.
- Intelligent autocomplete: Copilot is excellent at intuiting what an engineer is going for and completing it for them. This is particularly helpful for writing repetitive code or test cases, where Copilot can generate the test itself and accurately predict the next condition under the test.
- Detecting patterns and generating error-handling code: Copilot can detect new patterns in code and rewrite it accordingly, as well as generate error-handling code for functions.
Our engineers have found GitHub Copilot to be a force multiplier, saving them time and increasing their efficiency. However, it's important to note that Copilot isn't foolproof and will frequently generate incorrect code. (The folks at GitHub and OpenAI are fully transparent about this and warn us not to trust the code until a qualified human software engineer has checked it.)
Did I say “incorrect” code? I mean truly random, weird, garbage code that will make you throw up in your mouth if you have any sense of good programming style.
In one early experiment, I asked ChatGPT (not Copilot) to implement C’s notorious
strcopy(), and it built me a function that was vulnerable to buffer overflow attacks. It fixed it, mind you, when I asked it to but the naive approach it took was nonetheless bad. My engineers have also shared with me generated code samples that, while functional, would have been a nightmare to maintain, refactor, or extend.
Human developers are not obsolete… not even close.
Generate accelerated insanity
Generative AI—whether GitHub Copilot, Amazon CodeWhisperer, Google Bard, or whatever else emerges from the coming singularity before I can finish typing these words with my human meat sticks—will amplify what we choose to value most in software engineering.
For developers, departments, and organizations that prioritize speed over quality, LLMs will accelerate the rate at which they can hack horrible software together and deploy it into production. We’ll be cleaning up the mess this makes for the next decade if it continues.
For those who view code as a commodity and engineers as cogs in a code-generating machine, LLMs will automate the assembly of derivative, unimaginative software that addresses already well understood needs with conventional solutions. This will lead to stagnation and no sustainable gains.
But for those who value innovative solutions to interesting problems—crafted with quality and validated with critical thinking—LLMs offer a subtler, more satisfying potential. Human engineers and generative AI can each bring their strengths to a hybrid partnership that prioritizes crafting secure, stable, sustainable solutions to important challenges.
Unlocking this potential will require us to train LLMs for quality. Meanwhile, we humans will have to learn how to get the very best out of our new collaborators, while bringing our own unique contributions to the team.
The Iron Age
We’re in the very early days of this AI revolution. LLMs are in their raw iron ingot stage: so many possibilities held within them, but still just big rocks that we can hit together to make some noise. We have to forge the steel sword. Or the plowshare. Or whatever tools we decide are best suited to our needs and values.
For me, that means training these models to generate code that is secure, stable, scalable, extensible, maintainable, highly available, clean, and maybe even well styled. We need to train them to do test-driven development. And most importantly, we need to train them to be better copilots for their human captains.
Those with different values than mine might choose instead to build a fully automated code-assembling machine that will completely commoditize our craft and condemn humanity to at best derivative and often dangerous systems and software. That would be a tragic path for our industry to follow, squandering transformative new possibilities while accelerating vulnerabilities and fatal flaws into the world.
A decade from now, I don’t want my company’s revenue to rest on cleaning up the dystopian dump of irresponsibly applied AI. We’d rather be doing our part to advance humanity’s finer ambitions, not rebuilding in the aftermath of our automated folly.
I assume smart people are already at work training these generative AIs to be quality collaborators with human engineers, as I’m recommending. If they aren’t, sign me up for the project. Because, if we do this right, we’re not going to make good engineers obsolete. Instead, we’re going to make good engineers into absurdly good cyborg hybrid engineers, mind melded with our machines, able to solve problems and create solutions more powerful than either humans or AI could ever build alone.
I, for one, welcome our robot collaborators. They’re not here to take our jobs. But if we train them well and apply them responsibly, they are going to make us better.