Working with Legacy Code

By Ahmed Ismail

Chances are if you’ve worked with software, you’ve heard the term “legacy code.” If you were to ask your buddy sitting at the desk across from you (theoretically at least — thanks, COVID) what the definition of legacy code is, they might give you a brain dump that includes things like:

– Code you’ve gotten from someone else

– Code that hasn’t been touched in living memory

– Code you’re afraid to touch

– Code that’s written by someone who is no longer around

– Code without tests

– Code without tests …

Code without tests? Cue the feeling of guilt when you realize you might be writing legacy code right now. Without tests, no matter how well written, encapsulated, or object-oriented it is, it will be difficult to change the behavior of your code quickly and verifiably. Code without tests does not allow for the developer to be certain if the code is getting better or worse. Instead, clean code is easier to understand, easier to refactor, and easier to add features.

If you’ve read Michael Feather’s Working Effectively with Legacy Code, you may remember an analogy used to describe the two distinct ways to change code. The first is “Edit and Pray. This seems to be the industry-standard approach. It refers to when you read a requirement, carefully plan the changes to be made, make sure you understand the current code, and then make the changes. If you’re asking yourself what’s wrong with that, then you may need to continue that line of questioning with, “How will you know that you’ve made those changes correctly and that you haven’t broken anything?”

On the other side of this is “Cover and Move.” This is like changing code with a safety net. It refers to covering with tests so that you can make quick changes and soon find out if the effects were good or bad.

If that analogy didn’t work for you, another that might tickle your fancy is the software vise. A vise is a tool that is used to hold an object firmly in place while work is done on it. A software vise represents tests that detect change so that the behavior of the code is fixed in place. Placing this type of vise around our code allows us to be in more control of our work.

I know that you’ve read 101 blog posts about the benefits of TDD and testing, etc. The benefits are well known: It can lead to faster development, more robust code, avoiding unneeded code (YAGNI), easier to find and fix bugs, add features, etc, etc.

But can we apply these test-first techniques when you’re trying to debug or optimize legacy code? Or even when attempting to add a new feature or refactor? In short, yes. When it comes to inheriting an older code base, the rules do change a bit.

To apply the test-first methodology, you’ll have to give up a few things. For example, forget about achieving 100% test coverage (though that can quickly become a vanity metric anyhow). I am not saying that you don’t need tests. On the contrary, more tests are better than fewer. But the idea of perfect test coverage needs to get thrown out of the window.

Presumably, when you inherit a legacy codebase, the application is already (at least mostly) working. Starting with a broad test that covers a large portion of the application is not a bad idea. It can be a great first step in helping you notice if any of your changes broke something. Software developers have a finite amount of time to spend writing tests — your product manager or business analyst will agree. It may be a good idea to head for broad coverage instead of spending your nights writing unit tests.

If you’re used to doing TDD, you may be used to switching context somewhat frequently — such as switching between writing failing tests and writing your production code until all tests pass. In a legacy codebase, most of the production code has already been written. So you may need to write many tests before switching back to writing your business logic. Of course, if you’re adding something new, you can always go with the practice of writing a unit test first.

So far, there’s been a hint of a dilemma that you may have already noticed: When code is changed, tests should be in place; but to put tests in place, code has to change. This is where a little bit of ingenuity and creativity comes in. But it doesn’t have to be complex. Start by writing one test that is as broad as possible. You may call it a smoke test or an end-to-end test. Whatever you call it, make sure you automate it.

If you want an algorithm, you can find a simple one in Working Effectively with Legacy Code:

  1. Identify change points.

  2. Find test points.

  3. Break dependencies.

  4. Write tests.

  5. Make the changes and refactor.

All of these steps have been talked about ad nauseam, so I won’t bore you (any more than I already have!) with the details. If you are interested, you should pick up the book, which I’ve linked at the bottom. It’ll go into many different situations and dependency-breaking techniques such as how to get a stubborn class/method into a test harness, how to extract an interface, or how to use mocks/stubs/fake objects.

If there is anything to learn about working with legacy code, it may be that there is no one strategy to rule them all. Some tests are better than none, so don’t let perfect be the enemy of good.

Katas are a great way to practice your legacy-code wrangling skills. Here are a few to check out:

Trip Service


Legacy Train

Birthday Greetings

Gilded Rose

On a related note, if you’re working in a legacy codebase and are not feeling inclined to refactor someone else’s work, check out why refactoring matters.