Adaptive and destructive refactoring
Refactoring is changing the structure of a program without altering its behaviour. There are two types of refactoring.
-
Adaptive refactoring
Gradually change the program to the shape you want, while preserving the original behaviour in each step of the process.
This method is basically a composition of little refactoring steps, where each step strictly obeys the refactoring rule - changes the program while preserving the original behaviour.
-
Destructive refactoring
Just write directly the shape you want and then fix what's broken.
This is more straightforward approach, but only the whole thing obeys the refactoring rule. Intermediate steps don't.
Those methods are not mutually exclusive. Refactoring is recursive, each step can be broken into smaller steps that can be considered either destructive or adaptive themselves.
Consider a variable rename. If you do it manually, changing the identifier will break the program, which will then stay broken until you update all its usages. It's a series of destructive steps that together form one adaptive step. This distinction is more important on larger scale, though.
Both methods evaluated π
Each of these approaches has its pros and cons.
Adaptive Method π
Pros π
- It's easier to make sure you don't brake something accidentally
- You can stop anytime, and you don't have a broken program
Cons π
- Finding the right composition of steps is a puzzle on its own, sometimes it's too complicated to be worth the time
- Intermediate states can leave the code in a worse shape than it was before
- The path can be very long, even when it's not difficult to find which steps to take and each step itself is simple
- Transformation to the exact end shape might not even be possible sometimes
Destructive Method π
Pros π
- Usually the shortest path from start to finish (semantically)
- We can write the shape we want directly
Cons π
- It's easy to break something unintentionally
- As the size of the change grows, so does the probability that something breaks. This probability quickly approaches 100%
- Intermediate states are broken and unusable
How to choose? π
Sometimes it's difficult to tell which method is optimal. Here are some rules of thumb I found helpful:
-
Prefer Adaptive refactoring by default
- Even though it can take longer time, it's usually less risky
-
If intermediate states in an adaptive option are improvements on their own, go adaptive
- And similarly, if intermediate states make the code temporarily worse, start thinking about destructive approach
-
If you have good tests that test all relevant things that can break, you can go destructive
- If you don't have good tests, adaptive is usually the way to go
-
If you are not sure what the final result looks like, or you are not sure how some things should be implemented, go adaptive
-
If you go destructive, it's a good idea to keep the original code, write the new code on the side first and then use some fuzzing to compare results of both methods to make sure the behaviour is the same for old and new version.
- If you have this option, then destructive refactoring often becomes optimal. But it's more work and has the disadvantage of bloating the code during the process. You want to keep the migration period as short as you can.
- A/B testing or gradual rollout with feature toggles is also an option, especially for larger changes
-
Try to minimize destructive steps
- As I said, refactoring is recursive, so if the destructive method seems the only way to go, try to first do as much adaptive changes as you reasonably can to make the final destructive step less risky.
- Also try to keep the breakage period as short as possible. Write the thing you want, delete the thing you don't want and then fix all errors as fast as you can, even if the fixes are ugly. After that, you can go back to safer adaptive refactoring.
-
For adaptive refactoring, it can be useful to make a commit after each step, even when it's tiny. This makes it easy to find a change that accidentally broke something. Once I made 200 commits in a day this way, but it was worth it, as finding a problem I introduced was trivial. You can always squash these in the end.
- (btw. if this wasn't clear already, you don't merge broken states into main branch, when using destructive method)
-
Destructive refactoring is too tempting. Be aware of that.
- It's a form of rewrite, and we know that rewrites are often not a good idea. We don't like the system in its current shape, but we usually underestimate the complexity of our envisioned rewrite and the time it takes to complete.
- Rewrites often stay in a limbo state for a very long time, because we find many edge cases that were not obvious in the original solution, and getting back to working state takes much longer than expected.
- For the same reason, they often end up nowhere when priorities change. At that point, we would be much better off in the middle of an adaptive change, which is at least closer to the final form, while still being functional.
-
Avoid puzzle trap in adaptive refactoring
- People who like puzzles (like me) can sometime spend ages trying to find perfect combination of adaptive refactoring steps, when small destructive step would be just fine.
-
Good IDE (and language) makes adaptive method more effective
- Some refactoring steps are especially prone to subtle errors, this is where actions like
extract/inline function
or various boolean logic transforms are invaluable - It also allows adaptive steps to be larger
- It replaces some destructive steps with adaptive ones. Without IDE, some actions (like changing order of parameters) can only be done as destructive steps
- Some refactoring steps are especially prone to subtle errors, this is where actions like
Closing with analogy π
One other way to look at it is:
- With the amount of changes...
- Cost grows faster for adaptive refactoring
- Risk grows faster for destructive refactoring
Adaptive refactoring is closer to putting your money into savings account. Destructive refactoring is more like investing in a startup.