Let's discuss: future of refactoring in the era of LLMs

For a long time we have designed our software in a way where we would value backward compatibility and would try to minimise breaking changes.

This is not only true for the open-source, widely used software, but also for internal software. The existence of companies with big mono repositories to support dynamic rhythm of changes that do not break in unknown parts is a good illustration of how important to have the ability to refactor without the fear.

Even in the pre-LLM era we had a technical ability to do some changes without the fear which we didn't really adopt. For over 20 years we already had tools that implement a number of deterministic refactorings for typed languages. If our change consisted of only a sequence of such deterministic changes – we could automatically generate codebase migration scripts together with the new versions of our libraries. And maybe we could even search for all the usages of our libraries on the GitHub and programmatically produce PRs with all the needed changes.

Today, with all the progress in LLMs it's not hard to imagine that instead (or together with) automated migration scripts we can distribute some automatically generated migration prompts. And for all the opensoruce code, we can actually validate the results of our migrations (by pulling all the code dependent on our library, applying migration prompts, building and running tests) – and everything that is validatable, usually reach very confident levels with LLMs.

It's not hard to imagine even better LLMs in future with the ability to do even more complex refactorings. And this might remove a lot of fear associated with the changes, which in turn will significantly improve the speed of innovation for old projects where a lot of historical decisions are cemented in the public interface.

This is the future of software engineering I would want to see, and I'm curious to hear what are your thoughts about this?

P.S. Migrating from one computer language to another should become a matter of "prompted change" at some point too!

1 points | by ohcmon 3 hours ago

3 comments

  • codingdave 2 hours ago
    If that future came to fruition we'd stop hearing "it works on my machine", and start hearing "it worked with my LLM".

    The problem with refactoring code, re-writing codebases, and such major work is not the effort of re-coding. It is that you lose the code that has been battle-tested in production over years of use, millions of man-hours of beating on the code and turning it into a product that has survived every real-world edge case, attack, and scaling concern throughout the entire history of the product.

    When you re-write code for for the sake of re-writing code, you throw that all out and have a brand new codebase that needs to go through all the production pain all over again.

    So no - the trend I'm hearing of people thinking code will just become an output of an LLM-driven build process sounds quite naive to me.

    • ohcmon 2 hours ago
      > you lose the code that has been battle-tested

      I agree that this is still the most important thing, and I don’t try to challenge this.

      At the same time we have quite adopted bumping our dependencies when it does not incorporate breaking changes (especially if there are know security vulnerabilities) — and my point is exactly about it, why even simple renames, extraction or flattening or other simple changes have to be treated so differently than internal changes that do not touch public interface?

  • pancsta 1 hour ago
    You should always distribute AST-level refacs, which are deterministic, instead of prompts. You can easily prompt these out, but nothing really changes here besides who writes the migration (human vs LLM).
  • seg_lol 2 hours ago
    This is an essay, not a conversation starter. You aren't asking any questions. Maybe you should try blogging.