Really open question here. I'm not after an answer.. only advice. Any past experience people have had with re-factoring legacy systems that they could pass on would be amazing. Here's some information about the software I am to re-factor (you'll cringe!):

Web application

Database driven (MySQL)

PHP4 and PHP5

Most of the code is PHP4

Nearly all the code is procedural

Code that is PHP5 isn't OO.. example: 10,000 line+ file with one class and one function

Global variables used everywhere

No source control was used to write the software (you can see from the comments in the code)

Massive amounts of code repetition

No separation of concerns - user interface and logic is combined everywhere

Application relies on order of database tables

Few code comments

250,000+ lines of code

Application in heavy use

Basically, the software is our core product and I have been hired to do a major re-factor (amongst other things). It's a massive task and I can't just dive in and fix all the little things.. I need an overall strategy. I've written some scripts to tidy indentation up, removed commented-out code everywhere and made the project into a repo but now it's time to do the real stuff.

I kind of have a vague idea but not sure how to go about it. I could somehow leave the current code alone and write some layer of software over it that abstracts away from all the horribleness. It would be good if the new layer was some sort of MVC architecture. At the same time I would go into the current code, remove redundancies because otherwise the new layer would be using bad code anyway so the code could slow down even more.

1 Answer
1

Reverse engineer some user epics; not all the stories, but the "big picture" suites of functionality that comprise the system.

Prioritize based on quality, replaceability, value, etc., etc. This is subjective, but you need to pick a piece to work on.

Write unit tests as best you can that the legacy code will pass.

Write new code for just that.

Now comes the hard part. Bridges.

You need to deploy the new part and maintain the legacy while you're moving forward. You'll need to "bridge" data from legacy to new (and new back to legacy).

You'll also have to write a bunch of Apache rewrite rules to redirect requests to the new so that the URL's don't appear to change too much or too disruptively. Some change is inevitable as your rewrite. Some change is disruptive to users and requires mod_rewrite or 304 redirects or both.

Once you've got that first epic (set of stories) deployed in production, you need to then rethink your epics and decomposition of what's left in the legacy that's still valuable. Reprioritize.

Write unit tests for the next epic. Rewrite the code. Create and revise the bridges. Deploy the next chunk.

You'll repeat the "partition, prioritize, test, refactor, deploy" loop until what's left over as legacy is of so little value that you no longer need any bridges. Then you decommission the old. Delete the code. Burn the bridges. No going back.