How to rebuild a lot of Windows boxes (idea)

Due to all the Y2Khype going around at the time, one or maybe several of the PHBmanagement types decided it would be a good thing if every workstation in the company was rebuilt with Y2K compliantsoftware. This was about April of 1999. Sort of late to be starting, compared with what other companies had done. Somehow, I ended up as the second in command of the whole rebuild thing.

This wasn't as high a position as you might at first think. I did a bit of planning; but mostly I did all the grunt work, I had become and professional installer of Windows. Fortunately, my boss attended the meetings with the PHB types, and I just solved problems and wrote procedures all day.

So the first thing I was asked to do was to check around and see how other companies had handled rebuilding 1300 Windows 95 and NT4 workstations. I couldn't find anything, so I had to we had to make it all up ourselves. That's mostly the reason I'm writing this; hopefully someone can learn from it and leverage my experience.

We tried to do a lot of planning. We examined a few different scenario's, and tried to pick the one that would be the most cost effective and allow us to get finished fastest without interrupting the normal work of our users. We couldn't afford to have one of our departments shutdown because we needed to rebuild their computers.

The scenarios we examined:

Build a machine in the build room, swap it with a users machine during the day.

Go out on the floor during the day, and rebuild during business hours.

Rebuild at night, after everyone had gone home.

The first option was out, because we didn't have enough machines in stock to do that on a large scale. Replacing ten machines a day or whatever would have been too slow. It would have also interrupted the user's work, since we would have to shut them down in the middle of the day, copy files, setup email, and so on.

The second option was totally out, because we would be make it hard for lots of people to do their jobs. The third option was the one we went with. Although it was going to be expensive because of the overtime, it was the fastest way to get things done. No users to worry about, and overtime was probably cheaper than the cost of putting a department out of business for a few days.

Building Machine Images

One of the things I did was write procedures on how to install windows from scratch and apply all the fixes and set all the settings just so. The goal was to get as much as possible done before we began deploying machines, so that the time it would take to deploy a single machine would be as small as possible. I think that
the checklist for an average machine with no special CAD stuff came out to about seventy items. The actual installation of Windows counted for maybe six of those. Creating a build from this checklist took something like three hours, if you had all the drivers already, and if you had already done it a million times while you were
writing and testing the procedure.

Once we had a build, we use Norton ghost to copy the contents of the Windows partition to a file, which was stored on a network drive. Because of the differences in machine models, we had to create a ghost file for each different type of machine, so that we didn't have to install drivers after we started deploying the builds. Fortunately, the machines were mostly of three major types, so it wasn't too much trouble.

All of the software installed after the rebuild had to approved by the Y2K department. This was sort of a pain in the ass, but it was the whole point of the thing after all. Y2K tended to take their time to get stuff approved. We waited for approved software lists from Y2K for half the bloody summer!

There was a large amount of software that needed to be reinstalled on some machines. The ghost builds all contained standard company wide software, but each department had its own set of software. For some departments, we created ghost builds with all the the software required by users in that department. This was done when the department was large enough and the software installation was long and complicated, such that significant time savings were gained. The rest of the software was repackaged (with Wise Installer and later SMS Installer) so that it could be installed with the options we wanted without any user prompting. This was one of the things that
helped keep our man hours under control.

Even with the packages, we still had to write a lot of documentation about how to install the software. Things like the network location, options if any, that sort of thing. I ended up spending my days writing software installation documentation for the
evening's rebuilds, if I wasn't patching a build of solving some other weird problem. A sheet with installation instructions became part of the kit handed out to team members every night.

Since by this time it was mid-June, and the PHB types wanted the whole thing done by the end of September, we had to really hurry. This meant lots of overtime. We had some eighteen hour days, trying to get everything working.

Each machine that was to be rebuilt had a little checklist. This had the machines name, location, a very short, point form, deployment checklist, and a list of addition software that needed to be installed. People were supposed to check the items off as they went, and then they were collected and saved as part of a record of that
machine having been rebuilt.

The rebuild teams consisted of about 3 to 8 people. Each member of the team would get four or five machines to do. We tried to give people groups of machines that were physically close together so that members could do more than one machine at a time. One of our team members was phenomenal, she could do 4 or 5 rebuilds at a time, and
get finished before some people had finished doing two.

Each night, every member of the rebuild team for the evening was handed a kit. This kit contained: Software installation instructions for things they were expected to install that evening, inventory sheets (to verify our inventorydatabase), checklists for each machine they were to visit, and a deployment procedure.

The deployment procedure, the instructions for rebuilding a users machine, went something like this:

Change the users password, log in as them, and copy the following files to thier user drive: *.xls, *.ppt, *.doc, *.pab, *.ost, *.pst. These are Microsoft Office and MS Outlook file formats. We also saved favorites. Since people weren't supposed to
save junk on their hard drive anyway, and since they were warned well in advance that their machine was to be rebuilt, we didn't worry about anything else.

Rebuild the machine, i.e., use ghost to copy the pre-built image down to the hard drive.

Change the PC name to match our naming scheme. If it was an NT machine, we would then also run a SID update (duplicate SIDs is the reason Microsoft doesn't like you to use a tool like Ghost to deploy machines) and bring the machine into the domain.

Copy back the users files, which all got dumped into C:\My Documents, except for the favorites.

Leave the user a phone message telling them their new password and other information about the rebuild. This was something like:

This is ___(your name)___ from IS&T. We rebuilt your machine tonight, and we had to change your password at that time. Your new password is ___(whatever)___, a-b-c-d..., all in lowercase. If you require additional software or if you have any
questions, please call the helpdesk.

Working out the bugs

Every morning, when I went to work I dreaded finding out that we had blown away some top level executive's email archives or a design that some engineer had saved on his hard drive that was critical to the success of the company. The trip to work and the
checking of my email and voice mail was probably the most stressful part of my day. Once we got the technical and proceduralbugs worked out anyway.

There were a few proceduralbugs. Things like the machine checklists not being ready, which had important information on them like the machine name, it's location, and what additional software was required. We finally solved this by getting the department secretary to get them ready for us.

The biggest technical problem was my attempt to use the multicast feature of Norton Ghost. Multicast was just to unreliable and to slow. You couldn't connect machines to the multicast session sometimes. Sometimes it would stop halfway through. The first full scale test, with helpdesk people involved, was mostly a disaster. It
was too error-prone. We didn't get the whole department done as planned. I realized that multicast wasn't the solution I had hoped it would be. One of our helpdesk guys, Bill, really saved my bacon by making boot disks up for an easy single session ghost. This worked much better, but it meant that we had to make a lot of boot disks and
that they all had to have different NetBIOS names on them. Once those disks were in use, things really got going. Thanks Bill!.

Aftermath

We managed to get about 95% percent of the machines in the company rebuilt by the end of September. The rest were special purpose machines, servers, or machines that we couldn't find (hence the inventory verification). I've heard that it's taken other companies over a year to do that many machines. Of course, they were probably just going from Windows 3.11 to Windows 95 and not really in a big rush, but I'm proud of getting it done that fast, especially since we only worked in the evenings.

Now their talking about deploying Windows 2000. Fortunately, I think the a saner approach is being taken in that machines will be upgraded when they are replaced or need to be rebuilt. Except that Office 2000 will be going out at the same time, and there's going to be version incompatibilities between the old and the new. Oh god....