The Festival Floppies — September 23, 2016

In 2009, Josh Thomson was walking through the Timonium Hamboree and Computer Festival in Baltimore, Maryland. Among the booths of equipment, sales, and demonstrations, he found a vendor was selling an old collection of 3.5″ floppy disks for DOS and Windows. He bought it, and kept it.

A few years later, he asked me if I wanted them, and I said sure, and he mailed them to me. They fell into the everlasting Project Pile, and waited for my focus and attention.

They looked like this:

I was particularly interested in the floppies that appeared to be someone’s compilation of DOS and Windows programs in the most straightforward form possible – custom laser-printed directories on the labels, and no obvious theme as to why this shareware existed on them. They looked like this, separated out:

There were other floppies in the collection, as well:

They’d sat around for a few years while I worked on other things, but the time finally came this week to spend some effort to extract data.

There’s debates on how to do this that are both boring and infuriating, and I’ve ended friendships over them, so let me just say that I used a USB 3.5″ floppy drive (still available for cheap on Amazon; please take advantage of that) and a program called WinImage that will pull out a disk image in the form of a .ima file from the floppy drive. Yes, I could do a flux imaging of these disks, but sorry, that’s incredibly insane overkill. These disks contain files put on there by a person and we want those files, along with the accurate creation dates and the filenames and contents. WinImage does it.

Sometimes, the floppies have some errors and require trying over to get the data off them. Sometimes it takes a LOT of tries. If after a mass of tries I am unable to do a full disk scan into a disk image, I try just mounting it as A: in Windows and pulling the files off – they sometimes are just fine but other parts of the disk are dead. I make this a .ZIP file instead of a .IMA file. This is not preferred, but the data gets off in some form.

Some of them (just a handful) were not even up for this – they’re sitting in a small plastic bin and I’ll try some other methods in the future. The ratio of Imaged-ZIPed-Dead were very good, like 40-3-3.

I dumped most of the imaged files (along with the ZIPs) into this item.

This is a useful item if you, yourself, want to download about 100 disk image files and “do stuff” with them. My estimation is that all of you can be transported from the first floor to the penthouse of a skyscraper with 4 elevator trips. Maybe 3. But there you go, folks. They’re dropped there and waiting for you. Internet Archive even has a link that means “give me everything at once“. It’s actually not that big at all, of course – about 260 megabytes, less than half of a standard CD-ROM.

I could do this all day. It’s really easy. It’s also something most people could do, and I would hope that people sitting on top of 3.5” floppies from DOS or Windows machines would be up for paying the money for that cheap USB drive and something like WinImage and keep making disk images of these, labeling them as best they can.

I think we can do better, though.

The Archive is running the Emularity, which includes a way to run EM-DOSBOX, which can not only play DOS programs but even play Windows 3.11 programs as well.

Therefore, it’s potentially possible for many of these programs, especially ones particularly suited as stand-alone “applications”, to be turned into in-your-browser experiences to try them out. As long as you’re willing to go through them and get them prepped for emulation.

Which I did.

The Festival Floppies collection is over 500 programs pulled from these floppies that were imaged earlier this week. The only thing they have in common was that they were sitting in a box on a vendor table in Baltimore in 2009, and I thought in a glance they might run and possibly be worth trying out. After I thought this (using a script to present them for consideration), the script did all the work of extracting the files off the original floppy images, putting the programs into an Internet Archive item, and then running a “screen shotgun” I devised with a lot of help a few years back that plays the emulations, takes the “good shots” and makes them part of a slideshow so you can get a rough idea of what you’re looking at.

You either like the DOS/Windows aesthetic, or you do not. I can’t really argue with you over whatever direction you go – it’s both ugly and brilliant, simple and complex, dated and futuristic. A lot of it depended on the authors and where their sensibilities lay. I will say that once things started moving to Windows, a bunch of things took on a somewhat bland sameness due to the “system calls” for setting up a window, making it clickable, and so on. Sometimes a brave and hearty soul would jazz things up, but they got rarer indeed. On the other hand, we didn’t have 1,000 hobbyist and professional programs re-inventing the wheel, spokes, horse, buggy, stick shift and gumball machine each time, either.

Just browsing over the images, you probably can see cases where someone put real work into the whole endeavor – if they seem to be nicely arranged words, or have a particular flair with the graphics, you might be able to figure which ones have the actual programming flow and be useful as well. Maybe not a direct indicator, but certainly a flag. It depends on how much you want to crate-dig through these things.

Let’s keep going.

Using a “word cloud” script that showed up as part of an open source package, I rewrote it into something I call a “DOS Cloud”. It goes through these archives of shareware, finds all the textfiles in the .ZIP that came along for the ride (think README.TXT, READ.ME, FILEID.DIZ and so on) and then runs to see what the most frequent one and two word phrases are. This ends up being super informative, or not informative at all, but it’s automatic, and I like automatic. Some examples:

Certainly in the last case, those words are much more informative than the name D4W20 (which actually stands for “Destroyer for Windows Version 2.0”), and so the machine won the day. I’ve called this “bored intern” level before and I’d say it’s still true – the intern may be bored, but they never stop doing the process, either. I’m sure there’s some nascent class discussion here, but I’ll say that I don’t entirely think this is work for human beings anymore. It’s just more and more algorithms at this point. Reviews and contextual summaries not discernible from analysis of graphics and text are human work.

I and others will no doubt write more and more complicated methods for extracting or providing metadata for these items, and work I’m doing in other realms goes along with this nicely. At some point, the entries for each program will have a complication and depth that rivals most anything written about the subjects at the time, when they were the state of the art in computing experience. I know that time is coming, and it will be near-automatic (or heavily machine-assisted) and it will allow these legions of nearly-lost programs to live again as easily as a few mouse clicks.

But then what?

But Then What is rapidly becoming the greatest percentage of my consideration and thought, far beyond the relatively tiny hurdles we now face in terms of emulation and presentation. It’s just math now with a lot of what’s left (making things look/work better on phones, speeding up the browser interactions, adding support for disk swapping or printer output or other aspects of what made a computer experience lasting to its original users). Math, while difficult, has a way of outing its problems over time. Energy yields results. Processing yields processing.

No, I want to know what’s going to happen beyond this situation, when the phones and browsers can play old everything pretty accurately, enough that you’d “get it” to any reasonable degree playing around with it.

Where do we go from there? What’s going to happen now? This is where I’m kind of floating these days, and there are ridiculously scant answers. It becomes very “journey of the mind” as you shake the trees and only nuts come out.

To be sure, there’s a sliver of interest in what could be called “old games” or “retrogaming” or “remixes/reissues” and so on. It’s pretty much only games, it’s pretty much roughly 100 titles, and it’s stuff that has seeped enough into pop culture or whose parent companies still make enough bank that a profit motive serves to ensure the “IP” will continue to thrive, in some way.

The Gold Fucking Standard is Nintendo, who have successfully moved into such a radical space of “protecting their IP” that they’ve really successfully started moving into wrecking some of the past – people who make “fan remixes” might be up for debate as to whether they should do something with old Nintendo stuff, but laying out threats for people recording how they experienced the games, and for any recording of the games for any purpose… and sending legal threats at anyone and everyone even slightly referencing their old stuff, as a core function.. well, I’m just saying perhaps ol’ Nintendo isn’t doing itself any favors but on the other hand they can apparently be the most history-distorting dicks in this space quadrant and the new games still have people buy them in boatloads. So let’s just set aside the Gold Fucking Standard for a bit when discussing this situation. Nobody even comes close.

There’s other companies sort of taking this hard-line approach: “Atari”, Sega, Capcom, Blizzard… but again, these are game companies markedly defending specific games that in many cases they end up making money on. In some situations, it’s only one or two games they care about and I’m not entirely convinced they even remember they made some of the others. They certainly don’t issue gameplay video takedowns and on the whole, historic overview of the companies thrives in the world.

But what a small keyhole of software history these games are! There’s entire other experiences related to software that are both available, and perhaps even of interest to someone who never saw this stuff the first time around. But that’s kind of an educated guess on my part. I could be entirely wrong on this. I’d like to find out!

Pursuing this line of thought has sent me hurtling into What are even musuems and what are even public spaces and all sorts of more general questions that I have extracted various answers for and which it turns out are kind of turmoil-y. It also has informed me that nobody kind of completely knows but holy shit do people without managerial authority have ideas about it. Reeling it over to the online experience of this offline debated environment just solves some problems (10,000 people look at something with the same closeness and all the time in the world to regard it) and adds others (roving packs of shitty consultant companies doing rough searches on a pocket list of “protected materials” and then sending out form letters towards anything that even roughly matches it, and calling it a ($800) day).

Luckily, I happen to work for an institution that is big on experiments and giving me a laughably long leash, and so the experiment of instant online emulated computer experience lives in a real way and can allow millions of people (it’s been millions, believe it or not) to instantly experience those digital historical items every second of every day.

So even though I don’t have the answers, at all, I am happy that the unanswered portions of the Big Questions haven’t stopped people from deriving a little joy, a little wonder, a little connection to this realm of human creation.

I do have a box full of somewhat interesting-looking old floppies. I’d love to do something similar with them. Do you make these automation scripts publicly available online? Maybe on a git repository such as on GitHub?

If there’s original disks in them (especially copy protected ones) I strongly suggest using a proper flux-based dump to make sure the copy protection stays intact. There are people out there who will help you do this (most of them even for free)
For all other disks, the WinImage approach (heck, even the good old Linux ‘dd’ approach) is good enough. But make sure that the floppies are write-protected when you insert them, Windows has a bad habit of rewriting the boot sector and probably more even when you’re just trying to read from the disk

It reminds me when I was recovering 5.25″ floppies salvaged from basement of some software company… All of them infected by fungus. Generally by using large amounts of alcohol-based solution it was possible to make the disk readable once (raw-imaging and then relocation on image) and drive’s head still usable, but I had to keep all windows in my house open for 3 days to get rid of smell.
If there are bad sectors in disk on a file, and it cannot be retried, there is a last possibility: A relocation, like NDD did.
The image files are very handy, especially that they allow other types of analysis, not only related to files. There is only one problem – no user-friendly (modern user who doesn’t want to use a tty) tool to open them in Linux. OK, it’s possible to mount them… and destroy metadata by side effects of file manager (e.g. files with thumbnails, view types, last modified etc.). So I got a bit mad about it and I’m developing a GPL software with MTools in the back, only to work with disk images different ways and with a clear information what user wants to do with the image (and when it is modified, it is by user’s initiative, not the OS).

[…] But so much of the site seems scattered and fragmented. The breadth of the archive’s goals are frankly overwhelming, and of varying levels of quality and completeness. Much as a physical archive may contain a folder of letters from a given person, archive.org has collections of floppies that belonged to an individual donor, or even that were at one point acquired in a pile. In thinking about this very issue recently, Jason Scott recognized the value in preserving this kind of content but posed a crucial question: ‘But then what?’ […]