My very first job on the Mac project was to help Burrell Smith and Brian Howard verify that the disk controller was working properly. They had just wired up a Woz-style disk controller to the Macintosh prototype, but they had no way to test if it was working properly without writing a fairly complicated program.

When I arrived on the scene, they were trying to debug a small 68000 routine that they had written together. They had written a simple loop to access the disk controller and were watching it execute with a logic analyzer. But neither of them were much of a programmer, so I was able to help right away, even though I had never programmed the 68000 before.

The Apple II disk controller, designed six months after the Apple II itself was complete, was Steve Wozniak's crowning achievement. His five chip disk controller card out-performed competitive controllers that were four times as expensive by shifting most of the responsibilities from hardware to software. In Woz's approach, the software was responsible for doing all of the encoding and decoding, head stepping, etc. This allowed Woz to improve the capacity and performance over standard techniques.

In those days, most floppy disks used a recording technique called FM encoding, where one clock bit would precede each data bit to improve reliability. But that was overkill, it was possible to get more data on a disk if you used some of the clocks for data. So Woz used a technique called "group encoding" (although Woz called it "nybblizing") to get five bits out of every eight transitions instead of four. He later figured out how to use six bits out of eight (the "16 sector" format vs. the earlier "13 sector" - non-Wozian drives used 10 sectors) improving the density further.

The software stored eight transitions at a time into the disk controller's shift register. Since each transition was 4 microseconds long, each "nybble" of data had to be provided every 32 microseconds. Eventually, Woz figured out how to decode the data on the fly, in between fetching nybbles, so he could achieve the ultimate in performance, one to one interleave. But he never could figure out how to encode on the fly, since the Apple II's microprocessor just wasn't fast enough, and the timing had to be more precise for writing.

But the Macintosh's microprocessor was at least four times faster than the Apple II, more like 10 times if you were manipulating 32 bit values, so writing one-to-one interleave disk routines was simple on the Macintosh. It was fun for me to achieve the holy grail of disk performance without breaking into a sweat. But first there was a big problem to solve.

Woz's disk technology required that the software feed it new data every 32 microseconds exactly. If we were even a single microsecond early or late, it would cause a glitch in the data and ruin it. In order to write the routines, I needed to know how fast the Macintosh executed each instruction. The manual gave the number of clocks for each instruction, but I wasn't sure how long it took to fetch from memory. So of course, I asked Burrell what the timings were, but I was surprised at his response.

"I don't know. The Mac is synchronous, just like the Apple II, so each instruction has the same timing, every time you execute it, so you will be able to write disk routines that have exact timing. I don't know what it is, so we'll just measure it. Why don't you write your routine and we'll measure it with the logic analyzer."

So I spent a couple of days writing the basic routines, and then sat down with Burrell and Brian in front of the logic analyzer and we watched each instruction execute, writing down how long each instruction took. They usually worked like we expected, but occasionally some things were surprising and I had to adjust the code. After a few fixes, I had the raw disk routines both reading and writing, doing the encoding and the decoding on the fly, achieving Woz's long sought after one-to-one interleave.

I expected to feel elated when I finally got the disk reading and writing, but it didn't feel that satisfying, because you couldn't really see it in action. The previous year, soon after Woz wrote his one-to-one interleave read routines, I made some fast slideshow disks for the Apple II, where the screen was filled with a new image in less than a second, twice as fast as previously possible. I thought it would be fun to use the new disk routines to read the slideshow disks on the Mac.

By this time, it was around 7pm and everyone else was going out to dinner. I was invited to come, but I was so close to getting the slideshow working, that I didn't think that I could concentrate on eating until I got it done. I was alone in the office when I finally got it working, the embryonic Mac reading and displaying images from an Apple II disc as fast as possible. It was far and away the coolest thing a Mac could do so far. It was fun to show it to everyone when they came back from dinner.

The description of 1:1 interleaving is not correct. The Apple ][ had no problem writing data forever without stopping, hence 1:1 interleaving was as easy as any other interleaving. Sectors were read realtime due to some impossibel to explain [nearly impossible to do] code. But to do this I recall having to buffer the sector into certain RAM addresses, perhaps low RAM (the first 256 bytes). 6502 routines worked faster in this low RAM. Even if it wasn't low RAM, the 6502 read faster into known memory locations. After reading a sector, the data had to be moved to the desired location in RAM. This too longer than the time between sectors. Hence, reading was much improved by spacing data to run sequentially on every other sector, for a 2:1 interleaving ratio.
The key to a 1:1 interleaving was to read directly to the desired RAM locations, or to have a fast enough processor to move the sector's worth of data (128 bytes, or 256 bytes, I forget) to where it belongs in the short time between sectors of the floppy disk.

from Henry Spragens on March 04, 2004 04:24:58

All this reminiscing caused me to dig my old Apple ][ outa the barn. Still runs. I did have to redo some of my code to access a 3.5" 800K drive from DOS 3.3, so my memory is refreshed on how this works. DOS 3.2 (13 sector) used a 2:1 interleave. For some reason DOS 3.3 (16 sector) shipped with 4:1 interleave, except COPY (Integer version), I believe still formatted at 2:1. I'm told 4:1 was chosen because that was optimum for Applesoft, which was much slower than Woz' Integer BASIC.
DOS has several parts, a command processor which hooked into the I/O stream (3D0G, 9DBFG bring back memories?), a File Manager which changes files to physical track-sectors, and RWTS which Reads-Writes-Tracks-Sectors by nybbilizing bytes from its buffer and sending them to disk. As Woz says, RWTS can read or write continuously a whole track. Obviously, there are pauses for the drive to step track to track, and track skew is another variable to optimize.
As shipped, the FM copies data 256 bytes at a time from application memory to a fixed buffer in RWTS, which then writes it to disk. On read, RWTS reads and denybbilizes 256 bytes into its buffer, and the FM then copies it to application memory. The trick is to modify the FM so it puts the target address into RWTS buffer pointers (self-modifying code) and let RWTS do its thing directly into program memory. This skips the copying of each sector twice between disk and program. Voila! 1:1 interleave!
The Apple Disk ][ is one of the cleverest hacks (in the good sense of the word) I've ever seen. BTW, Andy didn't mention that he managed to service the serial ports also while maintaining that 1:1 on the Mac.

from Philip Stephens on September 22, 2005 23:09:48

I remember many years ago I disassembled Locksmith's fast disk copy code to see how it worked. They figured out a way of decoding sectors that didn't result in the correct ordering of bits in the 256 bytes, but allowed them to re-encode the sector for writing fast enough to support a 1:1 interleave. It was very clever.
From what I could tell from this code, the problem with writing sectors from a fully decoded sector buffer using a 1:1 interleave was simply that the encoding scheme couldn't be implemented in few enough 6502 instructions.

from Brian Matthews on January 18, 2007 18:53:43

This stuff definitely brings back memories. I remember I hacked DOS to support 1:1 interleave and even had a DOS master disk I carried around so I could format disks with it.
I also remember working some late nights to figure out the format of the first sector, it was written differently, 4 bits per transition or something? I assume that was so the boot code on the card could read it. I finally figured it out and had a little program that could read and write the sector. I don't think I ever did anything with it though, once I had it figured out it wasn't fun any more. :-)

from Leo Lesniewicz on March 08, 2015 05:14:47

Back in the early 80's I was working for a computer dealer and was a hobbyist programmer on my Apple II. A friend of mine Mike McLaren was working for Legend Industries. They sold a card similar to the Apple Language Card, except with 64K of bank-switched memory. The one program power users needed more memory for back then was Visicalc. In any case one day I realized just how slow it was loading large binary files using DOS 3.3. It suddenly hit me that once I told DOS to bload, the first thing it did was read the file's Track/Sector List into a buffer. That TSL was really all I needed to know to load the file myself. So I patched DOS to jump to my own code after reading the TSL. My code took over reading through the TSL in memory and calling RWTS to load the file sectors several times faster than DOS would have. I thought it was kind of neat though limited, since it only sped up reading but not writing. I showed it to Mike and told him he could have the idea if it seemed useful. I believe it was a day later he not only had rewritten it to work on loads and bloads, but he also fit it into several small unused areas completely within DOS so it was loaded automatically. I don't know if they ever actually did anything with it, but it wasn't very long after though that a competitor of theirs came out with Quick DOS, which sped up both DOS reading and writing and obsoleted my idea.