FURTURE DATA 8" DISK FORMAT
Chuck Guzis
Sydex, Inc.
July, 2013
This is a summary of what I can remember when I processed a number Future
Data 10 years ago. Time does not always preserve accuracy, so this is what
I've been able to deduce from the code I wrote back then.
1. Low-level format.
The disks that I was given were standard 32-sector 8" single-sided
hard-sectored floppies. The "hard sectored" aspect appears to be a red
herring as sector boundaries do not correspond to physical sector holes. My
guess is that the drive was set up to output sector holes on a different pin
than that used to signal the index hole. Shugart, Qume and Siemens drives
can be jumpered to do this, which allows the use of either hard- or
soft-sectored floppies.
The modulation (recording scheme) is neither FM, MFM or MMFM, so a
conventional disk controller can't be used to read these disks. I used a
Catweasel Mk 1 (ISA) controller to provide a histogram of flux changes.
When these are accumulated and charted, three distinct peaks at t, 2t and 4t
will be noticed.
Since this isn't standard MFM (groupings would be at t, 1.5t and 2t), what
each time period represents requires a bit of guessing.
One of the first things that I do when I get a strange format disk is
attempt to determine where sector headers start and where sector data
starts. This is usually quite visible just looking at histogram data
returned--and a lot of guessing. Eventually, a pattern emerges.
Sector headers almost always have the cylinder (track) number embedded
somewhere--this is the way that seek errors are usually detected. The great
thing about these is that they'll stay the same for all the sectors on a
track and increment by one for each following track.
Guessing what t, 2t and 4t stand for is a real head-scratcher. It's
definitely not MFM, as the timings would be t, 1.5t and 2t. So a natural
guess would be group code. Since group code relies on keeping the average
flux change frequency low, a good guess would be t='1', 2t='01' and
3t='001'. (It's little more complicated than this and involves a fair
amount of scribbling and head-scratching, but hopefully you get the idea).
It's pretty safe to assume that if this is group code, that a 5-to-4 bit
mapping is being used, as we're probably dealing with an 8-bit byte system
here. (Other schemes are certainly possible.)
So what does the map look like?
Well, one way is to look at the stream of raw bits (decoded as above) on a
track. Sectors, even with hard-sectored disks, usually have some sort of ID
information with each sector. There's also some sort of synchronization
pattern preceding the sector ID information. It's entirely reasonable to
assume that sector IDs contain sector numbers and the increment is 1 from
sector to sector. Further, it's a safe bet to suspect that the track number
is part of the ID information so that seek errors can be detected. And it's
also a pretty safe bet that all sector ID fields on the same track have the
same cylinder number. Given that, with a bit of staring and guessing, we
can determine the 5-bit group codes for each 4-bit data group. All of that
will reveal itself within the first 16 cylinders.
What I came up with is a table that looks like this:
11001 = 0
11011 = 1
10010 = 2
10011 = 3
11101 = 4
10101 = 5
10110 = 6
10111 = 7
11010 = 8
01001 = 9
01010 = A
01011 = B
11110 = C
01101 = D
01110 = E
01111 = F
The sole exceptions to the table of legal values above is the
synchronization pattern before the ID field and data field start, which is a
string of at least 10 '1' bits (this is an illegal string of group code).
After the synchronization burst, 7 bytes of ID information follow. For
sector IDs, the second and fourth bytes are the cylinder number and sector
number respectively. I didn't bother discovering the signifcance of the
other bytes in the ID field, but two are probably a CRC or checksum.
When this is decoded, it's apparent that each cylinder has 52 sectors and
that the sector ID numbers are 0 through 51.
After the sector ID field, another synchronization burst of "1" bits precede
the data field, which is 131 bytes long--the first 128 of which are data and
the remainder are probably CRC and padding bytes.
At this point, it's possible to form a complete raw binary image of the
disk. The next step is to determine how the data is organized into files.
Staring at the disk image, it's apparent that there's a directory on the
first track, after the first sector on the disk (which appears to have bytes
of no significance, perhaps a boot sector) and that each entry is 16
bytes long, or 8 entries per sector. Looking to see where non-directory
data starts, it's obvious that there are 64 directory slots, followed by the
first data sector.
The first entry in the directory is called "DIR", so the directory has a
lost in itself for that. It appears that files are not fragmented--each
files appears to occupy consecutive sectors (this may not be correct, but
that pattern was followed on the 6 disks that I was given).
After a bit more staring, the apparent directory entry structure appears to
be this way:
10 bytes of a file name, if deleted the name is hex FF.
1 byte data starting cylinder
1 byte of attribute, 03 = directory, 04 = text file; (tentatively)
2 bytes of the number of sectors in the file.
1 byte unknown; always seems to be 1.
1 byte unknown; seems to be 1 or 0 (for directories)
After that, you have your files. Text files have lines ended by hex FB, FC
or FD. The end of a text file is marked by a byte of hex FE.
110