Are MS-DOS hard disk image storage affected by changing the hard disk physical geometry? E.g. by improving an old algorithm to calculate CHS dimensions of a hard disk to be more efficient, would it break data stored on said disk(which is stored as a flat disk image)? Or does MS-DOS always autodetect the disk geometry on startup and adjust it's calculations for CHS values accordingly?

Just modified the geometry formula, then tested my large test games disk image(with a MS-DOS 5.0 OS and various games on it to test functionality of my emulator) by mounting it in dosbox and using it normally(though the dosbox prompt). The new geometry is 3968,16,63(C,H,S format) and inputting that (reordered to 512,S,H,C used by Dosbox) mounts it correctly in Dosbox. Even tried some games in Dosbox and they all seem to run correctly:D So it seems that MS-DOS isn't affected by changing physical geometry(CHS) of hard disk emulation(with same size of disk image)?

It cares and knows only about the geometry it sees via standard BIOS disk IO as that is what it must use to access the disk.

If for some reason the geometry is different even if exact amount of sectors is identical, then it won't work, as even the partition table contains CHS values so the MBR may not find the partition to boot from.

Geometry shown by BIOS may or may not be translated depending on the physical geometry and BIOS may not support it or have different methods for performing the translation (none, large(old), large(new), lba, ...).Assuming that the physical geometry is even correctly detected by BIOS.

All it needs to work is the translated geometry to be identical, so that DOS sees nothing has changed, and these translations may not be identical throughout implementations and computing eras.

Would the fact that the default CHS geometry that's reported in UniPCemu's ATA hard disk emulation at drive diagnostics offsets 54,55,56(Default CHS translation) change and make MS-DOS using of the hard disk image buggy? Or will it continue to work without problems using the newly set geometry(assuming the BIOS doesn't need to translate it, e.g. with 20MB disk images)?

I don't know how BIOSes work to determine drive parameters, it's up to them to detect drives and translate if necessary.

I just re-read your previous post and noticed you were mounting disk images on dosbox. What was the exact command you used to mount them, and did you try to boot from that image?

Dosbox only needs CHS parameters when mounting a unformatted raw image or trying to boot from it. Besides, to boot from it, it needs the translated geometry if you want to boot from DOS that was installed using translated geometry.

The original geometry I used in UniPCemu (formatting it in that way and using it until recent, MS-DOS 5.0a 2GB sfdimg disk image) was the same(due to forced 16,64 and cylinders being automatically calculated). I'll need to use a disk image with another format to check correctly(e.g. 100MB disk image?).

I have a 100MB disk image, which used to be 203,16,63 in the old translation method(which it was formatted in). The new method would be 12800,16,1. So it would fail to read it, if changing cylinders screws up reading it?

fdisk does see the second disk, being only 4MB large, but the partition of 99MB is still there? CD-ing to the second harddisk on drive letter D doesn't detect it(Invalid drive specification) on Dosbox.

Mounting as filesystem (not raw, not booting from it) does not require geometry so it does not use your numbers. It does not need them because formatted images have the logical translated geometry in MBR and DOS partition boot sector. And yet you are trying to feed it the physical geometry which is wrong, DOS does not use physical geometry.

superfury wrote:I have a 100MB disk image, which used to be 203,16,63 in the old translation method(which it was formatted in). The new method would be 12800,16,1. So it would fail to read it, if changing cylinders screws up reading it?

Those images are of different size, the new one is larger. If you have a DOS image you want to boot from, you need to tell DOSBOX the translated geometry it will show to DOS.

Besides the physical geometry is unrealistic, no 100MB drive would ever have 12800 cylinders. So you cannot use such geometry in DOS untranslated, it needs translation, which is why dosbox needs translated numbers it shows to DOS as DOS uses translated numbers.

fdisk does see the second disk, being only 4MB large, but the partition of 99MB is still there? CD-ing to the second harddisk on drive letter D doesn't detect it(Invalid drive specification) on Dosbox.

Now trying on UniPCemu...

As I explained, that is untranslated geometry that BIOS cannot show so DOS cannot use, max is 1024 cyls, so that wraps around to 1*16*512=4MB disk. FDISK will read the MBR as that is always the first sector and thus it is readable, and the partition table has the old, already translated geometries in it as the partition was created with those values.

So I would need to create a new disk and move the files from the old disk to a newly created disk(which has an improved CHS partitioning, which follows the ATA-5 appendix C specification)? The 2GB disk image has no problems, since the geometry is unchanged(although the new geometry might have some orphan sectors(only accessable through LBA addressing mode), depending on how close it can get with CHS addressing format in the new spec).

Edit: Just made another small (almost classic, seeing how many times I did that in the past(should have learned by now not to do this: )) mistake: trying to load an unformatted, freshly partitioned disk image in WinImage: it will hang Windows mostly(needing task manager through extreme lagg and almost unresponsive OS with hard disk trashing to terminate it, as it's practically hanging the whole Windows 10 OS when trying to load the empty partition:S).

TRADITIONALCHSTRANSLATION is the old method used in UniPCemu. The other case(which is now activated) takes the largest size that can be found(defaulting to at least 1,1,1 when nothing is found, even for 0 sector images(empty .img file, which cannot be mounted in UniPCemu)). It should take the largest size available while adhering to the ATA-5 appendix C specs?

So this changes what to BIOS and DOS? The drive physical geometry C/H/S values are reported as per specification to BIOS?

But if BIOS uses LBA to access the drive, if there is no change in LBA total sectors, BIOS will most likely show identical logical translation to DOS no matter what the reported physical geometry is. Although for drives that can fit into 504MiB BIOS limits parameters should be reported in a way that does not need translation or LBA, so when mounting existing images, the parameters cannot be calculated, the parameters in the existing image must be used, no matter how non-optimal they are.

Yeah it's difficult, but in reality, is it really necessary to calculate CHS parameters for any possible size of image file? What if there are multiple solutions that are equally good at minimizing orphaned sectors, or resulting into zero orphaned sectors? That's why many image generating programs just use X*16*63 as the granularity is good enough, and when mounting existing images, the parameters are most likely generated again with assumption of X*16*63, or when this does not hold then the parameters can be detected from MBR and FAT partition info sector. When using X*16*63 method, X=1024 so max without translation fits the 504MiB limit. For larger drives, sectors are always 63, and bitshift method can be used by BIOS to cyl/head values so the limit kind of is (X/F) * (16*F) * 63, where F=factor 1,2,4,8, but 16 will cause trouble, so 1024*128*63 is the limit as 256 heads cause trouble, and thus larger drives used 15 heads to work around this to get (X/F) * (15*F) * 63, where F=factor 1,2,4,8, and 16 to get up to 1024*240*63, before LBA is necessary to go to 1024*255*63.

The words 54, 55 and 56 are supposed to be words 1,3,6. 54,55,56 are the current setting loaded by INITIALIZE DEVICE PARAMETERS command. Wirds 1,3,6 is the default CHS translation used by the disk, loaded when loading the disk image only.

6.2.1 Definitions and value ranges of IDENTIFY DEVICE words Definitions and value ranges of IDENTIFY DEVICE words (see 8.12)1) Word 1 shall contain the number of user-addressable logical cylinders in the default CHStranslation. If the content of words (61:60) is less than 16,514,064 then the content of word 1shall be greater than or equal to one and less than or equal to 65,535. If the content of words(61:60) is greater than or equal to 16,514,064 then the content of word 1 shall be equal to16,383.2) Word 3 shall contain the number of user-addressable logical heads in the default CHStranslation. The content of word 3 shall be greater than or equal to one and less than or equal to16. For compatibility with some BIOSs, the content of word 3 may be equal to 15 if the content ofword 1 is greater than 8192.3) Word 6 shall contain the number of user-addressable logical sectors in the default CHStranslation. The content of word 6 shall be greater than or equal to one and less than or equal to63.4) [(The content of word 1) ∗ (the content of word 3) ∗ (the content of word 6)] shall be less than orequal to 16,514,064.5) Word 54 shall contain the number of user-addressable logical cylinders in the current CHStranslation. The content of word 54 shall be greater than or equal to one and less than or equalto 65,535. After power-on or after a hardware reset the content of word 54 shall be equal to thecontent of word 1.6) Word 55 shall contain the number of user-addressable logical heads in the current CHStranslation. The content of word 55 shall be greater than or equal to one and less than or equalto 16. After power-on or after a hardware reset the content of word 55 shall be equal to thecontent of word 3.7) Word 56 shall contain the number of user-addressable logical sectors in the current CHStranslation. The content of word 56 should be equal to 63 for compatibility with all BIOSs.However, the content of word 56 may be greater than or equal to one and less than or equal to255. At power-on or after a hardware reset the content of word 56 shall equal the content ofword 6.8) Words (58:57) shall contain the user-addressable capacity in sectors for the current CHStranslation. The content of words (58:57) shall equal [(the content of word 54) ∗ (the content ofword 55) ∗ (the content of word 56)]. The content of words (58:57) shall be less than or equal to16,514,064. The content of words (58:57) shall be less than or equal to the content of words(61:60).9) The content of words 54, 55, 56, and (58:57) may be affected by the host issuing an INITIALIZEDEVICE PARAMETERS command (see 8.16).10) If the content of words (61:60) is greater than 16,514,064 and if the device does not support CHSaddressing, then the content of words 1, 3, 6, 54, 55, 56, and (58:57) shall equal zero. If thecontent of word 1, word 3, or word 6 equals zero, then the content of words 1, 3, 6, 54, 55, 56,and (58:57) shall equal zero.11) Words (61:60) shall contain the total number of user-addressable sectors. The content of words(61:60) shall be greater than or equal to one and less than or equal to 268,435,456.12) The content of words 1, 54, (58:57), and (61:60) may be affected by the host issuing a SET MAXADDRESS command (see 8.38).

Words 1/54, 3/55 and 6/56 are loaded with the function's result when a IDE disk is mounted. Currently the rules of appendix C are applied:

Annex C(informative)Identify device data for devices with more than 1024 logical cylindersC.1 Definitions and background informationThe original IBM PC BIOS (Basic Input/Output System) imposed several restrictions on the support ofdevices, and these have been incorporated into many higher level software products. One such restrictionlimits the capacity of a device. BIOS software cannot support a device with more than 1,024 cylinders, 16heads, and 63 sectors per track without translating an input logical geometry to a different output logicalgeometry. The maximum addressable capacity of a device that does not require BIOS translation is1,032,192 sectors.These rules allow BIOSes using bit shifting translation to access 15,481,935 (16,383∗15∗63) sectors, andBIOSes using LBA assisted translation to access 16,450,560 (1024∗255∗63) sectors. Extended BIOSfunctionality is defined in the ANSI Technical Report Enhanced BIOS Services for Disk Drives, whichdescribes new services provided by BIOS firmware to support ATA hard disks up to 16 mega-terra-sectors.C.2 Cylinder, head, and sector addressingBIOSs and other software that operate a device in CHS translation employ a combination of IDENTIFYDEVICE data words 1, 3, 6, words 53-58, and words 60-61 to ascertain the appropriate translation to use anddetermine the capacity of a device.Maximum compatibility is facilitated if the following guidelines are used. These guidelines limit the valuesplaced into words 1, 3, 6, 53-58, and 60-61. Accessing beyond 15,481,935 sectors should be performedusing LBA.C.2.1 Word 1For devices with a capacity less than or equal to 1,032,192 sectors, if IDENTIFY DEVICE data word 1(Default Cylinders) does not specify a value greater than 1,024, then no guideline is necessary.If a device is greater than 1,032,192 sectors, but less than or equal to 16,514,064 sectors, the maximumvalue that can be placed into this word is determined by the value in word 3 as shown in table C.1.If a device is greater than 15,481,935 sectors and supports CHS, this word should contain 16,383 (3FFFh).The INITIALIZE DEVICE PARAMETERS command does not change this value.The value in this word is changed by the SET MAX ADDRESS command.

Table C.1 − Word 1 value for devices between 1,032,192 and 16,514,064 sectorsValue in word 3 Maximum value in word 1 Value in word 3 Maximum value in word 11 1h 65,535 FFFFh2 2h 65,535 FFFFh3 3h 65,535 FFFFh4 4h 65,535 FFFFh5 5h 32,767 7FFFh6 6h 32,767 7FFFh7 7h 32,767 7FFFh8 8h 32,767 7FFFh9 9h 16,383 3FFFh10 Ah 16,383 3FFFh11 Bh 16,383 3FFFh12 Ch 16,383 3FFFh13 Dh 16,383 3FFFh14 Eh 16,383 3FFFh15 Fh 16,383 3FFFh16 10h 16,383 3FFFhC.2.2 Word 3IDENTIFY DEVICE data word 3 (Default Heads) does not specify a value greater than 16. If the device hasless than or equal to 8,257,536 sectors, then set word 3 to 16 heads. If the device has more than 16,514,064sectors, then set word 3 to 15 heads. If this value is set to 16 when the device has more than 16,514,064sectors, some systems will not boot some operating systems.The INITIALIZE DEVICE PARAMETERS command does not change this value.C.2.3 Word 6If the device is above 1,032,192 sectors then the value should be 63. This value does not exceed 63 (3Fh).The INITIALIZE DEVICE PARAMETERS command does not change this value.C.2.4 Use of words 53 through 58Devices with a capacity over 1,032,192 sectors implement words 53-58. Devices with a capacity less than orequal to 1,032,192 sectors may also implement these words. These words define the address range for allsectors accessible in CHS mode under 16,514,064. The product of word 54, word 55, and word 56 must notexceed 16,514,064.C.2.5 Words 60-61IDENTIFY DEVICE data words 60-61 contain a 32-bit value that is equal to the total number of sectors thatcan be accessed using LBA. If the device is less than or equal to 15,481,935 sectors, this value should be theproduct of words 1, 3, and 6. Setting the total number of LBA sectors in this manner reduces the probability ofconflicting device capacities being calculated by different operating systems.

C.3 Orphan sectorsThe sectors, if any, between the last sector addressable in CHS mode and the last sector addressable in LBAmode are known as "orphan" sectors. A device may or may not allow access to these sectors in CHSaddressing mode.The values in words 1, 3, and 6 are selected such that the number of orphan sectors is minimized. Normally,the number of orphan sectors should not exceed ( [word55] x [word56] - 1 ). However, the host system maycreate conditions where there are a larger number of orphans sectors by issuing the INITIALIZE DEVICEPARAMETERS command with values other than the values in words 3 and 6. If the recommendation in C.2.5is followed, there will be no orphan sectors and problems associated with new operating systems calculatinga different device size from older operating systems will be minimized.

Last edited by superfury on 2017-11-08 @ 23:13, edited 1 time in total.

Just found a bug in the CHS detection algorithm, not counting past 0x3FFF instead of 0xFFFF cylinders. I've fixed this now. 2GB reports as 3968,16,63(CHS) and 100MB reports as 12800,16,1(CHS). That should be correct, according to the rules of ATA-5?

I can't rely on disk contents, as disks may start out blank(filled with zeroes for the entire image) for all UniPCemu's supported formats(creating static .img and sparse .sfdimg disk images), which only have a size and no geometry(except default by above function and a size in sectors only).

superfury wrote:The words 54, 55 and 56 are supposed to be words 1,3,6. 54,55,56 are the current setting loaded by JNITIALIZE DEVICE PARAMETERS command. Wirds 1,3,6 is the default CHS translation used by the disk, loaded when loading the disk image only.

I don't know if BIOSes send these, and if they do, do they ever select other geometries than the physical default. I believe this command has nothing to do with the way BIOS translates geometry to INT13H interface. But it may be also the way how BIOS implements DOS CHS to DRIVE CHS conversion, by setting the drive to virtual geometry. Most definitely DOS CHS to LBA calculations are done by the BIOS before LBA is sent to drive. You should know what your BIOS does.

The point was, your situation is same as with real hardware. One hard drive may end up being translated differently at the INT13H interface between two computers depending on their BIOS so DOS won't boot.

superfury wrote:Just found a bug in the CHS detection algorithm, not counting past 0x3FFF instead of 0xFFFF cylinders. I've fixed this now. 2GB reports as 3968,16,63(CHS) and 100MB reports as 12800,16,1(CHS). That should be correct, according to the rules of ATA-5?

I still think 100MB drives were used back in the day when BIOS level translation was not invented, so it was impossible for cylinders to be over 1024.

The geometry given is generated by the function, which currently follows the appendix C and general rules quoted in my last post. If I were to reduce the cylinders to 1024, it wouldn't match the rules without loss of data, or would it? Do I need to add an additional restraint for 1024 cylinders when below 512MB to fix that?

Edit: After applying the limit for ~504MB and below, it now ends up with 800,16,16 for the 100MB disk image. Dosbox(as well as my other disk image booted through Dosbox) seems to read it fine now:D

Anyone? Is this routine correct? It tries to be compatible with Bochs' 16x63 block geometry(which has priority over the older autodetection), while also being compatible with disk images formatted in the optimal geometry documented by the appendix C of the ATA/ATAPI-5 spec.

I've made the disk CHS partitioning autodetect a bit more configurable:- Old sfdimg files will use a compatibility CHS layout(the defined ifdef part in my code).- New sfdimg files will use the new improved autodetect CHS layout(based on rules described earlier, this method is done by a disk image extension block of 512 bytes).- img files will default to the same CHS layout as new sfdimg files.- img files with a accompanying .img.unipcemu.txt file will be treated the same as old .sfdimg files.- img files with a accompanying .img.bochs.txt file will use the Bochs geometry(16 heads, 63 sectors per track) layout, for up to 65535 cylinders.- img files with and without .img.*.txt can be converted to and from .sfdimg files using the new disk image extension block to keep it's geometry when converting between the two disk formats.

The newer format specifications(newly createn sfdimg files and converted/defragmented sfdimg files) are incompatible with the older UniPCemu versions, due to extensions on the file format. Older disk images can still be used on the new builds, as long as they're not defragmented(which recreates a new sfdimg file, which adds the new extension block, which is unsupported by older UniPCemu versions).

So, to use the new format, just create and use it(both being in the new geometry format). To use Bochs static disk images or old UniPCemu static disk images, add a text file with the same filename which has .bochs.txt or .unipcemu.txt appended to the original disk filename, e.g. games.img.sfdimg.txt or bochsimage.img.bochs.txt

Is that a better way of handling the new and other geometries?

Edit: Just improved Bochs/Dosbox compatibility a little by making it create new empty static&dynamic disk images(using the settings menu option to create a disk image) in the Optimal format by default. It uses the Bochs format when using square(Numpad 4 key) instead of the normal way(like toggling readonly on non-CD-ROM drives).