The first time I saw this problem some months ago when testing a hard drive.. When the hard drive was connected, the floppy drive and intermittently say it was write protected. When the hard drive was removed the problem went away. But during my testing yesterday problem still persisted when the hard drive was removed. It is almost like was getting damaged when the hard drive connected by do not really see what, is just how it seemed.

I have had two other people email me similar problem. One guy to replace with the HC CPU as this solved a great number of DMA -type issues. But he still says he has the issue intermittently.

Going back to what I found before, it seems the ROM chips themselves seem to be contributing factor in all this. I've seen it before where one brand of ROM will work fine, and yet another brand will cause DMA issues and the drive gets corrupted. I think the problem may appear the faster the ROM chips are. I have not investigated this much as it would be almost impossible to diagnose and be very time consuming, but I think oscillations appear in various parts of the motherboard depending on the loading and reaction speeds of what chips are used on the bus. This is why changing the CPU to a HC type greatly solves DMA issues, even though CPU does not have anything to do with DMA transfers. Similar that the ROM types can aggravate DMA issues.

I just keep thinking the whole ST design is just a "house of cards". Change one thing and some unrelated thing would start to malfunction. This is basically why it takes me so long to design upgrades for these machines, it is not really the time involved designing the thing, it is working out why other things start to malfunction when they shouldn't. I would say 99% of design time tracing stupid faults like this all the time

"I just keep thinking the whole ST design is just a "house of cards"." - I don't agree with that. Atari ST is better quality than average micros from that era.
In period 1987-1992 I made several upgrades, most by own design - like EPROM programmer for cartridge port, then TOS ROM upgrades (mostly whatfor made programmer), RAM expansions - all it for me and other users. Little later IDE adapter, then large LCD display panel controller interface, etc. All it worked well and reliable without any additional measures, just after design was complete and error-free. Did not need 74F chips and like. All it worked with 74LS, 20-25nS GALs . Looking signals with oscilloscope indicated that it is pretty much clean. Unlike Sinclair Spectrum 48K, where I just could say: 'how it works at all with this mess ?' .
Problems for me started in last 5 years, approx. My Ataris gone more and more problematic. I removed IDE adapters from STE, Mega STE, 1040 ST, just to off load bus. Basically, approach now is: use only what is a must, don't connect much at once, and so ... Recapping PSUs helped little, but that can not resolve aged chips, which just changed their parameters. And when errors cumulate - we getting 'unexplainable' errors. Change everything to new, and it will work as charm
P.S. spell checker in Firefox sucks.

There is 2 kind of people: one thinking about moving to Mars after here becomes too bad, the others thinking about how to keep this planet habitable.

Every kit I have designed I have spent a long time debugging the ST hardware. You only have to look at the DMA investigation to see how complex problems are... problems are endless. Also just because something work on one motherboard, doesn't mean it works on others. MEGA ST design is terrible, very bad ringing on clocks and needs buffering. Atari even had to patch the blitter because of noise on some signals. Even replace 68000 with a HC on some motherboards will not work as needs 50pF load on 8MHz clock to prevent ringing and issues.

Same with other peoples hardware, PAK020 for example, not work on any of my machines, work fine in Jon's machines. Same problem with MonSTer, IDE refuse to work on any motherboard and I have 2 MonSTers (sold one last week). Of course others have both and never have any problems. But proves PAK has issues with many motherboards and hasn't been tested properly. Its why I didn't produce PAK design myself, as I didn't want to spend next 6+ months trying to figure out why its not working on some motherboards.

My first 16MHz booster worked perfectly on one motherboard for weeks, until I killed the GLUE, I changed it, then found booster no longer work at all with different GLUE IC. It is very complex set of problems, and just because something work on one machine for years without problem, does not mean it it will work reliable, or at all, on other machines. I have proven this year after year. I have like 20 test machines here to test my stuff on.

I can't speak for other hardware of the time. But from my point knowing all the issues, I really don't know how the ST ever worked at all. nobody will agree with me, everyone say ST reliable design etc etc, but nobody spend as much time as me debugging countless faults in the design, Mostly all PCB layout issues causing problems rather than ST design itself.

ST uses good chips and reliable in that way yes, very good in fact, but layout cause endless issues with bad signals and I been fighting with all this since 1994! Also like you say, chips aging now, even more tolerances in play, even more problems in recent years. What may have been reliable machine 30 years ago, is far from reliable in recent times, and this is where most of my time goes in fixing such issues so I can continue work on boosters etc.

Also like you say, chips aging now, even more tolerances in play, even more problems in recent years. What may have been reliable machine 30 years ago, is far from reliable in recent times, and this is where most of my time goes in fixing such issues so I can continue work on boosters etc.

If these observations are true (and I don't doubt they are), this would mean that sooner or later our beloved machines will seize to work, modified or not. But honestly, what could be the reason for that? Caps I get, they dry out and go out of spec, easy to fix. But the rest? I mean is it really the chips that age? Or is it the board itself? Maybe some magnetisation effect, causing hysteresis? And if the chips, what happens in them?

Ingo

“Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.” - Antoine de Saint-Exupéry

But the rest? I mean is it really the chips that age? Or is it the board itself? Maybe some magnetisation effect, causing hysteresis? And if the chips, what happens in them?

I've read some theories on that such as sun's rays, emps, magnetic fields, solar flares causing all sorts of things in chips that have no explanation as to why they die, so you might be onto something.

I would just assume some kind of thermal shock from years of being heated up when on, then cooled down when off would not be good for them too. Also it's not like the power they get is super regulated over the years going further and further out of spec, I assume theres only so much tolerance for that until something burns out or blows.

These links point to game carts but I would assume its not too far a jump to other kinds of chips and tech from the 1980s :

"cosmic radiation is able to change the state of data in cartridges. There was a speed runner recently playing Super Mario 64 who’s cartridge was struck by cosmic radiation. It changed his Y axis value, but in other cases, like a hospital database being hit, it could conceivably break the entire database, simply by switching one single variable."

Humidity (in air) is what harms chips with plastic case. No wonder that industry and military package is much better.
And I guess that there may be internal degradation too - nothing is 100% stable over time. Then, every power on or off is shock, what makes more harm than days of work.

There is 2 kind of people: one thinking about moving to Mars after here becomes too bad, the others thinking about how to keep this planet habitable.

Okay, I do get this as well. Everything that stores information as an electric charge will sooner or later give up, be it by imperfect insulation or electromagnetic exposure.

But can this be true for semicondcutors in general, especially ICs?

I did some googling as well, and came up with a bunch of articles, explaining aging in transistors. One said: "[...]If you think of overall reliability, there are thermal issues, device aging issues and electromigration that happens at the level of the interconnect. Those three mechanisms affect the overall reliability of any circuit.[...]."

"[...]Over time, charge carriers (electrons for negative, or n-channel, MOSFETs; holes for positive, or p-channel, MOSFETs) with a little more energy than the average will stray out of the conductive channel between the source and drain and get trapped in the insulating dielectric. This process, called hot-carrier injection, eventually builds up electric charge within the dielectric layer, increasing the voltage needed to turn the transistor on. As this threshold voltage increases, the transistor switches more and more slowly[...]"

and

"[...]Yet another aging mechanism comes into play when a voltage applied to the gate creates electrically active defects, known as traps, within the dielectric. If they become too numerous, these charge traps can join and form an outright short circuit between the gate and the current channel. This kind of failure is called oxide breakdown, or more verbosely, time-dependent dielectric breakdown. Unlike the other aging mechanisms, which cause a gradual decline in performance, the breakdown of the dielectric can lead to the catastrophic failure of the transistor, causing the circuit it’s in to malfunction.[...]"

and in addition

"[...]As if the aging of transistors wasn’t enough to worry about, semiconductor engineers also have to grapple with the metal connections between transistors wearing out over time. The concern here is a phenomenon called electromigration, which damages the copper or aluminum connections that tie transistors together or link them to the outside world.[...]"

So, yes, it seems semiconductors do age and these observations are very correct.

... which is a well-known fact since at least the 1970s, yes. One can hope that at least some measures to make the ICs more long-living were already incorporated into the Atari ASICs. Countless literature has been written about that. Coincidentally, the introductory chapter from a Springer book about that is available for free right now: https://www.springer.com/cda/content/do ... p174706113. It covers CMOS IC reliability issues related both to manufacturing variations and aging.

This will be rather later than sooner, though, at least when you're referring to the point where all original hardware has stopped functioning. Component aging and subsequent failure is a statistical process; there's a a long tail of components that will resist failing for a long time. (This is, BTW, also true for electrolytic capacitors, which is why statements such as "all capacitors are dead after 15 years" are rubbish.)

Longevity on parts is of course relating to its heat exposure or use over time. High voltage parts tend to get hotter due to higher voltage drops and have a much higher fail rate over low voltage parts, BUT, going into extreme amps in low voltage causes heat also. This is why a lot of chips (graphics cards chips on PC's for example ) have quoted (don't remember where) if you run it at say 120c it likely will fail in under 2 years. Run it at 50c and it could easily last 10+ years. Some manufactures have stated heat vs life of product. Its been some years since I looked into it, but its pretty much well known.

Similar with capacitors, run them at heat limits and datasheet quote as little as 2,000 hours. SR98 PSU caps run hot because they are subject to switchmode spikes etc plus also get "cooked" next to the heatsink. I did measure the temps after just 10mins a few months ago, think it was hitting 60C. Pretty much all PSU's which have been used a fair amount will now have bad caps. I saw this happening around 10 years ago. Does this mean they are all bad ? Well, depends on use, if stored at a low temp there may be some life left in them. Though manufactures (at least last time I looked) quoted shelf life as 2 years. After that time, they don't guarantee the cap specs or life. Also depends on how generous the value is. Generally manufacture use whatever the minimum they can get away with. So once it starts going out of spec, bad things happen. Its why I go with overkill, as in 10 years time the cap will be what the circuit needs to function correctly still. Of course there is equipment which still runs, Many Atari's do, the DVE PSU and TT PSU are still going pretty well compared to the SR98s. Working "at its best".. well thats another matter. Most people don't notice or care.

Its clear that in terms of the STE it has noise problems, even from factory. I have 3 machines all have developed the floppy write protect issue recently, and I heard same problem from 2 others recently.. Only takes a slight drift in spec somewhere and its enough to cause problems. Its only going to get worse over time. All we can do is build in fixes as faults appear. IC's likely last a long time yet, but as to if the Atari is still working in 50 years time, I doubt it, not without mods and fixes at least.

Your PAK worked first time with no issues in my test board, and of the CPU's you sent me with it (I think you sent 8 including the one installed in the PAK) I seem to recall all but one or two failed, and all the rest worked fine with no issues.

Which is in complete contrast to your experience of nothing working. The one thing with your PAK, is that it was an earlier revision than mine, with patch wires, and it wouldn't work at all with either of my 68882 FPUs, both of which work fine in my PAK and my Amiga accelerator board.

I have a board from a fairly beaten up 520 STFM on my bench atm, and it's fairly odd what it's up to, it just hangs for ages trying to read the floppy drive, but not always.

If I use a Gotek with a display, you can see the track info, and it flickers and does nothing. There are no mods on the board relating to the floppy, although it had been run for a long time with a 1.44Mb drive in it, so I wonder if it has killed the FD controller, or if the chip has just deteriorated over time.

I do think that your 'house of cards' analogy is spot on, everything was built to the lowest budget, and these machines were never meant to last as long as they have. As soon as something drifts out of spec, the rest soon follows.

I've said it before in relation to the DMA issue, that I think this variance of specs in components right from the day they were built, is what leads some machines to be problematic, and others to be fine.

I think it's inevitable that with 30 years of being switched on and off, with varying quality of power supply, leads to a slow lingering death for some machines. Probably all the ones built on a Friday, or the days a new cheaper batch of components arrived at the factory, will be the first to go!