Posted
by
Soulskill
on Friday August 20, 2010 @07:49AM
from the blue-screen-of-literal-death dept.

An anonymous reader writes "Two years ago, Spanair flight JK-5022 crashed shortly after takeoff in Madrid, killing 154 of its 172 passengers and crew. El Pais online newspaper reports that the ground computer responsible for triggering an alarm after three failures are reported in a plane failed to do so. The computer was infected with trojans (Google translation of Spanish original)."

I'm not sure that banning Windows by name would be of too much use. A quick trip down the router aisle at any computer store will show you more degenerate abuses of embedded linux and VXworks than you care to think about, and I'm told that things don't get better nearly as fast as you would hope as prices rise in other industry segments.

Anyone, though, using Windows in an environment where it could trivially be infected(ie. internet connected or contractors doing flash drive upgrades) really needs to be shown the door, yesterday. I'm also not sure why there would be "a" computer responsible for raising the alarm. Commodity x86 gear is pretty reliable for what you pay; but it isn't that reliable. If the safety of one or more 100 million+ aircraft, and everybody on board, is at stake, why are there not multiple systems, all independently capable of raising the alarm?

Actually two, operational blowout preventers were called for in the regulatory specifications. Turns out the single blowout preventer had no battery juice available. The system is supposed to work, by the batteries closing the hole automatically when detection of the control monitoring software fails. But if the batteries to the sole preventer don't have the juice when needed, bad things can happen. Someone thought the costs vs. risks were negligible, so they settled for less.

HEY!
Why do you blame taxpayers for it?
We don't have any say in how our Government spends our money.
e.g., waging wars based on lies.
if i don't support the war, can i avoid paying that portion of it in my tax? NOOOO..
So why blame us for the stupidity of the Government?

The more likely result, if we make all non-technical people aware of this particular instance: Government legislature that says *all* computers must use TPC such that they can only run programs that are created by authorized entities and signed with certs.

This is a one-way ticket to the cessation of all innovation in the field of computing.

Yeah because Linux is totally 100% immune to malware and never ever crashes!

If they couldn't properly isolate a mission critical windows system, guess what? They almost certainly wouldn't be able to properly secure a Linux or OSX system either. Relying on the small amount of Linux based malware for security? That sounds an awful lot like security by obscurity to me. Relying on the rights system? There's plenty that you could do without admin rights that would potential suppress or interfere with an alarm

The Internet is not the only source of infection. What about removable media, removable drives, or other machines on a private network that can connect to either the Internet or removable media? Perimeter defences are part of good security, but they are not the whole of it.

We had to secure a computer at a company I worked at years ago. The IT department claimed it was secure (they had put Norton AV and firewall on it) I laughed when the owner of the company told me about it. He asked if I could do better. I put the computer in a metal drawer, locked it, drilled a hole in the back for the cables to come out and handed him the key. "There, now it's secure." He thought I was kidding until I pointed out the USB ports and drive bays.

But if you don't have a network connection (and the machine is physically secured to protect the USB ports and removable media drives), then you don't NEED anti-virus software. Without a means for a virus to get onto the machine, it should be perfectly safe.

Having a live network connection only for the purpose of updating an unnecessary AV package provides a route of infection in itself. Unless the machine needs a network connection for another reason, then it shouldn't be connected to a network.

I'm not sure about Norton, but Symantec AV has gone beyond simple virus stuff for a while now.

Using Symantec we didn't block USB entirely, but it is possible. It did block the standard USB type attacks though. When USB drives where plugged in the system logged all activity including files and sent them up to the central server.

Better than a drawer would have been a nice server rack...of course physical security is important. Someone could steal the drive and modify it and then put it back in. But I would th

I worked for many years in the security industry. We had to do this to prevent security guards turning off the machine when they alarmed as it would interrupt their naps. Probably the best story I heard about a secure room was in Australian Defence. A contractor was installing a secure door to make a secure room(where you store your import and documents and hard disks after hours). Once completed a senior military guy comes down and is really impressed by this thick steel door with massive bolts etc. The contractor said its pretty good, but he reckoned he could get inside within 10 seconds. The military guys cannot believe it and bets the guy $100 he cant do it. They lock the door and the contractor then proceeds to go to the side of the secure room and put his foot thru the plaster board panelling, kicking out a large chunk and allowing him to crawl into the room in about 5 seconds.

See, this is why government oversight is so expensive. Regulations have to written for morons and swindlers. Here's the US Government standard.

1) Class A Vaults.

(a) Reinforced Concrete. The wall, floor, and ceiling will be a minimum thickness of eight inches of reinforced concrete. The concrete mixture will have a comprehensive strength rating of a least 3,000 psi. Reinforcement will be accomplished with steel reinforcing rods, a minimum of 5/8 inches in diameter, positioned centrally and spaced horizontally and vertically 6 inches on center; rods will be tied or welded at the intersections. The reinforcing is to be anchored into the ceiling and floor to a minimum depth of one-half the thickness of the adjoining member.

(b) Modular. Modular panel wall, floor, and ceiling components, manufactured of intrusion-resistant material, intended for assembly at the place of use, and capable of being disassembled and relocated meeting Underwriters Laboratories, Inc. (UL) standards are approved for vault construction.

(c) Steel-lined. Vaults may be constructed of steel alloy-type, such as U.S. Steel T-1, having characteristics of high-yield tensile strength or normal structural steel with a minimum thickness of 1/4 inch. The metal plates are to be continuously welded to load-bearing steel members of a thickness equal to that of the plates. If the load-bearing steel members are being placed in a continuous floor and ceiling of reinforced concrete, they must be firmly affixed to a depth of one-half the thickness of the floor and ceiling. If the floor and/or ceiling construction are less than six inches of reinforced concrete, a steel liner is to be constructed the same as the walls to form the floor and ceiling of the vault. Seams where the steel plates meet horizontally and vertically are to be continuously welded together.

(2) Class B Vaults.

(a) Monolithic Concrete. The wall, floor, and ceiling will be a minimum thickness of four inches of monolithic concrete.

(b) Masonry Units. The wall will be brick, concrete block, or other masonry units not less than eight inches thick. The wall will extend to the underside of the roof slab above (from the true floor to the true ceiling). Hollow masonry units shall be the vertical-cell type (load bearing) filled with concrete and metal reinforcement bars. The floor and ceiling must be of a thickness determined by structural requirements, but not less than four inches of monolithic concrete construction.

(3) Class C Vaults. The floor and ceiling must be of a thickness determined by structural requirements, but not less than four inches of monolithic concrete construction. Walls must be not less than eight inches thick concrete block or hollow-clay tile or other masonry units. The wall will extend to the underside of the roof slab above (from the true floor to the true ceiling).

I don't know why you think this demonstrates any particular excess expense of government. It is no more complex or restrictive than any of hundreds of private sector construction specifications and design criteria that I have read.

Except that this was not really a mission critical system - it was a fault logging system in the maintenance department. So far as one can tell from a machine-translated popular article, it was meant to log if a single aircraft had a number of different faults logged close together, because faults at different stations might not otherwise get correlated. As such, it is basically an IT system with response requirements in minutes, not a real time system with fault tolerance requirements. One of the systems w

Well its either had a hand in causing the deaths of 154 people, and therefore was a mission critical system. Or it wasn't a mission critical system and the entire article is just a load of sensationalist garbage.

Considering that 154 people died because this system did not issue the warning it was supposed to, I would say it most certainly IS a mission critical system, it just isn't treated as one.

Of course, it sounds like the whole thing was a tragedy of errors. The pilot should have seen that slats and flaps were in the wrong position, the computer in question should have flagged the plane for grounding, the on board computer should have raised the alarm. There should have been maintenance records independent of the computer that should have raised the flag on pre-flight. Not one of those things happened and people died as a result.

I would call it a comedy of errors except that it's hard to call 154 deaths a comedy.

Think of it as: The boss person for the "mission critical applications" area was given a nice long lunch and presented with some back of the napkin math just before an upgrade.
The savings in hardware and software over aspects of a traditional OS was amazing... and thats how an off-the-shelf OS could get into mission critical area.
Marketing has its lists of areas to wine, dine, seduce and penetrate.

Those mission-critical-designed-for OSes are, unfortunately, likely to be secure by obscurity. Something like vxWorks or QNX is not a big enough target for malware writers or blackhats, but I'm quite sure those platforms are full of holes simply because they are not very exposed. I'd say that linux, perhaps with realtime extensions, would be a somewhat better platform -- it's exposed way more, and most of the holes have been patched.

I'd say that linux, perhaps with realtime extensions, would be a somewhat better platform -- it's exposed way more, and most of the holes have been patched.

Does ground control really need realtime scheduling? It's basically a glorified traffick light system with cameras (radards). It doesn't really matter if it makes a decision a microsecond sooner or later, or even a whole second.

Anyway, a simple and efficient solution would be to run several parallel system on different OSes, and rise an alarm if they di

No, those OSes are not secure. Quite the opposite. Almost all of them are very primitive, and have wide-open memory models that allow anything to run, allow anything running to touch any location in memory, and don't log a thing about it. More recent versions of them may have memory partitioning and privileged-user-only modes, but don't bet on the more recent versions being used even on brand new projects.

Because humans are humans. Possible chain of events: "Hmmm. I want to surf in the internet but have no PC. But wait, there is our maintenance PC. If i install iTunes on it and connect it to my iPhone, i may surf during work. Hurray! I can even download the hot pics of my favorite celebrity to which i received a link from these chinese guy."

The operating system really isn't the issue here, failure to isolate the system is. I've set up several windows systems inside a double firewall which in turn are set up with a VPN to whatever the systems needed to communicate with, and nothing else. Those did exactly what they needed to do because nothing else would get in or out. That a mission critical system gets infected at all points to a serious flaw somewhere, a goddamned alarm system shouldn't need any active usb-ports nor any access to the interne

Its STILL not a high-availabilty OS, and should not be treated as such. Windows can be great for normal business use when properly set up, but it isnt designed for mission critical stuff-- if your graphical shell can bring down the OS, its probably not a good candidate for that kind of thing.

Sure, a Linux box can get rooted, but I've never seen one, and I've installed Linux on friends' computers when I got tired of reinstalling Windows for them after the thing slows to a crawl from malware. Once Linux was on it, they never got infected again.

Of course, to be victim of a trojan you have to know how to install a program;)

Undoubtedly, however there are meant to be safety nets against pilot incompetence. If such a system was compromised (as noted in a comment below, this is slightly dubious) then that error is partly responsibility for the incident.

"On 17 August 2009, CIAIAC released an interim report on the incident [21]. The interim report confirmed the preliminary report's conclusion that the crash was caused by an attempt to take off with the flaps and slats retracted, which constituted an improper configuration, and noted that safeguards that should have prevented the crash failed to do so. The cockpit recordings revealed that the pilots omitted the "set and check the flap/slat lever and lights" item in the After Start checklist. In the Takeoff Imminent verification checklist the copilot just repeats the flaps and slats correct values without actually checking them, as shown by the physical evidence."

The pilots kind of revoked their own licenses. Permanently. All of the crew perished in the crash.

The thing that bugs me is that flight systems on passenger jets are multiply redundant and their are strict rules about what can and can't be done when there is a system failure. For instance there are usually at least three autopilot systems, and if only one is indicating a fault then the flight crew has to perform all flight operations manually. WTF happened with regulatory control that didn't enforce that this kind of redundancy and human oversight applied to critical systems on the ground as well?

Beyond the translated Spanish article I can't find anything else about this idea of an alerting system being infected with malware. Typically such systems are simple, embedded and not interfaced in ways which could cause them to run software they are not meant to.

This bit from wikipedia is interesting:

The MD-80 Advanced was to incorporate the advanced flight deck of the MD-88, including a choice of reference systems, with an inertial reference system as standard fitting and optional attitude-heading equipment. It was to be equipped with an electronic flight instrument system (EFIS), an optional second flight management system (FMS), light emitting diode (LED) dot matrix electronic engine and system displays. A Honeywell windshear computer and provision for an optional traffic-alert and collision avoidance system (TCAS) were also to be included. A new interior would have a 12% increase in overhead baggage space and stowage compartment lights that come on when the door opens, as well as new video system featuring drop-down LCD monitors above.[4]

This is an aggregating computer at SpanAir HQ which is supposed to record aircraft alerts and notify when too many of them happen too close together. Its only connection with the on-board computer is that somehow it receives the alerts from it. Its OS is unstated. It is not a mission-critical system, it is a decision-support system. Even so, someone looks to have been careless.

Whoever modded up the above post - you've missed the point. There may have been a fault in the on-board management system - or human error failing to heed a warning - but nothing in TFA suggests that malware was in any way involved on the flight deck.

The summary is a bit misleading. The computer on the plane does not appear to be infected. What was infected was a warning control system computer at Spainair headquarters that monitored and recorded the planes. If I'm reading the article right, a component on the plane (it says "device" so it may not be a computer) failed at least twice before the flight took off. Since the central computer was infected with Trojans, it was not adequately recording nor triggering an alert that should have grounded that

Here is your complimentary guide to trolling this story:
1. Pretend only windows can get infected with trojans.
2. If you can't do 1. adequately, then pretend Windows is some how easier to infect with trojans than other OSes.
3. Accuse anyone who disagrees with you of being paid off.
4. Make thoughtless absolutists statements like Windows has no security model, and is not a networking OS.
5. Mention chair throwing as proof that MS personnel are unstable, but never mention wife murdering linux developers.
6. Repeat other MS bashers without researching what they're saying.
7. Mention "640k ought to be enough for anyone" as much as possible without giving thought to the brain dead simple idea that MS had nothing to do with the addressable memory limit of the 8086.
Following this guide is sure to get you modded up and liked by many other slashdotters, so be sure to follow it closely!

Problem with your rebuttal: Whether or not other systems can get trojans, you should NOT be using Windows for anything that needs 100% uptime to guarentee safety of human lives, plain and simple. If the entire system can be locked up and made responsive by userland apps, then it isnt qualified to be responsible for the safety of human lives.

In response to your point 2, Windows *is* easier to infect than other operating systems. But that has little to do with the level of security/privileges in the OS these days (Win 7 is a *huge* step forward as compared to, say, Windows 95, where you could bypass a login screen by hitting ESC). More, the reason Windows is easier to infect is because of market share.

Most virus infections still rely on good old social engineering: they e-mail themselves as an attachment to a user, and the user has to unwittingl

That's complete BS, I'm not even going to bother to refute your points, because they are nothing more than a red herring, it's not even important why, the fact itself is important, there are half of million of pieces of malware for Windows and almost none for Linux (and no actively spreading virus as far as I know).

If you choose your neighborhood would you go for a war ridden zone or for Malibu? Sure, Malibu is just as vulnerable, actually it has less defenses than Kabul... it can *potentially* become worse

When someone's malicious Trojan, Virus or other Malicious Coding will be used as evidence in a murder/manslaughter trial; however, what is needed, is a day when any seriously incompetent bit of code on a vital system should have the potential to be used in criminal court. I'm an Mechanical Engineer and I have to have a certification and insurance even as a contractor, why should I have to spend 1000's of dollars a year doing so I can work on building the mechanical systems of the plane when the programmers

The infected computer was one being used by mechanics to enter maintenance log entries. According to the article, an alert is supposed to be raised if three failures in the same part or subsystem occurred. If I understand the broken English correctly, they would have taken the plane out of service had the maintenance log entry been completed before the plane attempted to take off.

But, the problem that was supposed to be logged was reportedly an overheated pitot tube. That was not the cause of the crash: the report says that the pilots did not set the flaps correctly and a warning alarm did not go off. This was not related to the problem with the computer being used by mechanics.

The article appears to be trying to link two independent events: a separate problem with the plane and an error by the pilots. Or maybe it's just the broken English translation.

Spanish is my mother tongue, so maybe I can shed more light after reading the original article:

The procedures of Spanair are to log incidences right away whenever they are detected. Three accumulated incidences and the plane is grounded.

Two incidences had been found the day before the crash. One incidence was detected on the same day of the crash.

However, the technicians did not enter the incidences into the system right away, because the system was too slow (assumedly due to the malware)

The system did not trigger any alarm on the same day because the incidences had not been entered by the technicians. The plane was deemed airworthy, and then the accident happened due to the multiple causes described elsewhere.

This case is interesting because from the legal perspective it is of interest to find responsibilities for the accident.
The malware did not cause the crash but it interfered with the logging protocols.
The technicians will be probably held responsible for not taking measures such as manually checking printed logs, if the computer failed.

The infected computer was one being used by mechanics to enter maintenance log entries. According to the article, an alert is supposed to be raised if three failures in the same part or subsystem occurred. If I understand the broken English correctly, they would have taken the plane out of service had the maintenance log entry been completed before the plane attempted to take off.

But, the problem that was supposed to be logged was reportedly an overheated pitot tube. That was not the cause of the crash: the report says that the pilots did not set the flaps correctly and a warning alarm did not go off. This was not related to the problem with the computer being used by mechanics.

The article appears to be trying to link two independent events: a separate problem with the plane and an error by the pilots. Or maybe it's just the broken English translation.

Very true - the accident appears to have been the result of a series of crew errors that lead to an improper takeoff condition:

From Wikipedia: On 17 August 2009, CIAIAC released an interim report on the incident [21]. The interim report confirmed the preliminary report's conclusion that the crash was caused by an attempt to take off with the flaps and slats retracted, which constituted an improper configuration, and noted that safeguards that should have prevented the crash failed to do so. The cockpit recordings revealed that the pilots omitted the "set and check the flap/slat lever and lights" item in the After Start checklist. In the Takeoff Imminent verification checklist the copilot just repeats the flaps and slats correct values without actually checking them, as shown by the physical evidence. All three safety barriers provided to avoid the takeoff in an inappropriate configuration were defeated: the configuration checklist, the confirm and verify checklist, and aircraft warning system (TOWS).

Had they not made a series of compounding errors the flight probably would have been uneventful; it appears the deactivated systems was not related to the crash. It may be that some other systems were improperly set - ground vs flight mode - which caused problems and may have contributed to the accident; but none are related to the maintenance computer. Should the plane have been grounded due to an early problem? Maybe; but that may not have prevented the errors that lead to the crash.

We'll never know what the pilots were thinking; but having aborted one takeoff they may have assumed, intentionally or not, that they systems were set for takeoff and did a cursory check as a result; I've seen that happen in other industries where checklists are used. You interrupt the expected course of actions and people simply pick up where they left off, without assuring the systems were properly set for operation.

Maybe the computer was infested with trojans, although no evidence is offered to support this, not even the names. If it was, that still doesn't say that the trojans caused the problem. After all, the computer must have been running well enough even with the infestation to seem to be working. I'm inclined to think that trojans may just be a way to not really address the real problem.

This opens a new legal can of worms - if a trojan or virus is found to be resposible (at least partially) of a plane crash, can the creator fo this virus be held legally liable for the crew and passenger deaths?

This opens a new legal can of worms - if a trojan or virus is found to be resposible (at least partially) of a plane crash, can the creator fo this virus be held legally liable for the crew and passenger deaths?

I don't see why not. It might be hard to prove murder, but negligent homicide should be fairly easy to show. Reckless endangerment should be damned near an automatic conviction if you can prove that the person released the virus even if it DIDN'T hurt anyone.

The crash of an airliner these days is rarely due to a single cause. There's a saying in the industry that a crash occurs when the holes in the Swiss cheese happen to line up. This appears to have been the case with this particular crash.

The direct cause was that the pilots attempted to take off without setting take-off flaps.

They were rushing because they'd had a technical issue, and returned to the terminal after previously taxiing to the runway and completing the take-off checks. So they accidentally skipped the critical check that the flaps were deployed when they lined up to take off the second time.

There's a take-off configuration alarm that is supposed to alert the pilots, but it wasn't working.

It wasn't working because the engineer removed the circuit breaker that powered it, in order to turn off a stuck heater on a pitot tube that was due to a malfunctioning switch.

This particular fault had been noted on previous flights, so should have flagged a warning on the airline's fault monitoring system.

The fault monitoring system had a trojan.

Yup, the holes in the cheese certainly lined up that day. None of these, by itself, would have caused the crash.

Instead of indicting everyone under the sun, let's do something to fix it instead of tossing people in jail. Many people contributed a little, like Murder on the Orient Express. In the end, the ultimate responsibility rested on the Pilot-in-Command who paid the price for his mistakes. Let's learn from it instead.

1. Revise procedures so that the PNF (Pilot-Not-Flying) visually confirms the flap & slats indicator instead of just reading it to the PF (Pilot Flying)

The Spanish article cited in the summary does not allege any cause-and-effect relationship between the computer, the trojans, and the crash.

Nearly all crash investigations reveal factoids that cause suspicion and which invite people to jump to conclusions. Sometimes, the premature public debate on such issues cause emotional harm to victims, their families and other people involved.

I realize that I'm pissing into the wind to raise this topic. I's human nature to gossip. Slashdot is no different than any other public forum in this regard. It just frustrates me to see this happen again and again.

You are right, this is never alleged. But it is implied and they clearly want people to take the false impression by what is said and not said. Otherwise, it is a completely pointless thing to say. I would be like going out of your way to point out that the computer had a CRT screen and not an LCD screen. If there is no cause and effect (and I also believe there is not in this case), why make the statement?

This news puts Trojans in a new light. Taking over PCs to run scams is one thing; causing the deaths of 154 people is entirely different. Every top law enforcement agency and intelligence organization should be working to track down all of those responsible - from the guys who wrote the Trojans to the managers who allowed them to contaminate their computers, and very possibly those who wrote the vulnerable software and those who sold it for such a safety-critical application.

Between this and hospital computers rebooting themselves after auto-updating how can people defend Windows in critical operations? At the very least run embedded WIndows or something more specialized. Though, yes, I admit I'd rather see them not run Windows at all.

Even better would be if people didn't half ass engineer their system. Hospital computers autorebooting causes a problem? Disable it and manage reboots for updates some other way. Relatively critical system? Lock it down. No web surfing access, no external drives, no unapproved binaries etc.

And I would dearly love to see it in court. However I would imagine it would fit more under manslaughter rather than common law type murder, as I would imagine the trojan writer wasn't out to kill people. Though I would imagine you could argue malice is involved in writing trojans. I'm not a lawyer so don't take notice of anything I say.
Though going by the poorly translated article there was more going on then just the trojans, the trojan computer may of been more of a contributing factor rather than the primary reason for the crash, due to reasons stated in the article.

Exactly my thoughts... there wasn't anyone willing to take the blame or rather in this case they were most likely dead and the authorities weren't able to convincingly put the blame on someone that is dead without their defense. That's ok... pointing fingers is never a cool thing to do BUT to say that it was a computer glitch is more than a little arrogant against the people who are still alive and have been effected by the tragedy. It permeates an aura of a botched investigation and reeks of underhanded an

The computer, located at the headquarters of the airline in Palma de Mallorca, emits an alarm signal on the monitor when you register three similar technical problems in the same device

Pardon me if something got lost in translation, but why the hell was was there not a computer on board that could have registered a series of failures and alerted the crew? It seems that would have been useful information for them to have.

Hate to rain on the IT parade here, but the investigation revealed that the aircrew had the aircraft on "in-flight" mode, leading to erroneous indications (forcing the first abort), and then excluding the no flaps/no slats pre-takeoff configuration error warning. The crew also called for the flaps/slats settings to be proper without actually checking them. In effect, they were able to defeat three separate safety measures to prevent exactly this kind of mishap from happening.

It does not appear that an infection of the mainframe maintenance computer is anything more than a side note in this particular mishap. It may, however, be something for airline maintenance personnel to be aware of to prevent future incidents.

The real question is why the aircrew are allowed to override a weight-on-wheels (WOW) sensor, when that is primary used for troubleshooting by ground crews. Putting the aircraft into "flight" mode while on the ground requires special attention to actions/procedures (as in when a USAF F-4 shot up a maintenance truck when the WOW switch was in override and the weapons crew performed an ops check on the gun system--ops check good, BTW).

A computer controlling in-flight operations infected with trojans translates to a computer running MS windows. Why the fuck would anyone even think of this? This is like building a suspension bridge using legos and 6 year olds doing the assembly.

So when I fly, is my life really dependent on a tinker toy OS? That's fucked up! Someone should be beaten to death for this idea.

Maybe I wasn't clear, the mainframe in maintenance has nothing to do directly with inflight operations. The computers on board are completely independent of those in the maintenance system. Now, if there are wireless connections, allowing the maintenance mainframe and the aircraft to share information, it MAY be possible for a virus to gain access to the aircraft, but I am pretty sure this has not happened yet, possibly to the security in place, or possibly no one has really tried to infect an aircraft wi