Tuesday, June 23, 2009

Medical data is generally considered private, and there are good reasons for it: Your medical records may reveal if you're suffering from from contagious diseases (your friends are probably interested in this), if you have genetic conditions increasing your risk for certain forms of illnesses (your health insurance company might be interested in this), if you were really having a severe cold this week's Monday right after your vacations (your employer might be interested in this), and if you're suffering from sexual diseases or if you receive Methadone as a substitute for illegal drugs (you simply think nobody should be interested in this).

All this kind of data is regularly printed in a doctors practice. Now these printers are typically placed such that nobody can see what is printed, and you might believe that your data is secure. However, this belief is not justified. In this study we showed that printed text can be reconstructed from a previously taken recording of the sound emitted by the printer. A majority of the doctors' practices use dot-matrix printers for printing (see below for the results of a survey we commissioned on the usage of dot-matrix printers), and in some cases they are even required to do so.

In effect this means that any person sitting in the reception area of the doctor can record the sound of the printer and can reconstruct the printed text. Our novel attack takes as input a sound recording of a dot-matrix printer processing text, and recovers up to 72% of printed words. After an upfront training phase, the attack is fully automated and uses a combination of machine learning, audio processing and speech recognition techniques, including spectrum features, Hidden Markov Models and linear classification; moreover, it allows for feedback-based incremental learning.