We have constructed targeted audio adversarial examples on speech-to-text transcription neural networks: given an arbitrary waveform, we can make a small perturbation that when added to the original waveform causes it to transcribe as any phrase we choose.

In prior work, we constructed hidden voice commands, audio that sounded like noise but transcribed to any phrases chosen by an adversary. With our new attack, we are able to improve this and make an arbitrary waveform transcribe as any target phrase.

The audio examples on this page are impressive -- a little bit of background noise, such as you might hear on a telephone call with high compression, hard to perceive if you aren't listening out for it.

The sticker “allows attackers to create a physical-world attack without prior knowledge of the lighting conditions, camera angle, type of classifier being attacked, or even the other items within the scene.” So, after such an image is generated, it could be “distributed across the Internet for other attackers to print out and use.”

This is why many AI researchers are worried about how these methods might be used to attack systems like self-driving cars. Imagine a little patch you can stick onto the side of the motorway that makes your sedan think it sees a stop sign, or a sticker that stops you from being identified up by AI surveillance systems. “Even if humans are able to notice these patches, they may not understand the intent [and] instead view it as a form of art,” the researchers write.

Here is a 3D-printed turtle that is classified at every viewpoint as a “rifle” by Google’s InceptionV3 image classifier, whereas the unperturbed turtle is consistently classified as “turtle”.

We do this using a new algorithm for reliably producing adversarial examples that cause targeted misclassification under transformations like blur, rotation, zoom, or translation, and we use it to generate both 2D printouts and 3D models that fool a standard neural network at any angle. Our process works for arbitrary 3D models - not just turtles! We also made a baseball that classifies as an espresso at every angle! The examples still fool the neural network when we put them in front of semantically relevant backgrounds; for example, you’d never see a rifle underwater, or an espresso in a baseball mitt.

[The] results suggest that classifiers based on modern machine learning techniques, even those that obtain excellent performance on the test set, are not learning the true underlying concepts that determine the correct output label. Instead, these algorithms have built a Potemkin village that works well on naturally occuring data, but is exposed as a fake when one visits points in space that do not have high probability in the data distribution.

Kalina Bontcheva leads the EU-funded PHEME project working to compute the veracity of social media content. She said reducing the amount of human oversight for Trending heightens the likelihood of failures, and of the algorithm being fooled by people trying to game it.
“I think people are always going to try and outsmart these algorithms — we’ve seen this with search engine optimization,” she said. “I’m sure that once in a while there is going to be a very high-profile failure.”
Less human oversight means more reliance on the algorithm, which creates a new set of concerns, according to Kate Starbird, an assistant professor at the University of Washington who has been using machine learning and other technology to evaluate the accuracy of rumors and information during events such as the Boston bombings.
“[Facebook is] making an assumption that we’re more comfortable with a machine being biased than with a human being biased, because people don’t understand machines as well,” she said.

This is an excellent essay from Cory Doctorow on mass surveillance in the post-Snowden era, and the difference between HUMINT and SIGINT. So much good stuff, including this (new to me) cite for, "Goodhart's law", on secrecy as it affects adversarial classification:

The problem with this is that once you accept this framing, and note the happy coincidence that your paymasters just happen to have found a way to spy on everyone, the conclusion is obvious: just mine all of the data, from everyone to everyone, and use an algorithm to figure out who’s guilty. The bad guys have a Modus Operandi, as anyone who’s watched a cop show knows. Find the MO, turn it into a data fingerprint, and you can just sort the firehose’s output into ”terrorist-ish” and ”unterrorist-ish.”

Once you accept this premise, then it’s equally obvious that the whole methodology has to be kept from scrutiny. If you’re depending on three ”tells” as indicators of terrorist planning, the terrorists will figure out how to plan their attacks without doing those three things.

This even has a name: Goodhart's law. "When a measure becomes a target, it ceases to be a good measure." Google started out by gauging a web page’s importance by counting the number of links they could find to it. This worked well before they told people what they were doing. Once getting a page ranked by Google became important, unscrupulous people set up dummy sites (“link-farms”) with lots of links pointing at their pages.

from last week's CEAS conference; research comparing SpamAssassin releases against the evolution of the surrounding spam environment. Nice work, I always wanted to write up something like this (via JD)