How and why deepfake videos work — and what is at risk

Deepfakes swap celebrities' faces into porn videos and put words in politicians' mouths, but they could do a lot worse.

Today's top stories

Show More

Deepfake definition

Deepfakes are fake videos or audio recordings that look and sound just like the real thing. Once the bailiwick of Hollywood special effects studios and intelligence agencies producing propaganda, like the CIA or GCHQ's JTRIG directorate, today anyone can download deepfake software and create convincing fake videos in their spare time.

So far, deepfakes have been limited to amateur hobbyists putting celebrities' faces on porn stars' bodies and making politicians say funny things. However, it would be just as easy to create a deepfake of an emergency alert warning an attack was imminent, or destroy someone's marriage with a fake sex video, or disrupt a close election by dropping a fake video or audio recording of one of the candidates days before voting starts.

How dangerous are deepfakes?

This makes a lot of people nervous, so much so that Marco Rubio, the Republican senator from Florida and 2016 presidential candidate, called them the modern equivalent of nuclear weapons. "In the old days," he told an audience in Washington a couple weeks ago, "if you wanted to threaten the United States, you needed 10 aircraft carriers, and nuclear weapons, and long-range missiles. Today, you just need access to our internet system, to our banking system, to our electrical grid and infrastructure, and increasingly, all you need is the ability to produce a very realistic fake video that could undermine our elections, that could throw our country into tremendous crisis internally and weaken us deeply."

Political hyperbole skewed by frustrated ambition, or are deepfakes really a bigger threat than nuclear weapons? To hear Rubio tell it, we're headed for Armageddon. Not everyone agrees, however.

"As dangerous as nuclear bombs? I don't think so," Tim Hwang, director of the Ethics and Governance of AI Initiative at the Berkman-Klein Center and MIT Media Lab, tells CSO. "I think that certainly the demonstrations that we've seen are disturbing. I think they're concerning and they raise a lot of questions, but I'm skeptical they change the game in a way that a lot of people are suggesting."

How deepfakes work

Seeing is believing, the old saw has it, but the truth is that believing is seeing: Human beings seek out information that supports what they want to believe and ignore the rest.

Hacking that human tendency gives malicious actors a lot of power. We see this already with disinformation (so-called "fake news") that creates deliberate falsehoods that then spread under the guise of truth. By the time fact checkers start howling in protest, it's too late, and #PizzaGate is a thing.

Deepfakes exploit this human tendency using generative adversarial networks (GANs), in which two machine learning (ML) models duke it out. One ML model trains on a data set and then creates video forgeries, while the other attempts to detect the forgeries. The forger creates fakes until the other ML model can't detect the forgery. The larger the set of training data, the easier it is for the forger to create a believable deepfake. This is why videos of former presidents and Hollywood celebrities have been frequently used in this early, first generation of deepfakes — there's a ton of publicly available video footage to train the forger.

Shallow fakes are a problem, too

It turns out that low-tech-doctored videos can be just as effective a form of disinformation as deepfakes, as the controversy surrounding the doctored video of President Trump's confrontation with CNN reporter Jim Acosta at a November press conference makes clear. Video clearly shows a female White House intern attempting to take the microphone from Acosta, but subsequent editing made it look like the CNN reporter attacked the intern.

The incident underscores the fears that video can be easily manipulated to discredit a target of the attacker's choice—a reporter, a politician, a business, a brand. Unlike so-called "deepfakes," however, where machine learning puts words in people's mouths, low-tech doctored video hews close enough to reality that it blurs the line between the true and false.

FUD (fear, uncertainty and doubt) is familiar to folks working in the security trenches, and deploying that FUD as a weapon at scale can severely damage a business as well as an individual. Defending against FUD attacks is very difficult. Once the doubt has been sowed that Acosta manhandled a female White House intern, a non-trivial portion of viewers will never forget that detail and suspect it might be true.

Who's wagging whom?

David Mamet's wickedly funny 1997 film Wag the Dog satirized a president running for re-election who fakes a war using special effects to cover up a sex scandal. Prophetic for its time, the ability to "fake TV news" has been around for a while and is now in the hands on pretty much every laptop owner on the planet.

GANs, of course, have many other uses than making fake sex videos and putting words in politicians' mouths. GANs are a big leap forward in what's known as "unsupervised learning" — when ML models teach themselves. This holds great promise in improving self-driving vehicles' ability to recognize pedestrians and bicyclists, and to make voice-activated digital assistants like Alexa and Siri more conversational. Some herald GANs as the rise of "AI imagination."

That said, there are so many other forms of effective disinformation that focusing on playing "Whack-a-Mole" with deepfakes is the wrong strategy, Hwang tells CSO. "I think that even in the present it turns out there are lots of cheap ways that don't require deep learning or machine learning to deceive and shape public opinion."

For instance, taking a video of people beating someone up in the street, and then creating a false narrative around that video — perhaps claiming that the attackers are immigrants to the U.S., for example — doesn't require a fancy ML algorithm, just a believable false narrative and a video that fits.

How to detect deepfakes

Detecting deepfakes is a hard problem. Amateurish deepfakes can, of course, be detected by the naked eye. Other signs that machines can spot include a lack of eye blinking or shadows that look wrong. GANs that generate deepfakes are getting better all the time, and soon we will have to rely on digital forensics to detect deepfakes — if we can, in fact, detect them at all.

This is such a hard problem that DARPA is throwing money at researchers to find better ways to authenticate video. However, because GANs can themselves be trained to learn how to evade such forensics, it's not clear that this is a battle we can win.

"Theoretically, if you gave a GAN all the techniques we know to detect it, it could pass all of those techniques," David Gunning, the DARPA program manager in charge of the project, told MIT Technology Review. "We don't know if there's a limit. It's unclear."

If we are unable to detect fake videos, we may soon be forced to distrust everything we see and hear, critics warn. The internet now mediates every aspect of our lives, and an inability to trust anything we see could lead to an "end of truth." This threatens not only faith in our political system, but, over the longer term, our faith in what is shared objective reality. If we can't agree on what is real and what is not, how can we possibly debate policy issues? alarmists lament.

Hwang thinks this is exaggeration, however. "This is one of my biggest critiques," he says. "I don't see us crossing some mystical threshold after which we're not going to know what's real and what's not."

At the end of the day, the hype around deepfakes may be the greatest protection we have. We are on alert that video can be forged in this way, and that takes the sting out of deepfakes.

Deep-fake pornography — the realistic threat

Politicians have been lying for as long as politics has existed, and the threat of deep fakes to democracy is overblown. The more realistic threat, however, is the creation of deep fake pornography that puts a celebrity's — or maybe an ex-girlfriend's — head on the body of a porn star.

It's like revenge porn, only without the porn. All the deep-fake porn creator needs is a bunch of photos of the victim, easily taken from that person's social media feed, and a video of the victim's face that's several minutes long. Even celebrities, many of whom are used to a certain amount of public hating, have expressed horror on discovering their heads superimposed on the body of a porn star in a raunchy video.

Sometimes deep fakes aren't about gaslighting a population, but about bullying or harassment. This seems like a far more probably outcome than those who carry on about deep fakes being an existential threat to democracy.

How do we regulate to prevent deep fakes?

Are deep fakes legal? It's a thorny question, and unresolved. There's the First Amendment to consider, but then intellectual property law, privacy law, plus the new revenge porn statutes many states across the United States have enacted of late.

In many cases platforms such as Gfycat and Pornhub have actively removed deep fake porn videos from their websites, arguing that such content violates their terms of service. Deep fakes of the pornographic variety continue to be shared on less-mainstream platforms.

However, when it comes to political speech that is not of an abusive sexual nature, the lines get blurry. The First Amendment protects the right of a politician to lie to people. It protects the right to publish wrong information, by accident or on purpose. The marketplace of ideas is meant to sort the truth from falsehood, not a government censor, or a de facto censor enforcing arbitrary terms of service on a social media platform.

Regulators and lawmakers continue to grapple with this problem. Watch this space.

At a time when film could take weeks to cross an ocean, filmmakers would dramatize earthquakes or fires with tiny sets to make the news more lifelike. In the 1920s, sending black-and-white photographs over transoceanic cables was the latest rage, and filmmakers would use genuine photographs to create their scenes of destruction.

That changed in the 1930s, and it was the expectation of audience members that what they were watching was the genuine article.

This story, "How and why deepfake videos work — and what is at risk" was originally published by
CSO.