When Berlin mayor Franziska Giffey took a call from Vitali Klitschko, the mayor of Kyiv, in June, the pair discussed several important issues, including the status of Ukrainian refugees in Germany. It was a perfectly normal conversation between politicians, given the circumstances. Except Klitschko wasn’t real.
Although the mayor could see the face of the former Boxer turned politician, and was talking to him in real-time, she was actually talking to an imposter. Deepfakes – technology which creates realistic renders of famous faces using artificial intelligence (AI) – are now sophisticated enough to work in real-time.
It’s not yet clear who the tricksters behind the incident were, nor what their intentions were, but the same group reportedly fooled the mayors of Vienna and Madrid using the same Klitschko deepfake.
Since deepfakes first emerged in 2017, a persistent worry has been that they could be used to meddle in politics and otherwise cause chaos. As with many things in recent times, those fears have been replaced with cold, hard reality.
Vitali Klitschko speaking to media during the NATO Summit in Spain in June 2022
The telltale signs of a deepfake
Dr Matthew Stamm, who researches multimedia forensics at Drexel University in Philadelphia, argues that with current technology it’s quite possible to fool people, even if the fakes falter under close inspection.
“If you give [a deepfake] a really close look, a lot of times there's physical cues like inconsistent or strange motion patterns, or something that seems off about the face,” he says. “I think we're still a good way away from making something that's very visually convincing that stands up to long-term scrutiny.”
Even if the video does look a little off, though, it has one important factor on its side: human psychology. “If we have to make a quick decision, information aligns with our preconceived biases, so we're often disposed to just believe it,” says Stamm.
Deepfakes don’t necessarily need to be pixel-perfect – they just need to be good enough. That makes it all the more important to recognise the tell-tale signs of a deepfake.
“To make a really good deepfake, you typically want an actor that looks reasonably like the person you're trying to fake,” Stamm continues. “Deepfakes typically falsify your face, they currently don't change the shape of your head. You see weird discontinuities around the sides and around the face, because it's basically putting a mask over the top of you.”
Perhaps the biggest giveaway is occlusion – the moments when the face is partially covered by a hand waving in front – or when faces move rapidly or look too far to the side. The software does not yet know how to handle it, so it momentarily breaks the illusion. But given the pace of innovation, we should expect developers to figure this out over time.
“A researcher called Siwei Lyu discovered that deepfakes don't blink,” said Stamm. “He published a paper, and within a week, we started seeing deepfakes in the wild that were blinking. People had [already] figured out how to model this behaviour.”
MLOps and trustworthy AI for data leaders
A data fabric approach to MLOps and trustworthy AI
It’s inevitable the fakery will continue to improve, which is why Stamm instead focuses his own research hunting for other ‘forensic’ clues. “There are also statistical traces that show up that our eyes can't see,” he says. “If a criminal were to break into your house, they would leave behind fingerprints or hair. With digital signals, processing leaves behind its own traces. These are just statistical and typically invisible in nature, but researchers like myself are working to capture them.”
He likens it to how similar techniques can be used to spot images that have been manipulated in software like Photoshop, but explains video multiplies the complexity.
Quantifying the deepfake threat
Should we view the Klitschko trickery as an ominous sign of the future? Perhaps, Stamm thinks, but not yet. “Deepfakes aren't the primary concern,” he says. “You don't need to make a deepfake to fool people.”
He points to how one of the most common forms of misinformation are recontextualised images – photos stripped of their caption and used misleadingly, with no digital manipulation required. “[Imagine] a protest five years ago in the US and claim this was a protest that occurred yesterday in London,” says Stamm. “It's a real image just taken completely out of context. And people who want to believe this will, because you can even confront them with evidence that it's fake, but by this point, it's already affected their worldview.”
Another high-profile example of this was a video of US Speaker of the House Nancy Pelosi, which went viral in 2020 with claims she was slurring her words. This video was just slowed down, which you can do using a standard video editor. “All they did was slow it down a little, and they misled a lot of people for at least a brief period of time,” he explains.
As the Mayor of Berlin has demonstrated, we may, in future, need more sophisticated tools to help us spot disinformation when whatever we’re looking at is good enough to fool us. “I would expect these forensic analysis techniques to be deployed pretty widely,” says Stamm. “A number of governmental organisations are working to develop them within the US and throughout the rest of the world. Private industry is starting to adopt these techniques and you may see, over the next few years, them being integrated into social media or available through private companies.”
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2023.