“Deep synthesis” can realize many application forms such as face swapping, face synthesis, speech synthesis, video generation, and even virtual humans. It has gradually emerged from the shadow of porn deepfake videos and ushered in the era of commercialization.
However, due to the lack of understanding of the technology, people still have many prejudices and misunderstandings about “deep synthesis” technology, such as thinking that “deep synthesis” is “deepfake porn”, and that “deep synthesis” will completely impact social trust, and so on.
And a number of deepfake porn sites have appeared, such as mrdeepfakes.com and adultdeepfakes.com, etc. Celebrity sex deepfake videos are posted and viewed on these adult sites, and you can even see faces in these videos have been replaced by Scarlett Johansson, Ariana Grande, Gal Gadot, and Emma Watson.
This article summarizes 5 misconceptions about the technology and hopes to clarify these misunderstandings to help people understand the development and application of deep synthesis technology more comprehensively.
Myth1: Deep Synthesis Only Includes One Form: AI Face-Swapping
Deep synthesis technology, in addition to AI face replacement, also includes face reproduction, face generation, speech synthesis, and other technologies, and is developing in the direction of whole-body synthesis and virtual human.
Face swaps are the first to be widely known to the public. And it is also the most widely used form of deep synthesis. In some AI face-swapping applications, users only need to upload a photo to make any deepfake videos. Such as DeepSwap this website, upload any photo and original video, and you can make your favorite deepfake video with one click.
Deep synthesis is moving from local synthesis to whole-body synthesis, and from 2D synthesis to 3D synthesis. At present, global Internet companies are trying virtual human technology and using generative adversarial networks to complete the migration of human movements. Virtual humans will be promising in the fields of games, social networking, film and television, and medical care.
Myth 2: Anyone Can Create High-Quality Deep Synthetic Content
The threshold for using deep synthesis technology has been greatly lowered. Ordinary users can easily create and obtain entertaining depths such as AI face swapping, face synthesis, and speech synthesis on terminal devices such as phones and computers with the help of deep synthesis applications. This kind of deep synthetic content is often easier to identify and has source markings so that it does not appear to be fake.
Therefore, for now, although software such as DeepSwap and FaceMagic have begun to expose more people to deep synthesis technology, it is still difficult to create ultra-high-quality and refined deep synthetic content, which requires a lot of input from professionals with professional skills and professional tools to support machine learning.
Myth 3: Deep Synthesis Is Deepfake
“Deepfake” is a partial generalization, which is not enough to cover all deep synthesis techniques and corresponding synthetic content. Deepfake was originally only used to describe pornographic videos of AI face-swapping. It is a specific AI face-swapping technology, which was later used by the media to refer to all deep synthesis technologies.
The connotation of “deep synthesis” is more extensive, which means the synthesis and automatic generation of speech, music, images, faces, videos, and other content with the help of artificial intelligence algorithms, and AI face-swapping represented by “deep fake” is only one of them.
In addition, the unscientific term “deepfake” is easy to stigmatize the corresponding AI technology, which may stifle the potential social welfare of the technology and is not conducive to the development and application of technology. Because the AI technology behind deepfake has great positive application value, such as AI synthetic anchors, virtual singers on the Internet, face-swapping applications in social media, etc.
Therefore, although the emergence of deepfake has brought widespread attention to the AI technology behind it, it is not scientific to overemphasize the potential deceptiveness or possible negative effects of the technology.
Myth 4: Deep Synthesis Will Only Have a Negative Impact: Deepfake Porn
Although the deep synthesis technology with high simulation ability also has the risk of being abused in the pornographic industry, its huge positive application value will continue to bring social benefits and is being widely used in film and television, entertainment, education, medical care, social networking, e-commerce, content marketing, artistic creation, scientific research, and many other fields.
For example, in the post-production of film and television works, deep synthesis technology has been used to “revive” actors or achieve “digital dubbing” in multiple languages. A large number of social and content applications such as AI anchors, virtual singers, AI face-swapping, and virtual humans have also begun to emerge.
In the medical field, deep synthesis technology can allow patients at risk of losing their voices to regain their “own voice”, and can also generate medical images that are indistinguishable from real images to train AI systems to solve problems such as insufficient data and patient privacy protection.
In conclusion, deep synthesis is not about “fake” and “deception” techniques. Although it, like any other technology, has created a series of difficulties that must be faced, this will not obliterate the progress that this technology brings to society.
Myth 5: Deep Compositing Is a Complete Shock to Media Trust
Entering the era of the popularization of artificial intelligence, the cultivation of public information discrimination ability is also an important part of governance. In the past, editing technologies such as Photoshop can also synthesize content to a certain extent, but it has not impacted the trust of society. On the contrary, society can adapt and use this technology well.
In some reports, deep synthesis technology is described as a destroyer of social truth, arguing that the existence of deep synthesis technology will lead to natural distrust of media information, and the public can use ” deepfake” to doubt everything they want to doubt.
The problem is that before this technology appeared, false information could be concocted using traditional audio and video editing technology, or even without technical means, through simple methods such as taking out of context. The shaping of media trust is definitely not only what can be achieved by blocking a certain technology but also needs to be regulated in terms of content production, dissemination, and reception.
The emergence of deep synthesis technology has made us realize that what we see is not necessarily “real”, which is an important opportunity to strengthen the ability of the public to distinguish information.
Before modern technology, most humans lived in a very small world. But modern technology, represented by the Internet, allows each of us to connect with the world. Technologies like deep synthesis can create beauty that does not exist in the physical world, let us feel it, and it will inevitably push the beauty of human life to the next level.