From Snapchat face swap to Mona Lisa talking, DeepFake has come a long way in the field of technology. This creepy technology has given us some chills since its advancements in the face-swapping phenomenon. Recently, we have witnessed DeepFaking Jim Carrey into Jack Nicholson’s most popular cult classic The Shining, swapping Nicolas Cage’s face with John Travolta’s 1997 blockbuster Face/Off and much more.
Recently, researchers from Bar-Ilan University and the Open University of Israel have developed a similar model known as FSGAN for face swapping and re-enactment in images and videos. Face swapping is used to transfer a face from an image source to a target image while face reenacting or face puppeteering uses the facial movements and expression deformations of a control face in one video to guide the motions and deformations of a face which is appearing in another video.
Behind the Model
The above image shows the overview of Face Swapping GAN or FSGAN which mainly consists of three main components mentioned below:
- Reenactment And Segmentation Generator: This component estimates the reenacted face, its segmentation and estimates the face and hair segmentation mask of the target image
- Face Inpainting Network: This component is used to estimate the missing pixels in the face part. It inpaints the missing parts of the reenacted face based on the face and hair segmentation mask to estimate the complete reenacted face.
- Blending of the Completed Face: The blending generator blends the complete reenacted face and the target’s face using the segmentation mask
In order to train the generator, video sequences of the IJB-C dataset are used. IJB-C contains approximately 11k face videos of which only 5,500 were used for the training purpose that are in high definition.
FSGAN is a deep learning-based approach which can be applied to different subjects without requiring subject-specific training.
The key takeaways of this model are:
- Subject Agnostic Swapping And Reenactment: This model is able to simultaneously manipulate pose, expression and identity without requiring person-specific or pair-specific training while producing high quality and temporally coherent results
- Multiple View Interpolation: This model has the capability of interpolating between multiple views of the same face in a continuous manner based on reenactment
- New Loss Functions: Two new loss functions have been proposed which include a stepwise consistency loss for training face reenactment progressively in small steps, and a Poisson blending loss to train the face blending network to seamlessly integrate the source face into its new context
Limitation of This Model
- This model faces few limitations such as the larger the angular differences, the more identity and texture quality degrades
- Also, too many iterations of the face reenactment generator may lead to blurring the texture of the image
- This method is limited to the resolution of the training data
- Another limitation arises from using a sparse landmark tracking method which fails to fully capture the complexity of facial expressions
The proposed model has outperformed several existing face-swapping works without any explicit training on subject-specific images. Due to a number of interactive reasons, General Adversarial Network (GAN) has been successfully used for face manipulating tasks such as generating realistic images of fake faces. However, these methods have both good and the bad side. The researchers of this novel project stated that this method is basically introduced to take measures on various privacy concerns such as face de-identification, face-swapping in images, etc. Applications like face swapping and face reenactment are attracting significant research attention due to their applications in various domains such as in entertainment for visual media production, graphics, pattern recognition, in privacy for photorealistic face de-identification, exchanging faces in images, among others.
Read the paper here.