FaceSwap is an app that I have originally created as an exercise for my students in "Mathematics in Multimedia" on the Warsaw University of Technology. The app is written in Python and uses face alignment, Gauss Newton optimization and image blending to swap the face of a person seen by the camera with a face of a person in a provided image.
You will find a short presentation the program's capabilities in the video below (click to go to YouTube):
To start the program you will have to run a Python script named zad2.py (Polish for exercise 2). You need to have Python 3 and some additional libraries installed. Once Python is on your machine, you should be able to automatically install the libraries by running pip install -r requirements.txt
in the repo's root directory.
You will also have to download the face alignment model from here: http://sourceforge.net/projects/dclib/files/dlib/v18.10/shape_predictor_68_face_landmarks.dat.bz2 and unpack it to the main project directory.
A faster and more stable version of FaceSwap is available on Dropbox here. This new version is based on the Deep Alignment Network method, which is faster than the currently used method if ran on a GPU and provides more stable and more precise facial landmarks. Please see the GitHub repository of Deep Alignment Network for setup instructions.
I hope to find time to include this faster version in the repo code soon.
The general outline of the method is as follows:
First we take the input image (the image of a person we want to see on our own face) and find the face region and its landmarks. Once we have that we fit the 3D model to those landmarks (more on that later) the vertices of that model projected to the image space will be our texture coordinates.
Once that is finished and everything is initialized the camera starts capturing images. For each captured images the following steps are taken:
The most crucial element of the entire process is the fitting of the 3D model. The model itself consists of:
The model is projected into the image space using the following equation:
where s is the projected shape, a is the scaling parameter, P are the first two rows of a rotation matrix that rotates the 3D face shape, S_0 is the neutral face shape, w_1-n are the blendshape weights, S_1-n are the blendshapes, t is a 2D translation vector and n is the number of blendshapes.
The model fitting is accomplished by minimizing the difference between the projected shape and the localized landmarks. The minimization is accomplished with respect to the blendshape weights, scaling, rotation and translation, using the Gauss Newton method.
The code is licensed under the MIT license, some of the data in the project is downloaded from 3rd party websites:
If need help or you found the app useful, do not hesitate to let me know.
Marek Kowalski [email protected], homepage: http://home.elka.pw.edu.pl/~mkowals6/