Face Recognition with SphereFace in Python

Nowadays, deep learning based face recognition models are accepted as state-of-the-art ones because they dominate the face recognition field. Herein, SphereFace is a promising one which already passed the human-level performance. In this post, we are going to implement SphereFace model in Python for face recognition.

Objective

Notice that a modern face recognition pipeline consists of 4 common stages: detect, align, represent and verify. So does SphereFace! Herein, SphereFace is a regular CNN model responsible for representing faces as vectors.


🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

I pushed the source code of this study to GitHub as a notebook. You can support this study by starring⭐️ the repo.

Reference study

SphereFace is a joint work of the researchers of Georgia Institute of Technology, Carnegie Mellon University and Sun Yat-Sen University.

The author of the paper shared the pre-trained model for Caffe in this GitHub repo. Published model promisingly got 99.30% accuracy on LFW data set. We will directly apply transfer learning in this post. In other words, we will use pre-trained weights and no training will be applied.

Structure: https://github.com/wy1iu/sphereface/blob/master/train/code/sphereface_deploy.prototxt
Weights: https://drive.google.com/open?id=0B_geeR2lTMegb2F6dmlmOXhWaVk
Caffe within OpenCV

Caffe is not a lightweight framework and it is really problematic to install. To be honest, I could not succeed to install it on Windows. However, opencv dnn module provides a common interface to build and call Caffe models. Deep neural networks module comes with its contrib extension.

#!pip install opencv-python==4.2.0.34
#!pip install opencv-contrib-python==4.3.0.36

Then, it is easy to build Caffe models.

model = cv2.dnn.readNetFromCaffe("sphereface_deploy.prototxt", "sphereface_model.caffemodel")
Model structure

SphereFace expects (112x96x3) shaped inputs whereas it returns 512-dimensional embeddings. It has a 20 layer architecture. Here, you can find a pretty visualization of SphereFace-20 model.

SphereFace structure

Basically, it has 3x64 filters; 5x128 filters; 9x256 filters; 3x512 filters respectively and all filters are 3×3 shaped. Notice that the sum of the first multipliers which are also mentioned in bold is 20. It is equal to the number of layers of the model.

Pre-processing

Face detection and alignment are pre-processing stages of a face recognition pipeline. We are going to use deepface to handle those pre-processing stages. preprocess_face function returns (112x96x3) shaped array but caffe model expects (1x3x96x112) shaped inputs. blobFromImage function handles this reshape process.





#!pip install deepface
from deepface.commons import functions

input_shape = (112, 96)

img1 = functions.preprocess_face(img1_path, target_size=input_shape, detector_backend='opencv'])[0]
img2 = functions.preprocess_face(img2_path, target_size=input_shape, detector_backend = 'opencv')[0]

img1_blob = cv2.dnn.blobFromImage(img1)
img2_blob = cv2.dnn.blobFromImage(img2)

Face detection can be done with many solutions such as OpenCVSSD, Dlib or MTCNN. Herein, SSD and MTCNN are modern deep learning based approaches whereas OpenCV haar cascade and Dlib HoG are legacy methods. You can monitor the detection performance of those methods in the following video.

Here, you can watch how to use different face detectors in Python.

Herein, retinaface is the cutting-edge technology for face detection. It can even detect faces in the crowd. Besides, it finds some facial landmarks including eye coordinates. In this way, its alignment score is high as well.

Representation

Herein, caffe model is a regular CNN. However, we will use this CNN model to find vector embeddings instead of classification.

model.setInput(img1_blob)
img1_representation = model.forward()[0]

model.setInput(img2_blob)
img2_representation = model.forward()[0]

SphereFace model returns 512-dimensional vector.

Verification

We will find the distance between two representations in verification stage. Cosine similarity is adopted in the original study but my experiments show that Euclidean distance is a little bit better.

metric = “euclidean”

if metric == “euclidean”:
   threshold = 17.212238311767578
   distance = findEuclideanDistance(img1_representation, img2_representation)
elif metric == “cosine”:
   threshold = 0.4668717384338379
   distance = findCosineDistance(img1_representation, img2_representation)

if distance <= threshold:
   print("they are same person")
else:
   print("they are not same person")

Besides, I fed the unit test images of deepface to SphereFace model to find the best threshold with C4.5 algorithm. Threshold is 17.21 for Euclidean distance whereas it is 0.46 for Cosine distance. Moreover, the distributions of positive and negative classes seem to be consistent. this section could be found in the notebook of the study as well.

Distributions
Predictions

So, we have built the prediction infrastructure. Now, we can make predictions.

These are some results for true positives.

True positives

Also, these are some results of false positives.





False positives
Conclusion

So, we have mentioned a promising face recognition model in this post. It got an accuracy over many state-of-the-art models on a public data set. I pushed the source code of this study to GitHub as a notebook. You can support this study by starring⭐️ the repo.

Bonus: out-of-the-box library for face recognition

Herein, deepface is a lightweight face recognition framework for Python. It currently wraps the most state-of-the-art face recognition models including VGG-FaceFacenetOpenFaceFacebook DeepFace and DeepID.

Here, you can watch the how to video for deepface.

Besides, you can run deepface in real time with your webcam as well. It could apply facial attribute analysis as well: age, gender, emotion and ethnicity.

Large scale face recognition

Large scale face recognition requires to apply face verification several times. However, we can store the representations of ones in our database once. In this way, we just need to find the representation of target image. Finding distances between representations can be handled very fast. So, we can find an identity in a large scale data set in just seconds. Deepface offers an out-of-the-box function to handle large scale face recognition as well.

Notice that face recognition has O(n) time complexity and this might be problematic for millions or billions level data. Herein, approximate nearest neighbor (a-nn) algorithm reduces time complexity dramatically. Spotify Annoy, Facebook Faiss and NMSLIB are amazing a-nn libraries. Besides, Elasticsearch wraps an NMSLIB and it comes with highly scalability. You should run deepface within those a-nn libraries if you have really large scale data base.


Like this blog? Support me on Patreon

Buy me a coffee


2 Comments

  1. Hi Sefik,
    Is it possible to use DEEPFACE with a GPU (GeForce RTX 3090)?
    If yes, what should I do to allow DEEPFACE to take advantage of the GPU. Mine work environment is Windows 10.
    thanks!

    1. Yes you can run deepface on gpu. However, pypi package install regular tensorflow distribution. That’s why, you should install tensorflow-gpu manually and run “pip install deepface –no-deps”

Comments are closed.