ArcFace is developed by the researchers of Imperial College London. It is a module of InsightFace face analysis toolbox. The original study is based on MXNet and Python. However, we will run its third part re-implementation on Keras. The original study got 99.83% accuracy score on LFW data set whereas Keras re-implementation got 99.40% accuracy. So, re-implementation seems robust as well. Besides, MXNet model is seen to be reproducible with Keras. So, we will build ArcFace model from scratch in this post.
Face recognition pipeline
A pipeline consists of 4 common stages: detect, align, represent and verify. Herein, ArcFace is a regular face recognition algorithm responsible for representation.
🙋♂️ You may consider to enroll my top-rated machine learning course on Udemy
Source code
The source code of this study is pushed to GitHub as a jupyer notebook. You should clone it before starting this tutorial. Besides, you should clone ResNet architecture building dependency and pre-trained weights. BTW, you can support this study⭐️ if you star the repo.
Pre-trained model
Keras re-implementation of ArcFace shared the pre-trained model in its repo. However, it is saved as monolithic. I mean that model structure and pre-trained weights are stored in a single h5 file here. However, model was saved in tensorflow 2 and it might cause troubles if you try load the model in different tensorflow versions. That’s why, I prefer to build model structure in the code manually and save just pre-trained weights to avoid version problems.
Model structure
ArcFace is mainly based on ResNet34 model. The following illustration explains the model. It has a very complex architecture.
I’ve already designed the network architecture from scratch. You should download ArcFace.py here, and then call its load model function as demonstrated below. Notice that the following program should be in the same directory with ArcFace.py.
#https://github.com/serengil/deepface/blob/master/deepface/basemodels/ArcFace.py import ArcFace ArcFace.loadModel()
ArcFace model expects (112, 112, 3) shaped inputs whereas it returns 512 dimensional vector representations.
Pre-trained weights
Your friendly neighbor blogger saved just pre-trained weights and share in Google Drive. Its size is 133 MB. We have already built the model structure in the previous step. Now, we can load the pre-trained weights as shown below.
# Google Drive Link: https://drive.google.com/uc?id=1LVB3CdVejpmGHM28BpqqkbZP5hDEcdZY # GitHub link: https://github.com/serengil/deepface_models/releases/download/v1.0/arcface_weights.h5 model.load_weights("arcface_weights.h5")
Early stages of pipeline
ArcFace is responsible for the representation stage of a face recognition pipeline whereas detection and alignment are early stages. Luckily, deepface can handle those early stages. It wraps opencv, ssd, mtcnn and dlib for face detection.
My experiments show that MTCNN is the most robust detector but it is the slowest. SSD is the fastest one but its alignment is not as good as mtcnn.
You can find out the math behind alignment more on the following video:
Besides, face detectors detect faces in a rectangle area. So, detected faces come with some noise such as background color. We can find 68 different landmarks of a face with dlib. In this way, we can get rid of any noise of a facial image.
In addition, MediaPipe can find 468 landmarks. Please see its real time implementation in the following video. Recommended tutorials: Deep Face Detection with MediaPipe, Zoom Style Virtual Background Setup with MediaPipe.
Here, retinaface is the cutting-edge face detection technology. It can even detect faces in the crowd and it finds facial landmarks including eye coordinates. That’s why, its alignment score is very high.
Preprocessing
#!pip install deepface from deepface.commons import functions img1_path = "img1.jpg" img2_path = "img2.jpg" img1 = functions.preprocess_face(img1_path, target_size = (112, 112)) img2 = functions.preprocess_face(img2_path, target_size = (112, 112))
I’m going to test the model for the two iconic characters of Game of Thrones: Emilia Clarke (Daenerys Targaryen) and Lena Headey (Cersei Lannister). In particular, Emilia’s daily appearance and her role look very different. I even might not recognize her even if I didn’t know her. Similarly, Lena looks very different with her short and long hair.
Representation
We already applied detection and alignment to facial images and also resize themto the expected size. Now, we can feed those preprocessed facial images to the ArcFace.
img1_embedding = model.predict(img1)[0] img2_embedding = model.predict(img2)[0]
Verification
We fed facial images to a CNN model and it represents facial image pairs to 512 dimensional vectors. Here, we expect that distance between the image pair representations should be low for same person whereas it should be higher for different persons.
from deepface.commons import distance as dst metric = 'euclidean' if metric == 'cosine': distance = dst.findCosineDistance(img1_embedding, img2_embedding) elif metric == 'euclidean': distance = dst.findEuclideanDistance(img1_embedding, img2_embedding) elif metric == 'euclidean_l2': distance = dst.findEuclideanDistance(dst.l2_normalize(img1_embedding), dst.l2_normalize(img2_embedding))
Threshold
We have distance values but how to determine a distance is low or high? The easiest way to determine it to feed lots of positive and negative instances. Then, decision tree algorithms can find the best split point. Here, you can find a detailed tutorial: Fine Tuning the Threshold in Face Recognition. I fed the unit test images of deepface. Here, master.csv stores the image pairs and they are same person or not. There are 37 positive; 239 negative; 276 total instances here.
It seems that yes and no classes are distributed discretely. That’s good. Target labels are unbalanced. That’s why, no classes seem higher values in the y-axis.
When I feed distance values and target classes to the C4.5 algorithm, it finds the best threshold values when information gain maximizes.
def findThreshold(metric): if metric == 'cosine': return 0.6871912959056619 elif metric == 'euclidean': return 4.1591468986978075 elif metric == 'euclidean_l2': return 1.1315718048269017
Accuracy
As I mentioned before, this keras re-implementation got 99.40% accuracy on LFW data set. I also fed the unit test instances of deepface and find the best thresholds. Here are the accuracy, precision and recall values for each metric based on found thresholds.
metric | precision | recall | f1-score | accuracy |
cosine | 0.98 | 0.89 | 0.93 | 0.97 |
euclidean | 0.98 | 0.86 | 0.91 | 0.96 |
euclidean l2 | 0.98 | 0.89 | 0.93 | 0.97 |
Decision
We have the both distance and threshold values for each distance metric. Now, let’s find the decision.
threshold = findThreshold(metric) if distance <= threshold: print("they are same person") else: print("they are different persons")
True positive pairs
ArcFace amazingly verifies the identities of Emilia Clarke and Lena Headey. Emilia has a very different look in her daily life and the role in GOT. Similarly, Lena’s look is very different when she has a short hair and long hair. Still the ArcFace model verifies them.
True negative pairs
ArcFace succeeded the verification for true negative pairs. That’s an amazing work!
Running ArcFace in deepface
We have built ArcFace model from scratch and applied all stages of a pipeline step by step. On the other hand, we can build and run ArcFace model within deepface with a few lines of code.
We can apply face verification or find an identity in a database. We just set model name argument to ArcFace here.
#!pip install deepface from deepface import DeepFace #face verification obj = DeepFace.verify("img1.jpg", "img2.jpg", model_name = 'ArcFace') print(obj) #face recognition df = DeepFace.find("img1.jpg", db_path = "C:/my_db", model_name = 'ArcFace') print(df.head())
deepface also wraps VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID and Dlib models. Herein, the both FaceNet, VGG-Face and Dlib overperform than others.
Real time face recognition
It supports real time face recognition as well.
Meanwhile, you can run face verification tasks directly in your browser with its custom ui built with ReactJS.
Anti-Spoofing and Liveness Detection
What if DeepFace is given fake or spoofed images? This becomes a serious issue if it is used in a security system. To address this, DeepFace includes an anti-spoofing feature for face verification or liveness detection.
Large scale face recognition
A face recognition pipeline actually verifies an image pair is same person or different persons. Herein, face recognition requires to apply face verification several times. Deepface can find an identity in a database fast because it stores the representations of database items beforehand.
Notice that face recognition requires O(n) time complexity and this becomes problematic for millions level data. Herein, you can run deepface with Elasticsearch.
On the other hand, a-nn algorithm does not guarantee to find the closest one always. We can still apply k-nn algorithm here. Map reduce technology of big data systems might satisfy the both speed and confidence here. mongoDB, Cassandra and Hadoop are the most popular solutions for no-sql databases. Besides, if you have a powerful database such as Oracle Exadata, then RDBMS and regular sql might satisfy your concerns as well.
Approximate Nearest Neighbor
As explained in this tutorial, facial recognition models are being used to verify a face pair is same person or different persons. This is actually face verification instead of face recognition. Because face recognition requires to perform face verification many times. Now, suppose that you need to find an identity in a billion-scale database e.g. citizen database of a country and a citizen may have many images. This problem has O(n x logn) time complexity where n is the number of entries of your database.
On the other hand, approximate nearest neighbor algorithm reduces time complexity dramatically to O(logn)! Vector indexes such as Annoy, Voyager, Faiss; and vector databases such as Postgres with pgvector and RediSearch are running this algorithm to find a similar vector of a given vector even in billions of entries just in milliseconds.
So, if you have a robust facial recognition model then it is not a big deal to run it in billions!
Ensemble
deepface wraps many face recognition models and they are all amazing. Herein, deepface offers a special boosting solution to improve the model accuracy. This comes with higher accuracy but it is really slow. If your priority is the accuracy instead of speed, you might think to adopt this approach.
Tech Stack Recommendations
Face recognition is mainly based on representing facial images as vectors. Herein, storing the vector representations is a key factor for building robust facial recognition systems. I summarize the tech stack recommendations in the following video.
The Best Single Model
DeepFace has many cutting-edge models in its portfolio. Find out the best configuration for facial recognition model, detector, similarity metric and alignment mode.
DeepFace API
DeepFace offers a web service for face verification, facial attribute analysis and vector embedding generation through its API. You can watch a tutorial on using the DeepFace API here:
Additionally, DeepFace can be run with Docker to access its API. Learn how in this video:
Conclusion
In this post we mentioned a state-of-the-art face recognition model, how to build and run it from scratch, and its performance. Besides, how to run it with a few lines of code as well.
I pushed the source code of this study to GitHub. You can support this study if you star⭐️ the repo.
Support this blog if you do like!
Deepface weights take a long time to load. I am using an Intel CoreI3 processor with 8 GB RAM, Windows. It takes 50 seconds to 2 minutes to load the weights.
“DeepFace.analyze” with images does not take that much time though.
Would you please suggest how I can reduce the loading time reasonably?
Hi, a wonderful notebook to follow through, most resources are accessible and convenient.
However, you mention that you have already provided the pre-train weights for Arcface in this notebook, I would like to know the dataset used to train this. Is it all right if you share the data?
I did not train the model. I just use the pre-trained weights.
The pre-trained weights gdrive link is not valid anymore.
You can find the up-to-date files here: https://github.com/serengil/deepface_models/releases/tag/v1.0
I will update the post soon