Face Recognition with Dlib in Python

Dlib is a powerful library having a wide adoption in image processing community similar to OpenCV. Researchers mostly use its face detection and alignment module. Beyond this, dlib offers a strong out-of-the-box face recognition module as well. Even though it is written in c++, it has a python interface as well. In this post, we will mention how to apply face recognition with Dlib in Python.

person-of-interest-face-recognition — Person of interest (2011)

Face recognition pipeline

A modern face recognition pipeline consists of 4 common stages: detect, align, represent and verify. Supportively, all of those stages are covered in dlib’s implementation.

🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Vlog

The following video explains how to apply face recognition within dlib. You can either watch the video or follow the blog post.

Model

Dlib is mainly inspired from a ResNet-34 model. Davis E. King modified the regular ResNet structure and dropped some layers and re-build a neural networks consisting of 29 convolution layers. It expexts 150x150x3 sized inputs and represent face images as 128 dimensional vectors.

He then re-trained the model for various data sets including FaceScrub and VGGFace2. In other words, it learns how to find face representations with 3M samples. Then, he tested the built for labeled faces in the wild (LFW) data set which is accepted as a baseline for face recognition researches. He got 99.38% accuracy. On the other hand, human beings hardly have 97.53% score on same dataset. This means that dlib face recognition model can compete with the other state-of-the-art face recognition models and human beings as well.

Prerequisites

Dlib requires a facial landmark detector and resnet model files. You can manually download the source files and decompress them. Alternatively, the following code block will download and unzip these required files if they doesn’t exist in your current directory.

def unzip_bz2_file(zipped_file_name):
zipfile = bz2.BZ2File(zipped_file_name)
data = zipfile.read()
newfilepath = output[:-4] #discard .bz2 extension
open(newfilepath, "wb").write(data)

def download_file(url):
output = url.split("/")[-1]
gdown.download(url, output, quiet=False)

if os.path.isfile("shape_predictor_5_face_landmarks.dat") != True:
print("shape_predictor_5_face_landmarks.dat is going to be downloaded")
url = "http://dlib.net/files/shape_predictor_5_face_landmarks.dat.bz2"
download_file(url)
unzip_bz2_file(output)

if os.path.isfile("dlib_face_recognition_resnet_model_v1.dat") != True:
print("dlib_face_recognition_resnet_model_v1.dat is going to be downloaded")
url = "http://dlib.net/files/dlib_face_recognition_resnet_model_v1.dat.bz2"
download_file(url)
unzip_bz2_file(output)

Loading pre-trained models

We’ve downloaded the prerequisite files in the previous block. Now, we need to build pre-trained models.

import dlib
detector = dlib.get_frontal_face_detector()
sp = dlib.shape_predictor("shape_predictor_5_face_landmarks.dat")
facerec = dlib.face_recognition_model_v1("dlib_face_recognition_resnet_model_v1.dat")

Face detection and alignment

The following code block handles loading, detection and alignment stages. Aligned faces will be in shape of (150, 150, 3).

#load images
img1 = dlib.load_rgb_image("img1.jpg")
img2 = dlib.load_rgb_image("img2.jpg")

#detection
img1_detection = detector(img1, 1)
img2_detection = detector(img2, 1)

img1_shape = sp(img1, img1_detection[0])
img2_shape = sp(img2, img2_detection[0])

#alignment
img1_aligned = dlib.get_face_chip(img1, img1_shape)
img2_aligned = dlib.get_face_chip(img2, img2_shape)

On the other hand, we don’t have to apply face detection within dlib because it is not the best solution in the open source solutions.

Face detection can be done with many solutions such as OpenCV, Dlib or MTCNN. OpenCV offers haar cascade, single shot multibox detector (SSD). Dlib offers Histogram of Oriented Gradients (HOG) and Max-Margin Object Detection (MMOD). Finally, MTCNN is a popular solution in the open source community as well. Herein, SSD, MMOD and MTCNN are modern deep learning based approaches whereas haar cascade and HoG are legacy methods. Besides, SSD is the fastest one. You can monitor the detection performance of those methods in the following video.

Here, you can watch how to use different face detectors in Python.

You can find out the math behind alignment more on the following video:

Besides, face detectors detect faces in a rectangle area. So, detected faces come with some noise such as background color. We can find 68 different landmarks of a face with dlib. In this way, we can get rid of any noise of a facial image.

Here, retinaface is the cutting-edge face detection technology. It can even detect faces in the crowd and it finds facial landmarks including eye coordinates. That’s why, its alignment score is very high.

More sensitive way

Face detection does not have to be applied for rectangle areas. We can do it more sensitive with the facial landmark detection with Dlib. It can find 68 facial landmark points on the face including jaw and chin, eyes and eyebrows, inner and outer area of lips and nose.

Here, you can find a deeply explained tutorial about facial landmarks detection with dlib.

Represention

We will feed the aligned faces to the ResNet model and it represent faces 128 dimensional vector.

img1_representation = facerec.compute_face_descriptor(img1_aligned)
img2_representation = facerec.compute_face_descriptor(img2_aligned)

Even though dlib finds representations in dlib.vector type, we can convert it to numpy easily to find the distance easily in the following step.

img1_representation = np.array(img1_representation)
img2_representation = np.array(img2_representation)

Euclidean distance

Davis King proposes to use Euclidean distance to verify faces because he found the tuned threshold.

def findEuclideanDistance(source_representation, test_representation):
euclidean_distance = source_representation - test_representation
euclidean_distance = np.sum(np.multiply(euclidean_distance, euclidean_distance))
euclidean_distance = np.sqrt(euclidean_distance)
return euclidean_distance

Verification

We already have the representations of pairs. We also know how to find the distance between these vectors. King shared the tuned threshold as well.

distance = findEuclideanDistance(img1_representation, img2_representation)
threshold = 0.6 #distance threshold declared in dlib docs for 99.38% confidence score on LFW data set

if distance < threshold: print("they are same")
else: print("they are different")

Tests

I’ve tested the face recognition module of dlib for several pairs. The following code block will plot pairs side by side. I’ve used the some unit test images of deepface.

def plotPairs(img1, img2):
fig = plt.figure()
ax1 = fig.add_subplot(1,2,1)
plt.imshow(img1);plt.axis("off")
ax1 = fig.add_subplot(1,2,2)
plt.imshow(img2); plt.axis("off")
plt.show()

Results seem very satisfactory.

Approximate Nearest Neighbor

As explained in this tutorial, facial recognition models are being used to verify a face pair is same person or different persons. This is actually face verification instead of face recognition. Because face recognition requires to perform face verification many times. Now, suppose that you need to find an identity in a billion-scale database e.g. citizen database of a country and a citizen may have many images. This problem has O(n x logn) time complexity where n is the number of entries of your database.

On the other hand, approximate nearest neighbor algorithm reduces time complexity dramatically to O(logn)! Vector indexes such as Annoy, Voyager, Faiss; and vector databases such as Postgres with pgvector and RediSearch are running this algorithm to find a similar vector of a given vector even in billions of entries just in milliseconds.

So, if you have a robust facial recognition model then it is not a big deal to run it in billions!

Out-of-the-box pipeline

Dlib is a spectacular library. However, it expects you to apply all common stages of a face recognition pipeline: detect, align, represent and verify. This might discourage you. Herein, DeepFace library for python handles all of those stages in the background and you can run it with a few lines of code.

It is a hybrid face recognition framework wrapping the state-of-the-art face recognition models including University of Oxford’s VGG-Face, Google FaceNet, Carnegie Mellon University’s OpenFace, Facebook DeepFace, The Chinese University of Hong Kong’s DeepID and Dlib ResNet model.

dlib-in-deepface — Dlib ResNet model in deepface package

Here, you can find a video covering how to run deepface.

This comes with a real-time implementation as well.

Meanwhile, you can run face verification tasks directly in your browser with its custom ui built with ReactJS.

Anti-Spoofing and Liveness Detection

What if DeepFace is given fake or spoofed images? This becomes a serious issue if it is used in a security system. To address this, DeepFace includes an anti-spoofing feature for face verification or liveness detection.

Large scale face recognition

Besides, you can apply large scale face recognition.

Notice that face recognition has O(n) time complexity and this might be problematic for millions or billions level data. Herein, approximate nearest neighbor (a-nn) algorithm reduces time complexity dramatically. Spotify Annoy, Facebook Faiss and NMSLIB are amazing a-nn libraries. Besides, Elasticsearch wraps an NMSLIB and it comes with highly scalability. You should run deepface within those a-nn libraries if you have really large scale data base.

On the other hand, a-nn algorithm does not guarantee to find the closest one always. We can still apply k-nn algorithm here. Map reduce technology of big data systems might satisfy the both speed and confidence here. mongoDB, Cassandra and Hadoop are the most popular solutions for no-sql databases. Besides, if you have a powerful database such as Oracle Exadata, then RDBMS and regular sql might satisfy your concerns as well.

The Best Single Model

DeepFace has many cutting-edge models in its portfolio. Find out the best configuration for facial recognition model, detector, similarity metric and alignment mode.

DeepFace API

DeepFace offers a web service for face verification, facial attribute analysis and vector embedding generation through its API. You can watch a tutorial on using the DeepFace API here:

Additionally, DeepFace can be run with Docker to access its API. Learn how in this video:

Tech Stack Recommendations

Face recognition is mainly based on representing facial images as vectors. Herein, storing the vector representations is a key factor for building robust facial recognition systems. I summarize the tech stack recommendations in the following video.

Encrypt Vector Embeddings

Facial recognition relies on vector embedding comparisons, much like other vector-based models such as reverse image search, recommendation engines, and LLMs. While embeddings do not allow the original data to be reconstructed, they still contain sensitive information, similar to fingerprints. If leaked, they can expose systems to adversarial attacks.

Encrypting embeddings can enhance security, but traditional symmetric key algorithms like AES have limitations. The private key needed for decryption must remain secure and cannot be transmitted to the cloud without risking exposure. If encrypted embeddings are retrieved for decryption and distance calculations on on-premises systems, the cloud’s computational power remains underutilized.

Homomorphic encryption offers a powerful alternative through public-key cryptography. It enables computations directly on encrypted data without requiring decryption, allowing full utilization of the cloud’s computational resources.

You can encrypt embeddings and compute vector similarity directly on encrypted data using homomorphic encryption. This ensures privacy while still enabling similarity calculations. Here, you can find an implementation using partially homomorphic encryption.

The same use case can also be implemented with fully homomorphic encryption (FHE), but it comes with significant drawbacks. FHE is slower, requires more computational power, and generates much longer ciphertexts. Additionally, its private and public keys are not well-suited for memory-constrained environments like IoT devices.

Conclusion

So, we’ve mentioned how to use out-of-the-box face recognition module of dlib library. It seems that dlib comes with a challenging face recognition service. It also covers all common stages of a modern face recognition pipeline. Just importing dlib is enough to apply face verification.

Finally, I pushed the source code of this study to GitHub. You can support this work by starring⭐️ the repo.

Support this blog financially if you do like!

Face Recognition with Dlib in Python

Face recognition pipeline

Vlog

Model

Prerequisites

Loading pre-trained models

Face detection and alignment

More sensitive way

Represention

Euclidean distance

Verification

Tests

Approximate Nearest Neighbor

Out-of-the-box pipeline

Anti-Spoofing and Liveness Detection

Large scale face recognition

The Best Single Model

DeepFace API

Tech Stack Recommendations

Encrypt Vector Embeddings

Conclusion

Related

3 Comments

Leave a Reply Cancel reply

Face recognition pipeline

Vlog

Model

Prerequisites

Loading pre-trained models

Face detection and alignment

More sensitive way

Represention

Euclidean distance

Verification

Tests

Approximate Nearest Neighbor

Out-of-the-box pipeline

Anti-Spoofing and Liveness Detection

Large scale face recognition

The Best Single Model

DeepFace API

Tech Stack Recommendations

Encrypt Vector Embeddings

Conclusion

Related

3 Comments

Leave a Reply Cancel reply

Discover more from Sefik Ilkin Serengil