Labeled Faces in the Wild for Face Recognition

Nowadays, a new state-of-the-art face recognition model is raised everyday. Researchers adopted Labeled faces in the wild or shortly LFW data set as a de facto standard to evaluate face recognition models and compare with existing ones. Luckily, scikit-learn provides LFW data set as an out-of-the-box module. In this post, we will evaluate a state-of-the-art model on LFW data set within scikit-learn API.

arya-in-hall-of-faces — Arya in Hall of Faces, Game of Thrones

BTW, I pushed the source code of this study to GitHub. You can support this study if you star⭐️ the repo 🙏.

🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Vlog

You can either follow this tutorial or watch the following video. They both cover the building a facial recognition pipeline with deepface for python and testing it on LFW data set.

We also performed some experiments to figure out which facial recognition model, face detector, distance metric and alignment mode configuration is the best.

Loading LFW data set

fetch_lfw_pairs function loads LFW data set. Default calling loads images in gray scale. That’s why, I’ll set its color argument to true. Besides, default usage loads train set images but in this post, I’ll just evaluate an existing model on LFW data set. I just need test set. That’s why, I’ll set subset argument to test. Finally, fetch_lfw_pairs function decreases the image resolution. Setting resize argument to 1 saves the original size.

from sklearn.datasets import fetch_lfw_pairs
fetch_lfw_pairs = fetch_lfw_pairs(subset = &#039;test&#039;
, color = True, resize = 1)

LFW pairs store image pairs and its label as same person or different persons.

pairs = fetch_lfw_pairs.pairs
labels = fetch_lfw_pairs.target
target_names = fetch_lfw_pairs.target_names

There are 1000 instances in the test set. First half of them are same person whereas second half is different persons.

actuals = []; predictions = []
for i in range(0, pairs.shape[0]):
   pair = pairs[i]
   img1 = pair[0]
   img2 = pair[1]

   fig = plt.figure()

   ax1 = fig.add_subplot(1,3,1)
   plt.imshow(img1/255)

   ax2 = fig.add_subplot(1,3,2)
   plt.imshow(img2/255)

   ax3 = fig.add_subplot(1,3,3)
   plt.text(0, 0.50, target_names[labels[i]])

   plt.show()

Data set stores low resolution images as seen. This will be challenging for a face recognition model.

Face recognition app

We will use deepface package for python for face recognition.

Face recognition task

We’ve retrieved pair images in the code block above. Face recognition task will be handled in the same for loop. DeepFace package for python can handle face recognition with a few lines of code. It wraps several state-of-the-art face recognition models: VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID and Dlib ResNet. I can switch the face recognition model by specifying the model name argument in the verify function. I will use Dlib ResNet model in this experiment.

#!pip install deepface
from deepface import DeepFace

#deepface expects bgr instead of rgb
img1 = img1[:,:,::-1]; img2 = img2[:,:,::-1]

# save images into file system
img1_target = f&quot;lfwe/test/{i}_1.jpg&quot;
img2_target = f&quot;lfwe/test/{i}_2.jpg&quot;

# plt.imsave(img1_target, img1/255) #works for my mac
plt.imsave(img1_target, img1) #works for my debian

# plt.imsave(img2_target, img2/255) #works for my mac
plt.imsave(img2_target, img2) #works for my debian

# pass image paths
obj = DeepFace.verify(img1_target, img2_target
   , model_name = &#039;Dlib&#039;, distance_metric = &#039;euclidean&#039;)
prediction = obj[&amp;quot;verified&amp;quot;]
predictions.append(prediction)

actual = True if labels[i] == 1 else False
actuals.append(actual)

Evaluation

We stored actual and prediction labels in the dedicated variables. Sklearn offers accuracy metric calculations as out-of-the-functions as well.

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
accuracy = 100*accuracy_score(actuals, predictions)
precision = 100*precision_score(actuals, predictions)
recall = 100*recall_score(actuals, predictions)
f1 = 100*f1_score(actuals, predictions)

The performance of dlib on LFW data set for test subset is shown below. Results seem satisfactory for low resolution pairs.

instances = 1000
accuracy =&#039;92.7 %&#039;
precision = &#039;94.20289855072464 %&#039;
recall = &#039;91.0 %&#039;
f1 = &#039;92.5737538148525 %&#039;

Confusion matrix might give some insights as well.

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(actuals, predictions)
print(cm)

tn, fp, fn, tp = cm.ravel()
print(tn, fp, fn, tp)

Confusion matrix is demonstrated below.

cm = [
   [472,  28],
   [ 45, 455]
]

true_negative = 472
false_positive = 28
false_negative = 45
true_positive = 455

Best Single Model

In this experiment, we used Dlib model from DeepFace. However, you can configure face recognition models, detectors, distance metrics and alignment modes in DeepFace to have a higher accuracy. Find out the best configuration set.

Approximate Nearest Neighbor

LFW dataset has 1000 items in its test set which is very small. What if you need to run facial recognition on a larger dataset?

As explained in this tutorial, facial recognition models are being used to verify a face pair is same person or different persons. This is actually face verification instead of face recognition. Because face recognition requires to perform face verification many times. Now, suppose that you need to find an identity in a billion-scale database e.g. citizen database of a country and a citizen may have many images. This problem has O(n x logn) time complexity where n is the number of entries of your database.

On the other hand, approximate nearest neighbor algorithm reduces time complexity dramatically to O(logn)! Vector indexes such as Annoy, Voyager, Faiss; and vector databases such as Postgres with pgvector and RediSearch are running this algorithm to find a similar vector of a given vector even in billions of entries just in milliseconds.

So, if you have a robust facial recognition model then it is not a big deal to run it in billions!

Conclusion

So, we’ve mentioned how to evaluate a face recognition model on LFW data set within scikit-learn API. I plan to add the accuracy metrics of all models wrapped in deepface package soon.

I pushed the source code of this study to GitHub. You can support this study if you star⭐️ the repo 🙏.

Support this blog financially if you do like!

8 Comments

Puneet Kaur says:

July 31, 2023 at 5:41 am

i ran the same code as in video or blog, but accuracy is 50%. I am using google colab and written the same code as in your github , video or blog. But accuracy is 50%. Sir please tell me the solution.

Log in to Reply

Labeled Faces in the Wild for Face Recognition

Vlog

Loading LFW data set

Face recognition app

Face recognition task

Evaluation

Best Single Model

Approximate Nearest Neighbor

Conclusion

Related

8 Comments

Leave a Reply Cancel reply

Vlog

Loading LFW data set

Face recognition app

Face recognition task

Evaluation

Best Single Model

Approximate Nearest Neighbor

Conclusion

Related

8 Comments

Leave a Reply Cancel reply

Discover more from Sefik Ilkin Serengil