Nowadays, a new state-of-the-art face recognition model is raised everyday. Researchers adopted Labeled faces in the wild or shortly LFW data set as a de facto standard to evaluate face recognition models and compare with existing ones. Luckily, scikit-learn provides LFW data set as an out-of-the-box module. In this post, we will evaluate a state-of-the-art model on LFW data set within scikit-learn API.
BTW, I pushed the source code of this study to GitHub. You can support this study if you starβοΈ the repo π.
πββοΈ You may consider to enroll my top-rated machine learning course on Udemy
Vlog
You can either follow this tutorial or watch the following video. They both cover the building a facial recognition pipeline with deepface for python and testing it on LFW data set.
We also performed some experiments to figure out which facial recognition model, face detector, distance metric and alignment mode configuration is the best.
Loading LFW data set
fetch_lfw_pairs function loads LFW data set. Default calling loads images in gray scale. That’s why, I’ll set its color argument to true. Besides, default usage loads train set images but in this post, I’ll just evaluate an existing model on LFW data set. I just need test set. That’s why, I’ll set subset argument to test. Finally, fetch_lfw_pairs function decreases the image resolution. Setting resize argument to 1 saves the original size.
from sklearn.datasets import fetch_lfw_pairs fetch_lfw_pairs = fetch_lfw_pairs(subset = 'test' , color = True, resize = 1)
LFW pairs store image pairs and its label as same person or different persons.
pairs = fetch_lfw_pairs.pairs labels = fetch_lfw_pairs.target target_names = fetch_lfw_pairs.target_names
There are 1000 instances in the test set. First half of them are same person whereas second half is different persons.
actuals = []; predictions = [] for i in range(0, pairs.shape[0]): pair = pairs[i] img1 = pair[0] img2 = pair[1] fig = plt.figure() ax1 = fig.add_subplot(1,3,1) plt.imshow(img1/255) ax2 = fig.add_subplot(1,3,2) plt.imshow(img2/255) ax3 = fig.add_subplot(1,3,3) plt.text(0, 0.50, target_names[labels[i]]) plt.show()
Data set stores low resolution images as seen. This will be challenging for a face recognition model.
Face recognition app
We will use deepface package for python for face recognition.
Face recognition task
We’ve retrieved pair images in the code block above. Face recognition task will be handled in the same for loop. DeepFace package for python can handle face recognition with a few lines of code. It wraps several state-of-the-art face recognition models: VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID and Dlib ResNet. I can switch the face recognition model by specifying the model name argument in the verify function. I will use Dlib ResNet model in this experiment.
#!pip install deepface from deepface import DeepFace #deepface expects bgr instead of rgb img1 = img1[:,:,::-1]; img2 = img2[:,:,::-1] # save images into file system img1_target = f"lfwe/test/{i}_1.jpg" img2_target = f"lfwe/test/{i}_2.jpg" # plt.imsave(img1_target, img1/255) #works for my mac plt.imsave(img1_target, img1) #works for my debian # plt.imsave(img2_target, img2/255) #works for my mac plt.imsave(img2_target, img2) #works for my debian # pass image paths obj = DeepFace.verify(img1_target, img2_target , model_name = 'Dlib', distance_metric = 'euclidean') prediction = obj["verified"] predictions.append(prediction) actual = True if labels[i] == 1 else False actuals.append(actual)
Evaluation
We stored actual and prediction labels in the dedicated variables. Sklearn offers accuracy metric calculations as out-of-the-functions as well.
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score accuracy = 100*accuracy_score(actuals, predictions) precision = 100*precision_score(actuals, predictions) recall = 100*recall_score(actuals, predictions) f1 = 100*f1_score(actuals, predictions)
The performance of dlib on LFW data set for test subset is shown below. Results seem satisfactory for low resolution pairs.
instances = 1000 accuracy ='92.7 %' precision = '94.20289855072464 %' recall = '91.0 %' f1 = '92.5737538148525 %'
Confusion matrix might give some insights as well.
from sklearn.metrics import confusion_matrix cm = confusion_matrix(actuals, predictions) print(cm) tn, fp, fn, tp = cm.ravel() print(tn, fp, fn, tp)
Confusion matrix is demonstrated below.
cm = [ [472, 28], [ 45, 455] ] true_negative = 472 false_positive = 28 false_negative = 45 true_positive = 455
Best Single Model
In this experiment, we used Dlib model from DeepFace. However, you can configure face recognition models, detectors, distance metrics and alignment modes in DeepFace to have a higher accuracy. Find out the best configuration set.
Approximate Nearest Neighbor
LFW dataset has 1000 items in its test set which is very small. What if you need to run facial recognition on a larger dataset?
As explained in this tutorial, facial recognition models are being used to verify a face pair is same person or different persons. This is actually face verification instead of face recognition. Because face recognition requires to perform face verification many times. Now, suppose that you need to find an identity in a billion-scale database e.g. citizen database of a country and a citizen may have many images. This problem has O(n x logn) time complexity where n is the number of entries of your database.
On the other hand, approximate nearest neighbor algorithm reduces time complexity dramatically to O(logn)! Vector indexes such as Annoy, Voyager, Faiss; and vector databases such as Postgres with pgvector and RediSearch are running this algorithm to find a similar vector of a given vector even in billions of entries just in milliseconds.
So, if you have a robust facial recognition model then it is not a big deal to run it in billions!
Conclusion
So, we’ve mentioned how to evaluate a face recognition model on LFW data set within scikit-learn API. I plan to add the accuracy metrics of all models wrapped in deepface package soon.
I pushed the source code of this study to GitHub. You can support this study if you starβοΈ the repo π.
Support this blog if you do like!
I’m getting a value error in the 6th cell in your ipynb that you uploaded in github.
“ValueError: (‘Detected face shape is ‘, (0, 92, 3), ‘. Consider to set enforce_detection argument to False.’)”
What’s the problem here?
No face found in one of the image pair you passed. Please follow the instruction in the exception message.
After putting enforce_detection = False, new error appears.
Now showing “TypeError: Cannot handle this data type: (1, 1, 3), <f4".
How can I feed high resolution images? Is there any way to filter out small images? As this is a fixed datset by scikit learn. Can I manually select my dataset folder or any other way?
Thanks for the reply.
hi sefik , i want to apply deepface model on other database which is not in sklearn. Can you help me wit code where dataset has classes and classes has images. Then how to make pairs and perform verfication.
What kind of a help do you expect?
I am running DeepFace models on the different dataset and when i use mtcnn as backend detector , RAM gets exhausted using colab or kaggle. Any solution?? And another thing i want to know if i run these models on CPU instead of GPU, Will result vary??
Hello Sefik Sir , How to use YOLO detector within DeepFace as backend detector. Have you posted it?
i ran the same code as in video or blog, but accuracy is 50%. I am using google colab and written the same code as in your github , video or blog. But accuracy is 50%. Sir please tell me the solution.