InsightFace entered to the facial recognition world with two spectacular modules: its face recognition model ArcFace, and its face detection model RetinaFace. The both models are state-of-the-art ones already. In this post, we are going to focus on RetinaFace in tensorflow.
A modern face recognition pipeline consists of 4 common stages: detect, align, represent and verify. Detection and alignment are early and very important stages. RetinaFace handles those early stages with a high confidence score.
πββοΈ You may consider to enroll my top-rated machine learning course on Udemy
Vlog
You can either keep on reading this tutorial or watch the following video. They both cover the deep face detection with retinaface and deep face recognition with ArcFace models.
Tensorflow re-implementation
The original implementation is based on mxnet but I found this tensorflow re-implementation by Stanislas Bertrand. This amazing study shows a very close performance to original mxnet implementation. It has a 1% lower score in easy and medium widerface validation dataset and 2% lower score in the hard one. It is promising and satisfying.
However, that repo is not pip / pypi compatible. You have to download the source code manually. Besides, it depends on some c dependencies. You have to run make command to execute the makefile in its root folder. Finally, its pre-trained weight size is more than 100 MB. That’s why, it cannot be stored in the source code and it expects you to download the pre-trained weights from dropbox manually.
Herein, I made that re-implementation pip compatible. Then, I simplified the source code and transformed its c dependencies with python. Finally, I built a mechanism downloading the pre-trained weights from google drive source in the background. You just need to import the library and call its detect face function. So, it can be run with a few lines of code!
Structure
RetinaFace is mainly based on an academic study: RetinaFace: Single-stage Dense Face Localisation in the Wild. Model design is based on feature pyramids. Context modules come after the independent context modules.
The original work uses ResNet152 backbone but tensorflow re-implementation uses ResNet50 backbone.
Feature pyramids handled in the source code as shown below. You will not need the following code snippet. Do not let it to confuse you.
#feature pyramids model = Model(inputs=data, outputs=[ face_rpn_cls_prob_reshape_stride32, face_rpn_bbox_pred_stride32, face_rpn_landmark_pred_stride32, face_rpn_cls_prob_reshape_stride16, face_rpn_bbox_pred_stride16, face_rpn_landmark_pred_stride16, face_rpn_cls_prob_reshape_stride8, face_rpn_bbox_pred_stride8, face_rpn_landmark_pred_stride8 ])
Installation
RetinaFace code is fully open-sourced and it is available on pip / pypi. All you need is to run the following command. It handles to download its prerequisites as well.
#!pip install retina-face
Face detection
Once you install the package, you can import the library. It serves with detect face function in its interface. The function expects an exact image path. Passing images in numpy format is fine as well.
from retinaface import RetinaFace img_path = "img1.jpg" faces = RetinaFace.detect_faces(img_path)
Then, the function will return facial area coordinates, some landmarks including eye, nose and mouth coordinates with a confidence score.
{ "face_1": { "score": 0.9993440508842468, "facial_area": [155, 81, 434, 443], "landmarks": { "right_eye": [257.82974, 209.64787], "left_eye": [374.93427, 251.78687], "nose": [303.4773, 299.91144], "mouth_right": [228.37329, 338.73193], "mouth_left": [320.21982, 374.58798] } } }
You can highlight the facial area and landmarks if you have the response of detect faces function.
identity = faces[0] facial_area = identity["facial_area"] landmarks = identity["landmarks"] #highlight facial area cv2.rectangle(img, (facial_area[2], facial_area[3]) , (facial_area[0], facial_area[1]), (255, 255, 255), 1) #extract facial area #img = cv2.imread(img_path) #facial_img = img[facial_area[1]: facial_area[3], facial_area[0]: facial_area[2]] #highlight the landmarks cv2.circle(img, tuple(landmarks["left_eye"]), 1, (0, 0, 255), -1) cv2.circle(img, tuple(landmarks["right_eye"]), 1, (0, 0, 255), -1) cv2.circle(img, tuple(landmarks["nose"]), 1, (0, 0, 255), -1) cv2.circle(img, tuple(landmarks["mouth_left"]), 1, (0, 0, 255), -1) cv2.circle(img, tuple(landmarks["mouth_right"]), 1, (0, 0, 255), -1)
The output of the retinaface seems very promising! Please expand the following image with clicking it and focus on each face individually. You can find the base image here.
The default threshold value is set to 0.9. You can decrease it if you want to detect faces in low resolution.
resp = RetinaFace.detect_faces(img_path, threshold = 0.5)
RetinaFace can detect faces of fans in the tribune if we decrease the threshold value to 0.5. That’s really awesome! Please click the following image to see it in a high resolution. You can find the base image here.
Face alignment
Notice that found landmarks include eye coordinates. So, we can apply alignment to the detected faces until the eye coordinates become horizontal. Experiments show that alignment increases the face recognition accuracy almost 1%. RetinaFace framework offers a custom extract faces function. It expects exacth image path or numpy array, and it returns the detected face itself. It will apply alignment by default but you can turn this feature off if you set the align argument to False.
import matplotlib.pyplot as plt faces = RetinaFace.extract_faces(img_path = "img.jpg", align = True) for face in faces: plt.imshow(face) plt.show()
Detection and alignment results seem very clear.
You can find out the math behind face alignment more on the following video:
Face recognition
Face recognition module of insightface is ArcFace and face detection module is RetinaFace. This post mentions its face detection module but if you need to run an end-to-end facial recognition pipeline, consider to use deepface.
#!pip install deepface from deepface import DeepFace DeepFace.verify("img1.jpg", "img2.jpg", model_name = "ArcFace", detector_backend = "retinaface")
ArcFace is not the one wrapped in deepface as a facial recognition module. It wraps several models: VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID and Dlib. Experiments show that VGG-Face, FaceNet, Dlib and ArcFace overperform.
Retinaface is wrapped in deepface directly. The library also wraps other state-of-the-art face detectors: opencv, ssd, dlib and mtcnn. Here, retinaface , dlib and mtcnn find facial landmarks including eye coordinates. That’s why, their detection + alignment score is high. On the other hand, opencv and ssd are faster in detection but their alignment score is low. If your priority is confidence, then consider to use mtcnn, retinaface or dlib; on the other hand, if your priority is speed, consider to use opencv or ssd.
Here, you can watch the detection performance of those detectors. Running times ignored in this video.
Conclusion
So, we’ve mentioned the amazing face detection model RetinaFace in this post. Experiments show that it can detect faces even in wild. Here, RetinaFace and ArcFace complete each other and build a robust face recognition pipeline.
Here, you can find the source code of the study. You can support this study if you starβοΈ the repoπ.
Support this blog if you do like!
Can you help to explain the outputs of retinaface?
#feature pyramids
model = Model(inputs=data,
outputs=[
face_rpn_cls_prob_reshape_stride32,
face_rpn_bbox_pred_stride32,
face_rpn_landmark_pred_stride32,
face_rpn_cls_prob_reshape_stride16,
face_rpn_bbox_pred_stride16,
face_rpn_landmark_pred_stride16,
face_rpn_cls_prob_reshape_stride8,
face_rpn_bbox_pred_stride8,
face_rpn_landmar
How to do “RetinaFace.extract_faces” using GPU? Pls, i really need this.
If you have tensorflow-gpu dependency, then retinaface will work on gpu
My DeepFace.stream() doesn’t work.how to fix it help please.