Face recognition technology is mainly based on face verification. All scenario depends on feeding two face photos to a convolutional neural networks and retrieving their vector representations. Then, decision will be made based on the distance of those vectors but this is easy.
On the other hand, building a CNN model and calling its predict function are both costly operations. That’s why, applying face recognition on a large scale data set seems to be problematic. In this post, we will mention a workaround to handle large scale face recognition in an easy and fast way.
🙋♂️ You may consider to enroll my top-rated machine learning course on Udemy
A Face Recognition Pipeline
Let’s remember the flow of a modern face recognition pipeline.
It consists of 4 common stages: detect, align, represent and verify. We will focus on represent and verify stages in this post.
Big O Notation
Firstly, face verification has O(1) complexity whereas face recognition has O(n) complexity in big O notation. In other words, face recognition requires to call face verification function n times where n is the number of instances in the database.
Vlog
Now, you can either watch the following vlogs or follow this blog post. They both cover the large scale face recognition topic.
To focus on just face verification on face pairs.
Face recognition requires to apply face verification several times. Deepface handles it in the background.
Approximate Nearest Neighbor
But this might be problematic for billions of level data. In this case, we can apply approximate nearest neighbor algorithm to decrease the complexity. Spotify Annoy, Facebook Faiss and NMSLIB are very popular a-nn libraries.
Herein, Elasticsearch wraps NMSLIB but it comes with highly scalable feature. We can run it on many clusters.
Face recognition is a complex task
The both building a face recognition model and calling its prediction function are costly operations.
Imagine the required time to look for a face in a data set consists of 100 samples based on the table above.
Do we need building time plus 100 times verification time? Because this is not acceptable.
The hacker’s way
We might have hundreds of face photos in our database. The verification stage of a pipeline requires to find distances between vector of each image in the database and vector of the target image.
The trick is that we can already store the vector representations of faces in our database.
from deepface import DeepFace #-------------------------- employees = [] for r, d, f in os.walk(db_path): # r=root, d=directories, f = files for file in f: if ('.jpg' in file): exact_path = r + "/" + file employees.append(exact_path) #-------------------------- representations = [] for employee in employees: representation = DeepFace.represent(img_path = employee, model_name = "VGG-Face")[0]["embedding"] instance = [] instance.append(employee) instance.append(representation) representations.append(instance) #-------------------------- import pickle f = open('representations.pkl', "wb") pickle.dump(representations, f) f.close()
So, we will already have representations of identities existing in the database when a face recognition task is called.
Then, you we should find the vector representation of just one target face when we look for a face.
target_path = "target.jpg" target_img = DeepFace.extract_faces(img_path = target_path)[0]["face"] target_representation = DeepFace.represent(img_path = target_path, model_name = "VGG-Face")[0]["embedding"]
Besides, we could build the face recognition model already and wait for a target face. In this case, we need to spend time mentioned in the verification row. That lasts less than a second in the worst case scenario.
Then, all we need is to find distances between target and source image vectors. This could be handled very fast as you imagine.
#load representations of faces in database f = open('representations.pkl', 'rb') representations = pickle.load(f) distances = [] for i in range(0, len(representations)): source_name = representations[i][0] source_representation = representations[i][1] distance = dst.findCosineDistance(source_representation, target_representation) distances.append(distance) #find the minimum distance index idx = np.argmin(distances) matched_name = representations[idx][0]
My experiments show that finding an identity is completed in less than a second for a data set consisting of tens of instances if the face recognition model is already built.
Approximate Nearest Neighbor
As explained in this tutorial, facial recognition models are being used to verify a face pair is same person or different persons. This is actually face verification instead of face recognition. Because face recognition requires to perform face verification many times. Now, suppose that you need to find an identity in a billion-scale database e.g. citizen database of a country and a citizen may have many images. This problem has O(n x logn) time complexity where n is the number of entries of your database.
On the other hand, approximate nearest neighbor algorithm reduces time complexity dramatically to O(logn)! Vector indexes such as Annoy, Voyager, Faiss; and vector databases such as Postgres with pgvector and RediSearch are running this algorithm to find a similar vector of a given vector even in billions of entries just in milliseconds.
So, if you have a robust facial recognition model then it is not a big deal to run it in billions!
DeepFace
Deepface offers an out-of-the-box find function to handle this action. All of those stages are handled in the background.
# !pip install deepface from deepface import DeepFace # find returns list of pandas dataframes dfs = DeepFace.find(img_path = "target.jpg", db_path = "C:/my_db") for df in ds: print(df.head())
As seen, this can be handled just a few lines of code.
Anti-Spoofing and Liveness Detection
What if DeepFace is given fake or spoofed images? This becomes a serious issue if it is used in a security system. To address this, DeepFace includes an anti-spoofing feature for face verification or liveness detection.
Map reduce technology
Approximate nearest neighbor algorithm reduces the time complexity dramatically but it does not guarantee to find the nearest ones always. Big data technology and no sql databases come with the power of map reduce technology and we can run k-nn algorithm easily in this way. Herein, mongoDB, Cassandra, Redis and Hadoop are strong candidates.
Tech Stack Recommendations
Face recognition is mainly based on representing facial images as vectors. Herein, storing the vector representations is a key factor for building robust facial recognition systems. I summarize the tech stack recommendations in the following video.
Conclusion
So, we’ve mentioned large scale face recognition in this blog post. Even though, building a face recognition pipeline is a complex process, applying some hacking skills contributes to find a workaround. In this way, we can find a custom face in a large scale data set just in seconds.
You can support this study by starring the GitHub repo as well.
Support this blog if you do like!
Does the code from your first code block assume at all of the images in the db_path directory are aligned / shaped in the correct way? If they’re not (say the db_path has image files of angelina jolie taken off of google images), should there be an added step of extracting the faces before representing and pickling them?
Also, the second code block saves the output of extract_faces to the variable “target_image”, but that variable isn’t used. should the target representation be created using “target_image” instead of “target_path”?
Yes, I suppose they are aligned already.