Large Scale Face Recognition for Deep Learning

Face recognition technology is mainly based on face verification. All scenario depends on feeding two face photos to a convolutional neural networks and retrieving their vector representations. Then, decision will be made based on the distance of those vectors but this is easy.

On the other hand, building a CNN model and calling its predict function are both costly operations. That’s why, applying face recognition on a large scale data set seems to be problematic. In this post, we will mention a workaround to handle large scale face recognition in an easy and fast way.


🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

A Face Recognition Pipeline

Let’s remember the flow of a modern face recognition pipeline.

It consists of 4 common stages: detect, align, represent and verify. We will focus on represent and verify stages in this post.

Big O Notation

Firstly, face verification has O(1) complexity whereas face recognition has O(n) complexity in big O notation. In other words, face recognition requires to call face verification function n times where n is the number of instances in the database.

Vlog

Now, you can either watch the following vlogs or follow this blog post. They both cover the large scale face recognition topic.

To focus on just face verification on face pairs.

Face recognition requires to apply face verification several times. Deepface handles it in the background.

Approximate Nearest Neighbor

But this might be problematic for billions of level data. In this case, we can apply approximate nearest neighbor algorithm to decrease the complexity. Spotify Annoy, Facebook Faiss and NMSLIB are very popular a-nn libraries.

Herein, Elasticsearch wraps NMSLIB but it comes with highly scalable feature. We can run it on many clusters.





Face recognition is a complex task

The both building a face recognition model and calling its prediction function are costly operations.

face-recognition-time
Time complexity of face recognition models

Imagine the required time to look for a face in a data set consists of 100 samples based on the table above.

Do we need building time plus 100 times verification time? Because this is not acceptable.

The hacker’s way

We might have hundreds of face photos in our database. The verification stage of a pipeline requires to find distances between vector of each image in the database and vector of the target image.

The trick is that we can already store the vector representations of faces in our database.

from deepface import DeepFace
#--------------------------
employees = []

for r, d, f in os.walk(db_path): # r=root, d=directories, f = files
 for file in f:
  if ('.jpg' in file):
   exact_path = r + "/" + file
   employees.append(exact_path)
#--------------------------
representations = []
for employee in employees:
 representation = DeepFace.represent(img_path = employee, model_name = "VGG-Face")[0]["embedding"]

 instance = []
 instance.append(employee)
 instance.append(representation)
 representations.append(instance)
#--------------------------
import pickle
f = open('representations.pkl', "wb")
pickle.dump(representations, f)
f.close()

So, we will already have representations of identities existing in the database when a face recognition task is called.

Then, you we should find the vector representation of just one target face when we look for a face.

target_path = "target.jpg"
target_img = DeepFace.extract_faces(img_path = target_path)[0]["face"]
target_representation = DeepFace.represent(img_path = target_path, model_name = "VGG-Face")[0]["embedding"]

Besides, we could build the face recognition model already and wait for a target face. In this case, we need to spend time mentioned in the verification row.  That lasts less than a second in the worst case scenario.

Then, all we need is to find distances between target and source image vectors. This could be handled very fast as you imagine.

#load representations of faces in database
f = open('representations.pkl', 'rb')
representations = pickle.load(f)

distances = []
for i in range(0, len(representations)):
 source_name = representations[i][0]
 source_representation = representations[i][1]
 distance = dst.findCosineDistance(source_representation, target_representation)
 distances.append(distance)
#find the minimum distance index
idx = np.argmin(distances)
matched_name = representations[idx][0]

My experiments show that finding an identity is completed in less than a second for a data set consisting of tens of instances if the face recognition model is already built.





Approximate nearest neighbors

Face recognition is actually type of k-nearest neighbor algorithm and it has a O(n x d) time complexity where n is the number of instances and d is the number of dimensions of vectors. This might be problematic for millions of data.

On the other hand, approximate nearest neighbor algorithm finds nearest neighbors much faster. Annoy is the software package developed by Spotify engineering team implements a-nn algorithm. It decreases time complexity to O(log n).

DeepFace

Deepface offers an out-of-the-box find function to handle this action. All of those stages are handled in the background.

# !pip install deepface
from deepface import DeepFace

# find returns list of pandas dataframes
dfs = DeepFace.find(img_path = "target.jpg", db_path = "C:/my_db")

for df in ds:
   print(df.head())

As seen, this can be handled just a few lines of code.

Map reduce technology

Approximate nearest neighbor algorithm reduces the time complexity dramatically but it does not guarantee to find the nearest ones always. Big data technology and no sql databases come with the power of map reduce technology and we can run k-nn algorithm easily in this way. Herein, mongoDB, Cassandra, Redis and Hadoop are strong candidates.

elephant-racing

Tech Stack Recommendations

Face recognition is mainly based on representing facial images as vectors. Herein, storing the vector representations is a key factor for building robust facial recognition systems. I summarize the tech stack recommendations in the following video.

Conclusion

So, we’ve mentioned large scale face recognition in this blog post. Even though, building a face recognition pipeline is a complex process, applying some hacking skills contributes to find a workaround. In this way, we can find a custom face in a large scale data set just in seconds.

You can support this study by starring the GitHub repo as well.


Like this blog? Support me on Patreon

Buy me a coffee


2 Comments

  1. Does the code from your first code block assume at all of the images in the db_path directory are aligned / shaped in the correct way? If they’re not (say the db_path has image files of angelina jolie taken off of google images), should there be an added step of extracting the faces before representing and pickling them?

    Also, the second code block saves the output of extract_faces to the variable “target_image”, but that variable isn’t used. should the target representation be created using “target_image” instead of “target_path”?

Comments are closed.