Large Scale Face Recognition with Elasticsearch

Previously, we have mentioned large scale face recognition with Spotify Annoy, Facebook Faiss and NMSLIB. Those are spectacular but very core libraries. This comes with scalability troubles in production pipelines. Today, we are going to mention large scale face recognition task with more scalable one: Elasticsearch. It is actually based on NMSLIB but it comes with default distributed cluster feature. Besides, we will adapt a facial recognition task, but it could be adapted to any NLP or reverse image search cases because they are all based on vector representation for entities and similarity search.

Dancing Houses, Amsterdam by Burak Arik
Vlog

You can either watch the following vlog or follow this blog. They both cover the large-scale facial recognition with Elasticsearch topic.


πŸ™‹β€β™‚οΈ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

BTW, strongly recommend to watch the following video about the math behind approximate nearest neighbor and its python implementation from scratch

Installation

We have to install Elasticsearch and its data visualization dashboard Kibana. I’ve worked with version 7.6.2 for them both in this study but you should install its latest version. Then, you need to unzip downloaded files, and run bin/elasticsearch and bin/kibana on Linux and macOS, run bin/elasticsearch.bat and bin/kibana.bat on Windows to get elasticsearch server up.

Elasticsearch console
Kibana console

Elasticsearch is getting up on 9200 port and kibana is getting up on 5601 port in my environment. Those are default ports but it could be different on your environment if some other application is using those ports. I highlight the gotten up ports in the console view.

Elasticsearch client for python

Elasticsearch server is up and we can communicate it with a rest service. Herein, we can do it with a low-level python client as well. We will need the following pypi distribution.

#https://pypi.org/project/elasticsearch/
!pip install elasticsearch==7.10.0

Thereafter, we can get Elasticsearch client up in python programs. Notice that my elasticsearch server was getting up in 9200 port.

from elasticsearch import Elasticsearch
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
Creating an index

We will create an index and store facial representations. The key is index name and its data structure. The name of the index is face_recognition. I will store face representation vector and its file name in this index.





mapping = {
    "mappings": {
        "properties": {
            "title_vector":{
                "type": "dense_vector",
                "dims": 128
            },
            "title_name": {"type": "keyword"}
        }
    }
}

es.indices.create(index="face_recognition", body=mapping)

I will use Google FaceNet model to represent faces as vectors. It expects 160×160 shaped inputs and represent faces as 128 dimensional vectors. That’s why, dims value is 128 in the title vector variable. Besides, there are two type of vectors in elasticsearch: sparse and dense vector. Each dimension stores a single value in my case that’s why, type is dense vector. On the other hand, you could store different shaped arrays in each dimension in sparse vectors. Create function returns the following result if the index can be created successfully.

{'acknowledged': True,
'index': 'face_recognition',
'shards_acknowledged': True}

You could monitor the index in Kibana dashboard now. Please visit localhost:5601. Then, you should follow Dashboard (in the left menu) > Elasticsearch > Index Management steps.

Index management in Kibana dashboard

This command can be run once. If you run index create command twice, then it will return resource already exists exception.

Face recognition pipeline

A modern face recognition pipeline consists of 4 common stages: detect, align, represent and verify.

Face recognition model

We will build FaceNet model via deepface framework for Python.

Deepface framework for python wraps several state-of-the-art face recognition models: VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID, Dlib and ArcFace.

We set the dimension of the dense vector to 128 in the previous step. This is actually the output size of the FaceNet model.

If you will use different face recognition model, dimension size might be different. For example, VGG-Face represent faces as 2622 dimensional vector.





FaceNet, VGG-Face, Dlib and ArcFace overperform among others. Here, you can watch how to determine the best model.

Storing face database in elasticsearch

I will read the image names and paths in the unit test folder of deepface. This is going to be my face database.

import os
files = []
for r, d, f in os.walk("deepface/tests/dataset/"):
    for file in f:
        if ('.jpg' in file):
            exact_path = r + file
            files.append(exact_path)

As you might remember a modern face recognition pipeline consists of 4 common stages: detect, align, represent and verify. Preprocess face function of deepface framework handles detection and alignment. We’ve already built the FaceNet model before. We will feed the preprocessed facial image to FaceNet model to find the vector representation. Then, store face embedding and its exact path in the elasticsearch index.

from deepface import DeepFace
index = 0
for img_path in files:
    embedding = DeepFace.represent(img_path = img_path, model_name = "Facenet")[0]["embedding"]
    
    doc = {"title_vector": embedding, "title_name": img_path}
    es.create("face_recognition", id=index, body=doc)
    
    index = index + 1

Notice that you should store the facial vector representations of your database into the Elasticsearch once.

There are 61 instances in my database. You could store millions level data and it would not be huge thing for Elasticsearch. You can monitor the final form of your index in the Kibana as well. As seen, there are 61 docs in my index.

Index details in the Kibana dashboard
Target

I would like to search the following target image in the elasticsearch index.

Target image

Let’s find its vector representation first.

target_path = "target.jpg"
target_img = DeepFace.extract_faces(target_path)[0]["face"]
target_embedding = DeepFace.represent(img_path = target_path, model_name = "Facenet")[0]["embedding"]
Elasticsearch query

We will create a query to search in the Elasticsearch index. We’ll pass the target embedding to query vector, and define the distance metric in the source field. I show the usage of the both euclidean distance and cosine similarity below.

query = {
    "size": 5,
    "query": {
    "script_score": {
        "query": {
            "match_all": {}
        },
        "script": {
            #"source": "cosineSimilarity(params.queryVector, 'title_vector') + 1.0",
            "source": "1 / (1 + l2norm(params.queryVector, 'title_vector'))", #euclidean distance
            "params": {
                "queryVector": list(target_embedding)
            }
        }
    }
}}

Then, we will pass the query to the Elasticsearch index. Amazingly, Elasticsearch returns responses in 0.007 seconds! We can increase the number of docs in the index, Elasticsearch can still respond in milliseconds even for billions!





res = es.search(index="face_recognition", body=query)

Here, response object stores the nearest neighbors of the target image. We need the exact image path which was stored in the title name field and its similarity score.

import matplotlib.pyplot as plt
for i in res["hits"]["hits"]:
    candidate_name = i["_source"]["title_name"]
    candidate_score = i["_score"]
    print(candidate_name, ": ", candidate_score)
    
    candidate_img = DeepFace.extract_faces(candidate_name)[0]["face"]
    
    fig = plt.figure()
    
    ax1 = fig.add_subplot(1, 2, 1)
    plt.imshow(target_img[0][:,:,::-1])
    plt.axis('off')
    
    ax2 = fig.add_subplot(1, 2, 2)
    plt.imshow(candidate_img[:,:,::-1])
    plt.axis('off')
    
    plt.show()
    
    print("-------------------------")

This returns the following items as nearest neighbors in my Elasticsearch index.

Nearest neighbors

We might call this approach Elastic-Face from now on πŸ™‚

Lightweight way

Sometimes you don’t have to have a database in your architecture if your data is hundreds or thousands level.

Face recognition requires to apply face verification several times. Deepface handles this in the background.

Map Reduce Technology

Approximate nearest neighbor algorithm reduces the time complexity dramatically but it does not guarantee to find the closest ones always. If you have million level data, big data systems and map reduce technology might match your satisfactions if your concern is not to discard important ones. Herein, mongoDB, Cassandra, Redis and Hadoop are the most popular solutions.

Elephant is an iconic mascot of hadoop
Tech Stack Recommendations

Face recognition is mainly based on representing facial images as vectors. Herein, storing the vector representations is a key factor for building robust facial recognition systems. I summarize the tech stack recommendations in the following video.





Conclusion

So, we’ve mentioned how to use Elasticsearch for a large scale face recognition task. The library is based on very core approximate nearest neighbors framework – NMSLIB but it comes with highly scalable capability. It is totally production driven and we can run it on many clusters. In this way, we can apply search over billions very fast. This is completely map reduce technology appearing in hadoop.

Besides, even though we’ve adapted face recognition task with Elasticsearch, we can apply any NLP or reverse image search problem with this approach. Because they are all based on representing entities as vectors and a-nn algorithm.

I pushed the source code of this study to GitHub. You can support this study if you star⭐️ the repo.


Support this blog if you do like!

Buy me a coffee      Buy me a coffee


5 Comments

  1. Hi Sefik!
    Firstly, this is a great tutorial.
    This works for a single face in the photo and I have found a workaround by integrating with Retina Face and identifying multiple faces in the photo and capturing embedding for the same. The embeddings are stored in the Elastic search individually for each face. But the querying is not functioning properly, it is returning some faces which are not similar with higher candidate scores. Let me know if I missed something out.

Comments are closed.