Deep Face Recognition with Cassandra

We mostly need the power of map reduce of no sql databases when the data becomes really big and massive. Herein, Apache Cassandra is a pretty tool to store and process massive data. In this post, we are going to mention cassandra from its installation to running a face recogntion pipeline with it.

Cassandra for face recognition
Vlog

You can either read this post or watch the following video. They both cover the face recognition with Apache Cassandra wide column store.


πŸ™‹β€β™‚οΈ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

Vector Similarity Search with Cassandra

Cassandra server

Installing Cassandra on Windows is easy. But I had troubles when I install in on Mac. I mostly install packages with homebrew package manager but Cassandra is not starting when I install it with homebrew. That’s why, I had to install it manually. The following video might help you if you will install it on your Mac.

I’ve done my experiments with its 3.11.9 release. It was the latest stable version when I was writing this post. It is going to be installed as zipped. All you need is to unzip it. Then, change your directory to bin folder under cassandra home in the command prompt and run cassandra.bat for windows or cassandra for linux.

Starting Cassandra Server

Cassandra server is up in my localhost and 9042 port.

You can communicate with your Cassandra database with Cassandra query language shell (or shortly cqlsh). It is accessible at CASSANDRA_HOME/bin/cqlsh. However, it requires Python 2.X. I had to create a virtual environment with conda based on python 2.7.18 to run cqlsh command. Notice that you should run cqlsh command in a different command prompt window. Cassandra server is still running meanwhile in a different window.





conda create –-name cassandra python=2.7.18
conda activate cassandra
cd c:\apache-cassandra-3.11.9\bin
cqlsh
Cassandra query language shell (cqlsh)

Once you see that the cqlsh is activated, then you can run any database related command here.

The equivalent of the database term in RDMS is keyspace in Cassandra. Let’s create the database first.

CREATE KEYSPACE deepface WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
Creating keyspace (database)

Once your keyspace is created, then you should see its name when you describe keyspaces.

DESCRIBE keyspaces;
Describing keyspaces

As I mentioned before, we can run any database related command in cqlsh. However, we will run create table, insert and select commands within python in the next section.

Face recognition pipeline

A modern face recognition pipeline consists of 4 common stages: detect, align, represent and verify.

The both detection and alignment are preprocessing stages. Face recognition models are responsible for representing facial images as vector. Those concepts might be confusing for newbies. You should watch the following video to find out how face recognition pipeline works.

Herein, deepface framework for python covers all of those stages. Besides, it wraps several state-of-the-art face recognition models such as VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, Dlib and ArcFace.

FaceNet, VGG-Face and ArcFace overperform among others. Here, you can watch how to determine the best model.





FaceNet model

I will use Google FaceNet model to represent facial images as vectors. The model expects 160, 160, 3 shaped inputs whereas it represent facial images as 128 dimensional vectors.

from deepface import DeepFace
Creating local facial database

I will use the unit test items of deepface as a facial database. Once you clone the repo into your environment, you can get the list of image names with walk function.

import os

facial_img_paths = []
#Available at: https://github.com/serengil/deepface/tree/master/tests/dataset
for root, directory, files in os.walk("deepface/tests/dataset"):
    for file in files:
        if '.jpg' in file:
            facial_img_paths.append(root+"/"+file)
Face detection and alignment

We stored the names of facial images in facial image paths list. We then read the image and apply preprocessing steps. Luckily, deepface offers a preprocess face function covering the both detection and alignment.

Deepface actually wraps several face detectors: opencv, ssd, dlib and mtcnn. Herein, mtcnn is the most robust one but it is the slowest one as well. SSD is the fastest one but its alignment accuracy is not as robust as mtcnn. You can watch the face detection performance of those backends in the following video.

Herein, retinaface is the cutting-edge technology for face detection. It can even detect faces in the crowd. Besides, it finds some facial landmarks including eye coordinates. In this way, its alignment score is high as well.

Preprocessing

We will then pass the preprocessed face into the built FaceNet model to have a facial representation.

import pandas as pd
from tqdm import tqdm
from deepface.commons import functions

instances = []
for i in tqdm(range(0, len(facial_img_paths))):
    facial_img_path = facial_img_paths[i]    
    embedding = DeepFace.represent(img_path = facial_img_path, model_name = "Facenet")[0]["embedding"]
    
    instance = []
    instance.append(facial_img_path)
    instance.append(embedding)
    instances.append(instance)

df = pd.DataFrame(instances, columns = ["img_name", "embedding"])

Facial image name and its representation are stored in columns of the pandas data frame.





Local data set
Connecting Cassandra

We’ve created the deepface database in cqlsh before. It’s time to connect it within Python. Cassandra-driver package offers a pretty interface to communicate with Cassandra server.

#!pip install cassandra-driver
from cassandra.cluster import Cluster
cluster = Cluster(['127.0.0.1'], port=9042)
session = cluster.connect('deepface', wait_for_all_pools = True)

Notice that I connected the host and port mentioned in the window cassandra.bat called.

Creating tables

The table we need should store the file name and its vector representation. We can do it within cqlsh but we can do it within Python as well.

session.execute('DROP TABLE IF EXISTS deepface.embeddings;')
session.execute('CREATE TABLE deepface.embeddings(img_id int PRIMARY KEY, img_name text, embedding list<double>);')

Cassandra expects a mandatory primary key. Exact image name is an unique value. That’s why, we can set primary key to image name. Alternatively, you should set it to an increasing value. My choice is the second one in this experiment.

Storing local database into Cassandra

We’ve stored the local facial database in the pandas data frame. We will walk over its rows and execute an insert statement for each row.

for index, instance in tqdm(df.iterrows(), total = df.shape[0]):
    img_name = instance["img_name"]
    embedding = instance["embedding"].tolist()    
    statement = "insert into deepface.embeddings (img_id, img_name, embedding) values (%d, '%s', %s);" % (index, img_name, embedding)
    
    session.execute(statement)

We can query the table in cqlsh when execution is over.

Query
Target image

We stored the embeddings of lots of images in Cassandra. As an use case, we will feed a new image, and expect Cassandra to find the same persons in the db. We have to apply preprocessing and represention stages for the target image as well.

target_img_path = "target.png"
target_img = DeepFace.extract_faces(target_img_path)[0]["face"]
target_embedding = DeepFace.represent(img_path = target_img_path, model_name = "Facenet")[0]["embedding"]

That’s the target image we will look for its identity in the database.

Angelina Jolie as a target
Client side solution

We can retrieve the embeddings in the facial database, and then find the distance of each item with the target image in client side.

rows = session.execute('SELECT * FROM deepface.embeddings')

instances = []
for row in rows:
    instance = []
    instance.append(row.img_name)
    instance.append(row.embedding)
    instances.append(instance)

retrieved_df = pd.DataFrame(instances, columns = ["img_name", "embedding"])

Retrieved data frame stores the image name and its representation as columns. We should add the target representation as a column as well.





target_duplicated = np.array([target_embedding,]*retrieved_df.shape[0])
retrieved_df['target'] = target_duplicated.tolist()
Data frame

We need to find the euclidean distance between embedding and target columns for each row.

def findEuclideanDistance(row):
    source = np.array(row['embedding'])
    target = np.array(row['target'])
    
    distance = (source - target)
    return np.sqrt(np.sum(np.multiply(distance, distance)))

retrieved_df['distance'] = retrieved_df.apply(findEuclideanDistance, axis = 1)

The threshold value is 10 for FaceNet and Euclidean pair. We should discard the rows having a larger distance value. You can see the threshold values for other model and metric pairs here.

retrieved_df = retrieved_df[retrieved_df['distance'] <= 10]
retrieved_df = retrieved_df.sort_values(by = ['distance'])
print(retrieved_df[['img_name', 'distance']])

The following items are same person with the target image. They are all really Angelina Jolie.

Results for client side approach

This approach becomes complexer for the number of instances increases. Notice that it has O(n x d) time complexity where n is the number of items in the database and d is the number of dimensions in the representation vector (128 for facenet).

Server side solution

As an alternative to client side approach, we can pass the representation of the target image to Cassandra, and expect it to find the distance values.

Cassandra query language (CQL) is not easy as SQL. We have to define a function to find the distance value. Luckily, we can write Java codes in functions here. You can run the following command in either cqlsh or python.

CREATE FUNCTION euclidean(source list<double>, target list<double>) CALLED ON NULL INPUT RETURNS double LANGUAGE java AS '
	double  distance = 0;
	for (int i=0;i<source.size();i++){
		double p = source.get(i);
		double q = target.get(i);
		
		distance = distance + (p - q) * (p - q);
	}
	distance = java.lang.Math.sqrt(distance);
	return distance;
';

The default behaviour return the exception: “User-defined functions are disabled in cassandra.yaml – set enable_user_defined_functions=true to enable” when you try to create function. You should modify that parameter in CASSANDRA_HOME/conf/cassandra.yaml.

Once the function is created, we can call in select statements.

statement = 'SELECT img_name, deepface.euclidean(embedding, %s) as distance ' \
            'FROM deepface.embeddings' % (target_embedding)

rows = session.execute(statement)

instances = []
for row in rows:
    instance = []
    instance.append(row.img_name)
    instance.append(row.distance)
    instances.append(instance)

result_df = pd.DataFrame(instances, columns = ["img_name", "distance"])

Result data frame stores the image name and its distance to the target image in each row. All items in the db appears in the pandas data frame as well. That has O(n) time complexity where n is the number of items in the database. Notice that client side approach has O(nxd) time complexity. Actually, d might be discarded when n is massive.

Pandas can discard the distant ones fast.





result_df = result_df[result_df['distance'] < 10]
result_df = result_df.sort_values(by = ["distance"]).reset_index(drop = True)
Results for server side approach

Let’s see the images itself.

Result set

To be honest, Cassandra is not a high level database like Hive. Unfortunately, we cannot add conditions into the where clause for the newly created columns like distance. If we can do it, we can use the power of map reduce literally. However, Cassandra returns the distance of all instances and we have to discard the distant ones in the client side.

Remember that we could do it literally in mongoDb.

Lightweight way

If your task does not require high scalability, then a lightweight way exists!

Validation

Even though client side and server side approaches return same result, we can validate the results for same sample set within deepface.

val_dfs = DeepFace.find(img_path = target_img_path, db_path = "deepface/tests/dataset"
, model_name = 'Facenet', model = model, distance_metric = 'euclidean'
, detector_backend = 'opencv')

To have a more information about its verify and find functions, you should watch the following videos.

Face recognition requires to apply face verification several times.





The Best Single Model

DeepFace has many cutting-edge models in its portfolio. Find out the best configuration for facial recognition model, detector, similarity metric and alignment mode.

DeepFace API

DeepFace offers a web service for face verification, facial attribute analysis and vector embedding generation through its API. You can watch a tutorial on using the DeepFace API here:

Additionally, DeepFace can be run with Docker to access its API. Learn how in this video:

Large scale face recognition

Even though the power of map reduce is important, we can handle large scale face recognition with limited hardware. You might consider some solutions based on approximate nearest neighbor : elasticsearchannoyfaiss or nmslib. Those libraries reduces time complexity dramatically.

Other no sql solutions

As an alternative to Cassandra, mongoDB, Hadoop and Redis are strong no sql solutions. They come with the power of the map reduce technology. Especially, the both cassandra and redis are key value stores and they offer high performance for face verification task instead of face recognition.

Elephant is an iconic mascot of hadoop

Super Fast Vector Search

In this post, we focused on using the k-NN algorithm to find similar vectors. However, this approach becomes problematic with large databases due to its time complexity of O(n + n log(n)). Imagine indexing all images on Google! To address this, we use the approximate nearest neighbor algorithm, which significantly reduces complexity and allows for super-fast vector searches. With this method, you can find the nearest vectors in a billion-scale database in just milliseconds. Many vector databases and indexing tools, such as Annoy, Faiss, ElasticSearch, NMSLIB, and Redis, adopt a similar approach.

Tech Stack Recommendations

Face recognition is mainly based on representing facial images as vectors. Herein, storing the vector representations is a key factor for building robust facial recognition systems. I summarize the tech stack recommendations in the following video.





Conclusion

So, we’ve mentioned how to run a face recognition pipeline with Cassandra in this post. This is fully experimental study. It might be problematic for really large data. Because, we cannot use the power of map reduce literally in Cassandra because of its limitations.

I pushed the source code of this study into GitHub as a notebook. You can support this study if you star⭐️ the repo.


Support this blog if you do like!

Buy me a coffee      Buy me a coffee